• ## When the Best is not the Best

Posted by Konstantin 04.01.2016 5 Comments

Collecting large amounts of data and then using it to "teach" computers to automatically recognize patterns is pretty much standard practice nowadays. It seems that, given enough data and the right methods, computers can get quite precise at detecting or predicting nearly anything, whether it is face recognition, fraud detection or movie recommendations.

Whenever a new classification system is created, it is taken for granted that the system should be as precise as possible. Of course, classifiers that never make mistakes are rare, but if it possible, we should strive to have them make as few mistakes as possible, right? Here is a fun example, where things are not as obvious.

Consider a bank, which, as is normal for a bank, makes money by giving loans to its customers. Of course, there is always a risk that a customer will default (i.e. not repay the loan). To account for that, the bank has a risk scoring system which, for a given loan application, assesses the probability that the corresponding customer may default. This probability is later used to compute the interest rate offered for the customer. To simplify a bit, the issued interest on a loan might be computed as the sum of customer's predicted default risk probability and a fixed profit margin. For example, if a customer is expected to default with probability 10% and the bank wants 5% profit on its loans on average, the loan might be issued at slightly above 15% interest. This would cover both the expected losses due to non-repayments as well as the profit margin.

Now, suppose the bank managed to develop a perfect scoring algorithm. That is, each application gets a rating of either having 0% or 100% risk. Suppose as well that within a month the bank processes 1000 applications, half of which are predicted to be perfectly good, and half - perfectly bad. This means that 500 loans get issued with a 5% interest rate, while 500 do not get issued at all.

Think what would happen, if the system would not do such a great job and confused 50 of the bad applications with the good ones? In this case 450 applications would be classified as "100%" risk, while 550 would be assigned a risk score of "9.1%" (we still require the system to provide valid risk probability estimates). In this case the bank would issue a total of 550 loans at 15%. Of course, 50 of those would not get repaid, yet this loss would be covered from the increased interest paid by the honest lenders. The financial returns are thus exactly the same as with the perfect classifier. However, the bank now has more clients. More applications were signed, and more contract fees were received.

True, the clients might be a bit less happy for getting a higher interest rate, but assuming they were ready to pay it anyway, the bank does not care. In fact, the bank would be more than happy to segment its customers by offering higher interest rates to low-risk customers anyway. It cannot do it openly, though. The established practices usually constrain banks to make use of "reasonable" scorecards and offer better interest rates to low-risk customers.

Hence, at least in this particular example, a "worse" classifier is in fact better for business. Perfect precision is not really the ultimately desired feature. Instead, the system is much more useful when it provides a relevant and "smooth" distribution of predicted risk scores, making sure the scores themselves are decently precise estimates for the probability of a default.

January 2016
M T W T F S S
« Sep   May »
123
45678910
11121314151617
18192021222324
25262728293031