Quantifying Model Performance: Terms You Should Know

Out of all our identity data APIs, the Identity Check API is used most often in machine learning (ML) models. And many companies are using Identity Check data to improve fraud model performance. The data science teams at these companies must quantify the performance of those models after adding our data. But how do they do that?
Here are a few terms data scientists often use when quantifying the performance of machine learning models: Receiver Operating Characteristic (ROC), Area Under Curve (AUC), and Kolmogorov-Smirnov (KS). ROC and KS are two of the most common methods of evaluating model performance. Most of our customers use the ROC method, so this is the method we will cover in some detail. In this post, we’ll discuss ROC and AUC.
Receiver Operating Characteristic (ROC)
ROC is a plot graph that compares the true positive rate against the false positive rate at different thresholds.

Let’s say you would like to evaluate the performance of a fraud detection model. One way you could quantify the performance of the model would be to assess transactions flagged as fraud and held for manual review. In this example, if a transaction is flagged correctly and sent to manual review, then it is a true positive– the transaction is truly fraudulent. If a transaction is not flagged as fraud and is indeed from a good customer, it is a true negative. If a transaction is from a good customer, but it is flagged as fraud and held for manual review, it is a false positive. Finally, if the system fails to flag a transaction and that transaction is indeed fraudulent, it is a false negative.
The perfect result for this fraud model example would be that the system catches 100% of the truly fraudulent transactions and 0% of good customers are impacted. On the chart below, this result would be located at the very top corner- 0.0 and 100%.

Area Under Curve (AUC)
Another term you’ll often hear data scientists use when evaluating the performance of machine learning models is Area Under Curve (AUC). The AUC tells you how well the model is performing overall, and if the results improve or worsen when changes are made to the model.
ROC curves drive the number of manual reviews and the number of chargebacks. If you know the percentage of fraud captured and the percentage of chargebacks processed, you can estimate the amount of money the model is saving the company by reducing the number of chargebacks.
The AUC only tells you how well the model is doing in general; it does not tell you how well the entire business is doing. However, the ROC method can help you understand how much true fraud the system is catching and the actual business impact of the model.
See a Live Demo of Identity Check
Ekata Identity Check API works with the five core traditional and digital data attributes of email, phone, name, address, and IP to help you verify the identities of your customers ensuring that fraudsters are blocked while good customers are not. Contact us today to see a live demo of Identity Check.

Start a Free Trial

See how Ekata can reduce fraud risk for your business, contact us for a Demo.