Original author is Kurt Weiss, Director of Enterprise Sales at Ekata
Working in the identity verification space, I am often asked by financial institutions for information regarding our “email-to-name” coverage. While I’ll take any opportunity to boast about how Ekata’s 20+ years of data sourcing has created one of the most intelligent identity graphs of linkages between individuals and their phones, emails, and addresses – it’s often beside the point.
While establishing an “email-to-name” match can be a useful signal for traditional Know Your Customer (KYC) compliance purposes, it rarely cracks the top 10 most valuable signals when applied in a machine learning risk model for identity verification. The email signal that we do see consistently moving risk models is “email-first-seen-days”, which tracks when Ekata first saw an email enter our network. As powerful as this signal is, it is never the signal we’re asked about.
Limitations to the Deterministic Approach
The focus on name-matching coverage is a vestige of the deterministic approach to risk that has been at the core of identity verification for decades, especially amongst financial institutions. KYC implores risk teams to affirm these links: Does the applicant live at this address? Is this their phone? Is this their email? The answers to these questions are binary: Is there an “email-to-name” match? Yes or no?
While these are valuable insights, the concept of digital identity has grown increasingly complex and fraudsters have evolved to match that sophistication.
Alternative Approach to Complement Risk Models
or fraudsters today, to replicate an “email-to-name” match that passes muster is easier than you would think. What takes increased sophistication is the patience to sit on that email and let it age before using it in a malicious scheme. That’s where “email-first-seen-days” shines. The question we should all be asking is “can I trust this email?” And the answer to that question lies in a probabilistic approach.
Not every new email is fraudulent, but new emails correlate to higher risk of fraud. Ekata tracks the age of an email based on when it was first seen in our network. When an email has been seen in the Ekata Identity Network for less than 30 days, fraudulent activity is observed 100x more often than with those addresses seen outside the 30-day window.
For instance, when fraudsters generate a synthetic identity, they will pair a new disposable or temporary email address with hacked phone numbers and addresses. Legitimate customers, however, tend to use long-documented emails, and are thus known to Ekata for several years or more.
Risk signals like “email-first-seen-days” correlate to those behaviors and can identify where there is a higher likelihood of fraud to occur. That is the probabilistic approach. The answer isn’t a binary yes/no, it’s a number – the number of days since Ekata first saw the email, and a correlation against known fraud that demonstrates it’s probabilistic value.
Benefits of a Probabilistic Approach
To take it even further, we can look at how long the email has been associated with the phone number provided. That again is not a question posed by KYC. However, when leveraged in a model trained for probabilistic fraud risk, it is one of the most important features enabling banks to root out synthetic identities before it’s too late.
Financial institutions are beholden to a deterministic risk assessment, but that shouldn’t limit the questions they ask to verify an identity. Seeking probabilistic answers compliments that deterministic knowledge with a risk assessment that allows institutions to refocus on the customer first in pursuit of compliance diligence.
Ekata is helping financial institutions bridge the gap between deterministic and probabilistic risk, improving KYC by first ensuring that you actually know who your customer is. To find out more about how our risk signals can increase your revenue, Contact Us today.