Creating a Better Customer Experience with Probabilistic Data

Ekata offers what is called a probabilistic approach to risk with identity verification. This is different from traditional methods, which are more deterministic. The advantage to using a probabilistic solution is of scale, speed, and cost. Deterministic risk approaches are not only costly, but can result in poor customer experiences and business outcomes.

So what is deterministic identity verification? Anyone who has applied for a credit card has been through one kind of deterministic identity verification. A social security number deterministically ties an individual’s identity to their credit history. Another example, online merchants might hire manual review agents to deterministically ensure that the user account links up to shipping data provided for a transaction.

To a business operating online, some form of deterministic verification is necessary, but doing so in every case is not scalable. Mixing in probabilistic data empowers product owners to use deterministic verification methods, only with those who pose the most risk.

Let’s explore with a simple analogy.

The Deterministic Farmer and the Disappearing Chickens

Imagine a farmer who needs to identify a culprit who keeps taking chickens from their coops every night. A deterministic approach would be to get indisputable evidence and set up a coop-mounted camera to catch the chicken-napper. But that approach can be expensive and time-consuming, especially if the farmer has many coops to protect. With all the time spent hunting down the one fox via a single method, more get in and the farmer could lose more chickens.

The Probabilistic Farmer finds the Fox

Conversely, a probabilistic farmer may instead look for other data from a variety of sources. For example, they may look for animal tracks around the coop. The farmer can learn a lot from the animal tracks:

  • What kind of animal is it?
  • How is it getting in?
  • How big is the animal?
  • Where did it go next?

The animal tracks can offer the same information as video evidence, but requires some inference. The farmer benefits from using probabilistic data, which is less expensive, faster, and can give the same relevant information that would be derived from deterministic data.

In the world of ecommerce and Financial Services, deterministic approaches to verification can be as simple as knowledge-based checks (select one address where you have lived before) or as drastic as ID uploads and credit checks. With each level of friction, customers are more and more likely to drop off.

Increasing friction can be a powerful tool to reduce fraud, but if applied to every new customer, the trade off tends towards greater losses in revenue. Ekata’s probabilistic approach can tell you when to require deterministic verification, and when to let new users go through a low-friction experience.

Ekata uses a probabilistic approach to identity risk through our two main assets. First, the Ekata Identity Graph is composed of third party verified data (aka “animal tracks”). And secondly, the Ekata Identity Network is built from the activity and relationships of identity elements as seen by our customers (aka “assessing where the animal tracks have been seen”).

In addition to other attributes, we return two scores, the Identity Network Score (based on our network) and the Identity Risk Score (based on all the elements in both the network and the graph). These are machine learning models (trained on real fraud), which are inherently probabilistic.

The data provided by Ekata can be thought of as a digital footprint. Certain attributes look like good identities, and others look like bad identities. The chart below is a distribution of the network scores (y axis) and identity risk scores (x axis) of thousands of identities. The identities on the left are good identities, not associated with fraud, and the identities on the right are bad identities, associated with fraud.

probabilistic approach

So if we observe a random identity with two low scores (they would fall in the lower right of our charts), we can be fairly certain that the identity is a good identity. We can bet on it because, probabilistically, very few bad identities have been associated with low scores. Thus, using Ekata data, I know to apply friction to those with the highest scores and give the group with the lowest scores the best customer experience possible.

Author

Frank Turner

Senior Field Data Scientist, Seattle

Frank is an electrical engineer who fell in love with the study of data science, he even spends his evenings teaching data science bootcamps. He is committed to working to do everything he possibly can to understand a customer workflow and figuring out how to learn from every problem he tackles!

Start a Free Trial

See how Ekata can reduce fraud risk for your business, contact us for a Demo.