An Introduction to Analyzing Customer Behavior using AI and Dynamic Time Warping
Online fraud is increasing across all domains and industries.
Apart from identity theft and credential theft, there are a number of people who attempt to appear better on paper by disingenuously answering questions when signing up for a service with a company.
What if these people with fraudulent intentions can be identified while signing-up online!
At ForMotiv, we are using Artificial Intelligence in identifying a risky or fraudulent customer, in real-time, when signing up for a new account.
In this blog, we have demonstrated how these risky or fraudulent customers can be identified when signing up for a new credit card account online. It is no longer optional to be reactive when preventing insurance fraud, we must be proactive.
Primitive Fraud Detectors
Before the introduction of online applications, customers would visit the bank/credit card companies to complete the paperwork for a credit card application.
Usually, this paperwork is done in front of an agent who could identify fraud by reading the applicant’s body language while filling in the application.
For instance, if they changed answers multiple times to questions or appeared to be lying when answering a question, the agent would likely further qualify that candidate.
These physical agents were the earliest form of fraud detectors.
Physical agents were no longer needed when the applications were made available online, which led to the customer’s behavior going unnoticed.
If you can’t see them, you obviously can’t read their body language.
So, What to do now?
To overcome this we can set up a virtual agent who analyzes the behavior based on a customer’s actions.
Virtual Agent — Behavioral Intelligence
A customer interacts with an online application through a computer/mobile that is connected to the internet.
Every action of the customer can be recorded which could be the inputs given through their keyboards or mouse and the time taken for it.
Converting a Speech Recognition Technology to identify Behavior
In order to identify a customer behavior pattern, each customer’s micro time series has to be compared with the rest of the other customer’s micro time series.
The comparison of two data points is usually based on the distance between them (Euclidean/Manhattan), closer the distance more similar they are to each other.
Also, the major problem is that the micro time series of each customer differs in length, like, a customer can use any combination of keys (Input/Backspace/Cut/Copy/Paste) with the time between them to complete a form field which may not be the same with another customer.
One similar example of this problem is speech recognition.
Suppose a person speaks the same sentence twice, the first time faster and the second time slower.
Traditional Euclidean distance matching matches the points between the speeches at one point in time, in this case as the two speeches are out of sync in time, the Euclidean distance becomes high (showing high dissimilarity).
To solve this, Dynamic Time Warping goes back in time and matches the points between the speeches. This is how it works,
1. Each point from speech one is compared with every point of speech two by calculating a vector difference metric similar to Euclidean distance. Similarly, each key in the micro time series of a customer has been compared to the micro time series of another customer.
2. For each point of speech 1, the least distance for points in speech two is taken. In the same way, the least distance is calculated between two micro time series of two different customers
3. This eventually warps a path based on the least distances. So, the more linear this path is, the more similar will be the speeches. So, DTW always warps a path irrespective of the length of two-time series.
Dynamic Time Warping
Dynamic Time Warping (DTW) is an A.I. technique which has been very useful for normalizing and comparing data with unequal lengths of data.
Similarly, there are key inputs of unequal lengths and varying time speeds.
Each micro time series were grouped by similarity for each form field (email, phone number, last name, etc.).
Except for the outliers, the remaining groups for each form field become the regular filling pattern for that form field. For e.g., DTW distances are calculated for various micro time series within the phone number field, micro time series with a high distance between the other micro time series becomes an outlier while the remaining becomes a regular filling pattern for the phone number field.
Likewise, DTW is done for time series between form fields on all pages, between form fields on each page, and between pages to generate a pattern for genuine filling.
So, any application time series pattern that doesn’t fall under this common pattern has a higher chance of being a risky or fraudulent user.

Segmenting users with similar typing patterns
The users can be clustered based on the Euclidean Distance using k-means to identify a group of users with similar typing patterns.
The outliers have a higher chance of being risky or fraudulent applicants.
Below plots are an example of users filling in the email form field.
Using the micro time series, two new features were generated, one is the Percentage of total time used for each key input in the email form field cumulatively and the other one is the Aggression rate of users which is the total number of input keys at a given time by each user.
Type 1 Users
Total users in this cluster: 33/210 (15%)
Active time: 60% to 80%
Idle time: 40% to 20%
Total Time in the email field: 7 sec to 9 sec
The first letter typed within 2 seconds after the email field is clicked
Typing speed: 3keys/sec to 5 keys/sec
These users are slow and less aggressive in typing
Type 2 Users
Total users in this cluster: 156/210 (74%)
Active time: 90% to 100%
Idle time: 10% to 0%
Total Time in email field: less than 6 sec.
The first letter typed within 4 seconds after the email field is clicked
Typing speed: more than 5 keys/sec
These users are fast and more aggressive in typing
More type of users was segmented to identify their typing patterns.
Type 2 users might have used autofill or more backspace while filling in the email.
They can be further classified based on error ratio (backspace to total keys ratio), autofill would hardly have any error ratio and fewer total keys but a high error ratio and high total keys indicate that the user used multiple email ids which could be one of the indicators for the risky customer.
This process is repeated for other form fields to shape the final dataset for the prediction model.
Adding features to the Prediction Model
The aggression rate, Percentage utilization of overall time taken for each input, error ratio, total keys and the DTW distance between the users can be some of the new features that can be added along with the other behavioral features like last hover field, time take to submit the application, meantime take for each field, number of tries for each field, the total number of sessions taken to submit the application and the labels in the final dataset to build the prediction model.
How does this help the company?
Identifying fraud users as soon as they submit the application adds huge savings to the cost considering the present fraud trends and the effort taken to track them down.
Written by Vinod Raj