Back to Blog

A comprehensive guide to fincrime AI basics

Photo of Katherine Gormley, Head of AML Products, author of the blog post
Katherine Gormley, Head of AML Products

There’s no doubt about it—the battle against fraud and financial crime has become increasingly complex. Traditional methods of detection and prevention are no longer sufficient to combat the ever-advancing tactics employed by cybercriminals and fraudsters. 

In response to this challenge, the integration of AI has become a focal point for banks and fintech companies in their quest to safeguard their financial systems. However, a lack of understanding about the applications of AI often holds many organizations back from embracing its potential. 

In this comprehensive guide, we’ll walk through the fundamentals of AI and explore how it can enhance detection, prevention, and investigation processes in the fight against fraud and financial crime as well as help organizations move forward in their fraud detection and anti-money laundering (FRAML) efforts.

An introduction to fundamental AI concepts

Before diving into the practical applications of AI in fraud and fincrime, it's essential to grasp a series of key concepts that form the foundation of AI technologies. Let's break down these fundamental concepts one by one:

Artificial intelligence (plus, machine learning and deep learning)

Before we talk about anything else, let’s set things straight on the definition of artificial intelligence, which is an extremely broad concept. 

In simple terms, AI is a computer system’s ability to perform cognitive functions that we typically associate with human minds, such as reasoning, learning, perceiving, and so on. 

Related to AI are the terms machine learning (ML) and deep learning (DL)—although these three are often used interchangeably, they aren’t one in the same.

While AI is the overarching concept of creating intelligent machines, ML is a subset of AI that focuses on enabling those machines to learn from data. DL, on the other hand, is a narrower subset of ML that specifically deals with deep neural networks that can learn intricate data patterns. In a nutshell, AI encompasses ML and ML encompasses DL. 

At Resistant AI, we make use of AI-powered fraud detection and prevention by using ML for most applications.

Classification: supervised, unsupervised, and semi-supervised methods

Put simply, classification is the systematic grouping of observations into categories. More specifically, the classification process involves the task of determining whether a transaction, person, or customer belongs to a specific class (such as legitimate, criminal, or a specific type of crime).

There are three main approaches to classification:

  1. Supervised classification: This method entails training a model with labeled data which allows it to learn patterns and make predictions based on its resulting knowledge. Labeling data can be understood as assigning it an informative value that a machine learning model can use to contextualize and learn from: for example, that a document or transaction is fraudulent (positive label) or legitimate (negative label), or that it fits a specific behavioral typology (structuring, chargeback fraud, etc…).Supervised classification’s reliance on labeled data means that this approach can often be time-consuming, and a solid level of expertise is required when establishing the labels for input and output variables. However, the output is largely predictable and accurate. You are essentially showing the system what you want to see, and telling it to find more of the same.

  2. Unsupervised classification: In unsupervised classification, data is grouped into clusters without making use of labeled data. Instead, this method allows the AI system to autonomously discover patterns or anomalies in the data. As a result, output can significantly differ in terms of accuracy, but at the same time, unsupervised classification can yield insightful results that wouldn’t be possible when using a supervised approach. In other words, you are letting the system find things on its own, often bringing back things you’d never know to look for in the first place—though not all of it is relevant. 

  3. Semi-supervised classification: Semi-supervised classification brings together elements of supervised and unsupervised methods. In practice, this method relies on a smaller amount of labeled data and a larger pool of unlabeled data to make informed decisions—this delivers results that are both accurate and insightful. Thus, this approach is the most effective one for the dynamic environment of financial crime detection, and we frequently employ semi-supervised classification in our work.

Replication and generalization

In AI, replication involves making decisions in situations that closely resemble past scenarios, which strongly supports explainability (more on this below). This is because replication fosters consistent decision-making and reliability: You’ve already taught the system that a specific profile of transactions is indicative of criminal behavior and that you want it flagged, and so it flags all further transactions that match that profile for your review.

Generalization, on the other hand, empowers AI to make decisions about unfamiliar situations based on past knowledge and experience. For AI, this is both a robust and risky capability—generalization allows AI to catch new criminals and techniques based on its more flexible capabilities: When the system detects a completely new behavior that nonetheless has elements that correlate with known criminal typologies, it can recognize them, and flag the transactions for review.

Due to the strengths and weaknesses of replication and generalization, it’s important to strike the right balance between these two concepts to build reliable AI systems.

AI precision and recall

Precision is the measure of true positive results (in other words, actual fraudulent transactions correctly identified as fraudulent) out of all positive results (both true positives and false positives). High precision means a high proportion of relevant results with limited irrelevant results. Recall, on the other hand, is the measure of how many relevant threats have been detected.

In the context of anti-fraud and anti-financial crime (AFC), achieving high rates of both precision and recall can be challenging. This is because an overly precise system may overlook new threats, whereas a recall-centric system may generate numerous false positives. As with replication and generalization, there are trade-offs to both of these concepts—finding the right balance between precision and recall is crucial.

Ensemble modeling

The ensemble approach is a machine learning concept where multiple models are combined and overlap to accomplish a task. It can enhance the reliability and accuracy of financial crime detection systems by leveraging diverse models, which collectively contribute to more focused outputs, higher explainability, and robust predictions. Basically, when you analyze a document or a transactional behavior, you do so through multiple different lenses, rather than a single view. Moreover, ensemble modeling gives you the flexibility to pick and choose the kinds of detections you want, can be generalized across different customers in order to learn from others, and offers faster deployment and ease of maintenance.

The alternative to ensemble modeling is the use of a single model.

While a single model is easier to develop (as there are many different models that have been implemented and ready to be trained), there are a few important drawbacks to the use of single models: They’re very specific to a customer, their data can “overfit” their training data and struggle generalizing when things change, and they can take longer to deploy and be more difficult to update and maintain, as they require much more data.  

In the case of Resistant AI’s Document Forensics and Transaction Forensics, using an ensemble of fraud and money laundering detectors means that our team’s findings never rely on a single piece of evidence, and multiple intersecting findings strengthen the overall accuracy and confidence in labeling a document or transaction as fraudulent or trustworthy.

Clustering and hypergraph analysis

Clustering is a technique that groups data points or categories together based on their similarities. To detect and combat financial crime, clustering can help identify patterns of suspicious behavior and uncover hidden connections among different entities involved in criminal activities. Almost any characteristic (or feature) of a data set can be used to create the clusters—and hundreds are analyzed at the same time.

For example, in document fraud detection, we use clustering to group fraudulent documents that all share the same structure, the same metadata, the same modification traces, the same scene compositions—even the same color calibration—to tie together organized criminal activity across different accounts.

In transaction monitoring, clustering can be used to group similar lines of business, transaction behaviors (such as items purchased or purchase times), or counterparties to tie together accounts that may seem independent, but actually work together to layer funds.

Hypergraph analysis is a technique that analyzes vast datasets and can capture more complex relationships involving multiple entities at once—overly simplified, you could imagine venn diagrams on steroids. It’s particularly useful in fraud detection for identifying cases where clusters of accounts have been created by a single entity, which may indicate the intention to use the network of accounts for structuring or layering.

Anomaly detection

Anomaly detection is the process of identifying data points that deviate significantly from expected patterns—these are known as statistical anomalies. In the anti-financial crime landscape, this is an invaluable practice for detecting unusual and potentially fraudulent transactions or activities. A couple examples of anomalies surfaced include transaction amounts that far exceed a customer’s historic transactions or the transactions of similar customers within the same segment.

When examining customer transaction behaviors, Resistant AI makes use of anomaly detection in our anti-money laundering (AML) findings.

Post-processing

In the context of AI, post-processing involves refining AI models’ output in order to enhance the overall quality of their results. Importantly, this step can help reduce false positives and improve the efficiency of financial crime detection systems. We use post-processing in our work during the assessment of new alerts against historic alerts, using the outcomes to eliminate any repetitive, duplicate alerts and adding context to alerts that are surfaced for analyst review.

Outcomes of AI implementation

In addition to understanding the foundational AI concepts, it's vital to consider the externalities or desirable outcomes that result from AI implementations. These ultimately depend on the specific use case in financial crime detection, such as authorized push payment (APP) fraud or buy now, pay later.

Explainability and transparency

Explainability is the ability for humans to clearly understand—and act upon—the output of computerized systems (like artificial intelligence). The outputs of these systems must be explainable so that analysts can effectively put them to use.

Transparency goes hand in hand with explainability. It involves making an AI model's decision-making process open and understandable to stakeholders. In the financial industry, transparency isn’t only a matter of good etiquette but also a regulatory requirement.

Increased robustness to evasion and manipulation methods

Financial criminals are constantly devising and revising their strategies to evade detection. With this in mind, ensuring that AI models are robust to ever-evolving evasion and manipulation attempts is crucial. In practice, this means that AI systems must be capable of adapting and evolving to counter new tactics employed by fraudsters—this is precisely why generalization and anomaly detection are so important.

Effective network analysis

Understanding the relationships and flows of information is a powerful tool in combating financial crime. Network analysis allows for the identification of hidden connections and networks used by criminals to carry out illegal activities. Good network analytics can ultimately help expand the reach of AI-led investigations.

How you can leverage AI with ease

The effective integration of AI in anti-fraud and anti-financial crime calls for a solid understanding of essential AI concepts and principles. Equipped with this knowledge, organizations can harness the full potential of AI to fight fraud and financial crime.

To learn how Resistant AI helps banks, fintechs, and payment companies fight fraud and financial crime with AI technology, get in touch with our team.

Want more? Take a look through our AML and fraud glossaries to learn more about the fundamental concepts used in AI.