How to secure LLM-driven processes in financial onboarding
Large language models (LLMs) are effectively transforming the modern business landscape. LLMs have recently become popular thanks to their ability to extract structured concepts from unstructured text as well as their capacity to fluently interpret a structured dataset into a cohesive narrative.
This proficiency in transforming content across different levels of representations allows LLMs to hold a conversation—and it also allows them to transform many aspects of a business.
Are LLMs useful for KYC and customer onboarding?
Today, onboarding and KYC processes are in dire need of transformation. The financial industry has been constantly tightening the onboarding requirements imposed on new customers to address the threats of fraud, money laundering, and financial crime in general. Other industries have been adopting similar requirements, and some form of onboarding is now required when you rent an apartment, lease or rent a car, start a new job, or even apply to a university.
Each of these processes have something in common: they’re typically labor-intensive processes designed to guide new customers through a maze of compliance requirements and formal assertions that they’ve never faced before. They’re designed to extract enough evidence to prove to a regulator that the customer isn’t up to anything suspicious.
"Extracting evidence" is a euphemism for reading, assessing, verifying, and interpreting documents that contain information about the customer being onboarded. So how can LLMs make this process better?
LLMs and deep neural network-based models have great capabilities for unstructured text interpretation and document reading that make them a perfect fit for the workflow and document understanding portion of the onboarding process. We can have endless debates whether generative AI techniques will ever really be creative, but it’s absolutely certain that they are more than good enough to replicate the highly scripted, uninspired work of a call center agent.
In our work at Resistant AI, we consider LLMs with a combination of text and inputs from documents, images, and scans. Using these inputs, we use the documents and user-provided information to assert whether or not certain claims about the future account holder (such as their address, age, name, source of income, and profession) are true.
Are LLMs safe for high-risk applications?
LLMs are a perfect example of a technology that’s inherently insecure. This isn’t a bad thing; most useful technologies are insecure—computers and software fall into this same category, yet we still entrust them with our lives and finances. However, in order to do so, we’ve learned how to safeguard them from the intelligent, motivated adversaries that we now face.
Here’s why LLMs are conceptually insecure: by design, they must process and interpret a vast range of both structured and unstructured input from the outside world. While the broadness of input is what makes them so useful, it’s also what makes them risky, since adversaries can craft the input at will.
To demonstrate why this is dangerous, we can go back about 30 years and look at the buffer overflow problem that has challenged the IT security practitioners for years (and remains to be perfectly solved). Buffer overflow is a problem that is infinitely less complex than any LLM—it’s caused by our inability to securely manage the mere length of an input text. The twist is that the text length is not set in advance in most practical situations, and the treacherously simple problem has led to countless infected systems over the years. Nevertheless, we still have no idea how many undiscovered, exploitable vulnerabilities there are.
We can compare the buffer overflow problem to LLMs’ security issues. LLM security research is an active field, and the owners of the foundational models are responsibly and proactively patching new vulnerabilities that appear on a daily basis. However, the fact remains that vulnerabilities continue to appear on a daily basis. And it can be argued that despite all the theoretical and practical progress being made, the problem of classification security in high-dimensional spaces doesn’t have a perfect solution.
In the same way that the Halting Theorem prevents us from being perfectly able to assess whether a specific program is free of vulnerabilities, we may be never able to validate that a model is safe from manipulation and misuse.
How are LLMs attacked?
Let’s have a look at a list of common attacks on LLM models. More specifically, what are the risks that a LLM faces when deployed in the financial industry?
1. Prompt injection: These attacks occur when an attacker passes an instruction to the classification model as a part of the input data, skewing the decision towards the outcome that they desire to achieve. LLMs are helpful and follow the instruction without applying the common sense and context awareness that every human has.
Image credit: Gustavo Lacerda
While countless prompt injection examples that result in unexpected behavior are fun to watch on social media, they’re less fun once you get fined by a regulator for failing to catch something "that should have been so obvious to anyone".
2. Modified documents: LLMs and LLM-based document processing systems are trustful. They’re optimized to interpret the inputs they receive in good faith, and they do their best to transform unstructured inputs into a structured internal representation that they use for further reasoning. With this in mind, they unintentionally tolerate forgeries, modifications, and fake documents mimicking real-world data, such as the one displayed below.
Please note that in each of the following images, all personal identifiable information (PII) has been changed—these reflect real-world document forgery examples and techniques that we have uncovered.
3. Context-free processing: AI models are forgetful—and that’s a good thing! You certainly don’t want your mortgage application rejected because of the dire financial circumstances of the previous applicant. On the other hand, (un)like humans, LLMs are not designed to notice coincidences that would alert any human loan officer that something is amiss. And why are they (un)like? Simply put, humans don’t have perfect memory either, and there aren’t many employees who would realize that a signature on a document resembles the one from a month ago a little bit too much.
4. Scalable attacks: Broken once, LLMs (and many automated systems) can be exploited a thousand times. Once an attacker discovers a vulnerability leading to a desired outcome, it can easily scale the exploitation. Some detections are easy. Hundreds of thousands of new clients applying overnight are going to get noticed, but two clients per week over the course of a year aren’t so easy to detect.
How can we protect LLMs?
LLMs need to be protected through input sanitization, tight supervision, and constant maintenance. However, having a human in this loop eliminates LLMs’ automation benefits. We need a different kind of AI—one that’s more predictable, more restrictive, more context-aware, and more difficult to mislead or manipulate—designed to protect LLM models that are too positive and vulnerable by their very nature. The following measures can help your organization build a system fit for safeguarding LLMs.
1. Be strict on inputs
• Strictly assess the input quality to limit any hallucination on the model side that would fill in the blanks with "expected" inputs.
• Verify that each document submitted for processing is what it says it is.
• Verify the presence of hidden prompts.
• Verify that no "human obvious" manipulations are present.
• Verify that no non-obvious, high-quality modifications are hidden in the document.
2. Assess the security context
• Verify that the document cannot be traced to a known document mill or other serial fraud threat actor.
• Verify that the document is not an algorithmically produced shallow fake or a deepfake.
• Verify that the document is not part of a serial fraud attempt, either fast or slow.
3. Design your controls for in-depth defense
Almost any single AI model can be manipulated by a determined adversary with sufficient access. At Resistant AI, our approach is to design protective models using the ensemble approach, in which we combine multiple perspectives provided by statistically de-correlated models in order to build a higher quality consensus. Manipulation against one of the models would typically result in higher scores on other models, helping us to detect the manipulated sample.
4. Design your system for rapid reaction
Rapid reaction is essential, as attackers can easily modify their strategies and produce new classes of attacks on a daily basis. A defense model that doesn’t reflect what is happening in near-real-time is bound to get circumvented. Moreover, any model that needs a significant volume of malicious behavior for training will become very expensive very quickly. Each fraudulent sample obtained equates to another fraud that got through undetected.
5. Humans must drive AI instead of being driven by AI systems
We firmly believe that AI should be used to empower humans and to scale-up their decisions. Human supervisors must control the system on a high level and only intervene to manage the anomalies, attacks, or unexpected situations.
While the above list is not exhaustive and is by no means perfect, it’s a start, and it’s grounded in our team’s 20 years of working with AI in security-sensitive applications.
Based on our experience, comprehensive security is not something you achieve simply by going through a checklist. Instead, security is a process of continuous improvement in face of ever-improving threats. A good security program is designed to control risks and keep your organization one step ahead of the attackers.
Getting the right team on your side is critical, and Resistant AI is here to help.