Researchers at the University of Texas at Austin under the supervision of Symmetry Systems CEO Mohit Tiwari discovered a new attack method called ConfusedPilot. This method targets Retrieval Augmented Generation (RAG) based AI systems and allows the manipulation of AI systems. This could lead to misinformation and altered decision-making within an affected organization. This attack would likely mirror the following steps:
- A malicious actor introduces a document containing particularly crafted strings into the target’s environment.
- A user makes a related query and the RAG system retrieves the introduced document.
- The AI reads these strings as user instructions and may disregard relevant and legitimate content, generate a misinformed answer via corrupted information, or falsely attribute information to legitimate sources.
Security leaders weigh in
Stephen Kowski, Field CTO at SlashNext Email Security+:
“One of the biggest risks to business leaders is making decisions based on inaccurate, draft, or incomplete data, which can lead to missed opportunities, lost revenue, and reputational damage. The ConfusedPilot attack highlights this risk by demonstrating how RAG systems can be manipulated by malicious or misleading content in documents not originally presented to the RAG system, causing AI-generated responses to be compromised.
“An interesting part of the attack is the RAG taking instructions from the source documents themselves as if they were in the original prompt, similar to how a human would read a confidential document and say they can't share certain pieces of information. This demonstrates the need for robust data validation, access controls, and transparency in AI-driven systems to prevent such manipulation.
“Ultimately, this can lead to a wide range of unintended outcomes, including but not limited to denial of access to data, presentation of inaccurate information, access to deleted items that should be inaccessible, and other potential attacks by chaining these vulnerabilities together.”
Amit Zimerman, Co-Founder and Chief Product Officer at Oasis Security:
“Attackers are increasingly looking at weaker parts of the perimeter, such as non-human identities (NHIs), which control machine-to-machine access and are increasingly critical in cloud environments. NHIs now outnumber human identities in most organizations, and securing these non-human accounts is vital, especially in AI-heavy architectures like Retrieval-Augmented Generation (RAG) systems.
“To successfully integrate AI-enabled security tools and automation, organizations should start by evaluating the effectiveness of these tools in their specific contexts. Rather than being influenced by marketing claims, teams need to test tools against real-world data to ensure they provide actionable insights and surface previously unseen threats. Existing security frameworks may need to be updated, as older frameworks were designed for non-AI environments. A flexible approach that allows for the continuous evolution of security policies is vital.”
John Bambenek, President at Bambenek Consulting:
“As organizations adopt Gen AI, they want to train in corporate data, but often that is in dynamic repositories like Jira, SharePoint, or even trouble ticket systems. Data may be safe at one point, but can be become dangerous when subtly edited by a malicious insider. AI systems see and parse everything, even data that humans might overlook, which makes the threat even more problematic.
“This is a reminder that the rush to implementing AI systems is far outpacing our ability to grasp much less mitigate the risks.”