How Should Effective AI Red Teams Operate?

As artificial intelligence (AI) is increasingly woven into daily work functions and its capabilities grow, it becomes more important that the interactivity, workflows, and decision pathways of these models are tested and understood. Otherwise, organizations could find themselves exposed through their own AI tools.
This is where the value of AI-specific red teaming comes in. But how can organizations implement this effectively?
To learn more, Security magazine spoke with Dr. Peter Garraghan, CEO and CTO of Mindgard, Professor in Computer Science at Lancaster University, and fellow of the UK Engineering Physical Sciences and Research Council (EPSRC).
Security magazine: Tell us about your background and career.
Garraghan: I began my career focusing on distributed systems, cloud infrastructure, and systems security. My academic work as a Chair Professor at Lancaster University has centered on creating and building computing infrastructure of the future. Over time, it became clear that artificial intelligence — and more specifically deep neural networks — was introducing a new class of security problem, one that did not fit neatly into traditional security frameworks.
Around the mid-2010s, as deep neural networks began to outperform other types of machine learning in complex tasks and the introduction of the transform architecture (that presently underpins modern LLMs and agents), I saw a widening gap between AI capability and AI risk. Organizations were deploying models into critical workflows without robust ways to test how those systems behaved under adversarial manipulation. That realization led me to found Mindgard as a research-driven effort to bring attacker-aligned testing methodologies to AI systems. The intent was not to promote fear around AI, but to ensure that as these systems became embedded in enterprise decision-making, they were subjected to the same scientific rigor and adversarial scrutiny that we expect in other high-risk domains.
Security: How is red teaming different from AI-specific red teaming?
Garraghan: Red teaming is the structured simulation by an adversary to test how an organization withstands challenges. It is not limited to technical systems. A red team may probe strategy, governance, operational processes, physical security, insider risk, or technology controls. The purpose is to think and act like a real attacker to expose blind spots, flawed assumptions, and systemic weaknesses before a genuine adversary does.
AI-specific red teaming applies that adversarial discipline to intrinsically probabilistic systems whose behavior is shaped by data and language. Unlike conventional systems, AI models can be influenced through phrasing, context manipulation, data sources, or tool integrations. Risk often emerges from how models interact with workflows and users rather than from a single technical flaw. As a result, AI red teaming must evaluate behavior and decision pathways over time, particularly as these systems become embedded in business-critical operations.
Security: Why is this an important distinction to make?
Garraghan: Many organizations underestimate AI risk by applying legacy testing assumptions. Asking a model a series of harmful questions and observing refusals is not equivalent to red teaming. It does not reflect how real adversaries operate, nor does it account for indirect or multi-step exploitation.
Security: How should effective AI red teams operate?
Garraghan: Start with adversary emulation; model intent, persistence, and economic motivation, rather than simply testing edge cases. AI systems are interactive, so testing should reflect iterative probing and escalation. A single prompt rarely reveals systemic weaknesses; exploitation often emerges across multiple steps.
Equally important is scope. The model alone is rarely the sole point of failure. Red teams must assess orchestration layers, retrieval mechanisms, external data sources, access controls, and downstream actions. Automation can help scale adversarial testing across thousands of variants, but human expertise remains critical to interpret emergent behavior and design novel attack paths. Finally, results must be measurable and mapped to impact. AI red teaming should produce evidence that informs governance and remediation, not anecdotal observations that lack operational relevance.
Security: Is there anything we haven’t discussed that you’d like to add?
Garraghan: I would emphasize that AI security is about disciplined risk management, not eliminating every possible failure, and that it is a rapidly evolving space both on the scientific and technological front. These systems are probabilistic by design, so the goal is to understand how they fail and to keep those failures within acceptable bounds. We are also entering an era where language acts as a control surface. In AI systems, natural language can directly shape behavior, and as autonomous agents gain access to data and actions, behavioral manipulation shifts from content risk to operational risk. That change requires the same rigor and governance we apply to any security-critical system.
Looking for a reprint of this article?
From high-res PDFs to custom plaques, order your copy today!









