CybersecurityLogical SecuritySecurity Education & Training

The AI Efficacy Asymmetry Problem

Blurry keyboard — Mohamed Marey via Unsplash

Over the last year and a half, we’ve seen AI organizations lay the foundation to enable the reality of “AI Agents”. With Anthropic’s release of Model Context Protocol (MPC) in November 2024, we saw the first building blocks of letting Large Language Models (LLMs) such as ChatGPT and Claude interact with the real world, via APIs, and have an impact on systems. This meant that LLMs could finally move beyond a simple chatbot. It was only a matter of time before the cybersecurity industry saw the impact of these capabilities.

How Has Rapid AI Innovation Changed the Cybersecurity Landscape?

Since the 2024 release of MPC, we’ve seen an incredible amount of innovation from the firms building these frontier models. With the latest releases from OpenAI and Anthropic, we’ve seen models that have significantly improved their ability to interact with external systems. Anthropic’s Clause Sonnet 4.6, released on Feb. 17, boasts of significantly increased capabilities when it comes to “computer use.” This means that the models don’t have to interact with APIs (via MPC) but can interact with systems just like humans do — through a browser or an application’s user interface.

AI is evolving faster than any foundational technology before it, and now, we’re seeing AI labs train new foundational models and create new features, like Claude Cowork, using their own AI models, exponentially increasing the speed of innovation in this space.

We have started to see innovations that enabled LLMs and AI models to be integrated directly into the workflows that both developers and attackers utilize. Technology like Claude Code or Open AI’s Codex CLI enables users to interact and orchestrate AI agents directly in a terminal or command line interface (CLI). And recently, we’ve seen some of the havoc (and quite frankly amazing things) that Clawdbot (now OpenClaw) can do by letting these agents orchestrate actions.

At the same time, we’ve also started to see significant investment in Agentic Pentest/attack companies; organizations that are working to automate the vulnerability discovery and exploitation process. Organizations like XBOW have built AI agents that earned top spots in capture the flag (CTF) and bug bounty hunting leaderboards.

All of this innovation has culminated into something we all predicted but didn’t expect to deal with so soon: threat actors leveraging these tools to orchestrate and automate broad attacks against organizations. In November 2025, Anthropic broke the news that they believed that a nation state threat actor leveraged Claude Code to orchestrate and automate much of a cyber espionage campaign that spanned from automated reconnaissance to automated exploitation of Web Application Vulnerabilities, and even attempted lateral movement. We’ve also seen claims from Google’s Threat Intelligence Group (GTIG) that LLMs have become essential for nation state threat actors for research, targeting and crafting lures.

How Does This Create the AI Efficacy Asymmetry Problem?

While threat actors have begun adapting their workflows to leverage Agentic AI, so have defenders. We’ve seen scores of AI SOC companies emerge, each promising to either fully automate a Security Operations Center (SOC) or scale humans significantly. We’ve integrated such AI tools into our MDR analyst workflows, to enable them to review the automated investigations, understand the searches or investigative steps that were taken, and then determine if the activity was malicious and trigger containment actions.

However, all of this only exacerbates the “cyber arms race” that we’ve been talking about. And in significant ways, it tips the advantage towards the threat actors — something I’m calling the AI Efficacy Asymmetry problem.

LLMs hallucinate. This is inherent to how they were built and trained. They attempt to predict the next most likely word based on their training data. If their training data doesn’t have the perfect match, if you ask the question incorrectly, or if there are issues with your context window, the AI agent will just lie to you. Confidently. This is because these models have been trained to please and be confident in their answers, even if they don’t have the data to back up their generated results. An article from OpenAI stated, “Our new research argues that language models hallucinate because standard training and evaluation procedures reward guessing over acknowledging uncertainty.”

This means that when AI models get things wrong, they do so confidently. That often doesn’t matter in an attack scenario; attackers can just try again. However, it could be detrimental for defenders.

Why Are Hallucinations So Harmful for Defenders?

There have been several benchmarking and academic efforts to test the overall efficacy of current AI models when it comes to cybersecurity performance. In one study by Stanford and Carnegie Mellon Universities, which attempted to pit human pentesters against an orchestrated AI agent with access to “standard” and open-source pen test tools, the AI agents were able to reliably identify and exploit real vulnerabilities roughly 80% of the time, with some variations in scaffolding and configuration. In this study, the ARTEMIS AI Agent was able to outperform 90% of human pen testers. Now, look… there are some details missing regarding the backgrounds and skillsets of the pentesters, and there is certainly a wide array of capabilities that pentesters have these days.

Anthropic’s recently released Claude Opus 4.6 model card quoted roughly 66% efficacy in finding vulnerabilities and roughly 93% success rate against Cybench’s 40 CTF challenges. As part of the model’s release in early February, Anthropic also published news that their Opus 4.6 based AI agent found roughly 500 new zero day vulnerabilities in open source software, including more complex bug categories like Buffer Overflows.

This means that when AI agent efficacy is about 80%, the threat actors succeed. If an AI agent hallucinates a vulnerability and attempts to exploit it, it will fail. And that’s OK for attackers. They can just try again with a different bug. All it costs them is tokens.

However, for defenders, the cost of hallucinations could be disastrous. What if your AI SOC agent connected to your EDR’s network isolation feature misinterprets an alert and attempts to isolate a Domain Controller or disables a service account that is critical to the functioning of your business? What if it accidentally disables your CEO’s account because they reported a phishing email that they didn’t actually fall for? What if an AI agent hallucinates the fact that threat actors were able to successfully move laterally and your IR team chooses to temporarily disconnect the network from the internet?

Defenders must consider the implications of implementing AI agents with response capabilities, or the consequences of a confidently incorrect AI agent. There must be guardrails that prevent the incorrect execution or potentially disastrous containment procedures. Time and time again, we’ve seen news stories of AI agents taking incorrect action in ways that are highly impactful. For example, this story details how an AI agent accidentally deleted production data bases at SaaS companies and then lied about it.

What Is the Lesson for Our Industry?

For the foreseeable future, while we have AI models that hallucinate, threat actors have an asymmetric advantage. An 80% success rate for threat actors is great. A 20% failure rate for defenders, when the AI agent can take impactful containment actions, is risky, and we need to architect our agentic AI deployments and guardrails accordingly.

Francisco Donoso is Chief Product & Technology Officer at Beazley Security. Image courtesy of Donoso

Digital threats to executives and other high-profile employees are evolving faster than most corporate protection programs. Learn why modern executive protection programs require data-driven, intelligence-led strategies to keep pace with the magnitude of today’s threats.

Learn how modern security teams are evolving from alert-driven workflows to outcome-driven operations and how AI is enabling faster, more confident decisions at every stage of the incident response lifecycle.

The AI Efficacy Asymmetry Problem

How Has Rapid AI Innovation Changed the Cybersecurity Landscape?

How Does This Create the AI Efficacy Asymmetry Problem?

Why Are Hallucinations So Harmful for Defenders?

What Is the Lesson for Our Industry?

Blog Topics

Security Blog

On the Track of OSAC

Blog Roll

Security Industry Association

Security Magazine's Daily News

SIA FREE Email News

SDM Blog

Sign-up to receive top management & result-driven techniques in the industry.

Join over 20,000+ industry leaders who receive our premium content.

The AI Efficacy Asymmetry Problem

How Has Rapid AI Innovation Changed the Cybersecurity Landscape?

How Does This Create the AI Efficacy Asymmetry Problem?

Why Are Hallucinations So Harmful for Defenders?

What Is the Lesson for Our Industry?

Share This Story

Blog Topics

Blog Roll

Sign-up to receive top management & result-driven techniques in the industry.

Join over 20,000+ industry leaders who receive our premium content.