Phishing Scams Can Deceive Large Language Models

Walkator via Unsplash
Netcraft researchers have discovered that large language models (LLMs), when asked to identify how to log into various platforms, would provide concerning results one-third of the time.
Two-thirds of the time, the LLM would provide the correct login URL. The concerning one-third can be broken down as follows:
- 30% sent users to domains that were unregistered, parked or inactive (which could leave them at risk of takeover).
- 5% directed users to unrelated organizations.
Essentially, more than one in three individuals were sent to a site unassociated with the brand in question.
The research asserts that the tests run were not edge-case prompts; rather, researchers used simple, casual phrases to simulate how an average user might realistically submit a prompt to an LLM.
In one observed instance, the live AI-powered search engine Perplexity directed researchers to a phishing link. According to the researchers, the phishing link was not associated with a subtle scam, and Perplexity ignored signals such as domain authority or reputation.
Below, security leaders discuss these findings and their implications.
Security Leaders Weigh In
Gal Moyal, CTO Office at Noma Security:
If AI suggests unregistered or inactive domains, threat actors can register those domains and set up phishing sites. As long as users trust AI-provided links, attackers gain a powerful vector to harvest credentials or distribute malware at scale.
Without guardrails enforcing URL correctness, AI responses can mislead users. Guardrails should validate domain ownership before recommending login, Any request/response containing a URL can be vetted using common practices, or use common practices such as domain reputation, known malicious URL websites, etc.
AI can easily become a phishing delivery mechanism, highlighting the urgency of runtime protection to be put in place.
Nicole Carignan, Senior Vice President, Security & AI Strategy, and Field CISO at Darktrace:
LLMs provide semantic probabilistic answers with intentional variability to avoid repetitive outputs. Unfortunately, this mitigation strategy can also introduce hallucinations or inaccuracies.
The research shows that approximately one-third of domains provided by the LLM were unregistered, parked, or unavailable — highlighting an emerging risk that can be easily weaponized by threat actors. When AI suggests one of these domains, it opens the door to malicious redirection, phishing, and credential harvesting. This, however, is not a new tactic. Threat actors have been leveraging typo-squatting — registering intentionally misspelled or lookalike domains to deceive users — for more than two decades.
The research also revealed a more dangerous threat; the intentional data poisoning or bias interjected into promoted GitHub repositories. The compromise of data corpuses used in the AI training pipeline underscores a growing AI supply chain risk. Data integrity, data sourcing, cleansing, and verification are critical to ensuring the safety and accuracy of LLM-generated outputs.
LLMs can and should have guardrails in place to mitigate this risk. One basic mitigation is to have LLMs ground or source any URL that is cited, essentially removing “generated” hostnames and replacing them with grounded, accurate hostnames.
More broadly, this research points to a deeper issue: users are relying on generated, synthetic content from the outputs of LLMs as if it is fact-based data retrieval. LLMs don’t “retrieve” information — they generate it based on learned semantic probabilities from training data that users typically have no visibility into. Without proper sourcing, these systems become ripe for both inaccuracy and exploitation.
J Stephen Kowski, Field CTO at SlashNext Email Security+:
AI sending users to unregistered, parked or unavailable URLs creates a perfect storm for cybercriminals. When AI models hallucinate URLs pointing to unregistered domains, attackers can simply register those exact domains and wait for victims to arrive. It’s like having a roadmap of where confused users will end up — attackers just need to set up shop at those addresses and collect whatever sensitive information people try to enter.
This LMM behavior definitely needs immediate attention through input and output filtering systems. Traditional security measures struggle with AI-generated content because it looks legitimate and bypasses normal detection patterns. Real-time URL validation and domain verification before presenting results to users would catch these hallucinated links before they cause damage.
The one instance when AI actually provided a link to a phishing site is the most concerning finding because it shows AI can directly serve up active threats, not just create opportunities for future ones. This demonstrates how AI systems can become unwitting accomplices in phishing campaigns, essentially doing the attacker’s work by delivering malicious links with the authority and trust that comes with an AI recommendation. When users trust AI responses as authoritative, a single malicious link recommendation can compromise thousands of people who would normally be more cautious about clicking suspicious URLs.
Looking for a reprint of this article?
From high-res PDFs to custom plaques, order your copy today!