In July 2020, the gaming company Nintendo was compromised in a data breach that commentators described as unprecedented.
The breach, dubbed “the gigaleak,” exposed internal emails and identifying information, as well as a deluge of proprietary source code and other internal documents. But the compromise wasn’t discovered by observing network traffic or even dark web analysis — it was first identified through a post on 4chan.
Less-regulated online spaces like imageboards, messaging apps, decentralized platforms, and other obscure sites are increasingly relevant for detecting these types of corporate security compromises. Serious threats can be easily missed if security teams aren’t looking beyond standard digital risk sources like technical and dark web data feeds.
Overlooked risks can cost companies millions in financial and reputational damage — but existing commercial threat intelligence solutions often lack data coverage, especially from these alternative web spaces.
How does this impact corporate security operations, and how can data coverage gaps be addressed?
An evolving corporate risk landscape
Security risk detection is no longer limited to highly anonymized online spaces like the dark web or technical feeds like network traffic data.
While these sources remain crucial, corporate security teams also need to assess obscure social sites, forums, and imageboards, messaging apps, decentralized platforms, and paste sites. These spaces are frequently used to circulate leaked data, as with the Nintendo breach, and discuss or advertise hacking tactics like malware and phishing.
Example of leaked data on RaidForums, a popular hacking website on the deep web—posted/discovered by Echosec Systems
Beyond malware and breach detection, these sources can indicate internal threats, fraud, theft, disinformation, brand impersonation, potentially damaging viral content, and other threats implicating a company or industry.
The rise of hacktivism and extremism on less-regulated networks also poses an increased risk to companies and executives. For example, disinformation or violence targeting high-profile personnel may be discussed and planned on these sites.
Why are these alternative sources becoming more relevant for threat detection?
To start, surface and deep web networks are more accessible for threat actors even though the dark web may offer more anonymity. They also have further reach than the dark web — a relatively small and isolated webspace — if the goal is to spread disinformation and leaked data.
Obfuscation tactics in text-based content are also becoming more sophisticated. For example, special characters (e.g. !4$@), intentional typos, code language, or acronyms can be used to hide targeted threats and company names. Adversaries are often less concerned with detection on surface and deep websites using these techniques.
Decentralization is also becoming a popular hosting method for threat actors concerned with censorship on mainstream networks and takedowns on the dark web. Decentralization means that content or social media platforms are hosted on multiple global or user-operated servers so that networks are theoretically impossible to dismantle.
CEO-targeted death threat on the decentralized social network Mastodon — discovered by Echosec Systems
While the dark web was once considered a mecca for detecting security threats, these factors are extending relevant intelligence sources to a wider range of alternative sites.
New barriers to threat detection
Emerging online spaces offer valuable security data, but the changing threat landscape is posing new challenges for corporate security. Many alternative threat intelligence sources are obscure enough that analysts may not know they exist or to look there for threats. Some surface and deep websites, like forums and imageboards, emerge and turn over quickly, making it hard to keep track of what’s currently relevant.
Additionally, many commercial, off-the-shelf APIs provide access to technical security feeds and common sources like the dark web and mainstream social media — but do not offer this alternative data. This creates a functional gap for security teams who realize the value of obscure online sources but may be forced to navigate them manually.
APIs enable security teams to funnel data from online sources directly into their security tooling and interfaces rather than collecting data through manual searches on-site.
Leaked image of a security operations Centre on social media — discovered by Echosec Systems
For most corporate security teams and operations centers, manual data gathering — which often requires creating dummy accounts — is unsustainable, requiring a significant amount of time and resources.
Efficient threat intelligence access is essential in an industry where security teams are often understaffed and overwhelmed by alerts. According to a recent survey by Forrester Consulting, the average security operations team sees 11,000 daily alerts but only has the resources to address 72% of them.
Putting aside the issue of niche data access, industry research suggests that commercial threat intelligence vendors vary widely in their data coverage — overlapping 4% at most even when tracking the same specific threat groups. This raises concerns about how many critical alerts are missed by security teams and operations centers — and how holistic their data coverage actually is, even when using more than one vendor.
Holistic APIs: The future of addressing corporate risk
How do security professionals and operations centers comprehensively access relevant data and accelerate analysis and triage? To address these issues, security teams must rethink their API coverage.
This means adopting commercial threat intelligence solutions that are transparent about their data coverage. Vendors must be able to offer a wider variety of standard and alternative threat sources than is commonly available through off-the-shelf APIs. To achieve this, vendors often must source data in unique ways — such as developing proprietary web crawlers to sit in less-regulated chat applications and forums.
When standard threat intelligence sources are combined with fringe online data in an API, analysts can do their jobs faster than merging conventional feeds with manual navigation. Analysts also get more contextual value within their tooling than viewing different sources separately. It also means that previously overlooked risks on obscure sites are included in a more holistic security strategy.
An API also retains content that has been deleted on the original site since being crawled, allowing for more thorough investigations than those possible with manual searches. This is important on more obscure networks like 4chan where content turns over quickly.
When collected and catalogued appropriately, a wider variety of online data can be used to train effective machine learning models. These can support faster and more accurate threat detection for overwhelmed security teams. In fact, some emerging APIs have machine learning functionality already built-in so analysts can narrow in on relevant data faster.
As alert volumes grow and threat actors migrate to a greater variety of online spaces, security professionals are likely to become more concerned with their data coverage — and how to integrate alternative data sources effectively into workflows.
These concerns are often a question of API capabilities: how much coverage does the threat intelligence vendor have across technical, surface, deep, and dark web feeds? Which vendors actively seek and include emerging, relevant sources? And does their offering structure and store data to best support machine learning development?
These developments are necessary to drive automation as the cybersecurity skills shortage worsens against increased digital risks year after year. Prioritizing data diversity also enables a more holistic security strategy.
This is essential for the Nintendo’s of the corporate security world — who need to look further than standard threat intelligence feeds to uncover damaging risks hiding in plain sight.