The latest estimates show 328.77 million terabytes of data are created each day, roughly 120 zettabytes per year, compared to just 2 zettabytes in all of 2010. This growth isn’t expected to slow in the coming years, and organizations increasingly rely on this data to make informed business decisions, conduct research and analysis and much more. But managing and securing that growth in data volume poses a challenge for security teams, leading to more breaches and resulting in new regulations such as the latest Security and Exchange Commission rules for incident disclosure and risk management. 

Data security posture management (DSPM) providers try to provide security visibility into these massive data volumes, but cannot scale to the volume of data created, shared and copied. Plus, a rules-based approach to data classification is slow and inaccurate, unable to keep up with changes in data, resulting in long times to value. Security teams must keep up with data growth, understand who has access to it and remediate problems as quickly as data comes in to effectively minimize risks. DSPMs and other legacy data security solutions weren’t designed to handle the volume of data and usage needs of modern enterprises with Cloud and generative AI (GenAI) applications, and security teams can’t scale their resources to fill that gap. 

In an era of increasing cyber threats and exponential data growth, it’s imperative that organizations understand how currently data security solutions are falling short and what’s needed to meet the SEC’s and other regulatory requirements to report significant cyber incidents within just a few days.

Data visibility alone isn’t enough 

Data visibility is undeniably important. There’s no way to protect data that hasn’t been identified. In addition, there’s lost, forgotten and stale data, plus data that is seldom analyzed, including log files, sensor data and other types of unstructured data, such as text documents, emails and audio and video files. GenAI makes it simple to generate images, text and much more, adding to overall data volume. To safely use GenAI models, enterprises must ensure that they are compliant, secure and will not output sensitive data. Add the ease with which data can be copied, moved and shared due to the use of cloud computing, and legacy data security tools can’t keep up. These solutions lack the architecture necessary to accurately identify and classify at the speed and scale required to ensure data security today.

Rules-based data classification no longer works 

Data classification has been a challenge for a long time, and many legacy data security solutions solved this problem by using rules to classify data. Regular expressions (also known as regexes) made the identification of some types of data easy — at least when you can consistently match a pattern, like a social security number: ###-##-####. But what if it doesn’t match the pattern exactly? Or there’s another number that does match that pattern, but isn’t a social security number? It’s not difficult to understand that this type of data classification is brittle and ill-suited to classifying large volumes of structured and unstructured data. Creating and maintaining the number of rules necessary to classify data at scale far outpaces the time security teams can dedicate to the problem. Instead, organizations must seek solutions that leverage artificial intelligence (AI) to identify data types, locations and access information, then classify that data, using AI reasoning to analyze both structured and unstructured data and understand new data types based on context.

Continuous data visibility and classification is essential

Data that has been identified and classified is easier to search, analyze and understand, ensuring that sensitive data is appropriately identified and protected and enabling organizations to comply with data privacy and security regulations. For example, the Health Insurance Portability and Accountability Act (HIPAA) ensures the security and privacy of protected health information (PHI) in the United States, a difficult requirement to follow without full data visibility. 

While the SEC covers only public companies, the scope of a security event is much broader, as it includes impact to and from customers, partners and suppliers. The SEC also requires registrants to disclose the material aspects of an incident’s nature, scope and timing within four business days of determination that a cybersecurity incident is material. When an incident occurs, organizations must rapidly determine the impact of the event, what data was accessed or stolen, which agencies must be notified, and how quickly notification must take place. Answering these questions quickly and accurately is impossible, often based primarily on guesswork and telemetry, for organizations that do not have continuous visibility into and classification of data. Strong access controls and data visibility show security teams what was compromised and when, enabling organizations to respond rapidly and in detail following a material cyber incident.

Modern enterprises require a modern approach

The reality is that organizations today are reliant on data to make business decisions, and more data is being created, stored and shared faster than ever before. Traditional data security solutions, including DSPMs, were not designed to identify, classify and secure the variety and volume of data generated today. This extends the time required to detect threats, leaving organizations vulnerable to attack, unable to use data effectively and ill-prepared to respond quickly to incidents. Security threats and the business risk they introduce, combined with new regulations that require rapid reporting of material incidents, require organizations to ensure that they are ready and able to respond at any time.