It’s undeniable that Machine Learning (ML) is changing the game for securing cloud infrastructure.  Security vendors have rapidly adopted ML as part of their solutions, and for good reason:  By analyzing massive quantities of data, it can help identify threats, speed incident response, and ease the burden on over-taxed security operations teams.

The problem?  Vendor hype has set some very high expectations for the technology, making it difficult to separate fact from reality.  Despite marketing claims, it isn’t a silver bullet.  Machine learning is just not appropriate for every use case.  It is, however, one very powerful tool that should be in every security practitioner’s toolbox for identifying and remediating threats in cloud infrastructure.

How can security leaders navigate these high expectations and find pragmatic uses for machine learning?  It starts with an understanding of what it can do and what it can’t.  In order to deliver on the promise of ML, we first have to dispel some of the misconceptions around it.  This article will highlight some of those misconceptions and offer practical advice for getting started.


Myth: Machine Learning will Replace Your Security Team

ML does promise to automate repetitive and manual tasks throughout the detection and incident response cycle.  First, it can help reduce “Mean-time-to-detect” (MTTD)  by identifying anomalous behaviors at scale.  This is critical when analyzing billions of events from multiple sources of security telemetry, and is something humans are not equipped to do.  Second, ML can help provide much-needed context that helps security analysts triage and investigate findings.

Is ML here to take our jobs as security professionals?  Absolutely not.  Despite vendor claims of “hands-free” operation, humans are critical to proper implementation, training, and on-going use of data to detect and remediate threats.

Remember:  machine learning uses statistical analysis of data to find patterns and make observations.  Take anomaly detection, for example:  Anomaly detection uses unsupervised learning to highlight behavioral outliers in security data.  Anomaly detection does not make judgements about good vs. bad, or low-risk vs, high-risk.  Given the highly dynamic, ephemeral nature of cloud infrastructure, those decisions must be made by people with both security expertise and an understanding of organizational context.  We still need humans to draw conclusions from ML findings; in fact, establishing a human-machine feedback loop will tune and improve models, make findings more relevant over time, and allow humans to focus on higher value work.  In this way, it’s helpful to view ML as a force multiplier.


Myth: ML Can Generate Meaningful Insights From Any Data

Effective ML-powered security strategies require data, and lots of it, to properly train models.  As infrastructure and the application stack becomes more layered, security telemetry becomes foundational for holistically assessing risk and building effective ML-models. 

The type of data needed is highly dependent on the organizational and team goals for ML. For cloud infrastructure security, you need to have deep and broad security telemetry that can enable your ML system to analyze data points from across your infrastructure. Without a complete view of your infrastructure, your ML system will continue to draw false conclusions due to a lack of depth and breadth in the data.  That depth of telemetry will also prove useful in establishing context for investigating findings.  Beware of “black box” ML implementations that are unable to explain why they surfaced a particular finding.

Data quality is also a key contributor to  the efficacy of ML. Garbage in, garbage out as the popular saying goes. Data must be normalized and well modeled so that it can be analyzed at scale, with security experts providing input on feature engineering.


Myth: ML Can Be Used as the Sole Means of Threat Detection

Some vendors position ML as a single solution for all the challenges around security cloud infrastructure.  They herald it as an “easy button,” promising a set-it-and-forget-it approach.

The reality:   ML is just one piece of the puzzle. It’s important to remember that ML  can detect anomalies in security data, but not all anomalies are threats and not all threats are anomalies. Further, anomalous behavior can change over time and become normal as models retrain. 

To address this limitation, ML should be paired with in-depth, rules-based detection to cover both known threats and “unknown” threats.   With rules-based monitoring for known risk and machine learning monitoring for anomalies, security teams can ensure they have coverage for all threats.

With an understanding of the above myths, security leaders can find practical ways to leverage ML as part of a cloud security practice.  Machines are exceedingly good at applying statistical analysis to data, but that isn’t enough.  Humans bring much-needed security expertise, organizational context, and decision-making to the interpretation of ML findings.  Coupled with rules-based detection, security teams can ensure they have visibility to both known and unknown threats.  This is the true promise of machine learning:  threat detection that solves real world problems, gets better and more relevant over time, and allows humans to focus on higher value activities.