Security Magazine logo
search
cart
facebook twitter linkedin youtube
  • Sign In
  • Create Account
  • Sign Out
  • My Account
Security Magazine logo
  • NEWS
    • Security Newswire
    • Technologies & Solutions
  • MANAGEMENT
    • Leadership Management
    • Enterprise Services
    • Security Education & Training
    • Logical Security
    • Security & Business Resilience
    • Profiles in Excellence
  • PHYSICAL
    • Access Management
    • Fire & Life Safety
    • Identity Management
    • Physical Security
    • Video Surveillance
    • Case Studies (Physical)
  • CYBER
    • Cybersecurity News
    • More
  • BLOG
  • COLUMNS
    • Career Intelligence
    • Cyber Tactics
    • Cybersecurity Education & Training
    • Leadership & Management
    • Security Talk
  • EXCLUSIVES
    • Annual Guarding Report
    • Most Influential People in Security
    • The Security Benchmark Report
    • Top Guard and Security Officer Companies
    • Top Cybersecurity Leaders
    • Women in Security
  • SECTORS
    • Arenas / Stadiums / Leagues / Entertainment
    • Banking/Finance/Insurance
    • Construction, Real Estate, Property Management
    • Education: K-12
    • Education: University
    • Government: Federal, State and Local
    • Hospitality & Casinos
    • Hospitals & Medical Centers
    • Infrastructure:Electric,Gas & Water
    • Ports: Sea, Land, & Air
    • Retail/Restaurants/Convenience
    • Transportation/Logistics/Supply Chain/Distribution/ Warehousing
  • EVENTS
    • Industry Events
    • Webinars
    • Solutions by Sector
    • Security 500 Conference
  • MEDIA
    • Interactive Spotlight
    • Photo Galleries
    • Podcasts
    • Polls
    • Videos
      • Cybersecurity & Geopolitical Discussion
      • Ask Me Anything (AMA) Series
  • MORE
    • Call for Entries
    • Classifieds & Job Listings
    • Newsletter
    • Sponsor Insights
    • Store
    • White Papers
  • EMAG
    • eMagazine
    • This Month's Content
    • Advertise
  • SIGN UP!
CybersecurityManagementCyber Tactics ColumnSecurity Leadership and Management

Machine Learning: How It Works

By John McClurg
SEC0519-Cyber-Feat-slide1_900px
SEC0519-cyber-slide2_900px
SEC0519-Cyber-Feat-slide1_900px
SEC0519-cyber-slide2_900px
May 1, 2019

Machine Learning leverages a four-phase process: Collection, Extraction, Learning and Classification.

Collection

Like DNA analysis, file analysis starts with massive data quantities – specific types of files (executables, PDFs, Microsoft Word® documents, Java, etc.). Millions of files are collected from industry sources, proprietary repositories and inputs from active computers.

The goal is to ensure:

  • statistically significant sample sizes
  • sample files of the broadest type and authorship (author groups such as  Microsoft, Adobe, etc.)
  • an unbiased collection, not over-collecting specific file types.

Files are then reviewed and placed into three buckets: known and verified valid; known and verified malicious; and unknown. An accurate review is imperative – the inclusion of malicious in the valid bucket or valid in the malicious bucket would create incorrect bias.

Extraction

The extraction of attributes follows, which is substantively different from behavior identification or malware analysis historically conducted by threat researchers. Rather than seeking things analysts believe might be malicious, this approach leverages the compute capacity of machines and data-mining to identify the broadest possible set of file characteristics — some as basic as the file size and others as complex as the first logic leap in the binary.

The atomic characteristics are then extracted, depending on file type (.exe, .dll, .com, .pdf, .java, .doc, .ppt, etc.). By identifying the broadest possible set of attributes, manual classification bias is removed. Use of millions of attributes also increases the cost an attacker incurs, creating a piece of malware that could go undetected. This attribute identification and extraction process creates a file genome comparable to the human genome and can be used to mathematically determine expected characteristics of files, just as human DNA analysis is leveraged, determining characteristics and behaviors of cells.

Learning

Once collected, the output is normalized and converted to numerical values for use in statistical models. Vectorization and machine learning are then applied to eliminate human impurities and to speed analytical processing. Leveraging the attributes identified in extraction, mathematicians then develop statistical models that predict whether a file is benign or malicious. Dozens of models are created with key measurements, ensuring the predictive accuracy. Ineffective models are scrapped. Effective models are subjected to multiple levels of testing.

The first level starts with a sample of known files. Later stages involve the entire file corpus (tens of millions of files). The final models are then loaded into a production environment for use in file classification.

It’s important to remember that for every file scrutinized, millions of attributes are analyzed to differentiate between legitimate files and malware. This is how machine learning identifies malware – whether known or unknown – and achieves unprecedented levels of accuracy. It divides a single file into an astronomical number of characteristics and analyzes each against hundreds of millions of other files to reach a decision about the health of each characteristic.

Classification

Statistical models once built can be used by math engines to classify files, which are unknown (e.g., files never seen before). This analysis takes milliseconds and is extremely precise because of the breadth of the file characteristics analyzed.

Using statistical models, the classification is not opaque. A “confidence score” is included as part of the process. This score provides incremental insight that can inform decisions regarding what action to take – block, quarantine, monitor or analyze further.

An important distinction between a machine-learning approach and a traditional approach is that the mathematical approach builds models that specifically determine if a file is benign or malicious. It returns a response of “suspicious” if confidence about a file's malicious intent is less than 20 percent and there are no other indications of maliciousness. An enterprise can thus gain a holistic perspective on the files running in their environment.

KEYWORDS: artificial intelligence (AI) cyber risk management cybersecurity trends

Share This Story

Looking for a reprint of this article?
From high-res PDFs to custom plaques, order your copy today!

John mcclurg

John McClurg served as Sr. Vice President, CISO and Ambassador-At-Large in BlackBerry's/Cylance’s Office of Security & Trust. McClurg previously was CSO at Dell; Vice President of Global Security at Honeywell International, Lucent Technologies/Bell Laboratories; and in the U.S. Intelligence Community, as a twice-decorated member of the Federal Bureau of Investigation.

Recommended Content

JOIN TODAY
To unlock your recommendations.

Already have an account? Sign In

  • Cyber tech background

    Security’s Top Cybersecurity Leaders 2026

    Security magazine’s Top Cybersecurity Leaders 2026 award...
    Security Leadership and Management
  • Iintegration and use of emerging tools

    Future Proof Your Security Career with AI Skills

    AI’s evolution demands security leaders master...
    Career Intelligence
    By: Jerry J. Brennan and Joanne R. Pollock
  • The 2025 Security Benchmark Report

    The 2025 Security Benchmark Report

    The 2025 Security Benchmark Report surveys enterprise...
    The Security Benchmark Report
    By: Rachelle Blair-Frasier
Manage My Account
  • Security Newsletter
  • eMagazine Subscriptions
  • Manage My Preferences
  • Online Registration
  • Mobile App
  • Subscription Customer Service

More Videos

Popular Stories

Opened padlock on computer keyboard

10 Data Breaches to Know About (April 2026)

Laptop with desktop screen showing

Research: Microsoft Edge Loads Stored Passwords in Cleartext

Diverse Team Collaborating on Business Analysis

12 Tips for Building an Effective Security Budget

Laptop in darkness

Reframing MFA Bypass: Four Identity Gaps Attackers Exploit

Nurse

Why De-Escalation Must Be Part of a Layered Safety Strategy in Healthcare

SEC 2026 Benchmark Banner

Events

June 3, 2026

The Role of AI and Video in Measuring Health, Safety, and Security Standards

OSHA fines grab headlines, but most compliance issues start with everyday operational gaps: missed protocols, unsecured areas, or slow response. Learn how emerging technologies & AI can be leveraged towards a more proactive model of compliance.

June 10, 2026

Applying Agentic AI in Security Operations for Faster Decisions & Better Outcomes

Security teams have never had more visibility. We’ll explore how a new decision layer is helping security teams move from detection to decision. Turn alerts into decision-ready context, reducing reliance on manual triage and enabling faster action.

View All Submit An Event

Products

Security Culture: A How-to Guide for Improving Security Culture and Dealing with People Risk in Your Organisation

Security Culture: A How-to Guide for Improving Security Culture and Dealing with People Risk in Your Organisation

See More Products
Solutions by Sector webinar promo


The Role of AI and Video - Free Webinar - June 3, 2026

Related Articles

  • Cyber tactics

    2023: The year for contextual cyber threat intelligence

    See More
  • cyber security

    Reflections on 35 years in the trenches

    See More
  • Cyber

    Have we declared “open season” on CISOs?

    See More

Related Products

See More Products
  • security culture.webp

    Security Culture: A How-to Guide for Improving Security Culture and Dealing with People Risk in Your Organisation

  • school security.jpg

    School Security: How to Build and Strengthen a School Safety Program

  • contemporary.jpg

    Contemporary Security Management, 4th Edition

See More Products
×

Sign-up to receive top management & result-driven techniques in the industry.

Join over 20,000+ industry leaders who receive our premium content.

SIGN UP TODAY!
  • RESOURCES
    • Advertise
    • Contact Us
    • Store
    • Want More
  • SIGN UP TODAY
    • Create Account
    • eMagazine
    • Newsletter
    • Customer Service
    • Manage Preferences
  • SERVICES
    • Marketing Services
    • Reprints
    • Market Research
    • List Rental
    • Survey/Respondent Access
  • STAY CONNECTED
    • LinkedIn
    • Facebook
    • YouTube
    • X (Twitter)
  • PRIVACY
    • PRIVACY POLICY
    • TERMS & CONDITIONS
    • DO NOT SELL MY PERSONAL INFORMATION
    • PRIVACY REQUEST
    • ACCESSIBILITY

Copyright ©2026. All Rights Reserved BNP Media, Inc. and BNP Media II, LLC.

Design, CMS, Hosting & Web Development :: ePublishing