Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Analytics

4/12/2021
11:55 AM
50%
50%

Microsoft Uses Machine Learning to Predict Attackers' Next Steps

Researchers build a model to attribute attacks to specific groups based on tactics, techniques, and procedures, and then figure out their next move.

Microsoft is developing ways to use machine learning to turn attackers' specific approaches to compromising targeted systems into models of behavior that can be used to automate the attribution of attacks to specific actors and predict the most likely next attack steps. 

In a research blog published earlier this month, the software giant stated it has used data collected on threat actors through its endpoint and cloud security products to train a large, probabilistic machine-learning model that can associate a series of tactics, techniques and procedures (TTPs) — the signals defenders can glean from an ongoing cyberattack — with a specific group. The model can also reverse the association: Once an attack is attributed to a specific group, the machine-learning system can uses its knowledge to predict the most likely next attack step that defenders will observe.

Related Content:

Could Automation Kill the Security Analyst?

Special Report: How Data Breaches Affect the Enterprise

New From The Edge: 9 Modern-Day Best Practices for Log Management

The machine-learning approach could lead to quicker response times to active threats, better attribution of attacks, and more context on ongoing attacks, says Tanmay Ganacharya, partner director for security research at Microsoft.

"It's critical to detect an attack as early as possible, determine the scope of the compromise, and predict how it will progress," he says. "How an attack proceeds depends on the attacker's goals and the set of tactics, techniques, and procedures that they utilize, [and we focus] on quickly associating observed behaviors and characteristics to threat actors and providing important insights to respond to attacks."

In the early April blog post, Microsoft described the research into machine learning and threat intelligence that uses TTPs from the MITRE ATT&CK framework, the attack chain, and the massive data set of trillions of daily security signals from its 400,000 customers to model threat actors. Just as defenders use playbooks to respond to attacks and not forget important steps in the heat of the moment, attackers typically have a standard way of conducting attacks. The machine learning approach attempts to model their behavior.

Companies are early in the process of adopting machine learning for threat intelligence processing and enrichment. While about 70% of companies are using machine learning with threat intelligence in some way, 54% of those companies are currently dissatisfied with the technology, according to the SANS Institute's "2021 SANS Cyber Threat Intelligence Survey."

Providing useful information using machine learning could help, the Microsoft 365 Defender Research team stated in its blog.

"We are still in the early stages of realizing the value of this approach, yet we already have had much success, especially in detecting and informing customers about human-operated attacks, which are some of the most prevalent and impactful threats today," the company wrote.

To enable its research, the company consumes data from its Microsoft Defender anti-malware software and services to create collections of TTPs. Using those signals, the company's researchers implemented a Bayesian network model — which in cybersecurity is most commonly associated with anti-spam engines — because it is "well suited for handling the challenges of our specific problem, including high dimensionality, interdependencies between TTPs, and missing or uncertain data," they said.

Bayes' theorem can calculate the probability, given certain TTPs and historical patterns, of a certain group being behind the attacks. 

"Massive data can provide insights humans cannot through supervised learning," Ganacharya says. "In this case, the TTPs are used as variables in a Bayesian network model, which is a complex statistical tool used to correlate alerts from various detection systems and [predict] future attack stages. These insights help analysts in attribution when a specific actor is present, allowing focused investigations."

Using the probability model also gives analysts additional tools to predict an attacker's next potential action. If certain TTPs are observed — the Transfer of Tools and Disable Security Tools from the MITRE ATT&CK framework, for example — the model will predict the attacks the defender will most likely see next.

In addition, the model can be easily updated with new information as attackers change their approaches to compromising targets, the company said.

Yet challenges remain. The model requires good data on threat actors and their specific TTPs to create the model. Human experts are required to evaluate the data and, currently, to interpret the model's results for customers. 

"If the training data does not represent the true behaviors, the model can make poor predictions," Ganacharya says. "This could result in security operations taking incorrect actions to halt the attack, either wasting critical response time by following false leads or impacting users who are not part of the attack."

Veteran technology journalist of more than 20 years. Former research engineer. Written for more than two dozen publications, including CNET News.com, Dark Reading, MIT's Technology Review, Popular Science, and Wired News. Five awards for journalism, including Best Deadline ... View Full Bio
 

Recommended Reading:

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
Commentary
Cyberattacks Are Tailored to Employees ... Why Isn't Security Training?
Tim Sadler, CEO and co-founder of Tessian,  6/17/2021
Edge-DRsplash-10-edge-articles
7 Powerful Cybersecurity Skills the Energy Sector Needs Most
Pam Baker, Contributing Writer,  6/22/2021
News
Microsoft Disrupts Large-Scale BEC Campaign Across Web Services
Kelly Sheridan, Staff Editor, Dark Reading,  6/15/2021
Register for Dark Reading Newsletters
White Papers
Video
Cartoon
Current Issue
The State of Cybersecurity Incident Response
In this report learn how enterprises are building their incident response teams and processes, how they research potential compromises, how they respond to new breaches, and what tools and processes they use to remediate problems and improve their cyber defenses for the future.
Flash Poll
How Enterprises are Developing Secure Applications
How Enterprises are Developing Secure Applications
Recent breaches of third-party apps are driving many organizations to think harder about the security of their off-the-shelf software as they continue to move left in secure software development practices.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2021-34390
PUBLISHED: 2021-06-22
Trusty TLK contains a vulnerability in the NVIDIA TLK kernel function where a lack of checks allows the exploitation of an integer overflow on the size parameter of the tz_map_shared_mem function.
CVE-2021-34391
PUBLISHED: 2021-06-22
Trusty TLK contains a vulnerability in the NVIDIA TLK kernel�s tz_handle_trusted_app_smc function where a lack of integer overflow checks on the req_off and param_ofs variables leads to memory corruption of critical kernel structures.
CVE-2021-34392
PUBLISHED: 2021-06-22
Trusty TLK contains a vulnerability in the NVIDIA TLK kernel where an integer overflow in the tz_map_shared_mem function can bypass boundary checks, which might lead to denial of service.
CVE-2021-34393
PUBLISHED: 2021-06-22
Trusty contains a vulnerability in TSEC TA which deserializes the incoming messages even though the TSEC TA does not expose any command. This vulnerability might allow an attacker to exploit the deserializer to impact code execution, causing information disclosure.
CVE-2021-34394
PUBLISHED: 2021-06-22
Trusty contains a vulnerability in all TAs whose deserializer does not reject messages with multiple occurrences of the same parameter. The deserialization of untrusted data might allow an attacker to exploit the deserializer to impact code execution.