Actionable Detections: An Analysis of ATT&CK Evaluations Data Part 2 of 2

The Tech Platform
Oct 3, 2020
6 min read

In part 1 of this blog series, we introduced how you can break down and understand detections by security products. When analyzing ATT&CK Evaluations results, we have found it helpful to assess and deconstruct each detection based on three key benchmarks:

Availability — Is the detection capability gathering the necessary data?
Efficacy — Can the gathered data be processed into meaningful information?
Actionability — Is the provided information sufficient to act on?

The first two benchmarks (availability and efficacy) are naturally defined by data sources, from which context is derived. Context, understanding the true meaning and consequences of events, is what enables actionability, but is limited to what raw data inputs you consume (availability), as well as consuming enough of the right data to make sense of the situation (efficacy).

In this second and final part of this blog series, we address the always relevant question of “so what?” and provide insight into why we are so excited to introduce protections into the next round of ATT&CK Evaluations. Detecting malicious events is not the final solution to thwarting adversaries, as some action needs to be taken to mitigate, remediate, and prevent current and future threats activity. The context provided by data sources is what enables us to make these actionable decisions, whether they are operational (ex: killing a malicious process) or strategic (ex: hardening an environment in an attempt to prevent the execution of malicious processes).

Actionability: The So What

Every detection ends with actionability, where the value of the entire detection process is realized. The actionable decisions we make begins with the available context surrounding a detection. However, generating detections does not guarantee successful actionability, as there are many other factors that challenge the strength of a detection’s context and must be addressed. We will explore and highlight these critical factors in the following case study.

Case Study: Credential Dumping (T1003)

Day 2 of the APT29 Emulation included a very interesting implementation of Credential Dumping (T1003). As described in publicly available cyber threat intelligence, APT29 has dumped plain-text credentials from victims using a PowerShell implementation of Mimikatz (Invoke-Mimikatz) hidden in and executed from a custom WMI class. Similar to the APT29 malware, we emulated this complex behavior in a single PowerShell script that was evaluated as steps 14.B.1 through 14.B.6, which leads us to our first actionability challenge.

stepFourteen_credDump.ps1 used to emulate the APT29 credential dumping behavior

Factor 1: Detecting a Behavior is not Detecting Every Technique: The One to Many Problem

We’ve learned a lot during our ATT&CK Evaluations journey. One of our biggest realizations relates to the difference between experimentation and real-world application. In the lab, we’re interested in capturing and analyzing every available data point to garner the maximum amount of specific and measurable results that we can analyze and draw conclusions from. However, reality is often much different, as real-world success may be based on maximizing the value of a single, seemingly less significant data point within an experiment.

This idea is highlighted by the credential dumping case study. The credential dumping behavior of the APT29 emulation was evaluated as six different but connected techniques, each with its own detection criteria and results.

Techniques associated with the emulated APT29 credential dumping behavior

These granular results are critical to Evaluations, where we aim to identify strengths/gaps and ultimately promote improvements, one technique at a time. But as defenders in the real world, do we actually need to detect every technique within this behavior to have a fighting chance at actionability?

The answer to this question of course circles back to context. Detecting each technique within this behavior provides an integral factor to understanding the entire scenario and how a defender could respond:

Potential defensive actions based on the emulated APT29 credential dumping behavior

As demonstrated above, the detection of each individual technique may provide unique context that can lead to a more complete actionable response. However, we can also see that the defensive action associated with each individual technique could prevent the behavior, as interrupting even a single technique of this behavior would stop the adversary from successfully obtaining credentials. Also, each defensive action could reveal more context that leads to the detection of the other connected techniques (e.g., investigating the WMI class would reveal the code to download and execute Mimikatz). These conclusions on the interrelationship between connected techniques leads to our next factor of actionability.

Factor 2: The Value Chain of Correlated

Although we provide Evaluations results one technique at a time, in reality, breaches are a series of connected techniques and behaviors. As the credential dumping case study shows, the behavior is a series of functionally dependent techniques an adversary uses to accomplish a single goal (obtaining credentials). One break in that process may render the behavior unsuccessful.

This concept directly relates to the Evaluation’s detection Correlated modifier (known in the APT3 Evaluation round as Tainted). Defined as presenting a detection “as being descendant of events previously identified as suspicious/malicious,” this highlights another factor of the actionability of a detection. Specifically, the actionability of a detection can be enhanced by detections of previous techniques and behaviors.

Example application of the Correlated modifier

To clarify this point, let’s review the credential dumping case study. Typically discovery techniques, such as the process discovery (T1057) in step 14.B.2, have less impactful potential defensive responsive actions. Unless the adversary discovery can easily be recognized as potentially malicious (such as scanning entire IP ranges), these techniques may blend into the “noise” of benign user activity. Since discovery techniques often utilize legitimate system utilities (such as binaries or protocols regularly used by users and services), preventing execution of these techniques may render systems unusable.

Mitigation provided for ATT&CK T1057 — Process Discovery

So how does the correlated modifier enhance actionability? Even if the process discovery in Step 14.B.2 is detected, as defenders, what can we confidently do with this information? Is killing every process discovering other processes an appropriate response, or do we need more context to make a better decision? In this case, detecting that technique alone is probably not enough to take action, but if we connect 14.B.2 back to 14.B.1 and recognize that the process discovery is being executed from an abnormal WMI execution (more context), we may have what we need to make a sound defensive action.

The power of correlation does not just exist within a single behavior. As we previously discussed, a breach is a series of connected behaviors. In our credential dumping case study, the behaviors of step 14.B are preceded by various detectable behaviors such as executing a malicious payload (Step 11.A) and bypassing UAC to elevate privileges (Step 14.A). Correlation enhances actionability by providing more context, not specifically to a single technique but rather to the entire story of behaviors. This leads to our final factor of actionability, which addresses how to detect the gaps in this story.

Factor 3: The Cost of “Misses”

In a perfect world, every story has a complete beginning, middle, and end. Each part of the story builds upon the previous parts and flows into the next. With detections, we capture this as correlation, where our context of the adversary’s story increases with each new detection. But does that context disappear if a piece is missing?

Looking back at the credential dumping case study, we are reminded that although not ideal, in the real world we can possibly tolerate “misses.” For example, even if we did not detect the credential dumping technique (14.B.4), we could potentially still understand the behavior based on the surrounding context. Detections capturing the write of the Mimikatz file (14.B.3) and saving the Mimikatz results (14.B.5) could fill in the missing gap (at least enough to take action) based on correlation and the surrounding context of the story.

Bringing Everything Together: See the Forest for the Trees

Context is key, but as the credential dumping case study highlighted detecting a behavior is not detecting every technique. If we organize and interpret our data correctly, we may not need to connect every piece of the puzzle to understand and act on the situation in front of us.

Can we determine what this incomplete image is?

As Keith McCammon outlined during his ATT&CKcon 2.0 presentation, Prioritizing Data Sources for Minimum Viable Detection, we need to focus on “the probable” vice “the possible.” In the case of detections, this translates to the conclusion that with the right context we don’t need to detect everything to be effective. We must learn to operate with and make the most of what we have. While we should always continually innovate and improve, this is another practical recognition of how we interpret the ATT&CK Evaluation results and how understanding detection capabilities can make us better defenders.

Actionability in the Context of ATT&CK Evaluations

In this two-part blog series, we discussed how we deconstruct and analyze detections using the availability, efficacy, and actionability benchmarks. As explained both in this post and in part 1, we continuously try to evolve and advance the way we execute and share Evaluations results. Along with data sources in the detection categories to address availability and efficacy, additional adjustments will be made to our Carbanak and FIN7 evaluations. As we shared here, these will include the introduction of the protections evaluations and a new approach to illuminating each vendor’s alert and correlation strategy. We believe these changes will further highlight the actionability of each detection.

Carbanak+FIN7 Evaluation Protection Categories

We hope that this series, as well as the corresponding changes to ATT&CK Evaluations, enhances your ability to use the results. Please reach out to us with any additional feedback or ideas on how we can provide more value. As always, stay healthy and safe.

Source: Medium.com