Background
The US Government's top cyber defense agency, CISA, has been cataloging and attributing known cyber exploitation activity to the underlying vulnerability. In a previous blog, we identified that ~90% of the CISA-identified attacks had vulnerabilities with Remote Code Execution (RCE) attributes. RCEs allow attackers to (a) readily infiltrate workloads and (b) run arbitrary code on the victim workload. At the same time, being swamped by the tsunami of vulnerabilities enterprises can't patch fast enough. Despite pouring billions into protecting enterprise workloads, we are regressing instead of closing the gap with bad actors. Rapidly.
What is at the heart of this problem?
Code can be hostile at two stages in its lifespan - first, before it starts executing, and second, at runtime, while the code executes.
To detect if code is hostile even before it starts executing, the anti-virus component of the XDR maintains a database of signatures of known malicious code. Unfortunately, this detection mechanism falls apart very quickly if the attacker were to make even a small change to their malicious code. An additional complication - over 450K new malware get reported every day.
Detecting if code is hostile as it executes is a little more complicated. A typical cyberattack on a workload involves a series of preparatory steps followed by the initial infiltration step, followed by the actual attack, and finally followed by post-attack cleanup steps. These steps are collectively known as the attack's kill chain. Each step in the kill chain has a specific Tactic, a specific Technique, and a precise Procedure (or code) that achieves the objective of a given stage of the kill chain. By analyzing millions of attacks over time, MITRE has cataloged the ATT&CK Framework, where hundreds of Tactics (the first T in TTP) and techniques (the second T in TTP) used by the bad actors, the exact procedures (the P in TTP) are cataloged. To detect a cyberattack, the Detect-And-Respond class of security control (XDR) looks for a series of TTPs occurring in order. When a monitored workload starts executing code and known TTPs are detected, the XDR control raises a "suspicion" score by a predefined amount. When the suspicion score crosses a pre-configured threshold, the security control declares that the workload is under attack. Unfortunately, using TTPs to detect a cyberattack suffers from several systemic vulnerabilities, as enumerated below.
Systemic Vulnerability #1: Malware can Bypass, Unhook, or Even Terminate XDR Telemetry
A recent article shows how bad actors can bypass the runtime telemetry from the workload that XDRs rely on. Worse, yet another recent article shows how to terminate XDR telemetry. With no telemetry from the XDR agent, the workload is flying blind.
Systemic Vulnerability #2: Over-Dependence on a Predefined (Potentially Outdated) Catalog of Threats
Whenever the XDR believes they have captured all possible TTPs, the attacker changes their procedure and manages to sneak past the XDR. Bad actors hold a significant advantage and can blindside the detect-and-respond security control.
Systemic Vulnerability #3: Detection Algorithm Cannot Distinguish between Legitimate and Malicious TTP
Each TTP involves the code issuing a series of system calls to the kernel. When the XDR picks up a TTP, it is hard to distinguish whether the observed TTP is the consequence of the monitored application's legitimate behavior or some attacker-induced behavior. It is easy to see that the security control will deliver false positives without observing the application in pristine conditions and until the enterprise exercises all legitimate application activity. It is interesting to note that the OS alone drops ~100K executables on the workload. Learning every legitimate behavior of every executable on every workload in the enterprise can become a serious logistical challenge. To distinguish between legitimate and malicious code, XDR uses several thresholds described below.
Systemic Vulnerability #4: Arbitrary Line-in-The Sand in Quantity Based TTP Threshold
Let's say the bad actor has infiltrated an application such as WinRar that encrypts files legitimately. The XDR may need to allow for two files getting encrypted in a minute to be normal behavior, but what if 15 files were to get encrypted in a minute? To not trigger a false positive, the XDR must watch the application normally execute for a sufficiently long time to "learn" if encrypting 15 files per minute files represents legitimate behavior of the app or not.
The line-in-the-sand of 15 (or whatever the number) is arbitrary and may differ for different apps. To abuse this threshold, bad actors can slow their attack down just a little to float under the radar of this threshold.
Systemic Vulnerability #5: Arbitrary Line-In-The-Sand Related to Detection Metrics
Let's consider a ransomware attack. Not knowing who or what code is at work, the security control will need a metric to decide if legitimate code is updating a file or if malicious code is doing so. Many XDRs measure a file's entropy ratio, defined as the ratio of plain text data vs. non-plain text data. An unencrypted text file will have very entropy, whereas an encrypted file will have very low entropy.
Unfortunately, there is no clear-cut value of the entropy ratio using which the XDR can definitively conclude whether malware is encrypting a file or not. A sophisticated attacker can easily encrypt only some chunks of data in the file so that the entropy ratio returns a sufficiently high value, yet the file has become useless. Malware like LockFile has successfully abused these metric-based lines in the sand using the above mentioned technique.
Systemic Vulnerability #6: XDR uses Arbitrary Line-In-The-Sand for Observation-Period
Knowing that XDRs must observe code for a while, sophisticated bad actors can quickly create short-fuse malware that completes its evil intent. Shown below is a partial list of some recent short-fuse malware that we found in the wild.
Figure 2: Recent Short fuse malware examples
Systemic Vulnerability #7: Extended Time Taken by XDR To Complete Analysis
As I read through this Bleeping Computer article, the following sentence caught my attention "Microsoft Security Copilot is an AI-powered security analysis tool that enables analysts to respond to threats quickly, process signals at machine speed, and assess risk exposure in minutes," Redmond says" it became evident that since TTPs are post-execution artifacts, the XDR will never be able to prevent some amount of damage from occurring, even if the telemetry were to be processed by ultra-powerful analysis.
Positive Security Model: A New Approach to Achieve Zero Dwell Time Cyber Protection
Contemporary XDR security controls continue leveraging post-exploitation telemetry to make cyber decisions. This strategy is not working out, and a serious rethink is needed. This next-generation security control ensures that:
- It allows only such code to execute that came from ISVs that the enterprise trusts and such code is explicitly authorized, and its integrity is explicitly Such code may be vulnerable, but at least it is not malware. A Positive Security Model control can then focus on keeping the code safe at runtime.
- It does not leverage any post-exploitation telemetry in making its decisions. Instead, it tracks attack signals that occur before the attacker begins to execute their malicious code. A Positive Security Model control focuses on whether the code about to execute has been provided or influenced by an attacker or is the original code from a trusted software vendor.
- It does not rely on arbitrary metrics that an attacker can subvert. Instead, it relies on invariant telemetry with a binary (true/ false) nature and hence does not make ambiguous decisions. A Positive Security Model control ensures that legitimate code executes on the invariant guardrails extracted from the code rather than focusing on the attacker whose behavior is highly unpredictable. One example of an invariant characteristic of code is the geometry of the process memory occupied by legitimate code. Once the code has loaded in memory, this geometry is invariant. By contrast, XDRs leverage ambiguous metrics such as YARA Rules, entropy ratio, etc., that the attacker can subvert at will.
- Like the firewall that allows network traffic that matches explicitly defined rules and denies all other traffic, a Positive Security Model allows only trusted code to execute and denies all other code from executing. With this approach, the attacker loses the ability to execute malicious code on a workload, even if the application is vulnerable and the bad actor can infiltrate it.
Conclusion
Despite pouring billions, the Detect-and-Respond (XDR)Security model hasn't lived up to its promise of keeping enterprises safe. Attackers are routinely taking advantage of systemic vulnerabilities in XDR Class of controls. Enterprises should switch to a Positive Security Model for workloads because the ever-evolving attacker tactics and techniques do not subvert these controls. A Positive Security Model control can defend against known-and-unknown malware and keep code safe at runtime even if the code has known or unknown vulnerabilities.
To learn more about the Positive Security Model, please visit our website, www.virsec.com.
Don't miss our security insights, and subscribe to our blog now.