In this blog post, I’m going to walk through the NetSpectre vulnerability, what this means to our customers, and what Red Hat and other industry partners are doing to address it.
Please note that based on Red Hat’s understanding, the observed measured maximum leakage rate from successfully exploiting this vulnerability is on the order of 15-60 bits (2-8 bytes) per hour on a local network, much lower over the internet and we do not yet have real-world examples of vulnerable code. Nonetheless, the risk posed by sophisticated attackers capable of deploying Advanced Persistent Threats (APTs) like NetSpectre against sensitive installations is real. But it is important to remember that an attacker will require a very significant amount of time to actually pull off a real-world attack.
We began 2018 with a series of important security disclosures affecting nearly all modern microprocessors. These vulnerabilities, known as Spectre and Meltdown, as well as the additional variants (e.g. Speculative Store Buffer Bypass, Bounds Check Bypass Store, etc.) that have been disclosed since, target the actual internal implementation of processors, or the microarchitecture. Vulnerable processors run user programs and operating systems entirely within the (architecture) specifications to which they are designed, only yielding results that are in line with those specs, yet microarchitectural optimizations - such as speculative execution - intended to improve performance may result in leaking unintended information from vulnerable designs. These leaks are not direct, like a traditional software exploit, but instead are through side-channels, or mechanisms by which internal processor state can be inferred by attackers.
In exploiting these vulnerabilities, suitably skilled attackers are able to observe the impact of running programs upon shared resources, such as data caches, Translation Lookaside Buffers (TLBs), and other internal processor structures. This is known as side-channel analysis. One processor optimization in particular has proven vulnerable to such analysis. During speculative execution, when the processor attempts to predict future program behavior and tentatively perform future operations that may or may not be needed, data used by later parts of a program may be pulled into these shared processor resources prior to knowing whether the data will be used. This optimization, known as branch prediction, was always considered to be a safe design decision because speculatively loaded data was not intended to be visible to programmers or users.
We now know that side-channel analysis of the processor microarchitectural state can be used to infer and leak the content of speculatively loaded data. While Meltdown got much of the press attention in early 2018, thanks to its ready exploitability and compelling visual demos showing application and operating system memory being dumped on vulnerable processors, it was actually quite straightforward to mitigate in software (albeit the operating system level software changes were not trivial to write). A performance hit was taken to address Meltdown through changes such as Page Table Isolation (PTI), but in many cases, the impact isn’t too high and the software mitigations allowed continued use of impacted processors while we wait for future silicon fixes.
Please keep your software updated to mitigate these vulnerabilities and continue to do so as the processors and microcode evolve to address these issues, we will continue to modify the operating system accordingly.
Similarly, Spectre variant 2, can also be mitigated through various software changes to address, what is at heart, a hardware problem. In Spectre variant 2, hardware known as a branch predictor within the processor is “trained” into mispredicting future program execution in order to load secrets that can be leaked also through side channel analysis. The specific hardware targeted by this vulnerability is the indirect predictor, which is used to guess the target location of code within a program that might run in the future as a result of some function or pointer reference. Like other predictors, it does this by observing historical behavior patterns.
Spectre v2 mitigation uses new hardware control interfaces that were added via in-field updates to processor microcode and firmware. These interfaces change the indirect predictor behavior making it either much harder to train, or separating the branch pattern history between different contexts (e.g. applications, the kernel, etc.). The exact mechanism available varies by processor and is based upon vendor provided workarounds. As with Meltdown, Spectre variant 2 is not eliminated via software, only mitigated with the tradeoff of some performance until it can be addressed in future silicon.
Spectre variant 1 is a different beast. This vulnerability, known as bounds check bypass (which also has a newer sub-variant known as bounds check bypass store), also exploits speculative execution, but it does so in a more subtle manner, targeting direct branches, unlike the indirect variety exploited in Spectre variant 2.
Consider the following example code taken from the original January 2018 Spectre paper:
if (x < array1_size)
y = array2[array1[x] * 4096];
When this code is compiled, it will take the form of a conditional branch that runs the assignment (y = array2[array1[x] * 4096]) only in the case that x is less than the upper bounds check (x < array1_size). Processors implementing speculation may seek to perform the assignment on the second line prior to knowing whether the bounds check is correct. The process of resolving the conditional expression upon which the branch occurs is known as resolving the branch, and it can be delayed by tiny fractions of a second, yielding enough time for a modern processor to decide to speculatively run ahead in the program. If this happens, the array access as part of the assignment to y might actually be beyond the limit of the array.
The above code is known as a spectre gadget. By carefully training the direct predictor (through many calls to the code with correctly in-bounds parameters), the processor will come to predict that later calls to this code path will always use in-bounds data. As a result, subsequent calls into the code with an out of bounds value may result in unsafe speculation in which the processor performs an access using the untrusted user data as an offset into memory. If there is secret data in memory stored after array1, then exceeding its bounds could result in a second access (to array2) at an offset that is dependent upon the value of the secret data. This has a measurable impact upon the shared data cache that an attacker with local access can use to infer the content of the secret. By carefully monitoring cache behavior (the time it takes to load data from the cache), this technique can be used to dump secret keys and other information.
As in the other variants, Spectre variant 1 gadgets rely upon the use of a shared processor cache. Modern processors actually implement many levels of caches between the (fast) core of the processor, and the (slower) external DDR memory chips. The caches are much smaller than the external memory chips, only keeping copies of recently used data, and so less recently used data is frequently evicted from the caches as a result of other memory accesses. Further, the behavior of the cache can be predicted based upon its known design (organization), and specific locations can be flushed through other unrelated accesses. Thus, an attacker can first arrange to flush parts of the cache, then measure which of several locations was reloaded by speculative activity, and from this infer the value of secrets, often a single bit at a time.
Spectre variant 1 is what really lends the ghost analogy used in the Spectre logo. This is because such gadgets are hard to find, and since the fundamental concept of branch prediction is so critical to the performance of every modern microprocessor it isn’t easy to remove this vulnerability in future processor designs. Thus we have currently relied on software changes. Over the past year, special purpose scanners have been created that attempt to search the source code for Spectre gadgets, and certain third-party compilers have been updated to insert automatic speculation hardening. In the case of the Linux kernel, the Smatch tool can be used to scan for many vulnerable gadgets, and then a manual clamping of a vulnerable bounds check can be used to force speculation only to use certain limit values.
For an example of Spectre-v1 hardening in the Linux kernel, we can examine the file kernel/bpf/syscall.c in the upstream tree maintained by Linus Torvalds. In the function find_and_alloc_map, an untrusted value oftype is provided by user packet filter code as an offset into an array of valid bpf_map_types.
Traditional limit checks are applied:
if (type >= ARRAY_SIZE(bpf_map_types))
But processor speculation may result in code following this limit check executing before the conditional test is resolved, meaning that a value of type outside of the allowed range might be used to index into the bpf_map_types array. Thus, it is necessary to clamp the value of type under speculation.
This is performed using the following:
type = array_index_nospec(type, ARRAY_SIZE(bpf_map_types));
The value of type is not actually modified by this code, except under speculation, where it might be clamped to a safe value using a careful masking operation defined in array_index_nospec.
When the variants of Spectre were first disclosed, it was widely presumed that this was a local-only class of attack. It was assumed that the attacker would need to have direct shared use of the same cache memory structure as the victim (gadget) code. As a result, mitigations focused on removing the ability to leak data across privilege boundaries - for example, in preventing a malicious user application from attacking the kernel and disclosing secrets. In addition, less focus was placed by the industry on direct attacks against application software.
NetSpectre proves the danger of jumping to conclusions (although the current observed access rates to unauthorized data are slow as noted below). The brilliant team of researchers at TU Graz in Austria have once again demonstrated a completely novel attack. In fact, they’ve gone one step further because NetSpectre also discloses a completely new side-channel that doesn’t rely upon caches at all. As we mentioned previously, there is no inherent need for side-channels to be of any particular variety. In this case, the Graz team discovered that AVX (Advanced Vector Extensions) hardware in recent Intel processors includes an optimization which will power down half of the vector unit if a wide vector hasn’t been used recently (within the last 1ms). By carefully timing how long AVX operations take, it is possible to leak data about the top half of a vector. We won’t dive into that further here, but it’s worth reading about it in more detail in the NetSpectre paper.
The core premise of NetSpectre is built upon the realization that microarchitectural state can be observed through the latency of network operations, and that it can be altered to perform remote cache flushing against a target. The attack is split into two halves. In the first, a disclosure gadget similar to a traditional Spectre gadget causes some secret data to be loaded into the processor caches as a result of microarchitectural operations. Just as with the original Spectre attacks, this gadget must be found in existing software (for example, the Linux kernel, or a web server on a remote machine), and it must execute in response to some data ultimately provided over the network, for example, in a web page or SSH request. The example given in the paper sets a single flag variable based upon an out-of-bounds array access:
if (x < bitstream_length)
flag = true
By first miss-training the processor branch prediction hardware (through packets containing normal data) into predicting the array access to be in-bounds (causing the following code to be run speculatively), this disclosure gadget can be used to cause the flag variable to be set under speculation based upon whether the value at bitstream[x] is true or false.
Once this done, a second gadget known as a transmit gadget is used to convey the value of the flag variable to the remote attacker. Transmit gadgets are far simpler, needing only to perform some activity based upon the microarchitectural state created by the leak gadget. As an example, simply using the flag variable above in some code path would cause execution time to vary depending upon whether flag was already loaded into the processor’s caches (as a result of being set to true). All that is left is the need to have a means to flush flag from the caches ahead of time. This is achieved simply by sending a large number of network packets, such as in a request to fetch a file from a web server. Graz coin the term Thrash+Reload for this alternative to the traditional Evict+Reload used by machine local malicious attacks.
Using these techniques of disclosure and transmit gadgets, as well as the newly discovered AVX side-channel (which allows for an increase in leakage rate up to the 60 bits per hour quoted), the Graz team are able to achieve unauthorized access at a rate of (aka leak) between 15 and 60 bits per hour. That’s bits per hour, not bytes. While this is, of course, a small number, it is still a concern, especially for long-running network services like SSH daemons, secure web servers, and other programs that contain secret crypto keys. While those keys may be thousands of bytes long, they are within the realm of feasible victims for a persistent attacker with sufficient skill and time to pull off an attack.
The NetSpectre paper is, of course, representative of amazing and novel research. But what can we do to mitigate the impact upon customers, and to help the broader communities of customers, developers, and partners that we are a part of? First, the good news. We have yet to find an actual example of vulnerable application code in network-accessible services. While we are certain that these will be found over time, it is worth the consolation that a real-world attack would be difficult at this stage. Red Hat has been working on tooling to help detect suitable gadget code, both in the form of binary and source scanners. A (lightly) modified Smatch tool has been running against various parts of the Red Hat Enterprise Linux userspace and we are busily analyzing the results. If we find vulnerable gadgets, we will remediate them using existing known techniques, such as bounds clamping (via masks), or in some cases processor context serialization code.
Red Hat is working with other members of the community to further understand the impact of NetSpectre on existing scanning tools, and we plan to aid in adapting them to find as many potential gadgets as possible. Because the process of reviewing these becomes tedious and difficult to maintain over time, we are also working with others on compiler improvements, and other longer-term solutions. We are encouraged by the work that Google has done on Speculation Load Hardening (SLH), and by Microsoft’s efforts to enhance its compilers. Red Hat’s tools team is looking at similar improvements for GCC. We will continue to work to deliver mitigations to our customers.
Finally, I have had the privilege and pleasure of spending time with the Graz team in the months since the original Meltdown and Spectre vulnerabilities, and subsequent variants have been disclosed. In that time, I have witnessed a team that is both talented and committed to the industry standard practice of coordinated disclosure. Many of us in the industry are grateful to the Graz team, as well as others, for the ability to collaborate on mitigations that protect our customers and communities ahead of time. These are true white hats who rightly deserve recognition for helping to keep us all safe from harm. We look forward to continuing to collaborate with the research community and encourage researchers to contact us with future vulnerabilities they would like to collaborate on.
Jon Masters is a computer architect at Red Hat.