to CPU design • Exploit speculative execution to perform side-channel attacks ◦ Speculation is unclean, altering microarchitectural state ◦ Exfiltration via CPU data cache violates isolation guarantees • Discovered by several independent teams in 2017 ◦ Jann Horn, Google Project Zero ◦ Paul Kocher ◦ Haas & Prescher, Cyberus Technology ◦ Gruss, Lipp, Mangard, & Schwartz, Graz University of Technology ◦ Others
execution will be followed from before direction is known and fetches instructions from the target ◦ Direct Jumps, Function Calls: Direction & Target known. ◦ Conditional Jump: if-then-else, for-loop. Target known. Direction? ◦ Indirect Jumps, Function Return: Direction known (taken). Target? • Otherwise, pipeline is stalled until direction known • Misprediction Delay can be 10–20 clock cycles • Typically get >90% accuracy • Predictor state stored in Branch Target Buffer https://www.cc.gatech.edu/~milos/Teaching/CS6290F07/6_BranchPred.pdf
are fully resolved (branch cond, mem load, etc) • Instructions may be exec’d in different order, in parallel • Processor saves checkpoint at start of speculative region • Following instructions are speculatively executed • If correct, the spec’ly exec’d instructions can be retired • Otherwise, rollback of transient instructions • Also, rollback upon exception • No side-effects from rolling back
of transient instructions • Old register states are preserved by checkpoint; can be restored • Memory writes are buffered; can be discarded • Cache modifications are not rolled back
illegal access to privileged memory that eventually causes exception • Takes advantage of race condition in permissions check • Reads kernel memory ◦ Or physical memory of other processes ◦ Or memory of other Docker containers ◦ Or hypervisor memory ◦ Or other Cloud tenants’ memory • Breaks security assumptions about memory isolation • Attacker runs code directly • Affects Intel, high-end ARM, POWER. Not AMD.
window between illegal access and exception raised • Sequence aborted; instructions discarded • Transmission: probe[y] now in cache; can deduce secret 1 secret = *pointer Read byte of inaccessible kernel memory 2 y = secret * 4096 4KB pages 3 z = probe[y] One element of probe loaded to cache • Before: probe, array of 256 4KB pages, flushed from cache
suggested that it was as if the transient instructions were actually executed in an alternate universe and somehow reached across and touched a cache line in this universe.
• Flush+Reload quantity is bottleneck • Faster to extract one bit at a time (shift and mask), as needs only 2 Flush+Reloads times 8 bits • 503KB/s read rate with error rate as low as 0.02%
that would not happen during correct execution ◦ Victim: separate process, sandbox host, etc • Attacker locates exploitable code sequence in victim • Attacker mistrains branch predictor with valid calls • Leaks via side-channel • Harder to exploit; harder to mitigate.
Attacker mistrains branch predictor with valid offsets to an array bounds check • Attacker evicts array bound and probe array from cache • Meanwhile, victim speculatively loads from invalid offset • Similar to Meltdown, except: ◦ Victim code executes in Spectre ◦ Uses branch misprediction, not exception
• GCC, LLVM, MSVC, etc can now insert speculation barrier • Using /Qspectre switch, “[MSVC] compiler detects that a range-checked integer is used as an index to load a value that is used to compute the address of a subsequent load” and inserts LFENCE barrier before first load; i.e., before reading secret. https://blogs.msdn.microsoft.com/vcblog/2018/01/15/spectre-mitigations-in-msvc/
7-minute presentation about Meltdown to Freely Speaking Toastmasters in Seattle. I used large, hand-drawn sheets of paper on a flip chart. I know I lost them on the final slide.