Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Meltdown & Spectre

Meltdown & Spectre

Explaining the Meltdown, Spectre, and Project Zero papers


George V. Reilly

March 01, 2018


  1. Meltdown & Spectre George V. Reilly @georgevreilly Papers We Love

    Seattle 2018-03-01
  2. Agenda • What are Meltdown and Spectre? • Concepts ◦

    CPU Cache, Branch Prediction, Speculative Execution, etc • Meltdown (Variant 3, Rogue Data Cache Load) • Spectre (Variant 1, Bounds Check Bypass) • Spectre (Variant 2, Branch Target Injection)
  3. What are Meltdown & Spectre? • Serious security flaws due

    to CPU design • Exploit speculative execution to perform side-channel attacks ◦ Speculation is unclean, altering microarchitectural state ◦ Exfiltration via CPU data cache violates isolation guarantees • Discovered by several independent teams in 2017 ◦ Jann Horn, Google Project Zero ◦ Paul Kocher ◦ Haas & Prescher, Cyberus Technology ◦ Gruss, Lipp, Mangard, & Schwartz, Graz University of Technology ◦ Others
  4. Variants 1 and 2: Spectre • Trick other apps into

    accessing arbitrary locations in their memory • Variant 1: MIS-PREDICT: bounds check bypass (CVE-2017-5753 ) ◦ Speculation leaks sensitive data ◦ Primarily affects JITs and Interpreters • Variant 2: BTI: branch target injection (CVE-2017-5715 ) ◦ Tricks indirect branch to attacker-controlled destination ◦ Primarily affects kernels and hypervisors
  5. Variant 3: Meltdown • Variant 3: PRIV-LOAD: rogue data cache

    load (CVE-2017-5754 ) • Breaks mechanism that keeps apps from accessing arbitrary OS memory • Unprivileged code reads privileged data • Primarily affects kernels (and architecturally equivalent software)
  6. https://investorshub.advfn.com/boards/read_msg.aspx?message_id=95215022 CPU Architecture

  7. http://www.tomshardware.com/reviews/dual-xeon-duo,664-3.html L1, L2, L3 Cache Hierarchy

  8. https://medium.com/@mattklein123/meltdown-spectre-explained-6bc8634cc0c2 Virtual Memory

  9. Page Permissions https://wiki.osdev.org/Paging

  10. Out-of-Order Execution https://renesasrulz.com/doctor_micro/rx_blog/b/weblog/posts/pipeline-and-out-of-order-instruction-execution-optimize-performance

  11. Branch Prediction • Branch Prediction: processor guesses which path of

    execution will be followed from before direction is known and fetches instructions from the target ◦ Direct Jumps, Function Calls: Direction & Target known. ◦ Conditional Jump: if-then-else, for-loop. Target known. Direction? ◦ Indirect Jumps, Function Return: Direction known (taken). Target? • Otherwise, pipeline is stalled until direction known • Misprediction Delay can be 10–20 clock cycles • Typically get >90% accuracy • Predictor state stored in Branch Target Buffer https://www.cc.gatech.edu/~milos/Teaching/CS6290F07/6_BranchPred.pdf
  12. Speculative Execution • Speculative Execution: processor executes code before antecedents

    are fully resolved (branch cond, mem load, etc) • Instructions may be exec’d in different order, in parallel • Processor saves checkpoint at start of speculative region • Following instructions are speculatively executed • If correct, the spec’ly exec’d instructions can be retired • Otherwise, rollback of transient instructions • Also, rollback upon exception • No side-effects from rolling back
  13. Speculation Example 1 if (foo_array[index1] ^ foo_array[index2] == 0) {

    2 result = bar_array[100]; 3 } else { 4 result = bar_array[200]; 5 } https://rwc.iacr.org/2018/Slides/Horn.pdf
  14. Misspeculation • Exceptions and incorrect branch predictions can cause rollback

    of transient instructions • Old register states are preserved by checkpoint; can be restored • Memory writes are buffered; can be discarded • Cache modifications are not rolled back
  15. Meltdown: Rogue Data Cache Load • Uses transient instructions after

    illegal access to privileged memory that eventually causes exception • Takes advantage of race condition in permissions check • Reads kernel memory ◦ Or physical memory of other processes ◦ Or memory of other Docker containers ◦ Or hypervisor memory ◦ Or other Cloud tenants’ memory • Breaks security assumptions about memory isolation • Attacker runs code directly • Affects Intel, high-end ARM, POWER. Not AMD.
  16. Kernel Mem Present in Userland https://databricks.com/blog/2018/01/16/meltdown-and-spectre-exploits-and-mitigation-strategies.html

  17. Meltdown Code • Transient instructions (L2, L3) exec’d in time

    window between illegal access and exception raised • Sequence aborted; instructions discarded • Transmission: probe[y] now in cache; can deduce secret 1 secret = *pointer Read byte of inaccessible kernel memory 2 y = secret * 4096 4KB pages 3 z = probe[y] One element of probe loaded to cache • Before: probe, array of 256 4KB pages, flushed from cache
  18. Observing the Cache

  19. Alternate Universe Speaking of speculative execution influencing the cache, Eric

    suggested that it was as if the transient instructions were actually executed in an alternate universe and somehow reached across and touched a cache line in this universe.
  20. Single-Bit Transmission • To recover a byte requires 256 Flush+Reloads

    • Flush+Reload quantity is bottleneck • Faster to extract one bit at a time (shift and mask), as needs only 2 Flush+Reloads times 8 bits • 503KB/s read rate with error rate as low as 0.02%
  21. Exception Handling / Suppression • Fork child; let it crash;

    parent observes cache • Catch with signal() or Structured Exception Handling • Suppress exceptions with Transactional Memory (TSX)
  22. Kernel Page-Table Isolation https://databricks.com/blog/2018/01/16/meltdown-and-spectre-exploits-and-mitigation-strategies.html

  23. Spectre Attacks • Attacker induces victim to speculatively perform ops

    that would not happen during correct execution ◦ Victim: separate process, sandbox host, etc • Attacker locates exploitable code sequence in victim • Attacker mistrains branch predictor with valid calls • Leaks via side-channel • Harder to exploit; harder to mitigate.
  24. #1: Conditional Branch Misprediction • Bounds Check Bypass Attack •

    Attacker mistrains branch predictor with valid offsets to an array bounds check • Attacker evicts array bound and probe array from cache • Meanwhile, victim speculatively loads from invalid offset • Similar to Meltdown, except: ◦ Victim code executes in Spectre ◦ Uses branch misprediction, not exception
  25. Victim Code 1 offset = some_untrusted_value attacker-controlled 2 if (offset

    < array->size) { array->size uncached ⇒ miss 3 secret = array->data[offset]; Out-of-bounds access 4 y = ((secret & 1) << 8); y = 0 or 256 6 z = probe->data[y]; probe->data also uncached 7 }
  26. Flow of Execution https://databricks.com/blog/2018/01/16/meltdown-and-spectre-exploits-and-mitigation-strategies.html

  27. Successful PoC Attacks • Spectre paper demonstrates JavaScript attack to

    read browser’s private memory. ◦ Ad malware stealing credentials from other tabs ◦ SharedArrayBuffer feature disabled ◦ performance.now() resolution reduced • Project Zero demos kernel attack using eBPF ◦ Unprivileged userspace code supplies bytecode to kernel ◦ Kernel validates the bytecode but ...
  28. Bounds Check Bypass Mitigation • Requires recompilation of vulnerable code

    • GCC, LLVM, MSVC, etc can now insert speculation barrier • Using /Qspectre switch, “[MSVC] compiler detects that a range-checked integer is used as an index to load a value that is used to compute the address of a subsequent load” and inserts LFENCE barrier before first load; i.e., before reading secret. https://blogs.msdn.microsoft.com/vcblog/2018/01/15/spectre-mitigations-in-msvc/
  29. #2: Poisoning Indirect Branches • Indirect branches may jump to

    more than 2 targets ◦ e.g., function pointer, vtable pointer, RET • If target address causes cache miss, & branch predictor mistrained, Speculative execution occurs at attacker-chosen location • Processor unwinds the speculation • But, exfiltration (or worse) via covert channel • GPZ demo: read host memory from KVM guest
  30. Branch Target Injection https://databricks.com/blog/2018/01/16/meltdown-and-spectre-exploits-and-mitigation-strategies.html

  31. Indirect Branch Mitigations • Recompilation required • Microcode patches from

    Intel: Indirect Branch Restricted Speculation, other instructions ◦ Huge overhead ◦ Linus Torvalds: “the patches are COMPLETE AND UTTER GARBAGE” • RETPOLINE: RET Trampoline ◦ Confuse speculation with a dummy infinite loop ◦ Helper func pushes target onto return stack, then RET to target ◦ GCC & LLVM: yes. MSVC: unknown. RETPOLINE: https://support.google.com/faqs/answer/7625886
  32. References • https://meltdownattack.com/: Meltdown/Spectre papers • https://googleprojectzero.blogspot.com/2018/01/reading -privileged-memory-with-side.html • https://www.youtube.com/watch?v=6O8LTwVfTVs:

    Horn • https://blog.acolyer.org/2018/01/15/meltdown/ • https://blog.acolyer.org/2018/01/16/spectre-attacks-exp loiting-speculative-execution/ • https://medium.com/@mattklein123/meltdown-spectre-e xplained-6bc8634cc0c2 • https://databricks.com/blog/2018/01/16/meltdown-and- spectre-exploits-and-mitigation-strategies.html
  33. https://www.xkcd.com/1938/

  34. Backup Slides

  35. Non-Technical Meltdown Speech On January 15th, 2018, I gave a

    7-minute presentation about Meltdown to Freely Speaking Toastmasters in Seattle. I used large, hand-drawn sheets of paper on a flip chart. I know I lost them on the final slide.
  36. None
  37. None
  38. None
  39. MeltdownPrime & SpectrePrime • Exploit cache-coherency protocols by observing caching

    behavior differences across multiple cores https://www.schneier.com/blog/archives/2018/02/new_spectremelt.html
  40. http://slideplayer.com/slide/10942870/