Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Synthesizing Evidence of Emergent Computation (PriSC 2022)

Fef3d97778444d83c46ae551f98d1f34?s=47 Scott Moore
January 21, 2022

Synthesizing Evidence of Emergent Computation (PriSC 2022)

Fef3d97778444d83c46ae551f98d1f34?s=128

Scott Moore

January 21, 2022
Tweet

Other Decks in Research

Transcript

  1. Scott Moore, Jennifer Paykin, Olivier Savary Bélanger Principles of Secure

    Compilation January 22, 2022 Synthesizing Evidence of Emergent Computation
  2. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. Studying

    insecure compilation • SEEC is a tool for automatically finding counter-examples to secure compilation • Why build tools to demonstrate insecure compilation? 2 3 As tools to understand system security in general: secure compilation is a framework for formalizing the intended behaviors of any interactive system 1 As tools to help write secure compilers by enabling designers to test before proving 2 As tools to understand existing compilers, which we know to be insecure, but not always how or why
  3. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. Exploits

    as insecure compilation • Many exploit authors think 
 about hacking as an exercise of 
 “programming a weird machine” 
 of unintended functionality 
 enabled by unexpected inputs • These “weird behaviors” or 
 “emergent computations” are 
 system behaviors that emerge 
 from implementation artifacts, 
 design mistakes, or adversarial 
 environments. • Understanding the weird behaviors of a system and their characteristics is crucial to understanding its vulnerability. 3 Implementing a virtual machine using out-of-bounds memory operations in bitmap decoding. “A deep dive into an NSO zero- click iMessage exploit: Remote Code Execution,” Ian Beer & Samuel Groß of Google Project Zero. Jennifer Paykin, Eric Mertens, Mark Tullsen, Luke Maurer, Benoît Razet, and Scott Moore, PriSC 2022.
  4. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. Insecure

    compilation models emergent computation 4 P P C C High-level model Low-level model - Definition A source component P admits emergent computation if there is a context C from a class of interactions C such that the behavior of C[P] cannot be simulated by P in the high-level model.
  5. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. SEEC

    (https://github.com/GaloisInc/SEEC) • Framework for modeling systems and 
 Synthesizing Evidence of Emergent Computations • SEEC is a domain-specific language for: • writing interpreters for models and compilers • writing queries to identify vulnerabilities • Builds on Rosette for symbolic execution • solver-aided programming DSL in Racket • verification and synthesis using Counter-Example Guided Inductive Synthesis (CEGIS) 5
  6. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. SEEC

    Workflow 6 Executable models of 
 source and target languages 
 and compiler SEEC Evidence of insecure 
 compilation query DSL Counter-examples Revised compiler, implementation, 
 or specification Bounded validation 
 of secure compilation
  7. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. Levels

    of evidence Higher levels of evidence intuitively represent “more dangerous” kinds of compiler insecurity because they give more flexibility to the adversary 7 3 A program and context that implement a specific emergent behavior (e.g., boolean AND, OR, etc.) 1 A program and a context that have changed behavior under compilation 4 A program and collection of contexts that implement a set of emergent behaviors that can be composed in sequence (e.g., NAND = NOT ∘ AND) 2 A program and a context that demonstrate emergent computation (any behavior not present at the source level)
  8. Running example: Set API

  9. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. Modeling

    sets 9 pure sets, nondeterministic select sets as hash tables, eager select sets as lists, nondeterministic select [abstract] sets as lists, eager select [concrete] more abstract more concrete
  10. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. (define-grammar

    set-api (set ::= list<integer>) (method ::= (insert integer) (remove integer) (member? integer) select) (interaction ::= list<method>) A simple set API 10
  11. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. (define

    (abstract-interpret interaction state) (match interaction ... [(set-api (cons (insert v:value) r:interaction)) (abstract-interpret r (abstract-insert state v))] ...)) Set API - abstract semantics 11 (define (abstract-insert s v) (set-api (,v ,(abstract-remove s v))))
  12. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. (define

    (abstract-interpret interaction state) (match interaction ... [(set-api (cons select r:interaction)) (set-api (cons ,(abstract-select state) ,(abstract-interpret r state)))] ...)) Set API - abstract semantics 12 (define (abstract-select s) (match s [(set-api nil) #f] [(set-api (cons x:value nil)) x] [(set-api (cons x:value r:vallist)) (if (nondet!) x (abstract-select r))]))
  13. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. (define

    (concrete-interpret interaction state) (match interaction [(set-api nil) (set-api nil)] [(set-api (cons (insert v:value) r:interaction)) (concrete-interpret r (concrete-insert state v))] [(set-api (cons select r:interaction)) (concrete-interpret r (concrete-select state v))] ...)) Set API - concrete semantics 13 (define (concrete-insert s v) (set-api (cons ,v ,s))) (define (concrete-select s) (match s [(set-api nil) (set-api #f)] [(set-api (cons x:value vallist)) x]))
  14. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. Set

    API - compiler 14 (define-language abstract #:grammar set-api #:expression interaction #:size 4 #:context set #:size 2 #:link snoc #:evaluate (uncurry abstract-interpret)) (define-language concrete #:grammar set-api #:expression interaction #:size 4 #:context set #:size 2 #:link snoc #:evaluate (uncurry concrete-interpret)) (define-compiler abstract-to-concrete #:source abstract #:target concrete #:behavior-relation equal? #:context-relation equal? #:compile id) ; Very simple compiler just swaps implementations
  15. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. 1

    A program and a context that have changed behavior under compilation > (find-changed-component abstract-to-concrete) The following expression... ((insert -1) ((insert 5) (select))) ...has behavior... ((ite nondet$9 5 -1)) ...in the following source-level context... *null* ==== Compiles to... ((insert -1) ((insert 5) (select))) ...with emergent behavior... (5) ...in the following target-level context... *null* ∃ e, ctx. behavior(ctx[e]) ≠ behavior(compile(ctx)[compile(e)])
  16. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. 2

    A program and a context that demonstrate emergent computation (any behavior not present at the source level) > (find-weird-component abstract-to-concrete #:source-expression-size 5) The following expression... ((insert -8) ((insert -8) ((remove -8) (select)))) ...has emergent behavior... (-8) ...witnessed by the following target-level context... *null* ∃ e, ctx. ∀ ctx. behavior(ctx[e]) ≠ behavior(ctx[compile(e)])
  17. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. Set

    API - concrete semantics 17 (define (concrete-insert s v) (set-api (cons ,v ,s))) (define (concrete-remove s v) (match s [(set-api nil) (set-api nil)] [(set-api (cons x:value r:vallist)) (if (equal? x v) r (set-api (cons ,x ,(concrete-remove r v))))]))
  18. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. Set

    API - concrete semantics 18 (define (concrete-insert s v) (set-api (cons ,v ,s))) (define (concrete-remove s v) (match s [(set-api nil) (set-api nil)] [(set-api (cons x:value r:vallist)) (if (equal? x v) r (concrete-remove r v) (set-api (cons ,x ,(concrete-remove r v))))]))
  19. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. 2

    A program and a context that demonstrate emergent computation (any behavior not present at the source level) > (find-weird-component abstract-to-concrete #:source-expression-size 5) ∃ e, ctx. ∀ ctx. behavior(ctx[e]) ≠ behavior(ctx[compile(e)]) No weird behavior found
  20. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. 3

    A program and context that implement a specific emergent behavior (e.g., boolean AND, OR, etc.) (define (add1? prog res-set) (let ([init-set (cdr prog)]) (equal? (seec-length res-set) (+ 1 (seec-length init-set))))) > (find-gadget concrete add1?) Synthesized a gadget ==Expression== (insert 0) ∃ g. ∀ ctx. speci fi cation(g, ctx, behavior(ctx[g]))
  21. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. 4

    A program and collection of contexts that implement a set of emergent behaviors that can be composed > (find-related-gadgets concrete obs-int bool-funs) Decoder: (member? -6) Gadget 0: (select) Gadget 1: (if (member? -6) (remove -6) (insert -6)) Can we find a “decoder” that interprets a set as a boolean and gadgets implementing fid = (λ x . x) and fnot = (λ x . not x) ? ∃ ⅅ, g1, …, gn. ∀ ctx, i. ⅅ(ctx[gi]) = speci fi cationi(ⅅ(ctx))
  22. Case Studies

  23. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. CS1:

    printf-oriented programming • Emergent computations admitted by malformed calls to the C printf API • Powerful interface: can read and write memory • Implementations enable additional emergent computation: • Access stack variables not passed as parameters • Read and write variables at inconsistent types • Goal: demonstrate that SEEC can synthesize examples corresponding to hand-crafted printf gadgets • Outcome: synthesized gadgets for boolean operations 23
  24. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. CS2:

    API design • Goals: • Explore SEEC-assisted workflow for designing an implementation of a specification • Experiment with synthesizing composable gadgets • Outcome: • Synthesis identified interesting behaviors in buggy implementations of a set API • Demonstrated use of relational verification to synthesize composable gadgets 24
  25. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. CS3:

    Data-Oriented Programming • Language-level vulnerability classes are a critical application of tools for reasoning about emergent computation and mitigations • “Data-oriented programming, ” which relies on existing control flow, is particularly interesting because emergent computations are program-specific • Goals: • Demonstrate applicability of SEEC to complex semantics and compilers • Explore synthesis of alternative emergent computational models 25
  26. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. CS3:

    Data-Oriented Programming TinyC: • C-like language with variables, pointers, & procedures • Memory-safety enforced by semantics • Contexts: input command writes user input to 
 memory object TinyA: • Unstructured control flow • Allocations are integer offsets in linear heap • input command writes unbounded number 
 of bytes to supplied address Tiny Compiler: • Compiles structured to unstructured control flow (JMP/CALL/RET) • Introduces explicit allocation of stack frames 26 P P C C Source language Target language -
  27. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. CS3:

    Data-Oriented Programming • auth and secret stored on stack after candidate • Unchecked write by input can corrupt auth or both auth and secret • Outcome: attacks that: • bypass authentication • overwrite secret 27 void main (int password) { int candidate; int auth; int secret; auth = 0; secret = 42; input(& candidate); if (candidate = password) { auth = 1; } guarded-fun(auth, secret); } void guarded-fun (int auth, int secret) { if (auth) { output(secret); } }
  28. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. CS4:

    Heap allocator • Heap allocators have witnessed a long arms race between exploits and mitigations • Pragmatic concerns lead to complex data structures and algorithms • Difficult to reason carefully about potential weaknesses • Goal: show utility of SEEC for iterative design of a secure allocator through design → detect → mitigate cycle 
 28
  29. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. CS4:

    Heap allocator Heap allocator specification: • Abstract operations: malloc, free, read, write • “Typed” pointers with base and offset Concrete heap allocator: • Explicit free-list management • In-line metadata: allocated bit, 
 size, free-list next and previous “Compiler”: • Refinement relation between abstract 
 model and compatible concrete heaps • Encodes permissible non-determinism of heap layouts and free-list organization 29 P P C C High-level model Low-level model -
  30. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. CS4:

    Heap allocator • Outcome: Synthesized attacks that enable: • Resizing a block (overwrite size metadata) • Forcing the address of the next allocation • Demonstrated power of composition: • Synthesizing next-alloc directly takes over an hour • Instead, next-alloc can be composed from: • find-head: (alloc 1), (copy 1 2), (free 1) • insert-in-freelist: (write <target> 2), (alloc 1) • Synthesized in seconds! • Iterative development uncovered interesting corner cases such as dangling or forged pointers 30
  31. Conclusion

  32. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. Lessons

    learned • Having an oracle of unexpected behaviors is powerful! • But scalability is a huge challenge! • State-space explosion under symbolic execution is difficult to control and debug • For larger systems, deductive symex + CEGIS alone may be too blunt an instrument • Found partially-concrete testing very valuable • As was “sketching” emergent computations • But exhaustive checking of small models is informative! 32
  33. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. Next

    steps • Applications to more realistic systems • Novel counter-example search strategies based on secure compilation proof techniques such as back-translation • Automatic identification of composable gadget collections • Let us help secure your next compiler or system! 
 SEEC is open source at https://github.com/GaloisInc/SEEC 
 Reach out to us at {scott,jpaykin,olivier}@galois.com 33
  34. Backup

  35. © 2015 Galois, Inc. ‹#› © 2022 Galois, Inc. •

    Given: • a set of composable target operations fi: 𝜏 i → 𝜏 i’ • a set of decoders ⅅτ: State → 𝜏 • a set of gadgets gi is composable if: 
 ∀s∀i, ⅅ i (gi(s)) = fi(ⅅ I (s)) • The set of decoders is a 
 “natural transformation” 
 from the gadgets to the 
 target computational 
 environment Formalizing gadget composition 35 s s s 𝜏 𝜏 ’ 𝜏 ” g g’ g;g’ f;f’ f f’ ⅅ ⅅ ’ ⅅ ”