Upgrade to Pro — share decks privately, control downloads, hide ads and more …

TMPA-2017: Vellvm - Verifying the LLVM

TMPA-2017: Vellvm - Verifying the LLVM

TMPA-2017: Tools and Methods of Program Analysis
3-4 March, 2017, Hotel Holiday Inn Moscow Vinogradovo, Moscow
Vellvm - Verifying the LLVM
Steve Zdancewic (Professor, USA University of Pennsylvania)

For video follow the link: https://youtu.be/jDPAtUfnoBU

Would like to know more?
Visit our website:
www.tmpaconf.org
www.exactprosystems.com/events/tmpa

Follow us:
https://www.linkedin.com/company/exactpro-systems-llc?trk=biz-companies-cym
https://twitter.com/exactpro

Exactpro

March 23, 2017
Tweet

More Decks by Exactpro

Other Decks in Technology

Transcript

  1. Collaborators •  Jianzhou Zhao •  Dmitri Garbuzov •  William Mansky

    •  ChrisGne Rizkallah •  Richard Zhang •  Milo M.K. MarGn •  Santosh NagarakaLe •  Gil Hur •  Jeehon Kang •  Viktor Vafeiadis
  2. The Science of Deep SpecificaGon •  Andrew Appel (Princeton) • 

    Adam Chlipala (MIT) •  Zhong Shao (Yale) •  Benjamin Pierce (U. Penn.) •  Stephanie Weirich (U. Penn.)
  3. Deep SpecificaGons •  Rich – expressive descripGon •  Formal –

    mathemaGcal, machine-checked •  2-Sided – tested from both sides •  Live – connected to real, executable code Goal: Advance the reliability, safety, security, and cost-effecGveness of so[ware (and hardware).
  4. The Coq InteracGve Theorem Prover •  Based on dependent type

    theory •  Pure funcGonal language + datatypes •  ConstrucGve proofs ⇒ executable code •  AutomaGon: tacGcs + inference ⇒ formalizaGon tool of choice for DeepSpec team [Developed at INRIA]
  5. MoGvaGon: So[Bound/CETS •  Buffer overflow vulnerabiliGes. •  Detect spaGal/temporal memory

    safety violaGons in legacy C code. •  Implemented as an LLVM pass. •  What about correctness? [NagarakaLe, et al. PLDI ’09, ISMM ‘10] hLp://www.cis.upenn.edu/acg/so[bound/
  6. InspiraGon: CompCert 12 [Xavier Leroy INRIA Rocquencourt] OpGmizing C Compiler:

    proved correct end-to-end with machine-checked proof in Coq C language CompCert Compiler ISA rich, formal, 2-sided, live
  7. Does Such VerificaGon Work? LLVM Csmith Tool Random test-case generaGon

    {8 other C compilers} + CompCert 79 bugs (25 criGcal) 202 bugs 325 bugs in total Source Programs [Yang et al. PLDI 2011]
  8. YES! VerificaGon Works "The striking thing about our CompCert results

    is that the middle-end bugs we found in all other compilers are absent. As of early 2011, the under-development version of CompCert is the only compiler we have tested for which Csmith cannot find wrong-code errors. This is not for lack of trying: we have devoted about six CPU-years to the task. The apparent unbreakability of CompCert supports a strong argument that developing compiler opEmizaEons within a proof framework, where safety checks are explicit and machine-checked, has tangible benefits for compiler users." – Regehr et. al 2011
  9. The Vellvm Project OpGmizaGons/ TransformaGons Typed SSA IR Analysis • 

    Formal semanGcs •  FaciliGes for creaGng simulaGon proofs •  Implemented in Coq •  Extract passes for use with LLVM compiler •  Example: verified memory safety instrumentaGon [Zhao et al. POPL 2012, CPP 2012, PLDI 2013]
  10. Vellvm Framework Transform C Source Code Other OpGmizaGons LLVM IR

    LLVM IR Target LLVM OCaml Bindings Printer Parser Coq Syntax OperaGonal SemanGcs Memory Model Type System and SSA Proof Techniques & Metatheory Extract
  11. Vellvm Framework C Source Code Other OpGmizaGons LLVM IR LLVM

    IR Target LLVM OCaml Bindings Printer Parser Coq Syntax OperaGonal SemanGcs Memory Model Type System and SSA Proof Techniques & Metatheory Extract Verified Transform
  12. Plan •  Tour of the LLVM IR •  Vellvm infrastructure

    – OperaGonal SemanGcs – SSA Metatheory + Proof Techniques •  Case studies: – So[Bound memory safety – mem2reg •  Conclusion
  13. LLVM IR by Example entry: r0 = ... r1 =

    ... r2 = ... Control-flow Graphs: + Labeled blocks exit: r7 = ... r8 = r1 x r2 r9 = r7 + r8 loop: r3 = ... r4 = r1 x r2 r5 = r3 + r4 r6 = r5 ≥ 100
  14. LLVM IR by Example entry: r0 = ... r1 =

    ... r2 = ... Control-flow Graphs: + Labeled blocks + Binary OperaGons exit: r7 = ... r8 = r1 x r2 r9 = r7 + r8 loop: r3 = ... r4 = r1 x r2 r5 = r3 + r4 r6 = r5 ≥ 100
  15. LLVM IR by Example entry: r0 = ... r1 =

    ... r2 = ... br r0 loop exit Control-flow Graphs: + Labeled blocks + Binary OperaGons + Branches/Return exit: r7 = ... r8 = r1 x r2 r9 = r7 + r8 ret r9 loop: r3 = ... r4 = r1 x r2 r5 = r3 + r4 r6 = r5 ≥ 100 br r6 loop exit
  16. LLVM IR by Example entry: r0 = ... r1 =

    ... r2 = ... br r0 loop exit Control-flow Graphs: + Labeled blocks + Binary OperaGons + Branches/Return + StaGc Single Assignment (each variable assigned only once, staGcally) exit: r7 = ... r8 = r1 x r2 r9 = r7 + r8 ret r9 loop: r3 = ... r4 = r1 x r2 r5 = r3 + r4 r6 = r5 ≥ 100 br r6 loop exit
  17. LLVM IR by Example entry: r0 = ... r1 =

    ... r2 = ... br r0 loop exit Control-flow Graphs: + Labeled blocks + Binary OperaGons + Branches/Return + StaGc Single Assignment + φ nodes exit: r7 = φ[0;entry][r5 ;loop] r8 = r1 x r2 r9 = r7 + r8 ret r9 loop: r3 = φ[0;entry][r5 ;loop] r4 = r1 x r2 r5 = r3 + r4 r6 = r5 ≥ 100 br r6 loop exit
  18. LLVM IR by Example entry: r0 = ... r1 =

    ... r2 = ... br r0 loop exit Control-flow Graphs: + Labeled blocks + Binary OperaGons + Branches/Return + StaGc Single Assignment + φ nodes (choose values based on predecessor blocks) exit: r7 = φ[0;entry][r5 ;loop] r8 = r1 x r2 r9 = r7 + r8 ret r9 loop: r3 = φ[0;entry][r5 ;loop] r4 = r1 x r2 r5 = r3 + r4 r6 = r5 ≥ 100 br r6 loop exit
  19. (UnopGmized) LLVM IR Code 27 example.c define i32 @factorial(i32 %n)

    nounwind uwtable ssp { entry: %1 = alloca i32, align 4 %acc = alloca i32, align 4 store i32 %n, i32* %1, align 4 store i32 1, i32* %acc, align 4 br label %start start: ; preds = %entry, %else %3 = load i32* %1, align 4 %4 = icmp ugt i32 %3, 0 br i1 %4, label %then, label %else then: ; preds = %start %6 = load i32* %acc, align 4 %7 = load i32* %1, align 4 %8 = mul i32 %6, %7 store i32 %8, i32* %acc, align 4 %9 = load i32* %1, align 4 %10 = sub i32 %9, 1 store i32 %10, i32* %1, align 4 br label %start else: ; preds = %start %12 = load i32* %acc, align 4 ret i32 %12 } example.ll unsigned factorial(unsigned n) { unsigned acc = 1; while (n > 0) { acc = acc * n; n = n -1; } return acc; }
  20. Other Parts of the LLVM IR 28 op ::= %uid

    | constant | undef Operands bop ::= add | sub | mul | shl | … OperaEons cmpop ::= eq | ne | slt | sle | … Comparison insn ::= | %uid = alloca ty Stack AllocaEon | %uid = load ty op1 Load | store ty op1, op2 Store | %uid = getelementptr ty op1 … Address CalculaEon | %uid = call rt fun(…args…) FuncEon Calls | … phi ::= | φ[op1;lbl1]...[opn;lbln] terminator ::= | ret %ty op | br op label %lbl1, label %lbl2 | br label %lbl
  21. Plan •  Tour of the LLVM IR •  Vellvm infrastructure

    – OperaGonal SemanGcs – SSA Metatheory + Proof Techniques •  Case studies: – So[Bound memory safety – mem2reg •  Conclusion
  22. LLVM IR SemanGcs SSA CFG ≈ funcGonal program + • 

    Types & Memory Layout –  structured, recursive types –  type-directed projecGon –  type casts •  Effects –  structured heap load/store –  system calls (I/O) –  nondeterminism [Appel 1998] We know how to model this and prove properGes about the models.
  23. LLVM’s memory model •  Manipulate structured types. %ST = type

    {i10,[10 x i8*]} i10 i8* i8* i8* i8* i8* i8* i8* i8* i8* i8* High-level RepresentaGon %val = load %ST* %ptr … store %ST* %ptr, %new
  24. LLVM’s memory model •  Manipulate structured types. •  SemanGcs is

    given in terms of byte-oriented low-level memory. –  padding & alignment –  physical subtyping %ST = type {i10,[10 x i8*]} b(10, 136) 0 b(10, 2) 1 uninit 2 uninit 3 ptr(Blk32,0,0) 4 ptr(Blk32,0,1) 5 ptr(Blk32,0,2) 6 ptr(Blk32,0,3) 7 ptr(Blk32,8,0) 8 ptr(Blk32,8,1) 9 ptr(Blk32,8,2) 10 ptr(Blk32,8,3) 11 … 12 … … i10 i8* i8* i8* i8* i8* i8* i8* i8* i8* i8* High-level RepresentaGon Low-level RepresentaGon %val = load %ST* %ptr … store %ST* %ptr, %new
  25. Dynamic Physical Subtyping b(10, 136) 0 b(10, 2) 1 uninit

    2 uninit 3 ptr(Blk32,0,0) 4 ptr(Blk32,0,1) 5 ptr(Blk32,0,2) 6 ptr(Blk32,0,3) 7 ptr(Blk32,8,0) 8 ptr(Blk32,8,1) 9 ptr(Blk32,8,2) 10 ptr(Blk32,8,3) 11 … 12 … … Blk0 Blk1 Blk32 b(16, 1) 0 b(16, 0) 1 uninit 2 uninit 3 uninit 4 uninit 5 uninit 6 uninit 7 ptr(Blk1,0,0) 8 ptr(Blk1,0,1) 9 ptr(Blk1,0,2) 10 ptr(Blk1,0,3) 11 … 12 … … i10 load i16*
 ⇒ 1 ✓ load i16*
 ⇒ undef ✗ [Nita, et al. POPL ’08]
  26. Fatal Errors Target-dependent Results Sources of Undefined Behavior •  UniniGalized

    variables: •  UniniGalized memory: •  Ill-typed memory usage •  Out-of-bounds accesses •  Access dangling pointers •  Free invalid pointers •  Invalid indirect calls %v = add i32 %x, undef %ptr = alloca i32 %v = load (i32*) %ptr Nondeterminism Stuck States
  27. Target-dependent Results Sources of Undefined Behavior •  UniniGalized variables: • 

    UniniGalized memory: •  Ill-typed memory usage %v = add i32 %x, undef %ptr = alloca i32 %v = load (i32*) %ptr Nondeterminism Stuck States Stuck(f, σ) = BadFree(f, σ) ˅ BadLoad(f, σ) ˅ BadStore(f, σ) ˅ … ˅ …0 Defined by a predicate on the program configuraGon.
  28. undef •  What is the value of %y a[er running

    the following? •  One plausible answer: 0 •  Not LLVM’s semanGcs! (LLVM is more liberal to permit more aggressive opGmizaGons) %x = or i8 undef, 1 %y = xor i8 %x %x
  29. undef •  ParGally defined values are interpreted nondeterminisEcally as sets

    of possible values: ⟦%x⟧ = {a or b | a∈⟦i8 undef⟧, b ∈⟦1⟧}
 = {1,3,5,…,255} ⟦%y⟧ = {a xor b | a∈⟦%x⟧, b∈⟦%x⟧} = {0,2,4,…,254} %x = or i8 undef, 1 %y = xor i8 %x %x ⟦i8 undef⟧ = {0,…,255} ⟦i8 1⟧ = {1}
  30. LLVMND OperaGonal SemanGcs •  Define a transiGon relaGon: f ⊢

    σ1 ⟼ σ2 –  f is the program –  σ is the program state: pc, locals(δ), stack, heap •  NondeterminisGc –  δ maps local %uids to sets. –  Step relaGon is nondeterminisGc •  Mostly straigh~orward (given the heap model) –  One wrinkle: phi-nodes exectuted atomically
  31. DeterminisGc Refinement Small Step Big Step NondeterminisGc DeterminisGc LLVMND LLVMD

    ∋︎ InstanGate ‘undef’ with default value (0 or null) ⇒ determinisGc.
  32. Big-step DeterminisGc Refinements Small Step Big Step NondeterminisGc DeterminisGc LLVMND

    LLVMD LLVMInterp ≈︎ ∋︎ BisimulaGon up to “observable events”: •  external funcGon calls
  33. Big-step DeterminisGc Refinements [Tristan, et al. POPL ’08, Tristan, et

    al. PLDI ’09] Small Step Big Step NondeterminisGc DeterminisGc LLVMND LLVMD LLVM* DFn LLVM* DB LLVMInterp ≈︎ ≿︎ ≿︎ ∋︎ SimulaGon up to “observable events”: •  useful for encapsulaGng behavior of funcGon calls •  large step evaluaGon of basic blocks
  34. Plan •  Tour of the LLVM IR •  Vellvm infrastructure

    – OperaGonal SemanGcs – SSA Metatheory + Proof Techniques •  Case studies: – So[Bound memory safety – mem2reg •  Conclusion
  35. Reasoning About LLVM Code How do we prove that a

    program transformaGon is correct with respect to the defined operaGonal semanGcs? •  Safety Invariants (preservaGon and progress) •  SimulaGon techniques
  36. Key SSA Invariant entry: r0 = ... r1 = ...

    r2 = ... br r0 loop exit exit: r7 = φ[0;entry][r5 ;loop] r8 = r1 x r2 r9 = r7 + r8 ret r9 loop: r3 = φ[0;entry][r5 ;loop] r4 = r1 x r2 r5 = r3 + r4 r6 = r5 ≥ 100 br r6 loop exit DefiniGon of r2 . Use of r2 . Uses of r2 .
  37. Key SSA Invariant entry: r0 = ... r1 = ...

    r2 = ... br r0 loop exit exit: r7 = φ[0;entry][r5 ;loop] r8 = r1 x r2 r9 = r7 + r8 ret r9 loop: r3 = φ[0;entry][r5 ;loop] r4 = r1 x r2 r5 = r3 + r4 r6 = r5 ≥ 100 br r6 loop exit DefiniGon of r2 . Use of r2 . Uses of r2 . The definiGon of a variable must dominate its uses.
  38. Safety ProperGes •  A well-formed program never accesses undefined variables.

    •  Ini=aliza=on: •  Preserva=on: •  Progress: If ⊢ f and f ⊢ σ0 ⟼* σ then σ is not stuck. ⊢ f program f is well formed σ program state f ⊢ σ ⟼* σ evaluaGon of f If ⊢ f then wf(f, σ0 ). If ⊢ f and f ⊢ σ ⟼ σ’ and wf(f, σ) then wf(f, σ’) If ⊢ f and wf(f, σ) then f ⊢ σ ⟼ σ’
  39. Safety ProperGes •  A well-formed program never accesses undefined variables.

    •  Ini=aliza=on: •  Preserva=on: •  Progress: If ⊢ f and f ⊢ σ0 ⟼* σ then σ is not stuck. ⊢ f program f is well formed σ program state f ⊢ σ ⟼* σ evaluaGon of f If ⊢ f then wf(f, σ0 ). If ⊢ f and f ⊢ σ ⟼ σ’ and wf(f, σ) then wf(f, σ’) If ⊢ f and wf(f, σ) then done(f,σ) or stuck(f,σ) or f ⊢ σ ⟼ σ’
  40. Well-formed States entry: r0 = ... r1 = ... r2

    = ... br r0 loop exit exit: r7 = φ[0;entry][r5 ;loop] r8 = r1 x r2 r9 = r7 + r8 ret r9 loop: r3 = φ[0;entry][r5 ;loop] r4 = r1 x r2 r5 = r3 + r4 r6 = r5 ≥ 100 br r6 loop exit pc State σ is: pc = program counter δ = local values
  41. Well-formed States entry: r0 = ... r1 = ... r2

    = ... br r0 loop exit exit: r7 = φ[0;entry][r5 ;loop] r8 = r1 x r2 r9 = r7 + r8 ret r9 loop: r3 = φ[0;entry][r5 ;loop] r4 = r1 x r2 r5 = r3 + r4 r6 = r5 ≥ 100 br r6 loop exit pc State σ is: pc = program counter δ = local values sdom(f,pc) = variable defns. that strictly dominate pc.
  42. Well-formed States entry: r0 = ... r1 = ... r2

    = ... br r0 loop exit exit: r7 = φ[0;entry][r5 ;loop] r8 = r1 x r2 r9 = r7 + r8 ret r9 loop: r3 = φ[0;entry][r5 ;loop] r4 = r1 x r2 r5 = r3 + r4 r6 = r5 ≥ 100 br r6 loop exit pc State σ contains: pc = program counter δ = local values sdom(f,pc) = variable defns. that strictly dominate pc. wf(f,σ) = ∀r∊sdom(f,pc). ∃v. δ(r) = ⎣v⎦ “All variables in scope are iniGalized.”
  43. Generalizing Safety •  DefiniGon of wf: •  Generalize like this:

    •  Methodology: for a given P prove three theorems: IniEalizaEon(P) PreservaEon(P) Progress(P) wf(f,(pc, δ)) = ∀r∊sdom(f,pc). ∃v. δ(r) = ⎣v⎦ wf(f,(pc, δ)) = P f (δ|sdom(f,pc) ) where P : Program ⟶ Locals ⟶ Prop Consider only variables in scope ⇒ P defined relaGve to the dominator tree of the CFG.
  44. InstanGaGng •  For usual safety: •  For semanGc properGes: • 

    Useful for verifying correctness of: – code moGon, dead variable eliminaGon, common expression eliminaGon, etc. Psafety f δ = ∀r∊dom(δ). ∃v. δ(r) = ⎣v⎦ Psem f δ = ∀r. f[r] = ⎣rhs⎦ ⇒ δ(r) = ⟦rhs⟧δ
  45. Plan •  Tour of the LLVM IR •  Vellvm infrastructure

    – OperaGonal SemanGcs – SSA Metatheory + Proof Techniques •  Case studies: – So[Bound memory safety – mem2reg •  Conclusion
  46. So[Bound So[Bound C Source Code Other OpGmizaGons LLVM IR LLVM

    IR Target •  Implemented as an LLVM pass. •  Detect spaGal/temporal memory safety violaGons in legacy C code. •  Good test case: –  Safety CriGcal ⇒ Proof cost warranted –  Non-trivial Memory transformaGon
  47. So[Bound So[Bound C Source Code Other OpGmizaGons LLVM IR LLVM

    IR Target %p = call malloc [10 x i8] %q = gep %p, i32 0, i32 255 store i8 0, %q %p = call malloc [10 x i8] %p_base = gep %p, i32 0 %p_bound = gep %p, i32 0, i32 10 %q = gep %p, i32 0, i32 255 %q_base = %p_base %q_bound = %p_bound assert %q_base <= %q /\ %q+1 < %q_bound store i8 0, %q Maintain base and bound for all pointers Propagate metadata on assignment Check that a pointer is within its bounds when being accessed
  48. Disjoint Metadata •  Maintain pointer bounds in a separate memory

    space. •  Key Invariant: Metadata cannot be corrupted by bounds violaGon. User memory Disjoint metadata %p %pbase %pbound %i1 %q %qbase %qbound %i6 %i3
  49. Proving So[Bound Correct 1.  Define So[Bound(f,σ) = (fs ,σs )

    –  TransformaGon pass implemented in Coq. 2.  Define predicate: MemoryViolaGon(f,σ) 3.  Construct a non-standard operaGonal semanGcs: –  Builds in safety invariants “by construcGon” 4.  Show that the instrumented code simulates the “correct” code: SB f ⊢ σ ⟼ σ’ SB f ⊢ σ ⟼* σ’ ⇒ ¬MemoryViolaGon(f,σ’) So[Bound(f,σ) = (fs ,σs ) ⇒ [f ⊢ σ ⟼* σ’] ≿ [fs ⊢ σs ⟼* σ’s ] SB
  50. Lessons About So[Bound •  Found several bugs in our C++

    implementaGon – InteracGon of undef, ‘null’, and metadata iniGalizaGon. •  SimulaGon proofs suggested a redesign of So[Bound’s handling of stack pointers. – Use a “shadow stack” – Simplify the design/implementaGon – Significantly more robust (e.g. varargs)
  51. 0% 50% 100% 150% 200% 250% Run5me overhead Extracted Competitive

    Runtime Overhead The performance of extracted SoftBound is competitive with the non-verified original
  52. Plan •  Tour of the LLVM IR •  Vellvm infrastructure

    – OperaGonal SemanGcs – SSA Metatheory + Proof Techniques •  Case studies: – So[Bound memory safety – mem2reg •  Conclusion
  53. mem2reg in LLVM Front-ends w/o SSA construcGon The LLVM IR

    w/o φ-nodes mem2reg •  Promote stack allocas to temporaries •  Insert minimal φ-nodes •  imperaGve variables 㱺 stack allocas •  no φ-nodes •  trivially in SSA form Backends SSA-based opGmizaGons The LLVM IR in the minimal SSA form
  54. mem2reg Example int x = 0; if (y > 0)

    
 x = 1; return x; l1 : %p = alloca i32 store 0, %p %b = %y > 0 br %b, %l2 , %l3 l2 : store 1, %p br %l3 l3 : %x = load %p ret %x The LLVM IR in the trivial SSA form
  55. mem2reg Example int x = 0; if (y > 0)

    
 x = 1; return x; l1 : %p = alloca i32 store 0, %p %b = %y > 0 br %b, %l2 , %l3 l2 : store 1, %p br %l3 l3 : %x = load %p ret %x The LLVM IR in the trivial SSA form l1 : %b = %y > 0 br %b, %l2 , %l3 l2 : br %l3 l3 : %x = φ[ 1,%l2 ] [ 0,%l1 ] ret %x Minimal SSA a[er mem2reg mem2reg
  56. mem2reg Algorithm •  Main operaGons – Phi placement (Lengauer-Tarjan algorithm) – Renaming

    of the variables – Removing loads/stores •  Intermediate stage breaks SSA invariant – Defining semanGcs & well formedness non-trivial
  57. vmem2reg Algorithm •  Incremental algorithm •  Pipeline of "micro transformaGons"

    – Preserves SSA semanGcs – Preserves well-formedness See: [Aycock & Horspool 2002.] max φs LAS/ LAA DSE DAE elim φ Find alloca
  58. How to Establish Correctness? max φs LAS/ LAA DSE DAE

    elim φ Find alloca 1.  Simple aliasing properGes (e.g. to determine promotability) 2.  InstanGate proof technique for –  SubsGtuGon –  Dead InstrucGon EliminaGon PDIE = … IniGalize(PDIE ) PreservaGon(PDIE ) Progress(PDIE ) 4. Put it all together to prove composiGon of “pipeline” correct. Aliasing ProperGes subst DIE
  59. vmem2reg is Correct Theorem: The vmem2reg algorithm preserves the semanGcs

    of the source program. Proof: ComposiGon of simulaGon relaGons from the “mini” transformaGons, each built using instances of the sdom proof technique. (See Coq Vellvm development.) □
  60. RunGme overhead of verified mem2reg 0% 20% 40% 60% 80%

    100% 120% 140% 160% 180% 200% sjeng go compress ijpeg gzip vpr mesa art ammp equake libquantum lbm milc bzip2 parser twolf mcf h264 Geo.mean Speedup Over LLVM-O0 LLVM's mem2reg Extracted mem2reg Vmem2reg: 77% LLVM’s mem2reg: 81% (LLVM’s mem2reg promotes allocas used by intrinsics)
  61. Plan •  Tour of the LLVM IR •  Vellvm infrastructure

    – OperaGonal SemanGcs – SSA Metatheory + Proof Techniques •  Case studies: – So[Bound memory safety – mem2reg •  Conclusion
  62. Ongoing Work •  Modular SemanGcs –  Factor out memory model

    [CAV 15] –  Linking/separate compilaGon •  For: –  more extensibility/robustness to changes –  verifying more analyses and opGmizaGons / program transformaGons –  support for (relaxed) concurrency –  beLer support for casts [PLDI 15] LLVM SSA core IR Memory Model / IO Concurrency
  63. •  Deep SpecificaGons –  rich, formal, 2-sided, live •  Layers

    of abstracGon –  Layer Calculus in CerGKOS [Shao et al.] –  Good for proofs! –  Bad for performance? –  ImplicaGons for theory / proof engineering? •  ComposiGonal specificaGon –  ComposiGonal CompCert [Stewart, et al. PLDI 15] •  So[ware Engineering ⇒ Proof Engineering –  Coq development methodology [CPDT: Chlipala] What engineering principles enable large-scale deep specifications?
  64. Conclusions •  Proof techniques for verifying LLVM transformaGons •  Verified:

    –  So[bound & vmem2reg –  Similar performance to naGve implementaGons •  Future: –  IntegraGon with other DeepSpec projects [hLp://www.cis.upenn.edu/~stevez/vellvm/]