Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A Verified Compiler from Isabelle/HOL to CakeML

Lars Hupel
February 19, 2018

A Verified Compiler from Isabelle/HOL to CakeML

Lars Hupel

February 19, 2018
Tweet

More Decks by Lars Hupel

Other Decks in Research

Transcript

  1. A Verified Compiler from Isabelle/HOL to CakeML
    Lars Hupel
    Technische Universität München
    February 19th, 2018

    View Slide

  2. Isabelle
    2

    View Slide

  3. Isabelle
    ▶ interactive proof assistant
    ▶ powerful automation
    ▶ classical and equational reasoning
    ▶ decision procedures (e.g. linear arithmetic)
    ▶ integration with external automated theorem provers
    ▶ ...
    ▶ supports functional programming
    3

    View Slide

  4. Low-level Isabelle
    ▶ generic proof assistant
    ▶ supports multiple object logics
    ▶ kernel: intuitionistic higher-order logic with natural deduction
    ▶ Isabelle/HOL built on top of the kernel
    ▶ ML code can be embedded almost everywhere
    ▶ theory syntax (commands) and term syntax (logic) can be extended
    4

    View Slide

  5. If you want to make an apple pie from scratch ...
    5

    View Slide

  6. Isabelle/HOL for Users
    People say Isabelle and mean Isabelle/HOL.
    ▶ inductive predicates
    ▶ datatypes and “refinement” types
    ▶ recursive functions
    ▶ pattern matching
    ▶ type classes
    6

    View Slide

  7. Isabelle/HOL for Users
    People say Isabelle and mean Isabelle/HOL.
    ▶ inductive predicates
    ▶ datatypes and “refinement” types
    ▶ recursive functions
    ▶ pattern matching
    ▶ type classes
    actually a feature of the kernel
    6

    View Slide

  8. Defining Datatypes
    datatype α list = Nil | Cons α (α list)
    This specification introduces:
    ▶ an induction principle
    ▶ a recursor
    rec_list :: α ⇒ (β ⇒ β list ⇒ α ⇒ α) ⇒ β list ⇒ α
    ▶ injective, non-overlapping constructors
    7

    View Slide

  9. Defining Datatypes
    datatype α list = Nil | Cons α (α list)
    This specification introduces:
    ▶ an induction principle
    ▶ a recursor
    rec_list :: α ⇒ (β ⇒ β list ⇒ α ⇒ α) ⇒ β list ⇒ α
    ▶ injective, non-overlapping constructors
    “freely constructed”
    7

    View Slide

  10. Defining Datatypes
    datatype α list = Nil | Cons α (α list)
    This specification introduces:
    ▶ an induction principle
    ▶ a recursor
    rec_list :: α ⇒ (β ⇒ β list ⇒ α ⇒ α) ⇒ β list ⇒ α
    ▶ injective, non-overlapping constructors
    ▶ alotmore
    “freely constructed”
    7

    View Slide

  11. Defining Functions
    ▶ HOL has a definition principle:
    x = λy1 . . . yn. t
    ▶ introduces a new constant and an axiom
    8

    View Slide

  12. Defining Functions
    ▶ HOL has a definition principle:
    x = λy1 . . . yn. t
    ▶ introduces a new constant and an axiom
    t must not depend on x itself
    8

    View Slide

  13. Defining Functions
    ▶ HOL has a definition principle:
    x = λy1 . . . yn. t
    ▶ introduces a new constant and an axiom
    ▶ primrec defines a function using a recursor
    t must not depend on x itself
    8

    View Slide

  14. Defining Functions
    ▶ HOL has a definition principle:
    x = λy1 . . . yn. t
    ▶ introduces a new constant and an axiom
    ▶ primrec defines a function using a recursor
    t must not depend on x itself
    “expressible as a fold”
    8

    View Slide

  15. Defining Functions
    ▶ HOL has a definition principle:
    x = λy1 . . . yn. t
    ▶ introduces a new constant and an axiom
    ▶ primrec defines a function using a recursor
    ▶ fun allows more flexible recursion, but need to prove termination
    t must not depend on x itself
    “expressible as a fold”
    8

    View Slide

  16. Functional Programming in Isabelle
    Definitions
    datatype α list = Nil | Cons α (α list)
    primrec append where
    append Nil ys = ys
    append (Cons x xs) ys = Cons x (append xs ys)
    9

    View Slide

  17. Functional Programming in Isabelle
    Definitions
    datatype α list = Nil | Cons α (α list)
    primrec append where
    append Nil ys = ys
    append (Cons x xs) ys = Cons x (append xs ys)
    Proofs
    lemma append xs (append ys zs) = append (append xs ys) zs
    by (induction xs) simp+
    9

    View Slide

  18. Functional Programming in Isabelle
    Definitions
    datatype α list = Nil | Cons α (α list)
    fun append where
    append Nil ys = ys
    append (Cons x xs) ys = Cons x (append xs ys)
    Proofs
    lemma append xs (append ys zs) = append (append xs ys) zs
    by (induction xs) simp+
    9

    View Slide

  19. Advanced Functional Programming
    Automatic termination proof
    fun fib where
    fib 0 = 1
    fib (Suc 0) = 1
    fib (Suc (Suc n)) = fib n + fib (Suc n)
    10

    View Slide

  20. Advanced Functional Programming
    Automatic termination proof
    fun fib where
    fib 0 = 1
    fib (Suc 0) = 1
    fib (Suc (Suc n)) = fib n + fib (Suc n)
    Manual termination proof
    function f91 where
    f91 n = (if 100 < n then n − 10 else f91 (f91 (n + 11)))
    10

    View Slide

  21. Evaluating Expressions
    We want to evaluate functions for concrete inputs, e.g. fib 10.
    1. using term rewriting (by simp)
    ▶ certified, but slow
    2. using code generation (by eval)
    ▶ fast, but not certified
    11

    View Slide

  22. Evaluating Expressions
    We want to evaluate functions for concrete inputs, e.g. fib 10.
    1. using term rewriting (by simp)
    ▶ certified, but slow
    2. using code generation (by eval)
    ▶ fast, but not certified
    11

    View Slide

  23. Code Generation
    Isabelle can generate code for ML, Haskell, Scala and OCaml
    ML
    datatype ’a list = Nil | Cons of ’a * ’a list;
    fun append Nil xs = xs
    | append (Cons (y, ys)) xs = Cons (y, append ys xs);
    12

    View Slide

  24. Code Generation
    Isabelle can generate code for ML, Haskell, Scala and OCaml
    Scala
    abstract sealed class list[A]
    final case class Nila[A]() extends list[A]
    final case class Cons[A](a: A, b: list[A]) extends list[A]
    def append[A](x0: list[A], xs: list[A]): list[A] = (x0, xs) match {
    case (Nila(), xs) => xs
    case (Cons(y, ys), xs) => Cons[A](y, append[A](ys, xs))
    }
    12

    View Slide

  25. Code Generation Pipeline
    1. input: Set of equations
    2. preprocess
    3. build dependency graph, compute SCCs
    4. translate to intermediate language
    5. serialize to target language
    6. output: Source text
    13

    View Slide

  26. Certifying Code Generation
    Idea: Transform equations into intermediate formal object
    Intermediate AST is a value in the logic
    14

    View Slide

  27. Certifying Code Generation
    Idea: Transform equations into intermediate formal object
    Intermediate AST is a value in the logic
    Magnus O. Myreen and Scott Owens. Proof-producing synthesis of ML
    from higher-order logic. ICFP 2012.
    Magnus O. Myreen and Scott Owens. Proof-producing translation of
    higher-order logic into pure and stateful ML. JAR 2014.
    14

    View Slide

  28. Certifying Code Generation
    Approach by Myreen & Owens
    ▶ define a datatype for ML syntax, formalize semantics
    ▶ define relators between HOL values and ML values, e.g.
    relint :: ML_val ⇒ int ⇒ bool
    ▶ when code generator is invoked on constant f,
    ▶ define a logical constant fML
    containing the AST
    ▶ prove theorem relating f to fML
    using the type’s relator
    15

    View Slide

  29. Certifying Code Generation
    Approach by Myreen & Owens
    ▶ define a datatype for ML syntax, formalize semantics
    ▶ define relators between HOL values and ML values, e.g.
    relint :: ML_val ⇒ int ⇒ bool
    ▶ when code generator is invoked on constant f,
    ▶ define a logical constant fML
    containing the AST
    ▶ prove theorem relating f to fML
    using the type’s relator
    specified in Lem
    15

    View Slide

  30. Certified Code Generation
    Our Approach
    Stage 1 (certifying)
    ▶ define a higher-order lambda calculus with term-rewriting semantics
    ▶ define relators between HOL values and lambda terms, e.g.
    relint :: term ⇒ int ⇒ bool
    ▶ when code generator is invoked on constant f,
    ▶ define a logical constant fλ
    containing the TRS
    ▶ prove theorem relating f to fλ
    using the type’s relator
    16

    View Slide

  31. Certified Code Generation
    Our Approach
    Stage 1 (certifying)
    ▶ define a higher-order lambda calculus with term-rewriting semantics
    ▶ define relators between HOL values and lambda terms, e.g.
    relint :: term ⇒ int ⇒ bool
    ▶ when code generator is invoked on constant f,
    ▶ define a logical constant fλ
    containing the TRS
    ▶ prove theorem relating f to fλ
    using the type’s relator
    λ-terms are conceptually much simpler!
    16

    View Slide

  32. Certified Code Generation
    Our Approach
    Stage 1 (certifying)
    ▶ define a higher-order lambda calculus with term-rewriting semantics
    ▶ define relators between HOL values and lambda terms, e.g.
    relint :: term ⇒ int ⇒ bool
    ▶ when code generator is invoked on constant f,
    ▶ define a logical constant fλ
    containing the TRS
    ▶ prove theorem relating f to fλ
    using the type’s relator
    λ-terms are conceptually much simpler!
    requires type class elimination
    16

    View Slide

  33. Certified Code Generation
    Our Approach
    Stage 2 (certified)
    ▶ reuse ML syntax and semantics
    16

    View Slide

  34. Certified Code Generation
    Our Approach
    Stage 2 (certified)
    ▶ reuse ML syntax and semantics
    export Lem to Isabelle
    16

    View Slide

  35. Certified Code Generation
    Our Approach
    Stage 2 (certified)
    ▶ reuse ML syntax and semantics
    ▶ define a HOL function
    compile :: (term × term) set ⇒ ML_val
    ▶ prove it correct once and for all
    export Lem to Isabelle
    16

    View Slide

  36. Challenges
    ▶ Isabelle supports type classes, ML doesn’t
    certifying dictionary construction
    ▶ users can specify custom equations
    need to figure out termination and induction principles (like fun)
    ▶ generation of relators for complex data types
    complex proof tactics to accomodate for non-standard recursion
    ▶ set of code equations is unordered
    need to specify wellformedness conditions
    ▶ transformation from term rewriting to big-step semantics
    multiple compiler phases
    17

    View Slide

  37. Challenge: Custom Code Equations
    What the user specified
    sum_by f = sum ◦ map f
    18

    View Slide

  38. Challenge: Custom Code Equations
    What the user specified
    sum_by f = sum ◦ map f
    What the user proved
    sum_by f [] = 0
    sum_by f (x # xs) = f x + sum_by xs
    18

    View Slide

  39. Challenge: Custom Code Equations
    What the user specified
    sum_by f = sum ◦ map f
    What the user proved
    sum_by f [] = 0
    sum_by f (x # xs) = f x + sum_by xs
    What the system needs
    sum_by monoidβ
    f [] = zero monoidβ
    sum_by monoidβ
    f (x # xs) = plus monoidβ
    (f x) (sum_by monoidβ
    f xs)
    18

    View Slide

  40. Challenge: Term Rewriting to Big-Step
    de Bruijn
    terms
    Named bound
    variables
    Explicit pattern
    matching
    R :: (term × term) set, t, t′ :: term
    R ⊢ t −→ t′
    R :: (term × nterm) set, t, t′ :: nterm
    R ⊢ t −→ t′
    R :: (string × pterm) set, t, t′ :: pterm
    R ⊢ t −→ t′
    19
    compiler phase
    semantics refinement
    semantics belonging to the phase

    View Slide

  41. Challenge: Term Rewriting to Big-Step
    Explicit pattern
    matching
    Sequential
    clauses
    R :: (string × pterm) set, t, t′ :: pterm
    R ⊢ t −→ t′
    rs :: (string × sterm) list, t, t′ :: sterm
    rs ⊢ t −→ t′
    rs :: (string×sterm) list, σ :: string ⇀ sterm
    t, u :: sterm
    rs, σ ⊢ t ↓ u
    19
    compiler phase
    semantics refinement
    semantics belonging to the phase

    View Slide

  42. Challenge: Term Rewriting to Big-Step
    Sequential
    clauses
    Evaluation
    semantics
    rs :: (string×sterm) list, σ :: string ⇀ sterm
    t, u :: sterm
    rs, σ ⊢ t ↓ u
    rs :: (string × value) list, σ :: string ⇀ value
    t :: sterm, u :: value
    rs, σ ⊢ t ↓ u
    σ :: string ⇀ value
    t :: sterm, u :: value
    σ ⊢ t ↓ u
    19
    compiler phase
    semantics refinement
    semantics belonging to the phase

    View Slide

  43. Key Insights
    ▶ reusable Lem specifications are a game changer
    ▶ less certifying code, more certified proofs
    ▶ feature parity is challenging
    ▶ performance is a significant issue
    20

    View Slide

  44. Q & A
     lars.hupel.info  larsrh  larsr_h

    View Slide