Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Annotating Deeply Embedded Languages

Annotating Deeply Embedded Languages

Presented at the Haskell Implementers Workshop 2022: https://icfp22.sigplan.org/home/hiw-2022#About
Video: TBD

Haskell's strong static typing and the ease at which the programmer can create and manipulate algebraic data types make it a good choice for implementing (embedded) programming languages. However a common problem with deeply embedded languages is that there is no strong connection between the source program the user writes; the abstract syntax tree that is generated for that embedded program; and the object code that is eventually generated, complied, and executed for that program. Once the embedded language moves beyond the prototype phase, this disconnect makes it almost impossible for a user to debug their program or determine where the execution time is being spent.

In this talk we discuss recent work on annotating the abstract syntax tree of the deeply embedded languages Accelerate. We present our approach for automatically gathering source location information for embedded expressions using a novel implicit-parameter based approach. Unlike the existing HasCallStack mechanism provided by GHC, our approach statically ensures that the required source location information is available when needed. Annotating the abstract syntax tree of the program with this information, it is now possible to map the embedded program back to the original source code. We demonstrate our current progress on using this information to improve compiler diagnostics, as well as enable profiler and debugger integration for embedded programs.

Trevor L. McDonell

September 11, 2022
Tweet

More Decks by Trevor L. McDonell

Other Decks in Research

Transcript

  1. Embedded Languages A language written inside of another language May

    be domain speci fi c That other language is called the host language 2
  2. ans : : Exp Int ans = Succ (Lit 41)

    Deeply Embedded Languages 3 https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/gadt.html data Exp a where Lit : : Int - > Exp Int Succ : : Exp Int - > Exp Int . . .
  3. ans : : Exp Int ans = Succ (Lit 41)

    Deeply Embedded Languages 3 https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/gadt.html data Exp a where Lit : : Int - > Exp Int Succ : : Exp Int - > Exp Int . . . eval : : Exp Int - > Int eval (Lit i) = i eval (Succ x) = 1 + eval x . . .
  4. ans : : Exp Int ans = Succ (Lit 41)

    Deeply Embedded Languages 3 https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/gadt.html data Exp a where Lit : : Int - > Exp Int Succ : : Exp Int - > Exp Int . . . eval : : Exp Int - > Int eval (Lit i) = i eval (Succ x) = 1 + eval x . . .
  5. ans : : Exp Int ans = Succ (Lit 41)

    Deeply Embedded Languages 3 https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/gadt.html data Exp a where Lit : : Int - > Exp Int Succ : : Exp Int - > Exp Int . . . eval : : Exp Int - > Int eval (Lit i) = i eval (Succ x) = 1 + eval x . . .
  6. ans : : Exp Int ans = Succ (Lit 41)

    Deeply Embedded Languages 3 https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/gadt.html data Exp a where Lit : : Int - > Exp Int Succ : : Exp Int - > Exp Int . . . eval : : Exp Int - > Int eval (Lit i) = i eval (Succ x) = 1 + eval x . . . academics
  7. ans : : Exp Int ans = Succ (Lit 41)

    Deeply Embedded Languages 3 https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/gadt.html data Exp a where Lit : : Int - > Exp Int Succ : : Exp Int - > Exp Int . . . eval : : Exp Int - > Int eval (Lit i) = i eval (Succ x) = 1 + eval x . . . academics me
  8. Deeply Embedded Languages Advantages: Integrates with the host language No

    separate parsing step Disadvantages: No separate parsing step 4
  9. Deeply Embedded Languages Advantages: Integrates with the host language No

    separate parsing step Disadvantages: No separate parsing step Deep embeddings lack context information 4
  10. The Problem • Let’s write an embedded program! 1. Write

    some code 5 https://github.com/tmcdonell/lulesh-accelerate
  11. The Problem • Let’s write an embedded program! 1. Write

    some code 5 https://github.com/tmcdonell/lulesh-accelerate
  12. The Problem • Let’s write an embedded program! 1. Write

    some code 5 https://github.com/tmcdonell/lulesh-accelerate
  13. The Problem • Let’s write an embedded program! 1. Write

    some code 2. Run it! 5 https://github.com/tmcdonell/lulesh-accelerate
  14. The Problem • Let’s write an embedded program! 1. Write

    some code 2. Run it! 5 https://github.com/tmcdonell/lulesh-accelerate
  15. The Problem • Let’s write an embedded program! 1. Write

    some code 2. Run it! 5 https://github.com/tmcdonell/lulesh-accelerate
  16. The Problem • Let’s write an embedded program! 1. Write

    some code 2. Run it! 5 https://github.com/tmcdonell/lulesh-accelerate
  17. The Problem • Let’s write an embedded program! 1. Write

    some code 2. Run it! 5 https://github.com/tmcdonell/lulesh-accelerate 👍
  18. The Problem • Let’s write an embedded program! 1. Write

    some code 2. Run it! 5 https://github.com/tmcdonell/lulesh-accelerate 👍
  19. The Problem • Let’s write an embedded program! 1. Write

    some code 2. Run it! 5 https://github.com/tmcdonell/lulesh-accelerate 👍
  20. The Problem • Let’s write an embedded program! 1. Write

    some code 2. Run it! 5 https://github.com/tmcdonell/lulesh-accelerate 👍
  21. The Problem • Let’s write an embedded program! 1. Write

    some code 2. Run it! 3. … 5 https://github.com/tmcdonell/lulesh-accelerate 👍
  22. The Problem • Let’s write an embedded program! 1. Write

    some code 2. Run it! 3. … 5 https://github.com/tmcdonell/lulesh-accelerate 👍
  23. The Problem • Let’s write an embedded program! 1. Write

    some code 2. Run it! 3. … 5 https://github.com/tmcdonell/lulesh-accelerate 👍
  24. The Problem • Let’s write an embedded program! 1. Write

    some code 2. Run it! 3. … 5 https://github.com/tmcdonell/lulesh-accelerate 👍
  25. The Problem • Let’s write an embedded program! 1. Write

    some code 2. Run it! 3. … 5 https://github.com/tmcdonell/lulesh-accelerate 👍
  26. The Problem • Let’s write an embedded program! 1. Write

    some code 2. Run it! 3. … 5 https://github.com/tmcdonell/lulesh-accelerate 👍
  27. The Problem • Let’s write an embedded program! 1. Write

    some code 2. Run it! 3. … 5 https://github.com/tmcdonell/lulesh-accelerate 👍
  28. The Problem • Let’s write an embedded program! 1. Write

    some code 2. Run it! 3. … 4. Pro fi t? 5 https://github.com/tmcdonell/lulesh-accelerate 👍 🤔
  29. The Problem There is a disconnect between: the embedded program

    the user writes; the abstract syntax tree generated for that program; and the optimised code that is eventually executed 6
  30. Objective 1. Recover context in deeply embedded programs 2. Annotate

    the embedded program with that information 3. Find other uses for the annotation system 7
  31. The Idea The program is described by some abstract syntax

    tree This AST is built via smart constructors These smart constructors should generate and store the necessary annotations 8
  32. The Idea These smart constructors can be encountered by: Regular

    functions Type class methods Pattern synonymns 9 Embedded Pattern Matching, McDonell T.L., Meredith, J.D, and Keller G.
  33. The Idea These smart constructors can be encountered by: Regular

    functions Type class methods Pattern synonymns 9 Embedded Pattern Matching, McDonell T.L., Meredith, J.D, and Keller G. constant : : Int - > Exp Int constant = Lit
  34. The Idea These smart constructors can be encountered by: Regular

    functions Type class methods Pattern synonymns 9 Embedded Pattern Matching, McDonell T.L., Meredith, J.D, and Keller G. instance Num (Exp a) where (+) = PrimApp PrimAdd constant : : Int - > Exp Int constant = Lit
  35. The Idea These smart constructors can be encountered by: Regular

    functions Type class methods Pattern synonymns 9 Embedded Pattern Matching, McDonell T.L., Meredith, J.D, and Keller G. instance Num (Exp a) where (+) = PrimApp PrimAdd pattern Maybe_ : : Exp a - > Exp (Maybe a) constant : : Int - > Exp Int constant = Lit
  36. Annotations Store metadata for an AST node Should be easily

    extensible Adding them shouldn’t change the user-facing language 10
  37. Storing Annotations 11 Trees that Grow, Njjd S. and Peyton-Jones

    S. data Ann = Ann { . . . } data Exp a where Lit : : Ann - > Succ : : Ann - > . . . constant : : Int - > Exp Int constant = Lit mkAnn mkAnn : : . . . = > Ann mkAnn = Ann { . . . } Int - > Exp Int Exp Int - > Exp Int
  38. Storing Annotations 11 Trees that Grow, Njjd S. and Peyton-Jones

    S. data Ann = Ann { . . . } data Exp a where Lit : : Ann - > Succ : : Ann - > . . . constant : : Int - > Exp Int constant = Lit mkAnn mkAnn : : . . . = > Ann mkAnn = Ann { . . . } Int - > Exp Int Exp Int - > Exp Int
  39. Storing Annotations 11 Trees that Grow, Njjd S. and Peyton-Jones

    S. data Ann = Ann { . . . } data Exp a where Lit : : Ann - > Succ : : Ann - > . . . constant : : Int - > Exp Int constant = Lit mkAnn mkAnn : : . . . = > Ann mkAnn = Ann { . . . } Int - > Exp Int Exp Int - > Exp Int
  40. Storing Annotations 11 Trees that Grow, Njjd S. and Peyton-Jones

    S. data Ann = Ann { . . . } data Exp a where Lit : : Ann - > Succ : : Ann - > . . . constant : : Int - > Exp Int constant = Lit mkAnn mkAnn : : . . . = > Ann mkAnn = Ann { . . . } Int - > Exp Int Exp Int - > Exp Int
  41. Storing Annotations 11 Trees that Grow, Njjd S. and Peyton-Jones

    S. data Ann = Ann { . . . } data Exp a where Lit : : Ann - > Succ : : Ann - > . . . constant : : Int - > Exp Int constant = Lit mkAnn mkAnn : : . . . = > Ann mkAnn = Ann { . . . } Int - > Exp Int Exp Int - > Exp Int
  42. Source Locations Associate AST fragments back to their original source

    location Use that for diagnostics, pro fi ling, debugging… 12
  43. Source Locations 1. GHC Call Stacks: GHC.Stack 2. RTS Execution

    Stacks: GHC.ExecutionStack 13 • Created at compile time • Functions require a HasCallStack constraint
  44. Source Locations 1. GHC Call Stacks: GHC.Stack 2. RTS Execution

    Stacks: GHC.ExecutionStack 13 • Runtime backtraces! • No changes to user code required! • Created at compile time • Functions require a HasCallStack constraint
  45. Source Locations 1. GHC Call Stacks: GHC.Stack 2. RTS Execution

    Stacks: GHC.ExecutionStack 13 • Runtime backtraces! • No changes to user code required! • Currently unusable 🙁 • Created at compile time • Functions require a HasCallStack constraint
  46. Source Locations Three scenarios: Regular functions: GHC Call Stacks (Existing)

    Type class methods: RTS Execution Stacks Pattern synonyms: GHC Call Stacks (plus some trickery) 14 https://gitlab.haskell.org/ghc/ghc/-/issues/19289
  47. HasCallStack 15 printError : : HasCallStack = > String -

    > IO () printError msg = putStrLn msg > > print callStack
  48. HasCallStack 15 printError : : HasCallStack = > String -

    > IO () printError msg = putStrLn msg > > print callStack printError : : (?callStack : : CallStack) = > String - > IO () printError msg = putStrLn msg > > print ?callStack desugars to….
  49. HasCallStack 16 main : : HasCallStack = > IO ()

    main = foo foo : : IO () - - silent error: no HasCallStack constraint! foo = bar bar : : HasCallStack = > IO () - - only prints ‘bar’ bar = print callStack
  50. HasCallStack 16 main : : HasCallStack = > IO ()

    main = foo foo : : IO () - - silent error: no HasCallStack constraint! foo = bar bar : : HasCallStack = > IO () - - only prints ‘bar’ bar = print callStack
  51. HasCallStack 16 main : : HasCallStack = > IO ()

    main = foo foo : : IO () - - silent error: no HasCallStack constraint! foo = bar bar : : HasCallStack = > IO () - - only prints ‘bar’ bar = print callStack
  52. SourceMapped 17 data OpaqueType = NotExported type SourceMapped = (

    ?requiresSourceMapping : : OpaqueType, HasCallStack ) - - Throws an error if the caller did not have the HasCallStack constraint sourceMap : : HasCallStack = > (SourceMapped = > a) - > a sourceMap k = . . .
  53. SourceMapped 17 data OpaqueType = NotExported type SourceMapped = (

    ?requiresSourceMapping : : OpaqueType, HasCallStack ) - - Throws an error if the caller did not have the HasCallStack constraint sourceMap : : HasCallStack = > (SourceMapped = > a) - > a sourceMap k = . . . The only way to satisfy the SourceMapped constraint
  54. SourceMapped 18 main : : HasCallStack = > IO ()

    main = foo foo : : IO () - - silent error: no HasCallStack constraint foo = bar bar : : HasCallStack = > IO () bar = print callStack qux : : HasCallStack = > qux = sourceMap bar - - Runtime error: no HasCallStack IO ()
  55. SourceMapped 18 main : : HasCallStack = > IO ()

    main = foo foo : : IO () - - silent error: no HasCallStack constraint foo = bar bar : : HasCallStack = > IO () bar = print callStack qux : : HasCallStack = > qux = sourceMap bar SourceMapped - - Runtime error: no HasCallStack IO ()
  56. SourceMapped 18 main : : HasCallStack = > IO ()

    main = foo foo : : IO () - - silent error: no HasCallStack constraint foo = bar bar : : HasCallStack = > IO () bar = print callStack qux : : HasCallStack = > qux = sourceMap bar SourceMapped - - Compilation error: unbound implicit parameter - - Runtime error: no HasCallStack IO ()
  57. SourceMapped 18 main : : HasCallStack = > IO ()

    main = foo foo : : IO () - - silent error: no HasCallStack constraint foo = bar bar : : HasCallStack = > IO () bar = print callStack qux : : HasCallStack = > qux = sourceMap bar SourceMapped - - Compilation error: unbound implicit parameter - - Runtime error: no HasCallStack IO ()
  58. SourceMapped 18 main : : HasCallStack = > IO ()

    main = foo foo : : IO () - - silent error: no HasCallStack constraint foo = bar bar : : HasCallStack = > IO () bar = print callStack qux : : HasCallStack = > qux = sourceMap bar SourceMapped - - Compilation error: unbound implicit parameter IO () - - Works!
  59. Putting it together 19 data Ann = Ann { locations

    : : HashSet CallStack, . . . } mkAnn : : SourceMapped = > Ann mkAnn = Ann { locations = capture ?callStack } where capture = . . . constant : : HasCallStack = > Int - > Exp Int constant = sourceMap $ Lit mkAnn
  60. Putting it together 19 data Ann = Ann { locations

    : : HashSet CallStack, . . . } mkAnn : : SourceMapped = > Ann mkAnn = Ann { locations = capture ?callStack } where capture = . . . constant : : HasCallStack = > Int - > Exp Int constant = sourceMap $ Lit mkAnn • inlining • loop unrolling • [no] fast math • …
  61. Case study: Accelerate • Deeply embedded language for data-parallel array

    computations - Multiple backends (CPU, GPU, …) - Multiple expression types (collective array and scalar expression) - Multiple AST types (surface language in HOAS, internal language is fi rst-order) 20
  62. Accelerate: Sharing recovery • Finds shared parts of the program

    based on stable names; moves those parts to explicit binders • NEW: - Keep track of annotation state during sharing recovery - Enable terms to be inlining by ignoring sharing 21
  63. Accelerate: Array fusion • Producer/consumer fusion rewrites the program to

    combine operations 22 map f . map g map (f . g) rewries into…
  64. Accelerate: Array fusion • Producer/consumer fusion rewrites the program to

    combine operations • NEW: - Multiple AST nodes may be merged; new annotation contains the union of the call stack sets - Source locations may be disjoint - Optimisation fl ags might differ… 22 map f . map g map (f . g) rewries into…
  65. Pro fi ling LULESH • One kernel accounts for 66%

    of the runtime - A good candidate to experiment with loop unrolling… 
 
 
 
 
 24 Ryzen 9 5900x: 8.66 s - > 9.6 s RTX 2080 Super: 3.13 s - > 3.07 ± 0.04 s
  66. Pro fi ling LULESH • One kernel accounts for 66%

    of the runtime - A good candidate to experiment with loop unrolling… 
 
 
 
 
 24 Ryzen 9 5900x: 8.66 s - > 9.6 s RTX 2080 Super: 3.13 s - > 3.07 ± 0.04 s
  67. Pro fi ling LULESH • One kernel accounts for 66%

    of the runtime - A good candidate to experiment with loop unrolling… 
 
 
 
 
 24 Ryzen 9 5900x: 8.66 s - > 9.6 s RTX 2080 Super: 3.13 s - > 3.07 ± 0.04 s • 14x higher L1 inst. cache miss rate • 7x higher LL inst. cache miss rate
  68. Summary New annotation system opens the door to a better

    developer experience Future work: 1. Expression-level debugging and pro fi ling 2. Granular loop optimisations 3. …? 4. Merge it 25