The Shape of Kotlin

209bed5a8a467a07a397b25c00a26e9e?s=47 mvndy_hd
November 30, 2019

The Shape of Kotlin

At Kotlin Day 2019 in London, I share how I stumbled on to AST parsing, debunk the Kotlin compiler, and share how 47 Degrees is taking advantage of AST parsing to elevate the power of Kotlin metaprogramming.



November 30, 2019


  1. 1. The Shape of Kotlin AST Parsing and its role

    in the Kotlin compiler Amanda Hinchman-Dominguez
  2. 2. The Shape of Kotlin • How I stumbled on

    to AST parsing • Debunk the black box that is the Kotlin Compiler • How 47 Degrees is leveraging AST parsing in Arrow-meta elevate the power of Kotlin metaprogramming 2 Amanda Hinchman-Dominguez
  3. 4. 4 Detecting UI inputs Amanda Hinchman-Dominguez • Abstract Syntax

    Tree Parsing (AST) • Program Structure Interface (PSI)
  4. 6.

    47 Degrees is a global consulting firm and certified LightBend

    and Databricks Partner Specializing in 6
  5. 9. 9 By studying AST, we can learn a lot

    of about the Kotlin compiler.
  6. 11. • AST is a form of abstracted representation that

    is generated and used in several roles within the compiler • AST is a tree made of nodes that have direct mapping to the text ranges in the underlying document 11 Bottom-most nodes of an AST matches individual tokens Higher nodes matches multiple-token fragments
  7. 12. What does AST parsing tell us? Amanda Hinchman-Dominguez •

    AST parsing tells us how code has been written by the end user • AST parsing gives all but punctuation in the analyzed text range including tokens 12
  9. 16. Amanda Hinchman-Dominguez Parsing Phase • Builds the AST tree

    • Analyzes the tree and augments with complete information
  10. 17. Amanda Hinchman-Dominguez • The basis for any compiler optimization

    • Transforms the input program into unoptimized intermediate representation • Generates Program Structure Interface (PSI) Analysis Phase
  11. 18. Amanda Hinchman-Dominguez • Generates 2 symbol tables: ◦ One

    to accompany AST ◦ Another for the associating generated IR model • PSI enhanced with descriptors which have been typed-checked • Optimizations performed on IR to improve quality and performance of machine code Resolution Phase
  12. 21. Amanda Hinchman-Dominguez • Lexer breaks code text into a

    sequence of lexical token • Lexer may break code into multiple fragments while scanning or into lexemes Lexer
  13. 22. Amanda Hinchman-Dominguez Builds the AST tree. The parse tree

    is often: • Analyzed • Augmented • Transformed In later phases of the compiler Syntax Analyzer
  14. 23. Amanda Hinchman-Dominguez • Compiler checks the AST tree for

    type checking and semantic analysis • Generates symbol table & IR Semantic Analyzer
  15. 24. Amanda Hinchman-Dominguez • Outputs unoptimized intermediate representation (IR) •

    Analysis performed on IR ◦ Control flow ◦ Call stacks Intermediate Code Generator
  16. 25. Amanda Hinchman-Dominguez • Machine-dependent optimizations on IR • Improves

    performance & quality of produced machine code • Resource & storage decisions Intermediate Code Optimizer
  17. 31. Amanda Hinchman-Dominguez • IR is generated as another form

    of abstracted representation for CPU-level architecture • PSI and IR each have symbol tables mapping their nodes to descriptors PSI Descriptor IR
  18. 33. Arrow-meta intercepts AST & its resulting models • AST

    allows us to alter the surface level of language without changing the rest of the compiler (although we can and usually do) 33
  19. 46. 46 • Arrow-meta: • Kotlin compiler crash course: Sources Cited