The Shape of Kotlin

November 30, 2019

At Kotlin Day 2019 in London, I share how I stumbled on to AST parsing, debunk the Kotlin compiler, and share how 47 Degrees is taking advantage of AST parsing to elevate the power of Kotlin metaprogramming.



    47 Degrees is a global consulting firm and certified LightBend

    and Databricks Partner Specializing in 6
  5. 9. 9 By studying AST, we can learn a lot

    of about the Kotlin compiler.
  6. 11. • AST is a form of abstracted representation that

    is generated and used in several roles within the compiler • AST is a tree made of nodes that have direct mapping to the text ranges in the underlying document 11 Bottom-most nodes of an AST matches individual tokens Higher nodes matches multiple-token fragments
  7. 12. What does AST parsing tell us? Amanda Hinchman-Dominguez •

    AST parsing tells us how code has been written by the end user • AST parsing gives all but punctuation in the analyzed text range including tokens 12
  9. 16. Amanda Hinchman-Dominguez Parsing Phase • Builds the AST tree

    • Analyzes the tree and augments with complete information
  10. 17. Amanda Hinchman-Dominguez • The basis for any compiler optimization

    • Transforms the input program into unoptimized intermediate representation • Generates Program Structure Interface (PSI) Analysis Phase
  11. 18. Amanda Hinchman-Dominguez • Generates 2 symbol tables: ◦ One

    to accompany AST ◦ Another for the associating generated IR model • PSI enhanced with descriptors which have been typed-checked • Optimizations performed on IR to improve quality and performance of machine code Resolution Phase
  12. 21. Amanda Hinchman-Dominguez • Lexer breaks code text into a

    sequence of lexical token • Lexer may break code into multiple fragments while scanning or into lexemes Lexer
  13. 22. Amanda Hinchman-Dominguez Builds the AST tree. The parse tree

    is often: • Analyzed • Augmented • Transformed In later phases of the compiler Syntax Analyzer
  14. 23. Amanda Hinchman-Dominguez • Compiler checks the AST tree for

    type checking and semantic analysis • Generates symbol table & IR Semantic Analyzer
  15. 24. Amanda Hinchman-Dominguez • Outputs unoptimized intermediate representation (IR) •

    Analysis performed on IR ◦ Control flow ◦ Call stacks Intermediate Code Generator
  16. 25. Amanda Hinchman-Dominguez • Machine-dependent optimizations on IR • Improves

    performance & quality of produced machine code • Resource & storage decisions Intermediate Code Optimizer
  17. 31. Amanda Hinchman-Dominguez • IR is generated as another form

    of abstracted representation for CPU-level architecture • PSI and IR each have symbol tables mapping their nodes to descriptors PSI Descriptor IR
  18. 33. Arrow-meta intercepts AST & its resulting models • AST

    allows us to alter the surface level of language without changing the rest of the compiler (although we can and usually do) 33
