The Shape of Kotlin

209bed5a8a467a07a397b25c00a26e9e?s=47 mvndy_hd
November 30, 2019

The Shape of Kotlin

At Kotlin Day 2019 in London, I share how I stumbled on to AST parsing, debunk the Kotlin compiler, and share how 47 Degrees is taking advantage of AST parsing to elevate the power of Kotlin metaprogramming.

209bed5a8a467a07a397b25c00a26e9e?s=128

mvndy_hd

November 30, 2019
Tweet

Transcript

  1. 1.

    47deg.com The Shape of Kotlin AST Parsing and its role

    in the Kotlin compiler Amanda Hinchman-Dominguez
  2. 2.

    47deg.com The Shape of Kotlin • How I stumbled on

    to AST parsing • Debunk the black box that is the Kotlin Compiler • How 47 Degrees is leveraging AST parsing in Arrow-meta elevate the power of Kotlin metaprogramming 2 Amanda Hinchman-Dominguez
  3. 4.

    47deg.com 4 Detecting UI inputs Amanda Hinchman-Dominguez • Abstract Syntax

    Tree Parsing (AST) • Program Structure Interface (PSI)
  4. 6.

    47 Degrees is a global consulting firm and certified LightBend

    and Databricks Partner 47deg.com Specializing in 6
  5. 9.

    47deg.com 9 By studying AST, we can learn a lot

    of about the Kotlin compiler.
  6. 11.

    47deg.com • AST is a form of abstracted representation that

    is generated and used in several roles within the compiler • AST is a tree made of nodes that have direct mapping to the text ranges in the underlying document 11 Bottom-most nodes of an AST matches individual tokens Higher nodes matches multiple-token fragments
  7. 12.

    47deg.com What does AST parsing tell us? Amanda Hinchman-Dominguez •

    AST parsing tells us how code has been written by the end user • AST parsing gives all but punctuation in the analyzed text range including tokens 12
  8. 14.

    47deg.com Amanda Hinchman-Dominguez 14 Element.INTEGER_CONSTANT Element.CALL_EXPRESSION Element.DOT_QUALIFIED_EXPLANATION Token.DOT Token.NUMBER Token.IDENTIFIER

    2.plus(3) Element.REFERENCE_ EXPRESSION Element.VALUE_ARGUMENT_LIST Token.LPAR Token.RPAR Element.VALUE_ARGUMENT 14 Element.INTEGER_CONSTANT Token.INTEGER_LITERAL
  9. 16.

    47deg.com Amanda Hinchman-Dominguez Parsing Phase • Builds the AST tree

    • Analyzes the tree and augments with complete information
  10. 17.

    47deg.com Amanda Hinchman-Dominguez • The basis for any compiler optimization

    • Transforms the input program into unoptimized intermediate representation • Generates Program Structure Interface (PSI) Analysis Phase
  11. 18.

    47deg.com Amanda Hinchman-Dominguez • Generates 2 symbol tables: ◦ One

    to accompany AST ◦ Another for the associating generated IR model • PSI enhanced with descriptors which have been typed-checked • Optimizations performed on IR to improve quality and performance of machine code Resolution Phase
  12. 21.

    47deg.com Amanda Hinchman-Dominguez • Lexer breaks code text into a

    sequence of lexical token • Lexer may break code into multiple fragments while scanning or into lexemes Lexer
  13. 22.

    47deg.com Amanda Hinchman-Dominguez Builds the AST tree. The parse tree

    is often: • Analyzed • Augmented • Transformed In later phases of the compiler Syntax Analyzer
  14. 23.

    47deg.com Amanda Hinchman-Dominguez • Compiler checks the AST tree for

    type checking and semantic analysis • Generates symbol table & IR Semantic Analyzer
  15. 24.

    47deg.com Amanda Hinchman-Dominguez • Outputs unoptimized intermediate representation (IR) •

    Analysis performed on IR ◦ Control flow ◦ Call stacks Intermediate Code Generator
  16. 25.

    47deg.com Amanda Hinchman-Dominguez • Machine-dependent optimizations on IR • Improves

    performance & quality of produced machine code • Resource & storage decisions Intermediate Code Optimizer
  17. 31.

    47deg.com Amanda Hinchman-Dominguez • IR is generated as another form

    of abstracted representation for CPU-level architecture • PSI and IR each have symbol tables mapping their nodes to descriptors PSI Descriptor IR
  18. 33.

    47deg.com Arrow-meta intercepts AST & its resulting models • AST

    allows us to alter the surface level of language without changing the rest of the compiler (although we can and usually do) 33
  19. 46.

    47deg.com 46 • Arrow-meta: https://github.com/arrow-kt/arrow-meta • Kotlin compiler crash course:

    https://github.com/ahinchman1/Kotlin-Compiler-Crash-Course Sources Cited