Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Compiler in JavaScript using ANTLR

Compiler in JavaScript using ANTLR

Slides from my talk at BerlinJS meetup in August 2018. The talk was about language compilers and basic principles behind it.

Alena Khineika

August 16, 2018
Tweet

Other Decks in Programming

Transcript

  1. • Find existing parsers for specific programming languages • Create

    your own parser SOURCE-TO-SOURCE TRANSFORMATION
  2. • Find existing parsers for specific programming languages • Create

    your own parser • Use some tools and libraries to generate a parser SOURCE-TO-SOURCE TRANSFORMATION
  3. 1. LEXICAL ANALYSIS { X : 1 } OpenBrace {

    Identifier x Colon : Identifier 1 CloseBrace } Tokens Raw Code Lexical Analysis JS Tokens Input
  4. FROM STRING TO TREE Tree Structure { X : 1

    } OpenBrace { Identifier x Colon : Identifier 1 CloseBrace } Tokens Raw Code
  5. +

  6. ANTLR • Another Tool for Language Recognition • Generates parser

    basing on grammars • Written in Java, but has JavaScript runtime
  7. Parse Tree • Represents concrete syntax • Contain unusable information

    Abstract Syntax Tree • Represents abstract syntax • Contain only meaningful information
  8. ANTLR AUXILIARY FILES *Lexer.js Lexical Analysis *Parser.js Syntax Analysis JS

    Characters Tree Input Tokens *Visitor.js Tree Traversal New Tree
  9. http://bit.ly/compiler-in-javascript This article can be viewed as both a practical

    guide for writing a JavaScript compiler and a theoretical resource that describes the basic concepts and principles of compiler design.
  10. → enterPropertyAssignment(ctx) → enterPropertyName(ctx) → enterIdentifierName(ctx) → enterTerminal(ctx) ← exitTerminal(ctx)

    ← exitIdentifierName(ctx) ← exitPropertyName(ctx) ← exitPropertyAssignment(ctx) ↓ visitPropertyNameAndValue(ctx) ↓ visitPropertyAssignment(ctx) ↓ visitPropertyName(ctx) ↓ visitIdentifierName(ctx) ↓ visitTerminal(ctx) Visitor • You should explicitly call child methods • Methods can return any custom type • You can modify existing nodes Listener • Methods are called automatically • Methods can’t return a value • You should store values outside the tree
  11. JAVASCRIPT INPUT { x: 1 } VISITOR 1 ROOT NODE

    LEAF NO CHANGES HERE 2 FORMATTING x: 1 ‘x’: 1 3
  12. JAVASCRIPT INPUT { x: 1 } VISITOR 1 ROOT NODE

    LEAF NO CHANGES HERE 2 FORMATTING x: 1 ‘x’: 1 3 PYTHON OUTPUT { ‘x’: 1 }
  13. Standard methods: visit() visitTerminal() LEFT HAND SIDE VISITING * The

    same logic is applicable for the right side NO CHANGES HERE 2
  14. FLOATING POINT NUMBERS DIFFERENT NUMERAL SYSTEMS SINGLE AND DOUBLE QUOTES

    What do we support? Don't damage Pi Answer should be accurate USER INPUT FORMAT Spaces, empty lines etc. ESCAPE SEQUENCES Have to be escaped COMMENTS Yes / No