Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ANTLR4

 ANTLR4

A (rather simple) overview

https://github.com/migulorama/AutoAnalyze

Duarte Duarte

April 08, 2014
Tweet

More Decks by Duarte Duarte

Other Decks in Programming

Transcript

  1. ANTLR 4 A (RATHER SIMPLE) OVERVIEW By Duarte Duarte, Luís

    Cleto, Miguel Marques and Ruben Cordeiro COMP/MIEIC/FEUP - 2013/2014
  2. WHAT'S ANTLR 4? ANTLR 4 is a parser generator, generating

    parse trees automatically from a given grammar. It's a parser generator and a language tool.
  3. INDUSTRY ADOPTION Twitter search uses ANTLR for query parsing. Oracle

    uses ANTLR within the SQL Developer IDE. The NetBeans IDE parses C++ with ANTLR.
  4. WHAT DOES PARSING WITH ANTLR4 MEAN? Structure is described as

    a grammar. Recognizer is divided into lexer and parser. Parse tree can be auto generated.
  5. ANTLR4 ALL(*) Generates recursive descent parser like what you would

    build by hand Does dynamic grammar analysis while parsing saves results like a JIT does Allows for direct left recursion indirect left recursion not supported Parsing time typically near-linear
  6. ANTLR4 ALL(*) By moving grammar analysis to parse-time, it's possible

    to avoid the common pitfalls of ambiguity and left-recursion.
  7. GRAMMAR SPECIFICATION grammar assignment; stat: ID '=' expr ';' //

    match an assignment; can match "f();" | ID '=' expr ';' // oops! an exact duplicate of previous alternative | expr ';' // expression statement ; expr: ID '(' ')' | INT ; ID : [a‐z]+ ; // match one or more of any lowercase letter INT : [0‐9]+ ; // match integers The ANTLR tool generates recursive-descent parsers from grammar rules such as the ones above.
  8. Given this concrete structure, we can use the tree-walking mechanisms

    that ANTLR generates automatically, rather than writing the same tree-walking boilerplate code for each application. PARSE-TREE LISTENERS AND VISITORS
  9. PARSE-TREE LISTENERS AND VISITORS Options for separation of actions from

    grammars in ANTLR4: Listeners (SAX style) Visitors (DOM style)
  10. We built a calculator without having to insert raw actions

    into the grammar. The grammar is kept application independent and programming language neutral. PRATICAL EXAMPLE Java
  11. SUMMARY ANTLR4's release codename is Honey Badger. It was named

    after the extremely resilient animal since ANTLR4 takes whatever you give it: it just doesn't give a damn!