Slide 1

Slide 1 text

JavaCC & JTB Aggelos Biboudis 1

Slide 2

Slide 2 text

JavaCC • lexical analyzer (token manager) • generates a top-down parser (LL(k)) • a recursive descent parser • building trees with JJTree and JTB Aggelos Biboudis 2

Slide 3

Slide 3 text

How to use • javacc Simple1.jj • javac *.java • java Simple1 Aggelos Biboudis 3

Slide 4

Slide 4 text

JavaCC file • Options • PARSER_BEGIN(name) - PARSER_END(name) • Lexical specifications – SKIP, TOKEN, SPECIAL_TOKEN, MORE • List of productions – non terminal declaration followed by ; – declarations and statements within {} – lexical tokens as strings or regular expressions – use non terminals with [...] – actions Aggelos Biboudis 4

Slide 5

Slide 5 text

Generated files • .java: The generate parser. • TokenManager.java: The generated token manager (or scanner/lexical analyzer). • Constants.java: A bunch of useful constants. • Also some boilerplate at "Token.java", "ParseException.java" Aggelos Biboudis 5

Slide 6

Slide 6 text

Options • LOOKAHEAD • CHOICE_AMBIGUITY_CHEC K • OTHER_AMBIGUITY_CHECK • STATIC • SUPPORT_CLASS_VISIBILIT Y_PUBLIC • DEBUG_PARSER • DEBUG_LOOKAHEAD • DEBUG_TOKEN_MANAGER • ERROR_REPORTING • JAVA_UNICODE_ESCAPE • UNICODE_INPUT • IGNORE_CASE • USER_TOKEN_MANAGER • USER_CHAR_STREAM • BUILD_PARSER • BUILD_TOKEN_MANAGER • TOKEN_EXTENDS • TOKEN_FACTORY • TOKEN_MANAGER_USES_P ARSER • SANITY_CHECK • FORCE_LA_CHECK • COMMON_TOKEN_ACTION • CACHE_TOKENS • OUTPUT_DIRECTORY Aggelos Biboudis 6

Slide 7

Slide 7 text

Productions • javacode_production – code instead EBNF when non context free production or difficult grammar in general – black box • bnf_production – local_lookahead – java_block – "(" expansion_choices ")" [ "+" | "*" | "?" ] – "[" expansion_choices "]" – [ java_assignment_lhs "=" ] regular_expression – [ java_assignment_lhs "=" ] java_identifier "(" java_expression_list ")" • regular_expr_production • token_manager_decls Aggelos Biboudis 7

Slide 8

Slide 8 text

Regular Expressions in JavaCC • < ID: ["a"-"z","A"-"Z","_"] ( ["a"-"z","A"-"Z","_","0"-"9"] )* > • ( ... )+ • ( ... )? • ( r1 | r2 | ... ) • ["a"-"z"] • ~[] (any character) • ~["\n","\r"] (any character exception the new line characters) Aggelos Biboudis 8

Slide 9

Slide 9 text

Choice conflict • No backtracking! – Decisions are based on local information Aggelos Biboudis 9

Slide 10

Slide 10 text

Choice conflict (2) • Warning: Choice conflict involving two expansions at line 25, column 3 and line 31, column 3 respectively. A common prefix is: Consider using a lookahead of 2 for earlier expansion. Aggelos Biboudis 10

Slide 11

Slide 11 text

Choice Conflict (3) • Turn it into LL(1) • Make use of LOOKAHEAD – Global lookahead via option (do not inc it without good reason!) – Local lookahead at choice point Aggelos Biboudis 11

Slide 12

Slide 12 text

Other lookaheads • Syntactic lookahead • Semantic lookahead Aggelos Biboudis 12

Slide 13

Slide 13 text

Java Tree Builder • Consumes a jj grammar file and generates – Syntax tree classes based on productions in grammar – Visitor design pattern – Visitor and GJVisitor interfaces – Two depth-first visitors: DepthFirstVisitor (simply for visiting) and GJDepthFirst (with generic return and args) – A JavaCC grammar (output jtb.out.jj. It builds the tree during parsing). Aggelos Biboudis 13

Slide 14

Slide 14 text

Java Tree Builder Aggelos Biboudis 14

Slide 15

Slide 15 text

How to use (from jtb example) • jtb subscheme.jj • Code a Visitor that is used in the jj (FreeVarFinderVisitor in the example) • javacc jtb.out.jj • javac SubScheme.java • java SubScheme < inputfile Aggelos Biboudis 15

Slide 16

Slide 16 text

Tree Node Interface and Classes Aggelos Biboudis 16 • Based on the RHS of the production the public fields that are generated are of the following types – Node – NodeListInterface – NodeChoise – NodeList – NodeListOptional – NodeOptional – NodeSequence – NodeToken

Slide 17

Slide 17 text

JavaCC vs SableCC • SableCC supports LALR(1) • JavaCC supports LL(1) • SableCC more OOP, JavaCC more imperative • Lookahead seems like a hack but it is more straightforward. Aggelos Biboudis 17

Slide 18

Slide 18 text

Resources • http://javacc.java.net/doc/ • http://java.net/projects/javacc • http://www.engr.mun.ca/~theo/JavaC C-FAQ/javacc-faq-moz.htm • http://compilers.cs.ucla.edu/jtb/jtb- 2003/docs.html Aggelos Biboudis 18