Slide 1

Slide 1 text

SableCC 1. Based on thesis of Étienne M. Gagnon http://sablecc.sourceforge.net/downl oads/thesis.pdf Aggelos Biboudis 1

Slide 2

Slide 2 text

Some History • [1975] Lex (token parser-stream of chars to tokens) – Lex builds a function implementing a deterministic finite automaton to recognize regular expressions in linera time – Lex is able to read arbitrary input, and determine what each part of the input is. This is called 'Tokenizing'. • [1975] YACC (parser generator that uses LALR(1) via table-based bottom up parsing) – Parses stream of tokens • [1989] PCCTS (parser generator that uses LL(*)) that builds recursive descent parsers Aggelos Biboudis 2

Slide 3

Slide 3 text

What is SableCC SableCC is an OO-framework that is based only on the lexical and grammatical definition of the compiled language •Parser automatically builds the AST •AST nodes are strictly typed •Analysis is written in each own class •Analysis is separate from nodes Aggelos Biboudis 3

Slide 4

Slide 4 text

General Steps 1. Creation of a SableCC specification file containing lexical definitions and the grammar 2. We launch the SableCC with the specification file as input 3. We create working classes 4. We create a main class to activate the lexer, parser and working classes 5. We compile everything with java compiler Aggelos Biboudis 4

Slide 5

Slide 5 text

Specification files • Lexical and grammar definitions only • A destination root java package (where to put the generated files?) • Lexical definitions use regular expressions • Grammar is written in BNF Aggelos Biboudis 5

Slide 6

Slide 6 text

Generated Files • Four packages are generated: lexer, parser, node and analysis – Lexer and exceptions – Parser and exceptions – Node classes for a typed AST – Analysis contains one interface and three classes for AST walking Aggelos Biboudis 6

Slide 7

Slide 7 text

Creating the specification file • >java SableCC postfix.grammar Aggelos Biboudis 7

Slide 8

Slide 8 text

Output Aggelos Biboudis 8

Slide 9

Slide 9 text

Create a translation class Aggelos Biboudis 9

Slide 10

Slide 10 text

Adding a main and compiling • >javac postfix\Compiler.java • >java postfix.Compiler Aggelos Biboudis 10

Slide 11

Slide 11 text

Lexer • Package declaration • Characters and character sets – Char, Decimal, Hex, Range, Union, Difference etc • Regular Expressions – line comment = '/' '/' [[0 .. 0xFFFF] - [10 + 13]]* (10 |13 | 10 13) • Helpers (not macros) – h = ‘a’ | ‘b’, t = ‘a’ h ‘b’ (t can be “aab”, “abb”, textual replacement would be a pitfall) • Tokens with optional lookahead • States (e.g. bol, inline, incomment) Aggelos Biboudis 11

Slide 12

Slide 12 text

Parser • Parser class that builds a typed AST automatically while processing the input • Productions – EBNF syntax (*,+,?) • Optional: x? • Just list: x* • Non-empty list: x+ – No action code in specification (what is action?) – Naming rules (part1_part2_...)->PPart1Part2 Aggelos Biboudis 12

Slide 13

Slide 13 text

Resources • http://sablecc.sourceforge.net/ • http://www.comp.nus.edu.sg/~sethhetu/rooms/Tu torials/EclipseAndSableCC.html Aggelos Biboudis 13