Slide 1

Slide 1 text

Michael Ficarra An Analysis of the Redesign of the CoffeeScript Compiler

Slide 2

Slide 2 text

CoffeeScript http://coffeescript.org "a little language that compiles into JavaScript" I prefer "an alternate syntax for idiomatic JS"

Slide 3

Slide 3 text

Michael Ficarra /michaelficarra ● CoffeeScript maintainer ○ worked on jashkenas/coffee-script for years ○ influential in the language's development ● contribute to many ECMAScript projects ○ constellation/escodegen ○ constellation/esmangle ○ documentcloud/underscore ○ kriskowal/es5-shim ● and plenty of my own -- check them out

Slide 4

Slide 4 text

http://www.kickstarter.com/projects/michaelficarra/make-a-better-coffeescript-

Slide 5

Slide 5 text

Project Goals ● separation of concerns ○ modularity ○ use and expose standardised IRs ● bug fixes ○ especially two-pass symbol generation ● source maps ● better error reporting ● mild extensibility ○ support multiple (similar) compilation targets ○ syntax extension is out of scope

Slide 6

Slide 6 text

Where do we start? ● Definitions: define the language ○ jashkenas/coffee-script is overly permissive ■ loosely defines the language as whatever passes through the compiler without an error ■ these need to be disallowed ○ jashkenas/coffee-script is sometimes too restrictive ■ mostly due to parser failings ■ these need to be allowed $ coffee -bep 'a is b and c = d' var c; a === b && (c = d); $ coffee -bep 'fn ->, ->' Error: Parse error on line 1: Unexpected ','

Slide 7

Slide 7 text

Where do we start? ● Definitions: define the language with ○ consistent syntactic rules ○ consistent semantics to go with them ○ an AST format that can represent CoffeeScript programs ● Process ○ break down compilation into individual components ○ provide an interface for composition

Slide 8

Slide 8 text

Parser Preprocessor Independent Components CS context free CS Code Generator CS AST JS AST context free CS CS AST JS AST JS + source map Compiler

Slide 9

Slide 9 text

CS Code Generator Independent Components Analysis CS AST CS AST CS CS AST Optimiser Predicate Yes / No CS AST

Slide 10

Slide 10 text

Syntax Formatter Compositions CS jashkenas/coffee-script CS JS CS ● preprocessor ● parser ● compiler ● JS code generator ● discard the source map ● preprocessor ● parser ● CS code generator

Slide 11

Slide 11 text

CLI: Composition and I/O output destination: --output CS context free CS CS AST JS AST CS JS + source map JavaScript: --js source map: --source-map input source: (defaults to stdin) --input --cli preprocessed: (not standardised) N/A parsed: --parse compiled: --compile CoffeeScript: --cscodegen

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

No content

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

● Chose to generate the parser from a parsing expression grammar (PEG) ● Upsides of PEGs ○ operates in time linear to input length ○ better error reporting ■ can enumerate all valid inputs following read position ○ good JS tooling support available at the time ○ fully describe the syntax of the language in one place ■ no separate lexer Parsing

Slide 18

Slide 18 text

No content

Slide 19

Slide 19 text

No content

Slide 20

Slide 20 text

No content

Slide 21

Slide 21 text

No content

Slide 22

Slide 22 text

No content

Slide 23

Slide 23 text

● Chose to generate the parser from a parsing expression grammar (PEG) ● Downsides of PEGs ○ not runtime extensible like parser combinators ■ builds parsers from other parsers ■ built at runtime, so may be overridden or extended ○ can only accept context-free languages ■ parser for context-sensitive languages needs an additional stack ■ PDA accepts context-free languages ■ LBA is needed to accept context-sensitive languages Parsing

Slide 24

Slide 24 text

● one really simple job ○ keep stack of context tokens as input is read ○ insert context boundary markers context boundaries: ● additional benefits ○ assures pairing chars are paired before parsing ○ enforces consistent indentation style Preprocessing (INDENT) (DEDENT) " " """ """ { } ` ` ' ' ''' ''' ( ) #{ } / / /// /// [ ] # (line terminator) ### ###

Slide 25

Slide 25 text

No content

Slide 26

Slide 26 text

https://developer.mozilla.org/en-US/docs/SpiderMonkey/Parser_API

Slide 27

Slide 27 text

Spidermonkey AST Example ariya/esprima input: { block: statement } ariya/esprima output: { type: 'Program', body: [ { type: 'BlockStatement', body: [ { type: 'LabeledStatement', label: { type: 'Identifier', name: 'block' }, body: { type: 'ExpressionStatement', expression: { type: 'Identifier', name: 'statement' } } } ] } ] } ariya/esprima input: ({object: expression}) ariya/esprima output: { type: 'Program', body: [ { type: 'ExpressionStatement', expression: { type: 'ObjectExpression', properties: [ { type: 'Property', key: { type: 'Identifier', name: 'object' }, value: { type: 'Identifier', name: 'expression' }, kind: 'init' } ] } } ] }

Slide 28

Slide 28 text

Spidermonkey AST Tools ariya/esprima JS AST JS yahoo/istanbul JS AST (instrumented) ● ECMAScript 5 parser ● extremely true to spec. ○ aside from some minor restrictions around early errors ● harmony branch ● instruments Spidermonkey AST for code coverage ● instrumented code produces standardised report (LCOV) JS AST

Slide 29

Slide 29 text

Spidermonkey AST Tools constellation/escodegen JS AST mozilla/sweet.js JS AST ● JS code generator ● configurable formatting with minification defaults ● guarantees parse(gen(tree)) == tree ● result of Tim Disney's Mozilla internship ● Creates augmented parser using user-provided macro definitions JS (using macros) JS macro defs.

Slide 30

Slide 30 text

Spidermonkey AST Tools constellation/esmangle JS AST ● generates semantically equivalent, syntactically minimal AST ● more difficult (and fun) than it sounds ● name mangling ● constant folding ● fixed-point evaluation of set of declarative rules ● 2 phases ○ AST simplification rules ■ !!!a => !a ○ syntactic simplification (AST expansion) rules ■ a.Infinity => a[1/0] ■ true => !0 ● declarative rule specification is extensible and modular JS AST

Slide 31

Slide 31 text

No content

Slide 32

Slide 32 text

Spidermonkey AST Tools constellation/estraverse ● extracted from esmangle project ● escodegen also uses it ● provides AST traversal functions ● implements simple visitor pattern on Spidermonkey AST pufuwozu/brushtail ● tail call elimination on spidermonkey ASTs ● uses estraverse and escope constellation/escope ● extracted from esmangle project ● provides static scope analysis ● predicates such as ○ isStatic (detects global, with, presence of direct eval) ○ isArgumentsMaterialized ● you probably don't know catch variables are block scoped in JS ○ escope does ○ (and CoffeeScript fixes this for you anyway)

Slide 33

Slide 33 text

Spidermonkey AST ● not perfect ○ some trees are impossible syntactic constructs { type: 'IfStatement', test: ..., consequent: { type: 'IfStatement', test: ..., consequent: ..., alternate: null }, alternate: ... } ○ no way to represent directive statements ● still better than alternatives ○ adoption has hit critical mass ○ interop with those tools is too valuable

Slide 34

Slide 34 text

Use Standardised IRs! ● take advantage of other open source projects ● your users can extract parts of your project ● in case of jashkenas/coffee-script ○ compiler and parser/rewriter are highly coupled ○ code generation is intermixed with compilation ○ code gen bugs are common ○ code gen logic is strewn throughout the compiler ○ no consistent concept of target's syntax ■ statement vs. expression ({} is different in different positions) ■ operator precedence ■ special syntactic constructs (esp. surrounding `new` operator) ■ significant whitespace

Slide 35

Slide 35 text

No content

Slide 36

Slide 36 text

No content

Slide 37

Slide 37 text

Doing it Right ● esprima ● acorn ● estraverse ● escope ● escodegen ● esmangle ● brushtail ● Sweet.js ● istanbul ● ibrik ● code painter ● LLJS ● RumCoke ● JSX

Slide 38

Slide 38 text

Calling You Out ● TypeScript ● ClojureScript ● UglifyJS ● UglifyJS2 (sigh) ● Dart ● Google Closure Compiler ● Roy (soon!) ● LiveScript (soon!) ● jashkenas/coffee-script

Slide 39

Slide 39 text

Optimisation / Compilation ● declarative rule specification ○ inherently extensible ● optimiser: fixpoint evaluation strategy CS AST JS AST Compiler CS AST CS AST Optimiser

Slide 40

Slide 40 text

No content

Slide 41

Slide 41 text

No content

Slide 42

Slide 42 text

No content

Slide 43

Slide 43 text

No content

Slide 44

Slide 44 text

Symbol Generation ● long-running problem with jashkenas/coffee- script ● common issue for our users ● very difficult to fix with the current compiler design $ coffee -bep '_this = 0; fn = => this' var fn, _this = this; _this = 0; fn = function() { return _this; };

Slide 45

Slide 45 text

No content

Slide 46

Slide 46 text

No content

Slide 47

Slide 47 text

No content

Slide 48

Slide 48 text

No content

Slide 49

Slide 49 text

No content

Slide 50

Slide 50 text

No content

Slide 51

Slide 51 text

Symbol Generation ● did you catch my hypocrisy? ● that IR is neither standardised nor exposed ● don't want to force this to be two operations ○ steps can be interleaved for performance ○ but the IR might actually be useful; it's a tradeoff CS AST JS AST +gensyms Compiler (in reality) JS AST

Slide 52

Slide 52 text

Source Maps ● set of mappings from section of JavaScript to section of source text directly responsible for producing it ● supported in Chrome ● Firefox support coming soon ○ see bugzilla #771597 ● Debug as if the source text is actually running in your JS interpreter

Slide 53

Slide 53 text

No content

Slide 54

Slide 54 text

Source Maps 1. preserve source info in parser 2. preserve source info through transformations ○ optimiser ○ compiler 3. modify escodegen to create a CST instead of a string 4. use mozilla/source-map to generate source map and flatten CST to JS

Slide 55

Slide 55 text

No content

Slide 56

Slide 56 text

No content

Slide 57

Slide 57 text

No content

Slide 58

Slide 58 text

No content

Slide 59

Slide 59 text

No content

Slide 60

Slide 60 text

No content

Slide 61

Slide 61 text

Image by Ryan Florence

Slide 62

Slide 62 text

No content

Slide 63

Slide 63 text

Current Status ● fixed over 50 open bugs ● implemented 20 accepted enhancements ● fairly stable interfaces ● 98% feature complete ● extensible design ● source map generation + esmangle integration ● great parser and runtime error reporting ● being integrated with a popular IDE People are using it and contributing!

Slide 64

Slide 64 text

No content

Slide 65

Slide 65 text

http://michaelficarra.github.com/CoffeeScriptRedux/

Slide 66

Slide 66 text

Future Work ● minor bug fixes ● loosen some whitespace restrictions ● more complete test suite ● rewrite parser actions in CoffeeScript ● remove some accidental mutation in compiler and optimiser rules ● update text editor plugins ● consider performance ● release 2.0, replace jashkenas/coffee-script ● fork and make it my own

Slide 67

Slide 67 text

No content

Slide 68

Slide 68 text

No content

Slide 69

Slide 69 text

Summary ● carefully choose your IRs ○ use standards whenever possible ○ expose them ○ take advantage of others' tools that operate on your IRs ○ for structured JS representation, use Mozilla's Spidermonkey API ○ JS code gen in JS from this representation is a solved problem; use escodegen or equivalent ● declarative behaviour specification is inherently extensible ● this compiler is a huge improvement over what we had before ○ start using it right now ○ report bugs and tell me what to work on next

Slide 70

Slide 70 text

No content

Slide 71

Slide 71 text

No content