Slide 1

Slide 1 text

Don’t Be Afraid of ASTs Jamund Ferguson

Slide 2

Slide 2 text

Our Basic Plan 1. High-level overview 2. Static Analysis with ASTs 3. Transforming and refactoring 4. A quick look at the Mozilla Parser API (de-facto standard AST format)

Slide 3

Slide 3 text

An abstract syntax tree is basically a DOM for your code.

Slide 4

Slide 4 text

An AST makes it easier to inspect and manipulate your code with confidence.

Slide 5

Slide 5 text

{ "type": "Program", "body": [ { "type": “VariableDeclaration", "kind": "var" "declarations": [ { "type": "VariableDeclarator", "id": { "type": "Identifier", "name": "fullstack" }, "init": { "type": "BinaryExpression", "left": { "type": "Identifier", "name": "node" }, "operator": "+", "right": { "type": "Identifier", "name": "ui" } } } ], } ] }

Slide 6

Slide 6 text

var fullstack = node + ui;

Slide 7

Slide 7 text

{ "type": "Program", "body": [ { "type": “VariableDeclaration", "kind": "var" "declarations": [ { "type": "VariableDeclarator", "id": { "type": "Identifier", "name": "fullstack" }, "init": { "type": "BinaryExpression", "left": { "type": "Identifier", "name": "node" }, "operator": "+", "right": { "type": "Identifier", "name": "ui" } } } ], } ] }

Slide 8

Slide 8 text

Things Built On ASTs • Syntax Highlighting • Code Completion • Static Analysis (aka JSLint, etc.) • Code Coverage • Minification • JIT Compilation • Source Maps • Compile to JS Languages So much more…

Slide 9

Slide 9 text

Static Analysis

Slide 10

Slide 10 text

It’s not just about formatting.

Slide 11

Slide 11 text

Fix a bug. Add a unit test. Fix a similar bug…

Slide 12

Slide 12 text

Write some really solid static analysis. Never write that same type of bug again.

Slide 13

Slide 13 text

function loadUser(req, res, next) { User.loadUser(function(err, user) { req.session.user = user; next(); }); } Bad Example We forgot to handle the error!

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

1. Each time a function is declared check if there is an error* parameter 2. If so set a count to 0; 3. Increment count when error is used 4. At the end of the function warn when count is empty * the parameter name can be defined by the user handle-callback-err

Slide 16

Slide 16 text

Static Analysis • Complexity Analysis • Catching Mistakes • Consistent Style

Slide 17

Slide 17 text

History Lesson • 1995: JavaScript • 2002: JSLint started by Douglas Crockford • 2011: JSHint comes out as a fork of JSLint. Esprima AST parser released. • 2012: plato, escomplex, complexity-report • 2013: Nicholas Zakas releases ESLint. Marat Dulin releases JSCS.

Slide 18

Slide 18 text

My static analysis tool of choice is ESLint.

Slide 19

Slide 19 text

No content

Slide 20

Slide 20 text

JSHint Mixes the rule engine with the parser

Slide 21

Slide 21 text

Examples

Slide 22

Slide 22 text

no-console return { "MemberExpression": function(node) { if (node.object.name === "console") { context.report(node, "Unexpected console statement.”); } } }; https://github.com/eslint/eslint/blob/master/lib/rules/no-console.js

Slide 23

Slide 23 text

no-loop-func function checkForLoops(node) { var ancestors = context.getAncestors(); if (ancestors.some(function(ancestor) { return ancestor.type === "ForStatement" || ancestor.type === "WhileStatement" || ancestor.type === "DoWhileStatement"; })) { context.report(node, "Don't make functions within a loop"); } } return { "FunctionExpression": checkForLoops, "FunctionDeclaration": checkForLoops };

Slide 24

Slide 24 text

max-params var numParams = context.options[0] || 3; function checkParams(node) { if (node.params.length > numParams) { context.report(node, "This function has too many parameters ({{count}}). Maximum allowed is {{max}}.", { count: node.params.length, max: numParams }); } } return { “FunctionDeclaration”: checkParams, “FunctionExpression”: checkParams }

Slide 25

Slide 25 text

no-jquery function isjQuery(name) { return name === '$' || name === 'jquery' || name === 'jQuery'; } return { “CallExpression”: function(node) { var name = node.callee && node.callee.name; if (isjQuery(name)) { context.report(node, 'Please avoid using jQuery here.’); } } }

Slide 26

Slide 26 text

More Complex Rules • indent • no-extend-native • no-next-next • security • internationalization

Slide 27

Slide 27 text

Other Areas for Static Analysis Code complexity and visualization is another area where static analysis is really useful. Plato is an exciting start, but I believe there are tons of more interesting things that can be done in this area.

Slide 28

Slide 28 text

Recap • Static Analysis can help you catch real bugs and keep your code maintainable • ESLint and JSCS both use ASTs for inspecting your code to make it easy to cleanly to add new rules • Static analysis can also help you manage your code complexity as well • What exactly does a for loop sound like?

Slide 29

Slide 29 text

Transforms

Slide 30

Slide 30 text

Sometimes you want to step into the future, but something is keeping you in the past.

Slide 31

Slide 31 text

Maybe it’s Internet Explorer

Slide 32

Slide 32 text

Maybe it’s the size of your code base

Slide 33

Slide 33 text

ASTs to the rescue!

Slide 34

Slide 34 text

Tools like falafel and recast give you an API to manipulate an AST and then convert that back into source code.

Slide 35

Slide 35 text

Two Types of AST Transformations Regenerative Regenerate the full file from the AST. Often losing comments and non-essential formatting. Fine for code not read by humans (i.e. browserify transforms). Partial-source transformation Regenerate only the parts of the source that have changed based on the AST modifications. Nicer for one-time changes in source.

Slide 36

Slide 36 text

Build a Simple Browserify Transform

Slide 37

Slide 37 text

var fullstack = node + ui; var fullstack = node + browserify;

Slide 38

Slide 38 text

4 Steps 1. Buffer up the stream of source code 2. Convert the source into an AST 3. Transform the AST 4. Re-generate and output the source

Slide 39

Slide 39 text

Step 1 var through = require(‘through'); var buffer = []; return through(function write(data) { buffer.push(data); }, function end () { var source = buffer.join(‘’); }); Use through to grab the source code

Slide 40

Slide 40 text

Step 2 var falafel = require(‘falafel’); function end () { var source = buffer.join(‘’); var out = falafel(source, parse).toString(); } Use falafel to transform create an AST

Slide 41

Slide 41 text

Step 3 function parse(node) { if (node.type === 'Identifier' && node.value === ‘ui’) { node.update('browserify'); } } Use falafel to transform the AST

Slide 42

Slide 42 text

Step 4 function end () { var source = buffer.join(‘’); var out = falafel(source, parse).toString(); this.queue(out); this.queue(null); // end the stream } Stream the source with through and close the stream

Slide 43

Slide 43 text

var through = require('through'); var falafel = require('falafel'); module.exports = function() { var buffer = []; return through(function write(data) { buffer.push(data); }, function end() { var source = buffer.join('\n'); var out = falafel(source, parse).toString(); this.queue(out); this.queue(null); // close the stream }); }; function parse(node) { if (node.type === 'Identifier' && node.name === 'ui') { node.update('browserify'); } }

Slide 44

Slide 44 text

It Works! browserify -t ./ui-to-browserify.js code.js (function e(t,n,r){function s(o,u){if(!n[o]){if(!t[o]){var a=typeof require=="function"&&require;if(!u&&a)return a(o,!0);if(i)return i(o,!0);var f=new Error("Cannot find module '"+o+"'");throw f.code="MODULE_NOT_FOUND",f} var l=n[o]={exports:{}};t[o][0].call(l.exports,function(e){var n=t[o][1] [e];return s(n?n:e)},l,l.exports,e,t,n,r)}return n[o].exports}var i=typeof require=="function"&&require;for(var o=0;o

Slide 45

Slide 45 text

Lots of code to do something simple?

Slide 46

Slide 46 text

Probably, but… It will do exactly what is expected 100% of the time.

Slide 47

Slide 47 text

And it’s a building block for building a bunch of cooler things.

Slide 48

Slide 48 text

What sort of cooler things?

Slide 49

Slide 49 text

How about performance?

Slide 50

Slide 50 text

No content

Slide 51

Slide 51 text

V8 doesn’t do it, but there’s nothing stopping you*.

Slide 52

Slide 52 text

*Except it’s hard.

Slide 53

Slide 53 text

A Basic Map/Filter var a = [1, 2, 3]; var b = a.filter(function(n) { return n > 1; }).map(function(k) { return k * 2; });

Slide 54

Slide 54 text

Faster Like This var a = [1, 2, 3]; var b = []; for (var i = 0; i < a.length; i++) { if (a[i] > 1) { b.push(a[i] * 2); } }

Slide 55

Slide 55 text

No content

Slide 56

Slide 56 text

A Basic Recast Script var recast = require(‘recast’); var code = fs.readFileSync(‘code.js', 'utf-8'); var ast = recast.parse(code); var faster = transform(ast); var output = recast.print(faster).code;

Slide 57

Slide 57 text

function transform(ast) { var transformedAST = new MapFilterEater({ body: ast.program.body }).visit(ast); return transformedAST; } var Visitor = recast.Visitor; var MapFilterEater = Visitor.extend({ init: function(options) {}, visitForStatement: function(ast) {}, visitIfStatement: function(ast) {}, visitCallExpression: function(ast) {}, visitVariableDeclarator: function(ast) {} });

Slide 58

Slide 58 text

How Does it Work? 1. Move the right side of the b declaration into a for loop 2. Set b = [] 3. Place the .filter() contents inside of an if statement 4. Unwrap the .map contents and .push() them into b 5. Replace all of the local counters with a[_i]

Slide 59

Slide 59 text

No content

Slide 60

Slide 60 text

No content

Slide 61

Slide 61 text

No content

Slide 62

Slide 62 text

No content

Slide 63

Slide 63 text

And Voila…. var a = [1, 2, 3]; var b = []; for (var i = 0; i < a.length; i++) { if (a[i] > 1) { b.push(a[i] * 2); } }

Slide 64

Slide 64 text

Worth the effort? YES!

Slide 65

Slide 65 text

The most well-read documentation for how to engineer your app is the current codebase.

Slide 66

Slide 66 text

If you change your code, you can change the future.

Slide 67

Slide 67 text

Knowledge What is an AST and what does it look like?

Slide 68

Slide 68 text

Parser 1. Read your raw JavaScript source. 2. Parse out every single thing that’s happening. 3. Return an AST that represents your code

Slide 69

Slide 69 text

Esprima is a very popular* parser that converts your code into an abstract syntax tree. *FB recently forked it to add support for ES6 and JSX

Slide 70

Slide 70 text

Parsers narcissus ZeParser Treehugger Uglify-JS Esprima Acorn

Slide 71

Slide 71 text

No content

Slide 72

Slide 72 text

Esprima follows the Mozilla Parser API which is a well documented AST format used internally by Mozilla (and now by basically everyone else*)

Slide 73

Slide 73 text

var fullstack = node + ui;

Slide 74

Slide 74 text

{ "type": "Program", "body": [ { "type": “VariableDeclaration", "kind": "var" "declarations": [ { "type": "VariableDeclarator", "id": { "type": "Identifier", "name": "fullstack" }, "init": { "type": "BinaryExpression", "left": { "type": "Identifier", "name": "node" }, "operator": "+", "right": { "type": "Identifier", "name": "ui" } } } ], } ] }

Slide 75

Slide 75 text

{ "type": "Program", "body": [ { "type": “VariableDeclaration", "kind": "var" "declarations": [ { "type": "VariableDeclarator", "id": { "type": "Identifier", "name": "fullstack" }, "init": { "type": "BinaryExpression", "left": { "type": "Identifier", "name": "node" }, "operator": "+", "right": { "type": "Identifier", "name": "ui" } } } ], } ] }

Slide 76

Slide 76 text

Node Types SwitchCase (1) Property (1) Literal (1) Identifier (1) Declaration (3) Expression (14) Statement (18)

Slide 77

Slide 77 text

Expression Types • FunctionExpression • MemberExpression • CallExpression • NewExpression • ConditionalExpression • LogicalExpression • UpdateExpression • AssignmentExpression • BinaryExpression • UnaryExpression • SequenceExpression • ObjectExpression • ArrayExpression • ThisExpression

Slide 78

Slide 78 text

Statement Types • DebuggerStatement • ForInStatement • ForStatement • DoWhileStatement • WhileStatement • CatchClause • TryStatement • ThrowStatement • ReturnStatement • SwitchStatement • WithStatement • ContinueStatement • BreakStatement • LabeledStatement • IfStatement • ExpressionStatement • BlockStatement • EmptyStatement

Slide 79

Slide 79 text

Debugging ASTs

Slide 80

Slide 80 text

• When debugging console.log(ast) will not print a large nested AST properly. Instead you can use util.inspect: 
 
 var util = require('util');
 var tree = util.inspect(ast, { depth: null });
 console.log(tree); • When transforming code start with the AST you want and then work backward. • Often this means pasting code using the Esprima online visualization tool or just outputting the trees into JS files and manually diffing them.

Slide 81

Slide 81 text

Oftentimes it helps to print out the code representation of a single node. 
 
 In recast you can do:
 var source = recast.prettyPrint(ast, { tabWidth: 2 }).code;
 
 In ESLint you can get the current node with:
 var source = context.getSource(node)

Slide 82

Slide 82 text

ASTs can turn your code into play-dough

Slide 83

Slide 83 text

Totally worth the effort!

Slide 84

Slide 84 text

the end @xjamundx

Slide 85

Slide 85 text

Related - http://pegjs.majda.cz/ - https://www.kickstarter.com/projects/michaelficarra/make-a-better-coffeescript-compiler - http://coffeescript.org/documentation/docs/grammar.html - https://github.com/padolsey/parsers-built-with-js - https://github.com/zaach/jison - https://github.com/substack/node-falafel - eprima-based code modifier - https://github.com/substack/node-burrito - uglify-based AST walking code-modifier - https://github.com/jscs-dev/node-jscs/blob/e745ceb23c5f1587c3e43c0a9cfb05f5ad86b5ac/lib/js-file.js - JSCS’s way of walking the AST - https://www.npmjs.org/package/escodegen - converts an AST into real code again - https://www.npmjs.org/package/ast-types - esprima-ish parser - http://esprima.org/demo/parse.html - the most helpful tool https://github.com/RReverser/estemplate - AST-based search and replace https://www.npmjs.org/package/aster - build system thing Technical Papers http://aosd.net/2013/escodegen.html Videos / Slides http://slidedeck.io/benjamn/fluent2014-talk http://vimeo.com/93749422 https://speakerdeck.com/michaelficarra/spidermonkey-parser-api-a-standard-for-structured-js-representations https://www.youtube.com/watch?v=fF_jZ7ErwUY https://speakerdeck.com/ariya/bringing-javascript-code-analysis-to-the-next-level Just in Time Compilers http://blogs.msdn.com/b/ie/archive/2012/06/13/advances-in-javascript-performance-in-ie10-and-windows-8.aspx https://blog.mozilla.org/luke/2014/01/14/asm-js-aot-compilation-and-startup-performance/ https://blog.indutny.com/4.how-to-start-jitting Podcasts http://javascriptjabber.com/082-jsj-jshint-with-anton-kovalyov/ http://javascriptjabber.com/054-jsj-javascript-parsing-asts-and-language-grammar-w-david-herman-and-ariya-hidayat/

Slide 86

Slide 86 text

RANDOM EXTRA SLIDES

Slide 87

Slide 87 text

No content

Slide 88

Slide 88 text

Static analysis tools like ESLint and JSCS provide an API to let you inspect an AST to make sure it’s following certain patterns.

Slide 89

Slide 89 text

No content

Slide 90

Slide 90 text

No content

Slide 91

Slide 91 text

No content

Slide 92

Slide 92 text

No content

Slide 93

Slide 93 text

function isEmptyObject( obj ) { for ( var name in obj ) { return false; } return true; }

Slide 94

Slide 94 text

static analysis > unit testing > functional testing

Slide 95

Slide 95 text

No content

Slide 96

Slide 96 text

function loadUser(req, res, next) { User.loadUser(function(err, user) { if (err) { next(err); } req.session.user = user; next(); }); } Another Example

Slide 97

Slide 97 text

Abstract Christmas Trees Program VariableDeclarator FunctionExpression ExpressionStatement Identifier Identifier VariableDeclaration

Slide 98

Slide 98 text

No content

Slide 99

Slide 99 text

Tools like falafel and recast give you an API to manipulate an AST and then convert that back into source code.