Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Don't be Afraid of ASTs

Don't be Afraid of ASTs

An early guide to abstract syntax trees in JavaScript

Jamund Ferguson

November 18, 2014
Tweet

More Decks by Jamund Ferguson

Other Decks in Technology

Transcript

  1. Our Basic Plan 1. High-level overview 2. Static Analysis with

    ASTs 3. Transforming and refactoring 4. A quick look at the Mozilla Parser API (de-facto standard AST format)
  2. { "type": "Program", "body": [ { "type": “VariableDeclaration", "kind": "var"

    "declarations": [ { "type": "VariableDeclarator", "id": { "type": "Identifier", "name": "fullstack" }, "init": { "type": "BinaryExpression", "left": { "type": "Identifier", "name": "node" }, "operator": "+", "right": { "type": "Identifier", "name": "ui" } } } ], } ] }
  3. { "type": "Program", "body": [ { "type": “VariableDeclaration", "kind": "var"

    "declarations": [ { "type": "VariableDeclarator", "id": { "type": "Identifier", "name": "fullstack" }, "init": { "type": "BinaryExpression", "left": { "type": "Identifier", "name": "node" }, "operator": "+", "right": { "type": "Identifier", "name": "ui" } } } ], } ] }
  4. Things Built On ASTs • Syntax Highlighting • Code Completion

    • Static Analysis (aka JSLint, etc.) • Code Coverage • Minification • JIT Compilation • Source Maps • Compile to JS Languages So much more…
  5. function loadUser(req, res, next) { User.loadUser(function(err, user) { req.session.user =

    user; next(); }); } Bad Example We forgot to handle the error!
  6. 1. Each time a function is declared check if there

    is an error* parameter 2. If so set a count to 0; 3. Increment count when error is used 4. At the end of the function warn when count is empty * the parameter name can be defined by the user handle-callback-err
  7. History Lesson • 1995: JavaScript • 2002: JSLint started by

    Douglas Crockford • 2011: JSHint comes out as a fork of JSLint. Esprima AST parser released. • 2012: plato, escomplex, complexity-report • 2013: Nicholas Zakas releases ESLint. Marat Dulin releases JSCS.
  8. no-console return { "MemberExpression": function(node) { if (node.object.name === "console")

    { context.report(node, "Unexpected console statement.”); } } }; https://github.com/eslint/eslint/blob/master/lib/rules/no-console.js
  9. no-loop-func function checkForLoops(node) { var ancestors = context.getAncestors(); if (ancestors.some(function(ancestor)

    { return ancestor.type === "ForStatement" || ancestor.type === "WhileStatement" || ancestor.type === "DoWhileStatement"; })) { context.report(node, "Don't make functions within a loop"); } } return { "FunctionExpression": checkForLoops, "FunctionDeclaration": checkForLoops };
  10. max-params var numParams = context.options[0] || 3; function checkParams(node) {

    if (node.params.length > numParams) { context.report(node, "This function has too many parameters ({{count}}). Maximum allowed is {{max}}.", { count: node.params.length, max: numParams }); } } return { “FunctionDeclaration”: checkParams, “FunctionExpression”: checkParams }
  11. no-jquery function isjQuery(name) { return name === '$' || name

    === 'jquery' || name === 'jQuery'; } return { “CallExpression”: function(node) { var name = node.callee && node.callee.name; if (isjQuery(name)) { context.report(node, 'Please avoid using jQuery here.’); } } }
  12. Other Areas for Static Analysis Code complexity and visualization is

    another area where static analysis is really useful. Plato is an exciting start, but I believe there are tons of more interesting things that can be done in this area.
  13. Recap • Static Analysis can help you catch real bugs

    and keep your code maintainable • ESLint and JSCS both use ASTs for inspecting your code to make it easy to cleanly to add new rules • Static analysis can also help you manage your code complexity as well • What exactly does a for loop sound like?
  14. Tools like falafel and recast give you an API to

    manipulate an AST and then convert that back into source code.
  15. Two Types of AST Transformations Regenerative Regenerate the full file

    from the AST. Often losing comments and non-essential formatting. Fine for code not read by humans (i.e. browserify transforms). Partial-source transformation Regenerate only the parts of the source that have changed based on the AST modifications. Nicer for one-time changes in source.
  16. 4 Steps 1. Buffer up the stream of source code

    2. Convert the source into an AST 3. Transform the AST 4. Re-generate and output the source
  17. Step 1 var through = require(‘through'); var buffer = [];

    return through(function write(data) { buffer.push(data); }, function end () { var source = buffer.join(‘’); }); Use through to grab the source code
  18. Step 2 var falafel = require(‘falafel’); function end () {

    var source = buffer.join(‘’); var out = falafel(source, parse).toString(); } Use falafel to transform create an AST
  19. Step 3 function parse(node) { if (node.type === 'Identifier' &&

    node.value === ‘ui’) { node.update('browserify'); } } Use falafel to transform the AST
  20. Step 4 function end () { var source = buffer.join(‘’);

    var out = falafel(source, parse).toString(); this.queue(out); this.queue(null); // end the stream } Stream the source with through and close the stream
  21. var through = require('through'); var falafel = require('falafel'); module.exports =

    function() { var buffer = []; return through(function write(data) { buffer.push(data); }, function end() { var source = buffer.join('\n'); var out = falafel(source, parse).toString(); this.queue(out); this.queue(null); // close the stream }); }; function parse(node) { if (node.type === 'Identifier' && node.name === 'ui') { node.update('browserify'); } }
  22. It Works! browserify -t ./ui-to-browserify.js code.js (function e(t,n,r){function s(o,u){if(!n[o]){if(!t[o]){var a=typeof

    require=="function"&&require;if(!u&&a)return a(o,!0);if(i)return i(o,!0);var f=new Error("Cannot find module '"+o+"'");throw f.code="MODULE_NOT_FOUND",f} var l=n[o]={exports:{}};t[o][0].call(l.exports,function(e){var n=t[o][1] [e];return s(n?n:e)},l,l.exports,e,t,n,r)}return n[o].exports}var i=typeof require=="function"&&require;for(var o=0;o<r.length;o++)s(r[o]);return s}) ({1:[function(require,module,exports){ var fullstack = node + browserify; },{}]},{},[1]);
  23. A Basic Map/Filter var a = [1, 2, 3]; var

    b = a.filter(function(n) { return n > 1; }).map(function(k) { return k * 2; });
  24. Faster Like This var a = [1, 2, 3]; var

    b = []; for (var i = 0; i < a.length; i++) { if (a[i] > 1) { b.push(a[i] * 2); } }
  25. A Basic Recast Script var recast = require(‘recast’); var code

    = fs.readFileSync(‘code.js', 'utf-8'); var ast = recast.parse(code); var faster = transform(ast); var output = recast.print(faster).code;
  26. function transform(ast) { var transformedAST = new MapFilterEater({ body: ast.program.body

    }).visit(ast); return transformedAST; } var Visitor = recast.Visitor; var MapFilterEater = Visitor.extend({ init: function(options) {}, visitForStatement: function(ast) {}, visitIfStatement: function(ast) {}, visitCallExpression: function(ast) {}, visitVariableDeclarator: function(ast) {} });
  27. How Does it Work? 1. Move the right side of

    the b declaration into a for loop 2. Set b = [] 3. Place the .filter() contents inside of an if statement 4. Unwrap the .map contents and .push() them into b 5. Replace all of the local counters with a[_i]
  28. And Voila…. var a = [1, 2, 3]; var b

    = []; for (var i = 0; i < a.length; i++) { if (a[i] > 1) { b.push(a[i] * 2); } }
  29. Parser 1. Read your raw JavaScript source. 2. Parse out

    every single thing that’s happening. 3. Return an AST that represents your code
  30. Esprima is a very popular* parser that converts your code

    into an abstract syntax tree. *FB recently forked it to add support for ES6 and JSX
  31. Esprima follows the Mozilla Parser API which is a well

    documented AST format used internally by Mozilla (and now by basically everyone else*)
  32. { "type": "Program", "body": [ { "type": “VariableDeclaration", "kind": "var"

    "declarations": [ { "type": "VariableDeclarator", "id": { "type": "Identifier", "name": "fullstack" }, "init": { "type": "BinaryExpression", "left": { "type": "Identifier", "name": "node" }, "operator": "+", "right": { "type": "Identifier", "name": "ui" } } } ], } ] }
  33. { "type": "Program", "body": [ { "type": “VariableDeclaration", "kind": "var"

    "declarations": [ { "type": "VariableDeclarator", "id": { "type": "Identifier", "name": "fullstack" }, "init": { "type": "BinaryExpression", "left": { "type": "Identifier", "name": "node" }, "operator": "+", "right": { "type": "Identifier", "name": "ui" } } } ], } ] }
  34. Node Types SwitchCase (1) Property (1) Literal (1) Identifier (1)

    Declaration (3) Expression (14) Statement (18)
  35. Expression Types • FunctionExpression • MemberExpression • CallExpression • NewExpression

    • ConditionalExpression • LogicalExpression • UpdateExpression • AssignmentExpression • BinaryExpression • UnaryExpression • SequenceExpression • ObjectExpression • ArrayExpression • ThisExpression
  36. Statement Types • DebuggerStatement • ForInStatement • ForStatement • DoWhileStatement

    • WhileStatement • CatchClause • TryStatement • ThrowStatement • ReturnStatement • SwitchStatement • WithStatement • ContinueStatement • BreakStatement • LabeledStatement • IfStatement • ExpressionStatement • BlockStatement • EmptyStatement
  37. • When debugging console.log(ast) will not print a large nested

    AST properly. Instead you can use util.inspect: 
 
 var util = require('util');
 var tree = util.inspect(ast, { depth: null });
 console.log(tree); • When transforming code start with the AST you want and then work backward. • Often this means pasting code using the Esprima online visualization tool or just outputting the trees into JS files and manually diffing them.
  38. Oftentimes it helps to print out the code representation of

    a single node. 
 
 In recast you can do:
 var source = recast.prettyPrint(ast, { tabWidth: 2 }).code;
 
 In ESLint you can get the current node with:
 var source = context.getSource(node)
  39. Related - http://pegjs.majda.cz/ - https://www.kickstarter.com/projects/michaelficarra/make-a-better-coffeescript-compiler - http://coffeescript.org/documentation/docs/grammar.html - https://github.com/padolsey/parsers-built-with-js -

    https://github.com/zaach/jison - https://github.com/substack/node-falafel - eprima-based code modifier - https://github.com/substack/node-burrito - uglify-based AST walking code-modifier - https://github.com/jscs-dev/node-jscs/blob/e745ceb23c5f1587c3e43c0a9cfb05f5ad86b5ac/lib/js-file.js - JSCS’s way of walking the AST - https://www.npmjs.org/package/escodegen - converts an AST into real code again - https://www.npmjs.org/package/ast-types - esprima-ish parser - http://esprima.org/demo/parse.html - the most helpful tool https://github.com/RReverser/estemplate - AST-based search and replace https://www.npmjs.org/package/aster - build system thing Technical Papers http://aosd.net/2013/escodegen.html Videos / Slides http://slidedeck.io/benjamn/fluent2014-talk http://vimeo.com/93749422 https://speakerdeck.com/michaelficarra/spidermonkey-parser-api-a-standard-for-structured-js-representations https://www.youtube.com/watch?v=fF_jZ7ErwUY https://speakerdeck.com/ariya/bringing-javascript-code-analysis-to-the-next-level Just in Time Compilers http://blogs.msdn.com/b/ie/archive/2012/06/13/advances-in-javascript-performance-in-ie10-and-windows-8.aspx https://blog.mozilla.org/luke/2014/01/14/asm-js-aot-compilation-and-startup-performance/ https://blog.indutny.com/4.how-to-start-jitting Podcasts http://javascriptjabber.com/082-jsj-jshint-with-anton-kovalyov/ http://javascriptjabber.com/054-jsj-javascript-parsing-asts-and-language-grammar-w-david-herman-and-ariya-hidayat/
  40. Static analysis tools like ESLint and JSCS provide an API

    to let you inspect an AST to make sure it’s following certain patterns.
  41. function isEmptyObject( obj ) { for ( var name in

    obj ) { return false; } return true; }
  42. function loadUser(req, res, next) { User.loadUser(function(err, user) { if (err)

    { next(err); } req.session.user = user; next(); }); } Another Example
  43. Tools like falafel and recast give you an API to

    manipulate an AST and then convert that back into source code.