Pro Yearly is on sale from $80 to $50! »

Bringing JavaScript Code Analysis to The Next Level

Bringing JavaScript Code Analysis to The Next Level

EmpireJS 2012 Talk

0284b8950e0f4a57bcc092d4dbb98d97?s=128

Ariya Hidayat

October 22, 2012
Tweet

Transcript

  1. Bringing JavaScript Code Analysis to The Next Level New York

    October 22, 2012 Ariya Hidayat 1
  2. whoami 2

  3. From Spelling Checker to Grammar Enforcement Your so wrong, therefore

    you loose! No misspelled word. Wrong choice of words! 3
  4. Quality: Practical Aspects Avoid silly mistakes Write readable code Do

    not provoke ambiguities Improve future maintenance Learn better code pattern 4
  5. Multiple Layers of Defense 5

  6. Tools and Mistakes Likelihood S = Skill C = Complexity

    S C Average Engineer S C 6
  7. Better Use of the Tools Engineer Tools Feedback Cycle Not

    everyone is a JavaScript ninja Boring Repetitive Time-consuming 7
  8. Adaptive Quality Criteria Explicit Implicit Customize analysis options Define new

    sets of rules Infer from high-quality sample Observe the engineer’s behavior 8
  9. Foundation 9

  10. JavaScript in the Browser User Interface Browser Engine Graphics Stack

    Data Persistence Render Engine JavaScript Engine Networking I/O 10
  11. JavaScript Engine Building Blocks Virtual Machine/ Interpreter Parser Runtime Source

    Syntax Tree Built-in objects, host objects, ... Fast and conservative 11
  12. Tokenization keyword equal sign semicolon identifier number var answer =

    42; 12
  13. Syntax Tree Variable Declaration Identifier Literal Constant answer 42 13

  14. JavaScript Parser Written in JavaScript UglifyJS ZeParser Esprima Traceur Narcissus

    Acorn 14
  15. Specification Conformance • ECMA-262 compliant • Automatic semicolon insertion •

    Strict Mode, e.g. “use strict” • Unicode for identifiers 'use strict'; var ೔ຊޠ = 1 return ೔ຊޠ 15
  16. Heavily Tested • > 550 unit tests • Compatibility tests

    • 100% code coverage • Performance tests Enforced during development 16
  17. Sensible Syntax Tree answer = 42 { type: "Program", body:

    [ { type: "ExpressionStatement", expression: { type: "AssignmentExpression", operator: "=", left: { type: "Identifier", name: "answer" }, right: { type: "Literal", value: 42 } } } ] } https://developer.mozilla.org/en/SpiderMonkey/Parser_API http://esprima.org/demo/parse.html Try online! 17
  18. Specification, Parser Code, Syntax Tree function parseWhileStatement() { var test,

    body; expectKeyword('while'); expect('('); test = parseExpression(); expect(')'); body = parseStatement(); return { type: 'WhileStatement', test: test, body: body }; } while ( Expression ) Statement ECMA-262 Annex A.4 18
  19. High Performance Esprima UglifyJS V1 922 ms 567 ms 620

    ms 233 ms Speed Comparison Chrome 18 Internet Explorer 9 Benchmark corpus: jQuery, Prototype, MooTools, ExtCore, ... 19
  20. Syntax Node Location { type: "ExpressionStatement", expression: { type: "AssignmentExpression",

    operator: "=", left: { type: "Identifier", name: "answer", range: [0, 6] }, right: { type: "Literal", value: 42, range: [9, 11] }, range: [0, 11] }, range: [0, 11] } answer = 42 20
  21. Error Tolerant Useful for IDE, editors, ... var msg =

    "Hello’; person..age = 18; if (person. 'use strict'; with (person) { } Mismatched quote Too many dots Incomplete, still typing Strict mode violation 21
  22. Handle the Comments https://github.com/thejohnfreeman/jfdoc Documentation tool Code annotation https://github.com/goatslacker/node-typedjs //

    Life, Universe, and Everything answer = 42 comments: [ { range: [0, 34], type: "Line", value: " Life, Universe, and Everything" } ] 22
  23. Forward Looking Experimental ‘harmony’ branch Object initializer shorthand { let

    x; const y = 0; } Block scope var point = {x, y}; Module & class module LinearAlgebra { export const System = 'Cartesian'; } class Vector3 { constructor (x, y, z) { this.x = x; this.y = y; this.z = z; } } Destructuring assignment point = {14, 3, 77}; {x, y, z} = point; 23
  24. Code Regeneration Esprima Escodegen Source Syntax Tree Syntax Transformation Shorten

    variable name Inline short function Remove dead code Obfuscate Source 24
  25. Tools 25

  26. Brackets of Tools Static analysis Inspection Dynamic analysis Transformation 26

  27. Inspection 27

  28. Syntax Tree Visualization answer = 42 http://esprima.org/demo/parse.html 28

  29. Syntax Demystifying Block statement 29

  30. Syntax Validation http://esprima.org/demo/validate.html Try online! esvalidate test.js 30

  31. Part of Continuous Integration JUnit XML + Jenkins 31

  32. Linter vs Validator Validator Linter Looks for specification violations Does

    not care about coding style Works well on generated/minified code Searches for suspicious pattern Warns on style inconsistencies Works well on hand-written code 32
  33. (Git) Precommit Hook files=$(git diff-index --name-only HEAD | grep -P

    '\.js$') for file in $files; do esvalidate $file if [ $? -eq 1 ]; then echo "Syntax error: $file" exit 1 fi done 33
  34. Code Outline Eclipse Functions Variables 34

  35. Content Assist/Autocomplete/IntelliSense Scripted 35

  36. Fragment Highlighting http://esprima.org/demo/highlight.html 36

  37. Most Popular Keywords this function if return var else for

    new in typeof while case break try catch delete throw switch continue default instanceof do void finally 4 10 12 14 15 25 35 38 72 84 84 115 122 143 188 225 232 436 562 2116 2878 3063 3108 3229 http://ariya.ofilabs.com/2012/03/most-popular-javascript-keywords.html var fs = require('fs'), esprima = require('esprima'), files = process.argv.splice(2); files.forEach(function (filename) { var content = fs.readFileSync(filename, 'utf-8'), tokens = esprima.parse(content, { tokens: true }).tokens; tokens.forEach(function (token) { if (token.type === 'Keyword') { console.log(token.value); } }); }); 37
  38. Most Popular Statements http://ariya.ofilabs.com/2012/04/most-popular-javascript-statements.html ExpressionStatement BlockStatement IfStatement ReturnStatement VariableDeclaration FunctionDeclaration

    ForStatement ForInStatement WhileStatement BreakStatement TryStatement EmptyStatement ThrowStatement SwitchStatement ContinueStatement DoWhileStatement LabeledStatement 6 12 25 35 38 66 84 115 131 143 293 371 2116 2878 3063 6353 6728 var fs = require('fs'), esprima = require('esprima'), files = process.argv.splice(2); files.forEach(function (filename) { var content = fs.readFileSync(filename, 'utf-8'), syntax = esprima.parse(content); JSON.stringify(syntax, function (key, value) { if (key === 'type') { if (value.match(/Declaration$/) || value.match(/Statement$/)) { console.log(value); } } return value; }); }); 38
  39. Identifier Length Distribution http://ariya.ofilabs.com/2012/05/javascript-identifier-length-distribution.html 0 250 500 750 0 5

    10 15 20 25 30 35 40 45 mean of the identifier length is 8.27 characters prototype-1.7.0.0.js SCRIPT_ELEMENT_REJECTS_TEXTNODE_APPENDING prototype-1.7.0.0.js MOUSEENTER_MOUSELEAVE_EVENTS_SUPPORTED jquery-1.7.1.js subtractsBorderForOverflowNotVisible jquery.mobile-1.0.js getClosestElementWithVirtualBinding prototype-1.7.0.0.js HAS_EXTENDED_CREATE_ELEMENT_SYNTAX 39
  40. More Code Metrics Cyclomatic complexity Comment density Expression depth Duplicated/similar

    fragment Native objects/functions usage 40
  41. Static Analysis 41

  42. “Code Linting” var fs = require('fs'), esprima = require('./esprima'), files

    = process.argv.splice(2); files.forEach(function (filename) { var content = fs.readFileSync(filename, 'utf-8'), syntax = esprima.parse(content, { loc: true }); JSON.stringify(syntax, function (key, value) { if (key === 'test' && value.operator === '==') console.log('Line', value.loc.start.line); return value; }); }); if (x == 9) { // do Something } Not a strict equal 42
  43. “Boolean Trap” Finder Can you make up your mind? treeItem.setState(true,

    false); event.initKeyEvent("keypress", true, true, null, null, false, false, false, false, 9, 0); The more the merrier? Obfuscated choice var volumeSlider = new Slider(false); Double-negative component.setHidden(false); filter.setCaseInsensitive(false); http://ariya.ofilabs.com/2012/06/detecting-boolean-traps-with-esprima.html 43
  44. Nested Ternary Conditionals http://ariya.ofilabs.com/2012/10/detecting-nested-ternary-conditionals.html var str = (age < 1)

    ? "baby" : (age < 5) ? "toddler" : (age < 18) ? "child": "adult"; 44
  45. Strict Mode Check 'use strict'; block = { color: 'blue',

    height: 20, width: 10, color: 'red' }; Duplicate data property in object literal not allowed in strict mode http://ariya.ofilabs.com/2012/10/validating-strict-mode.html 45
  46. Dynamic Analysis 46

  47. Statement Coverage http://ariya.ofilabs.com/2012/03/javascript-code-coverage-and-esprima.html x = 42; if (false) x =

    -1; https://github.com/itay/node-cover https://github.com/coveraje/coveraje https://github.com/pmlopes/coberturajs 47
  48. Instrumentation for Coverage http://itay.github.com/snug_codecoverage_slides/ var a = 5; { __statement_ZyyqFc(1);

    var a = 5; } foo(); { __statement_ZyyqFc(2); __expression_kC$jur(3), foo(); } function foo() { ... }; function foo() { __block_n53cJc(1); ... } Statement Expression Block 48
  49. Unit Test + Statement Coverage = Latent Trap http://ariya.ofilabs.com/2012/09/the-hidden-trap-of-code-coverage.html function

    inc(p, q) { if (q == undefined) q = 1; return p + q/q; } assert("inc(4) must give 5", inc(4) == 5); function inc(p, q) { if (q == undefined) return p + 1; return p + q/q; } assert("inc(4) must give 5", inc(4) == 5); Does not catch the missing code sequence 49
  50. Branch Coverage https://github.com/yahoo/istanbul function inc(p, q) { if (q ==

    undefined) q = 1; return p + q/q; } assert("inc(4) must give 5", inc(4) == 5); E = Else is not taken 50
  51. Execution Tracing http://ariya.ofilabs.com/2012/02/tracking-javascript-execution-during-startup.html https://gist.github.com/1823129 jQuery Mobile startup log 4640 function

    calls jquery.js 26 jQuery jquery.js 103 init undefined, undefined, [object Object] jquery.js 274 each (Function) jquery.js 631 each [object Object], (Function), undefined jquery.js 495 isFunction [object Object] jquery.js 512 type [object Object] jquery.mobile.js 1857 [Anonymous] jquery.mobile.js 642 [Anonymous] jquery.mobile.js 624 enableMouseBindings jquery.mobile.js 620 disableTouchBindings 51
  52. Tracking the Scalability http://ariya.ofilabs.com/2012/01/scalable-web-apps-the-complexity-issue.html Array.prototype.swap = function (i, j) {

    var k = this[i]; this[i] = this[j]; this[j] = k; } Array.prototype.swap = function (i, j) { Log({ name: 'Array.prototype.swap', lineNumber: 1, range: [23, 94] }); var k = this[i]; this[i] = this[j]; this[j] = k; } 52
  53. Transformation 53

  54. Non-Destructive Partial Source Modification Modified Intact Do not remove comments

    Preserve indentation & other formatting Add “contextual” information Inject or change function invocation 54
  55. String Literal Quotes http://ariya.ofilabs.com/2012/02/from-double-quotes-to-single-quotes.html console.log('Hello') [ { type: "Identifier", value:

    "console", range: [0, 7] }, { type: "Punctuator", value: ".", range: [7, 8] }, { type: "Identifier", value: "log", range: [8, 11] }, { type: "Punctuator", value: "(", range: [11, 12] }, { type: "String", value: "\"Hello\"", range: [12, 19] }, { type: "Punctuator", value: ")", range: [19, 19] } ] console.log("Hello") May need proper escaping List of tokens 55
  56. Style Formatter https://github.com/fawek/codepainter CodePainter Source Sample code Formatted source Infer

    coding styles Indentation Quote for string literal Whitespace 56
  57. Rewrite and Regenerate var syntax = esprima.parse('answer = 42;'); syntax.body[0].expression.right.value

    = 1337; escodegen.generate(syntax) answer = 1337; answer = 42; https://github.com/Constellation/escodegen 57
  58. Minification & Obfuscation https://github.com/Constellation/esmangle Array.prototype.swap = function (first, second) {

    var temp = this[first]; this[first] = this[second]; this[second] = temp; } Array.prototype.swap=function(a,b){var c=this[a];this[a]=this[b],this[b]=c} 58
  59. Syntax Augmentation ES.Future Exoskeleton 59

  60. LLJS (Low-Level JavaScript) http://mbebenita.github.com/LLJS/ let x = 0; Block scope

    let u8 flag; let i32 position; struct Point { int x, y; }; Data types let u16 *p = q; Pointers 60
  61. Sweet.js for Macro http://sweetjs.org macro def { case $name:ident $params

    $body => { function $name $params $body } } def sweet(a) { console.log(“Hello World”); } Define def.. ..so that you can write 61
  62. Transpilation: Class Harmony ES 5.1 // Vector in 3-D Cartesian

    coordinate class Vector3 { constructor (x, y, z) { this.x = x; this.y = y; this.z = z; } } // Vector in 3-D Cartesian coordinate var Vector3 = (function () { function Vector3 (x, y, z) { this.x = x; this.y = y; this.z = z; } ; return Vector3;})(); Intact http://ariya.ofilabs.com/2012/09/javascripts-future-class-syntax.html 62
  63. Transpilation: Module Harmony ES 5.1 https://github.com/jdiamond/harmonizr http://ariya.ofilabs.com/2012/06/esprima-and-harmony-module.html module LinearAlgebra {

    // Create 2-D point. export function Point(x, y) { return { x, y }; } } var LinearAlgebra = function() { // Create 2-D point. function Point(x, y) { return { x: x, y: y }; } return { Point: Point }; }(); Intact 63
  64. Future 64

  65. Assisted Code Review Should be automatic, based on predefined, historical,

    or heuristic patterns 65
  66. Syntax Query if (x = 0) { /* do Something

    */ } IfStatement.test AssigmentExpression[operator='='] Which syntax family should be the model? CSS selector? XPath? SQL? 66
  67. Copy Paste (Mistake) Detection function inside(point, rect) { return (point.x

    >= rect.x1) && (point.y >= rect.y1) && (point.x <= rect.x2) && (point.y <= rect.y1); } Forgotten change! 67
  68. Refactoring Helper // Add two numbers function add(firt, two) {

    return firt + two; } // Add two numbers function add(first, two) { return first + two; } 68
  69. And Many More... Semantic Diff Symbolic execution Informative syntax error

    Declarative transformation Pattern Matching 69
  70. Parsing Infrastructure Smart editing Source transformation Minification & obfuscation Instrumentation

    Code coverage Dependency analysis Documentation generator Conditional contracts 70
  71. Next-Generation Code Quality Tools To boldly analyze what no man

    has analyzed before... 71
  72. Thank You ariya.hidayat@gmail.com @AriyaHidayat ariya.ofilabs.com 72