Slide 1

Slide 1 text

Bringing JavaScript Code Analysis to The Next Level New York October 22, 2012 Ariya Hidayat 1

Slide 2

Slide 2 text

whoami 2

Slide 3

Slide 3 text

From Spelling Checker to Grammar Enforcement Your so wrong, therefore you loose! No misspelled word. Wrong choice of words! 3

Slide 4

Slide 4 text

Quality: Practical Aspects Avoid silly mistakes Write readable code Do not provoke ambiguities Improve future maintenance Learn better code pattern 4

Slide 5

Slide 5 text

Multiple Layers of Defense 5

Slide 6

Slide 6 text

Tools and Mistakes Likelihood S = Skill C = Complexity S C Average Engineer S C 6

Slide 7

Slide 7 text

Better Use of the Tools Engineer Tools Feedback Cycle Not everyone is a JavaScript ninja Boring Repetitive Time-consuming 7

Slide 8

Slide 8 text

Adaptive Quality Criteria Explicit Implicit Customize analysis options Define new sets of rules Infer from high-quality sample Observe the engineer’s behavior 8

Slide 9

Slide 9 text

Foundation 9

Slide 10

Slide 10 text

JavaScript in the Browser User Interface Browser Engine Graphics Stack Data Persistence Render Engine JavaScript Engine Networking I/O 10

Slide 11

Slide 11 text

JavaScript Engine Building Blocks Virtual Machine/ Interpreter Parser Runtime Source Syntax Tree Built-in objects, host objects, ... Fast and conservative 11

Slide 12

Slide 12 text

Tokenization keyword equal sign semicolon identifier number var answer = 42; 12

Slide 13

Slide 13 text

Syntax Tree Variable Declaration Identifier Literal Constant answer 42 13

Slide 14

Slide 14 text

JavaScript Parser Written in JavaScript UglifyJS ZeParser Esprima Traceur Narcissus Acorn 14

Slide 15

Slide 15 text

Specification Conformance • ECMA-262 compliant • Automatic semicolon insertion • Strict Mode, e.g. “use strict” • Unicode for identifiers 'use strict'; var ೔ຊޠ = 1 return ೔ຊޠ 15

Slide 16

Slide 16 text

Heavily Tested • > 550 unit tests • Compatibility tests • 100% code coverage • Performance tests Enforced during development 16

Slide 17

Slide 17 text

Sensible Syntax Tree answer = 42 { type: "Program", body: [ { type: "ExpressionStatement", expression: { type: "AssignmentExpression", operator: "=", left: { type: "Identifier", name: "answer" }, right: { type: "Literal", value: 42 } } } ] } https://developer.mozilla.org/en/SpiderMonkey/Parser_API http://esprima.org/demo/parse.html Try online! 17

Slide 18

Slide 18 text

Specification, Parser Code, Syntax Tree function parseWhileStatement() { var test, body; expectKeyword('while'); expect('('); test = parseExpression(); expect(')'); body = parseStatement(); return { type: 'WhileStatement', test: test, body: body }; } while ( Expression ) Statement ECMA-262 Annex A.4 18

Slide 19

Slide 19 text

High Performance Esprima UglifyJS V1 922 ms 567 ms 620 ms 233 ms Speed Comparison Chrome 18 Internet Explorer 9 Benchmark corpus: jQuery, Prototype, MooTools, ExtCore, ... 19

Slide 20

Slide 20 text

Syntax Node Location { type: "ExpressionStatement", expression: { type: "AssignmentExpression", operator: "=", left: { type: "Identifier", name: "answer", range: [0, 6] }, right: { type: "Literal", value: 42, range: [9, 11] }, range: [0, 11] }, range: [0, 11] } answer = 42 20

Slide 21

Slide 21 text

Error Tolerant Useful for IDE, editors, ... var msg = "Hello’; person..age = 18; if (person. 'use strict'; with (person) { } Mismatched quote Too many dots Incomplete, still typing Strict mode violation 21

Slide 22

Slide 22 text

Handle the Comments https://github.com/thejohnfreeman/jfdoc Documentation tool Code annotation https://github.com/goatslacker/node-typedjs // Life, Universe, and Everything answer = 42 comments: [ { range: [0, 34], type: "Line", value: " Life, Universe, and Everything" } ] 22

Slide 23

Slide 23 text

Forward Looking Experimental ‘harmony’ branch Object initializer shorthand { let x; const y = 0; } Block scope var point = {x, y}; Module & class module LinearAlgebra { export const System = 'Cartesian'; } class Vector3 { constructor (x, y, z) { this.x = x; this.y = y; this.z = z; } } Destructuring assignment point = {14, 3, 77}; {x, y, z} = point; 23

Slide 24

Slide 24 text

Code Regeneration Esprima Escodegen Source Syntax Tree Syntax Transformation Shorten variable name Inline short function Remove dead code Obfuscate Source 24

Slide 25

Slide 25 text

Tools 25

Slide 26

Slide 26 text

Brackets of Tools Static analysis Inspection Dynamic analysis Transformation 26

Slide 27

Slide 27 text

Inspection 27

Slide 28

Slide 28 text

Syntax Tree Visualization answer = 42 http://esprima.org/demo/parse.html 28

Slide 29

Slide 29 text

Syntax Demystifying Block statement 29

Slide 30

Slide 30 text

Syntax Validation http://esprima.org/demo/validate.html Try online! esvalidate test.js 30

Slide 31

Slide 31 text

Part of Continuous Integration JUnit XML + Jenkins 31

Slide 32

Slide 32 text

Linter vs Validator Validator Linter Looks for specification violations Does not care about coding style Works well on generated/minified code Searches for suspicious pattern Warns on style inconsistencies Works well on hand-written code 32

Slide 33

Slide 33 text

(Git) Precommit Hook files=$(git diff-index --name-only HEAD | grep -P '\.js$') for file in $files; do esvalidate $file if [ $? -eq 1 ]; then echo "Syntax error: $file" exit 1 fi done 33

Slide 34

Slide 34 text

Code Outline Eclipse Functions Variables 34

Slide 35

Slide 35 text

Content Assist/Autocomplete/IntelliSense Scripted 35

Slide 36

Slide 36 text

Fragment Highlighting http://esprima.org/demo/highlight.html 36

Slide 37

Slide 37 text

Most Popular Keywords this function if return var else for new in typeof while case break try catch delete throw switch continue default instanceof do void finally 4 10 12 14 15 25 35 38 72 84 84 115 122 143 188 225 232 436 562 2116 2878 3063 3108 3229 http://ariya.ofilabs.com/2012/03/most-popular-javascript-keywords.html var fs = require('fs'), esprima = require('esprima'), files = process.argv.splice(2); files.forEach(function (filename) { var content = fs.readFileSync(filename, 'utf-8'), tokens = esprima.parse(content, { tokens: true }).tokens; tokens.forEach(function (token) { if (token.type === 'Keyword') { console.log(token.value); } }); }); 37

Slide 38

Slide 38 text

Most Popular Statements http://ariya.ofilabs.com/2012/04/most-popular-javascript-statements.html ExpressionStatement BlockStatement IfStatement ReturnStatement VariableDeclaration FunctionDeclaration ForStatement ForInStatement WhileStatement BreakStatement TryStatement EmptyStatement ThrowStatement SwitchStatement ContinueStatement DoWhileStatement LabeledStatement 6 12 25 35 38 66 84 115 131 143 293 371 2116 2878 3063 6353 6728 var fs = require('fs'), esprima = require('esprima'), files = process.argv.splice(2); files.forEach(function (filename) { var content = fs.readFileSync(filename, 'utf-8'), syntax = esprima.parse(content); JSON.stringify(syntax, function (key, value) { if (key === 'type') { if (value.match(/Declaration$/) || value.match(/Statement$/)) { console.log(value); } } return value; }); }); 38

Slide 39

Slide 39 text

Identifier Length Distribution http://ariya.ofilabs.com/2012/05/javascript-identifier-length-distribution.html 0 250 500 750 0 5 10 15 20 25 30 35 40 45 mean of the identifier length is 8.27 characters prototype-1.7.0.0.js SCRIPT_ELEMENT_REJECTS_TEXTNODE_APPENDING prototype-1.7.0.0.js MOUSEENTER_MOUSELEAVE_EVENTS_SUPPORTED jquery-1.7.1.js subtractsBorderForOverflowNotVisible jquery.mobile-1.0.js getClosestElementWithVirtualBinding prototype-1.7.0.0.js HAS_EXTENDED_CREATE_ELEMENT_SYNTAX 39

Slide 40

Slide 40 text

More Code Metrics Cyclomatic complexity Comment density Expression depth Duplicated/similar fragment Native objects/functions usage 40

Slide 41

Slide 41 text

Static Analysis 41

Slide 42

Slide 42 text

“Code Linting” var fs = require('fs'), esprima = require('./esprima'), files = process.argv.splice(2); files.forEach(function (filename) { var content = fs.readFileSync(filename, 'utf-8'), syntax = esprima.parse(content, { loc: true }); JSON.stringify(syntax, function (key, value) { if (key === 'test' && value.operator === '==') console.log('Line', value.loc.start.line); return value; }); }); if (x == 9) { // do Something } Not a strict equal 42

Slide 43

Slide 43 text

“Boolean Trap” Finder Can you make up your mind? treeItem.setState(true, false); event.initKeyEvent("keypress", true, true, null, null, false, false, false, false, 9, 0); The more the merrier? Obfuscated choice var volumeSlider = new Slider(false); Double-negative component.setHidden(false); filter.setCaseInsensitive(false); http://ariya.ofilabs.com/2012/06/detecting-boolean-traps-with-esprima.html 43

Slide 44

Slide 44 text

Nested Ternary Conditionals http://ariya.ofilabs.com/2012/10/detecting-nested-ternary-conditionals.html var str = (age < 1) ? "baby" : (age < 5) ? "toddler" : (age < 18) ? "child": "adult"; 44

Slide 45

Slide 45 text

Strict Mode Check 'use strict'; block = { color: 'blue', height: 20, width: 10, color: 'red' }; Duplicate data property in object literal not allowed in strict mode http://ariya.ofilabs.com/2012/10/validating-strict-mode.html 45

Slide 46

Slide 46 text

Dynamic Analysis 46

Slide 47

Slide 47 text

Statement Coverage http://ariya.ofilabs.com/2012/03/javascript-code-coverage-and-esprima.html x = 42; if (false) x = -1; https://github.com/itay/node-cover https://github.com/coveraje/coveraje https://github.com/pmlopes/coberturajs 47

Slide 48

Slide 48 text

Instrumentation for Coverage http://itay.github.com/snug_codecoverage_slides/ var a = 5; { __statement_ZyyqFc(1); var a = 5; } foo(); { __statement_ZyyqFc(2); __expression_kC$jur(3), foo(); } function foo() { ... }; function foo() { __block_n53cJc(1); ... } Statement Expression Block 48

Slide 49

Slide 49 text

Unit Test + Statement Coverage = Latent Trap http://ariya.ofilabs.com/2012/09/the-hidden-trap-of-code-coverage.html function inc(p, q) { if (q == undefined) q = 1; return p + q/q; } assert("inc(4) must give 5", inc(4) == 5); function inc(p, q) { if (q == undefined) return p + 1; return p + q/q; } assert("inc(4) must give 5", inc(4) == 5); Does not catch the missing code sequence 49

Slide 50

Slide 50 text

Branch Coverage https://github.com/yahoo/istanbul function inc(p, q) { if (q == undefined) q = 1; return p + q/q; } assert("inc(4) must give 5", inc(4) == 5); E = Else is not taken 50

Slide 51

Slide 51 text

Execution Tracing http://ariya.ofilabs.com/2012/02/tracking-javascript-execution-during-startup.html https://gist.github.com/1823129 jQuery Mobile startup log 4640 function calls jquery.js 26 jQuery jquery.js 103 init undefined, undefined, [object Object] jquery.js 274 each (Function) jquery.js 631 each [object Object], (Function), undefined jquery.js 495 isFunction [object Object] jquery.js 512 type [object Object] jquery.mobile.js 1857 [Anonymous] jquery.mobile.js 642 [Anonymous] jquery.mobile.js 624 enableMouseBindings jquery.mobile.js 620 disableTouchBindings 51

Slide 52

Slide 52 text

Tracking the Scalability http://ariya.ofilabs.com/2012/01/scalable-web-apps-the-complexity-issue.html Array.prototype.swap = function (i, j) { var k = this[i]; this[i] = this[j]; this[j] = k; } Array.prototype.swap = function (i, j) { Log({ name: 'Array.prototype.swap', lineNumber: 1, range: [23, 94] }); var k = this[i]; this[i] = this[j]; this[j] = k; } 52

Slide 53

Slide 53 text

Transformation 53

Slide 54

Slide 54 text

Non-Destructive Partial Source Modification Modified Intact Do not remove comments Preserve indentation & other formatting Add “contextual” information Inject or change function invocation 54

Slide 55

Slide 55 text

String Literal Quotes http://ariya.ofilabs.com/2012/02/from-double-quotes-to-single-quotes.html console.log('Hello') [ { type: "Identifier", value: "console", range: [0, 7] }, { type: "Punctuator", value: ".", range: [7, 8] }, { type: "Identifier", value: "log", range: [8, 11] }, { type: "Punctuator", value: "(", range: [11, 12] }, { type: "String", value: "\"Hello\"", range: [12, 19] }, { type: "Punctuator", value: ")", range: [19, 19] } ] console.log("Hello") May need proper escaping List of tokens 55

Slide 56

Slide 56 text

Style Formatter https://github.com/fawek/codepainter CodePainter Source Sample code Formatted source Infer coding styles Indentation Quote for string literal Whitespace 56

Slide 57

Slide 57 text

Rewrite and Regenerate var syntax = esprima.parse('answer = 42;'); syntax.body[0].expression.right.value = 1337; escodegen.generate(syntax) answer = 1337; answer = 42; https://github.com/Constellation/escodegen 57

Slide 58

Slide 58 text

Minification & Obfuscation https://github.com/Constellation/esmangle Array.prototype.swap = function (first, second) { var temp = this[first]; this[first] = this[second]; this[second] = temp; } Array.prototype.swap=function(a,b){var c=this[a];this[a]=this[b],this[b]=c} 58

Slide 59

Slide 59 text

Syntax Augmentation ES.Future Exoskeleton 59

Slide 60

Slide 60 text

LLJS (Low-Level JavaScript) http://mbebenita.github.com/LLJS/ let x = 0; Block scope let u8 flag; let i32 position; struct Point { int x, y; }; Data types let u16 *p = q; Pointers 60

Slide 61

Slide 61 text

Sweet.js for Macro http://sweetjs.org macro def { case $name:ident $params $body => { function $name $params $body } } def sweet(a) { console.log(“Hello World”); } Define def.. ..so that you can write 61

Slide 62

Slide 62 text

Transpilation: Class Harmony ES 5.1 // Vector in 3-D Cartesian coordinate class Vector3 { constructor (x, y, z) { this.x = x; this.y = y; this.z = z; } } // Vector in 3-D Cartesian coordinate var Vector3 = (function () { function Vector3 (x, y, z) { this.x = x; this.y = y; this.z = z; } ; return Vector3;})(); Intact http://ariya.ofilabs.com/2012/09/javascripts-future-class-syntax.html 62

Slide 63

Slide 63 text

Transpilation: Module Harmony ES 5.1 https://github.com/jdiamond/harmonizr http://ariya.ofilabs.com/2012/06/esprima-and-harmony-module.html module LinearAlgebra { // Create 2-D point. export function Point(x, y) { return { x, y }; } } var LinearAlgebra = function() { // Create 2-D point. function Point(x, y) { return { x: x, y: y }; } return { Point: Point }; }(); Intact 63

Slide 64

Slide 64 text

Future 64

Slide 65

Slide 65 text

Assisted Code Review Should be automatic, based on predefined, historical, or heuristic patterns 65

Slide 66

Slide 66 text

Syntax Query if (x = 0) { /* do Something */ } IfStatement.test AssigmentExpression[operator='='] Which syntax family should be the model? CSS selector? XPath? SQL? 66

Slide 67

Slide 67 text

Copy Paste (Mistake) Detection function inside(point, rect) { return (point.x >= rect.x1) && (point.y >= rect.y1) && (point.x <= rect.x2) && (point.y <= rect.y1); } Forgotten change! 67

Slide 68

Slide 68 text

Refactoring Helper // Add two numbers function add(firt, two) { return firt + two; } // Add two numbers function add(first, two) { return first + two; } 68

Slide 69

Slide 69 text

And Many More... Semantic Diff Symbolic execution Informative syntax error Declarative transformation Pattern Matching 69

Slide 70

Slide 70 text

Parsing Infrastructure Smart editing Source transformation Minification & obfuscation Instrumentation Code coverage Dependency analysis Documentation generator Conditional contracts 70

Slide 71

Slide 71 text

Next-Generation Code Quality Tools To boldly analyze what no man has analyzed before... 71

Slide 72

Slide 72 text

Thank You [email protected] @AriyaHidayat ariya.ofilabs.com 72