Better Living Through Control Flow Graph Generation

Better Living Through Control Flow Graph Generation

Thunderplainsconf 2014

57f33120143ad72783fc64dbff4a2a67?s=128

Chris Dickinson

October 09, 2014
Tweet

Transcript

  1. THUNDER PLAINS CONF 2014 @isntitvacant better living through! control flow

    graph! generation
  2. THUNDER PLAINS CONF 2014 @isntitvacant OR: use JS to build

    a JS! runtime to analyze JS
  3. THUNDER PLAINS CONF 2014 @isntitvacant hi, I’m chris.

  4. THUNDER PLAINS CONF 2014 @isntitvacant hi, I’m chris. I work

    for WM labs
  5. THUNDER PLAINS CONF 2014 @isntitvacant hi, I’m chris. I’d like

    to talk to you about how we think about our code. I work for WM labs
  6. THUNDER PLAINS CONF 2014 @isntitvacant what is a program?

  7. THUNDER PLAINS CONF 2014 @isntitvacant what is a program? is

    it text?
  8. THUNDER PLAINS CONF 2014 @isntitvacant what is a program? is

    it text? is it what runs on the CPU?
  9. THUNDER PLAINS CONF 2014 @isntitvacant what is a program? is

    it text? is it what runs on the CPU? is it an idea?
  10. THUNDER PLAINS CONF 2014 @isntitvacant function compare(lhs, rhs) {
 return

    lhs - rhs > 0 } the treachery of text
  11. THUNDER PLAINS CONF 2014 @isntitvacant function compare(lhs, rhs) {
 return

    lhs - rhs > 0 }
  12. THUNDER PLAINS CONF 2014 @isntitvacant function compare(lhs, rhs) {
 return

    lhs - rhs > 0 }
  13. THUNDER PLAINS CONF 2014 @isntitvacant function compare(lhs lhs rhs rhs

    , ) { > - 0 } return
  14. THUNDER PLAINS CONF 2014 @isntitvacant function compare ( lhs lhs

    rhs rhs , ) { > - 0 } return
  15. THUNDER PLAINS CONF 2014 @isntitvacant function compare ( lhs lhs

    rhs rhs , ) { > - 0 } return
  16. THUNDER PLAINS CONF 2014 @isntitvacant function compare ( lhs lhs

    rhs rhs , ) { > - 0 } return
  17. THUNDER PLAINS CONF 2014 @isntitvacant function compare ( lhs lhs

    rhs rhs , ) { > - 0 } return
  18. THUNDER PLAINS CONF 2014 @isntitvacant function compare ( lhs lhs

    rhs rhs , ) { > - 0 } return
  19. THUNDER PLAINS CONF 2014 @isntitvacant function compare ( lhs lhs

    rhs rhs , ) { > - 0 } return
  20. THUNDER PLAINS CONF 2014 @isntitvacant function compare lhs lhs rhs

    rhs > - 0 return
  21. THUNDER PLAINS CONF 2014 @isntitvacant function compare lhs lhs rhs

    rhs > - 0 return name parameters body
  22. THUNDER PLAINS CONF 2014 @isntitvacant Abstract Syntax Trees one syntax

    tree can represent multiple different input texts.
  23. THUNDER PLAINS CONF 2014 @isntitvacant function compare lhs lhs rhs

    rhs > - 0 return name parameters body
  24. THUNDER PLAINS CONF 2014 @isntitvacant function a b b c

    c > - return name parameters body 0
  25. THUNDER PLAINS CONF 2014 @isntitvacant control flow graphs

  26. THUNDER PLAINS CONF 2014 @isntitvacant control flow graphs CFGs represent

    the flow between operations in a program.
  27. THUNDER PLAINS CONF 2014 @isntitvacant control flow graphs CFGs are

    directed graphs. ! A directed graph has edges that go from one node to another. ! In a CFG's case, nodes are operations, and edges are flows between operations.
  28. THUNDER PLAINS CONF 2014 @isntitvacant control flow graphs function <id>

    <id> <id> <id> <id> > - literal return name parameters body
  29. THUNDER PLAINS CONF 2014 @isntitvacant control flow graphs <id> <id>

    > - literal return
  30. THUNDER PLAINS CONF 2014 @isntitvacant control flow graphs <id> <id>

    > - literal return
  31. THUNDER PLAINS CONF 2014 @isntitvacant control flow graphs <id> <id>

    > - literal return
  32. THUNDER PLAINS CONF 2014 @isntitvacant control flow graphs <id> <id>

    > - literal return
  33. THUNDER PLAINS CONF 2014 @isntitvacant control flow graphs <id> <id>

    > - literal return
  34. THUNDER PLAINS CONF 2014 @isntitvacant control flow graphs <id> <id>

    > - literal return
  35. THUNDER PLAINS CONF 2014 @isntitvacant control flow graphs <id> <id>

    > - literal return operation: "load <id>"
  36. THUNDER PLAINS CONF 2014 @isntitvacant control flow graphs <id> <id>

    > - literal return then transfer control to..
  37. THUNDER PLAINS CONF 2014 @isntitvacant control flow graphs <id> <id>

    > - literal return another load,
  38. THUNDER PLAINS CONF 2014 @isntitvacant control flow graphs <id> <id>

    > - literal return then transfer control to..
  39. THUNDER PLAINS CONF 2014 @isntitvacant control flow graphs <id> <id>

    > - literal return a subtraction operation!
  40. THUNDER PLAINS CONF 2014 @isntitvacant control flow graphs importantly: one

    control flow graph can represent multiple different input syntax trees. for example...
  41. THUNDER PLAINS CONF 2014 @isntitvacant control flow graphs while(x >

    3) {
 x -= 2 } B1 B2 B3
  42. THUNDER PLAINS CONF 2014 @isntitvacant control flow graphs for(;x >

    3;) {
 x -= 2 } B1 B2 B3
  43. THUNDER PLAINS CONF 2014 @isntitvacant control flow graphs function f(x)

    { if(x > 3) f(x - 2) } x = f(x) B1 B2 B3
  44. THUNDER PLAINS CONF 2014 @isntitvacant goals and obstacles

  45. THUNDER PLAINS CONF 2014 @isntitvacant goals if we can build

    CFGs, we open the door to neat optimizations ! * dead code elimination * splitting polymorphic functions into multiple monomorphic functions * extreme minification * generating SSA graphs (like V8's Crankshaft!)
  46. THUNDER PLAINS CONF 2014 @isntitvacant goals (cont'd) and we can

    also do some neat mad-science-y stuff: ! * diffing CFGs — did that commit add a new exception edge to the program? * taint checking * if the IR is advanced enough, we could project JS into different languages.
  47. THUNDER PLAINS CONF 2014 @isntitvacant obstacles BUT FIRST: ! *

    We need to accurately represent every (wonky) JS control structure * And for best results, we want to have as many "linear" blocks as possible.
  48. THUNDER PLAINS CONF 2014 @isntitvacant obstacles (wonky control structures) B1:

    { if (x > 2) break B1 }
  49. THUNDER PLAINS CONF 2014 @isntitvacant obstacles (wonky control structures) B2:

    { B1: { try { break B1 } finally {
 break B2 } } console.log('???') }
  50. THUNDER PLAINS CONF 2014 @isntitvacant obstacles (linear blocks) what is

    a linear block? ! when a given operation has only one possible exit edge, and the next operation has one possible entry, we can say that:
  51. THUNDER PLAINS CONF 2014 @isntitvacant obstacles (linear blocks) what is

    a linear block? ! "If control ever enters this block, it must perform all of these operations before exiting this block"
  52. THUNDER PLAINS CONF 2014 @isntitvacant obstacles (linear blocks) what is

    a linear block? ! and thus we can simplify! load x load 3 less-than test less-than test load 3 load x ===
  53. THUNDER PLAINS CONF 2014 @isntitvacant obstacles (linear blocks) in other

    words: ! anything that conditionally executes code is not linear ! that includes anything that throws an exception
  54. THUNDER PLAINS CONF 2014 @isntitvacant obstacles (linear blocks) pop quiz:

    ! what operators in JavaScript are guaranteed to not throw exceptions?
  55. THUNDER PLAINS CONF 2014 @isntitvacant obstacles (linear blocks) pop quiz

    (answered): ! well, there's void
  56. THUNDER PLAINS CONF 2014 @isntitvacant obstacles (linear blocks) pop quiz

    (answered): ! * delete, in, instanceof, and new can all throw * all mathematical operators can all throw * all lookups can throw
  57. THUNDER PLAINS CONF 2014 @isntitvacant obstacles (linear blocks) pop quiz

    (really answered): ! no* operators are safe. ! it's only combinations of operators and operand types that are safe. ! * except for void but hey, it's kind of difficult to write a useful program with just void
  58. THUNDER PLAINS CONF 2014 @isntitvacant obstacles (linear blocks) solution: !

    * track values through the program
  59. THUNDER PLAINS CONF 2014 @isntitvacant tracking values

  60. THUNDER PLAINS CONF 2014 @isntitvacant tracking values How do we

    track values in a JS program? What should a value look like?
  61. THUNDER PLAINS CONF 2014 @isntitvacant tracking values Values might be:

    ! * Undefined or null * Primitives (strings, booleans, or numbers) * Objects * Functions * Unknown
  62. THUNDER PLAINS CONF 2014 @isntitvacant tracking values (unknowns) Unknowns are

    values that are present in the program, but either aren't available from the code OR aren't deducible statically.
  63. THUNDER PLAINS CONF 2014 @isntitvacant tracking values (unknowns) For instance,

    if you use a variable declared globally in another script, we would mark that "Unknown".
  64. THUNDER PLAINS CONF 2014 @isntitvacant tracking values (unknowns) We can

    make deductions based on their use: ! * are they defined? (i.e., not null/undefined) * are they functions?
  65. THUNDER PLAINS CONF 2014 @isntitvacant tracking values (unknowns) We track

    values using a stack.
  66. THUNDER PLAINS CONF 2014 @isntitvacant tracking values We track values

    using a stack. load x subtract load 3
  67. THUNDER PLAINS CONF 2014 @isntitvacant tracking values We track values

    using a stack. load x subtract load 3 Value<x> push 1
  68. THUNDER PLAINS CONF 2014 @isntitvacant tracking values We track values

    using a stack. load x subtract load 3 Value<x> Value<3> push 1
  69. THUNDER PLAINS CONF 2014 @isntitvacant tracking values We track values

    using a stack. load x subtract load 3 Value<x> Value<3> pop 2, push 1
  70. THUNDER PLAINS CONF 2014 @isntitvacant tracking values We track values

    using a stack. load x subtract load 3 Value<x-3> pop 2, push 1
  71. THUNDER PLAINS CONF 2014 @isntitvacant tracking values names Values are

    only half of the battle. ! We also care about names.
  72. THUNDER PLAINS CONF 2014 @isntitvacant tracking values names Names are

    variable names and property names. ! An object is a collection of names. A scope is an object whose names represent variables.
  73. THUNDER PLAINS CONF 2014 @isntitvacant tracking values names Operations can

    determine whether a name is pushed onto the stack, or a value. x = x + 3 This is called an LValue.
  74. THUNDER PLAINS CONF 2014 @isntitvacant tracking values names x =

    x + 3
  75. THUNDER PLAINS CONF 2014 @isntitvacant tracking values names x =

    x + 3 get x Name<x>
  76. THUNDER PLAINS CONF 2014 @isntitvacant tracking values names x =

    x + 3 load x Value<x> get x Name<x>
  77. THUNDER PLAINS CONF 2014 @isntitvacant tracking values names x =

    x + 3 load x load 3 Value<x> Value<3> get x Name<x>
  78. THUNDER PLAINS CONF 2014 @isntitvacant tracking values names x =

    x + 3 load x add load 3 Value<x+3> get x Name<x>
  79. THUNDER PLAINS CONF 2014 @isntitvacant tracking values names x =

    x + 3 load x add load 3 get x Value<x+3> store
  80. THUNDER PLAINS CONF 2014 @isntitvacant tracking values names Values only

    exist for as long as at least one name points to them. ! When we write JS, we grow and prune the object graph using names.
  81. THUNDER PLAINS CONF 2014 @isntitvacant branches

  82. THUNDER PLAINS CONF 2014 @isntitvacant branches Branching logic presents a

    problem: ! A value could be X or Y, depending on the branch taken!
  83. THUNDER PLAINS CONF 2014 @isntitvacant branches x = 3 if

    (Math.random() > 0.5) { x = "hi" y = x + ' world' } What is x's type? What is y's?
  84. THUNDER PLAINS CONF 2014 @isntitvacant branches solution: ! * inject

    a proxy into the scope chain. * wrap names and values that are accessed from outside of scope * when a value changes, split it
  85. THUNDER PLAINS CONF 2014 @isntitvacant branches after that point: !

    * accesses inside the branch get the updated name+value. * accesses outside the branch get an Either object.
  86. THUNDER PLAINS CONF 2014 @isntitvacant branches x = 3 if

    (Math.random() > 0.5) { x = "hi" y = x + ' world' } x = Either<Number, String> y = Either<Undefined, String>
  87. THUNDER PLAINS CONF 2014 @isntitvacant demo time! github.com/chrisdickinson/escontrol.git npm.im/escontrol