Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Better Living Through Control Flow Graph Generation

Better Living Through Control Flow Graph Generation

Thunderplainsconf 2014

Chris Dickinson

October 09, 2014
Tweet

More Decks by Chris Dickinson

Other Decks in Programming

Transcript

  1. THUNDER PLAINS CONF 2014 @isntitvacant hi, I’m chris. I’d like

    to talk to you about how we think about our code. I work for WM labs
  2. THUNDER PLAINS CONF 2014 @isntitvacant what is a program? is

    it text? is it what runs on the CPU? is it an idea?
  3. THUNDER PLAINS CONF 2014 @isntitvacant Abstract Syntax Trees one syntax

    tree can represent multiple different input texts.
  4. THUNDER PLAINS CONF 2014 @isntitvacant function a b b c

    c > - return name parameters body 0
  5. THUNDER PLAINS CONF 2014 @isntitvacant control flow graphs CFGs are

    directed graphs. ! A directed graph has edges that go from one node to another. ! In a CFG's case, nodes are operations, and edges are flows between operations.
  6. THUNDER PLAINS CONF 2014 @isntitvacant control flow graphs function <id>

    <id> <id> <id> <id> > - literal return name parameters body
  7. THUNDER PLAINS CONF 2014 @isntitvacant control flow graphs <id> <id>

    > - literal return then transfer control to..
  8. THUNDER PLAINS CONF 2014 @isntitvacant control flow graphs <id> <id>

    > - literal return then transfer control to..
  9. THUNDER PLAINS CONF 2014 @isntitvacant control flow graphs <id> <id>

    > - literal return a subtraction operation!
  10. THUNDER PLAINS CONF 2014 @isntitvacant control flow graphs importantly: one

    control flow graph can represent multiple different input syntax trees. for example...
  11. THUNDER PLAINS CONF 2014 @isntitvacant goals if we can build

    CFGs, we open the door to neat optimizations ! * dead code elimination * splitting polymorphic functions into multiple monomorphic functions * extreme minification * generating SSA graphs (like V8's Crankshaft!)
  12. THUNDER PLAINS CONF 2014 @isntitvacant goals (cont'd) and we can

    also do some neat mad-science-y stuff: ! * diffing CFGs — did that commit add a new exception edge to the program? * taint checking * if the IR is advanced enough, we could project JS into different languages.
  13. THUNDER PLAINS CONF 2014 @isntitvacant obstacles BUT FIRST: ! *

    We need to accurately represent every (wonky) JS control structure * And for best results, we want to have as many "linear" blocks as possible.
  14. THUNDER PLAINS CONF 2014 @isntitvacant obstacles (wonky control structures) B2:

    { B1: { try { break B1 } finally {
 break B2 } } console.log('???') }
  15. THUNDER PLAINS CONF 2014 @isntitvacant obstacles (linear blocks) what is

    a linear block? ! when a given operation has only one possible exit edge, and the next operation has one possible entry, we can say that:
  16. THUNDER PLAINS CONF 2014 @isntitvacant obstacles (linear blocks) what is

    a linear block? ! "If control ever enters this block, it must perform all of these operations before exiting this block"
  17. THUNDER PLAINS CONF 2014 @isntitvacant obstacles (linear blocks) what is

    a linear block? ! and thus we can simplify! load x load 3 less-than test less-than test load 3 load x ===
  18. THUNDER PLAINS CONF 2014 @isntitvacant obstacles (linear blocks) in other

    words: ! anything that conditionally executes code is not linear ! that includes anything that throws an exception
  19. THUNDER PLAINS CONF 2014 @isntitvacant obstacles (linear blocks) pop quiz:

    ! what operators in JavaScript are guaranteed to not throw exceptions?
  20. THUNDER PLAINS CONF 2014 @isntitvacant obstacles (linear blocks) pop quiz

    (answered): ! * delete, in, instanceof, and new can all throw * all mathematical operators can all throw * all lookups can throw
  21. THUNDER PLAINS CONF 2014 @isntitvacant obstacles (linear blocks) pop quiz

    (really answered): ! no* operators are safe. ! it's only combinations of operators and operand types that are safe. ! * except for void but hey, it's kind of difficult to write a useful program with just void
  22. THUNDER PLAINS CONF 2014 @isntitvacant tracking values How do we

    track values in a JS program? What should a value look like?
  23. THUNDER PLAINS CONF 2014 @isntitvacant tracking values Values might be:

    ! * Undefined or null * Primitives (strings, booleans, or numbers) * Objects * Functions * Unknown
  24. THUNDER PLAINS CONF 2014 @isntitvacant tracking values (unknowns) Unknowns are

    values that are present in the program, but either aren't available from the code OR aren't deducible statically.
  25. THUNDER PLAINS CONF 2014 @isntitvacant tracking values (unknowns) For instance,

    if you use a variable declared globally in another script, we would mark that "Unknown".
  26. THUNDER PLAINS CONF 2014 @isntitvacant tracking values (unknowns) We can

    make deductions based on their use: ! * are they defined? (i.e., not null/undefined) * are they functions?
  27. THUNDER PLAINS CONF 2014 @isntitvacant tracking values We track values

    using a stack. load x subtract load 3 Value<x> push 1
  28. THUNDER PLAINS CONF 2014 @isntitvacant tracking values We track values

    using a stack. load x subtract load 3 Value<x> Value<3> push 1
  29. THUNDER PLAINS CONF 2014 @isntitvacant tracking values We track values

    using a stack. load x subtract load 3 Value<x> Value<3> pop 2, push 1
  30. THUNDER PLAINS CONF 2014 @isntitvacant tracking values We track values

    using a stack. load x subtract load 3 Value<x-3> pop 2, push 1
  31. THUNDER PLAINS CONF 2014 @isntitvacant tracking values names Values are

    only half of the battle. ! We also care about names.
  32. THUNDER PLAINS CONF 2014 @isntitvacant tracking values names Names are

    variable names and property names. ! An object is a collection of names. A scope is an object whose names represent variables.
  33. THUNDER PLAINS CONF 2014 @isntitvacant tracking values names Operations can

    determine whether a name is pushed onto the stack, or a value. x = x + 3 This is called an LValue.
  34. THUNDER PLAINS CONF 2014 @isntitvacant tracking values names x =

    x + 3 load x load 3 Value<x> Value<3> get x Name<x>
  35. THUNDER PLAINS CONF 2014 @isntitvacant tracking values names x =

    x + 3 load x add load 3 Value<x+3> get x Name<x>
  36. THUNDER PLAINS CONF 2014 @isntitvacant tracking values names x =

    x + 3 load x add load 3 get x Value<x+3> store
  37. THUNDER PLAINS CONF 2014 @isntitvacant tracking values names Values only

    exist for as long as at least one name points to them. ! When we write JS, we grow and prune the object graph using names.
  38. THUNDER PLAINS CONF 2014 @isntitvacant branches Branching logic presents a

    problem: ! A value could be X or Y, depending on the branch taken!
  39. THUNDER PLAINS CONF 2014 @isntitvacant branches x = 3 if

    (Math.random() > 0.5) { x = "hi" y = x + ' world' } What is x's type? What is y's?
  40. THUNDER PLAINS CONF 2014 @isntitvacant branches solution: ! * inject

    a proxy into the scope chain. * wrap names and values that are accessed from outside of scope * when a value changes, split it
  41. THUNDER PLAINS CONF 2014 @isntitvacant branches after that point: !

    * accesses inside the branch get the updated name+value. * accesses outside the branch get an Either object.
  42. THUNDER PLAINS CONF 2014 @isntitvacant branches x = 3 if

    (Math.random() > 0.5) { x = "hi" y = x + ' world' } x = Either<Number, String> y = Either<Undefined, String>