Compilers for Free

Cd9b247e4507fed75312e9a42070125d?s=47 Tom Stuart
November 09, 2013

Compilers for Free

Partial evaluation is a powerful tool for timeshifting some aspects of a program's execution from the future into the present. Among other things, it gives us an automatic way to turn a general, abstract program into a faster, more specialized one.

This math-free talk uses Ruby to explain how partial evaluation works, how it can be used to make programs go faster, and how it compares to ideas like currying and partial application from the world of functional programming. It then investigates what happens when you run a partial evaluator on itself, and reveals some surprising results about how these techniques can be used to automatically generate compilers instead of writing them from scratch.

Given at RubyConf 2013 (http://lanyrd.com/2013/rubyconf). A video and transcript are available at http://codon.com/compilers-for-free.

Cd9b247e4507fed75312e9a42070125d?s=128

Tom Stuart

November 09, 2013
Tweet

Transcript

  1. 3.

    As Ruby programmers we already intuitively understand the power of

    metaprogramming: programs that write programs.
  2. 4.

    But there's another thing that I find even more interesting:

    programs that manipulate representations of other programs.
  3. 9.

    This only works if the program is written in a

    language that the machine understands.
  4. 13.

    • read some source code • build an AST by

    parsing the source code • evaluate the program by walking over the AST and performing instructions
  5. 14.

    SIMPLE a simple imperative language) a = 19 + 23

    x = 2; y = x * 3 if (a < 10) { a = 0; b = 0 } else { b = 10 } x = 1; while (x < 5) { x = x * 3 }
  6. 15.
  7. 16.

    grammar Simple rule statement sequence end rule sequence first:sequenced_statement ';

    ' second:sequence / sequenced_statement end rule sequenced_statement while / assign / if end rule while 'while (' condition:expression ') { ' body:statement ' }' end rule assign name:[a-z]+ ' = ' expression end rule if
  8. 17.

    rule if 'if (' condition:expression ') { ' consequence:statement '

    } else { ' alternative:statement ' }' end rule expression less_than end rule less_than left:add ' < ' right:less_than / add end rule add left:multiply ' + ' right:add / multiply end rule multiply left:term ' * ' right:multiply /
  9. 18.

    rule multiply left:term ' * ' right:multiply / term end

    rule term number / boolean / variable end rule number [0-9]+ end rule boolean ('true' / 'false') ![a-z] end rule variable [a-z]+ end end
  10. 19.

    >> Treetop.load('simple.treetop') => SimpleParser >> SimpleParser.new.parse('x = 2; y =

    x * 3') => SyntaxNode+Sequence1+Sequence0 offset=0, "x = 2; y = x * 3" (first,second): SyntaxNode+Assign1+Assign0 offset=0, "x = 2" (name,expression): SyntaxNode offset=0, "x": SyntaxNode offset=0, "x" SyntaxNode offset=1, " = " SyntaxNode+Number0 offset=4, "2": SyntaxNode offset=4, "2" SyntaxNode offset=5, "; " SyntaxNode+Assign1+Assign0 offset=7, "y = x * 3" (name,expression): SyntaxNode offset=7, "y": SyntaxNode offset=7, "y" SyntaxNode offset=8, " = " SyntaxNode+Multiply1+Multiply0 offset=11, "x * 3" (left,right): SyntaxNode+Variable0 offset=11, "x": SyntaxNode offset=11, "x" SyntaxNode offset=12, " * " SyntaxNode+Number0 offset=15, "3": SyntaxNode offset=15, "3"
  11. 20.

    Number = Struct.new :value Boolean = Struct.new :value Variable =

    Struct.new :name Add = Struct.new :left, :right Multiply = Struct.new :left, :right LessThan = Struct.new :left, :right Assign = Struct.new :name, :expression If = Struct.new :condition, :consequence, :alternative Sequence = Struct.new :first, :second While = Struct.new :condition, :body
  12. 21.

    end rule boolean ('true' / 'false') ![a-z] end rule variable

    [a-z]+ rule number [0-9]+ end { def to_ast Number.new(text_value.to_i) end } { def to_ast Boolean.new(text_value == 'true') end } { def to_ast Variable.new(text_value.to_sym) end }
  13. 22.

    end rule while 'while (' condition:expression ') { ' body:statement

    ' }' end rule assign name:[a-z]+ ' = ' expression end rule if 'if (' condition:expression ') { ' consequence:statement ' } else { ' alternative:statement ' }' { def to_ast While.new(condition.to_ast, body.to_ast) end } { def to_ast Assign.new(name.text_value.to_sym, expression.to_ast) end } { def to_ast If.new(condition.to_ast, consequence.to_ast, alternative.to_ast) end }
  14. 23.

    >> SimpleParser.new.parse('x = 2; y = x * 3').to_ast =>

    #<struct Sequence first=#<struct Assign name=:x, expression=#<struct Number value=2> >, second=#<struct Assign name=:y, expression=#<struct Multiply left=#<struct Variable name=:x>, right=#<struct Number value=3> > > >
  15. 25.

    class Number def evaluate(environment) value end end class Boolean def

    evaluate(environment) value end end class Variable def evaluate(environment) environment[name] end end
  16. 26.

    >> Number.new(3).evaluate({}) => 3 >> Boolean.new(false).evaluate({}) => false >> Variable.new(:y).evaluate({

    x: 7, y: 11 }) => 11 >> Variable.new(:y).evaluate({ x: 7, y: true }) => true
  17. 27.

    class Add def evaluate(environment) left.evaluate(environment) + right.evaluate(environment) end end class

    Multiply def evaluate(environment) left.evaluate(environment) * right.evaluate(environment) end end class LessThan def evaluate(environment) left.evaluate(environment) < right.evaluate(environment) end end
  18. 28.

    >> Multiply.new(Variable.new(:x), Variable.new(:y)). evaluate({ x: 2, y: 3 }) =>

    6 >> LessThan.new(Variable.new(:x), Variable.new(:y)). evaluate({ x: 2, y: 3 }) => true
  19. 29.

    class Assign def evaluate(environment) environment.merge({ name => expression.evaluate(environment) }) end

    end class If def evaluate(environment) if condition.evaluate(environment) consequence.evaluate(environment) else alternative.evaluate(environment) end end end class Sequence def evaluate(environment) second.evaluate(first.evaluate(environment)) end end class While def evaluate(environment) if condition.evaluate(environment) evaluate(body.evaluate(environment)) else environment end end end
  20. 30.

    >> Assign.new(:x, Number.new(1)).evaluate({}) => {:x=>1} >> Sequence.new( Assign.new(:x, Number.new(1)), Assign.new(:x,

    Number.new(2)) ).evaluate({}) => {:x=>2} >> SimpleParser.new.parse('x = 2; y = x * 3'). to_ast.evaluate({}) => {:x=>2, :y=>6} >> SimpleParser.new.parse('x = 1; while (x < 5) { x = x * 3 }'). to_ast.evaluate({}) => {:x=>9}
  21. 33.
  22. 35.

    • read some source code • build an AST by

    parsing the source code • generate target code by walking over the AST and emitting instructions
  23. 36.

    require 'json' class Number def to_javascript "function (e) { return

    #{JSON.dump(value)}; }" end end class Boolean def to_javascript "function (e) { return #{JSON.dump(value)}; }" end end class Variable def to_javascript "function (e) { return e[#{JSON.dump(name)}]; }" end end
  24. 37.

    class Add def to_javascript "function (e) { return #{left.to_javascript}(e) +

    #{right.to_javascript}(e); }" end end class Multiply def to_javascript "function (e) { return #{left.to_javascript}(e) * #{right.to_javascript}(e); }" end end class LessThan def to_javascript "function (e) { return #{left.to_javascript}(e) < #{right.to_javascript}(e); }" end end
  25. 38.

    class Assign def to_javascript "function (e) { e[#{JSON.dump(name)}] = #{expression.to_javascript}(e);

    return e; }" end end class If def to_javascript "function (e) { if (#{condition.to_javascript}(e))" + " { return #{consequence.to_javascript}(e); }" + " else { return #{alternative.to_javascript}(e); }" + ' }' end end class Sequence def to_javascript "function (e) { return #{second.to_javascript}(#{first.to_javascript}(e)); }" end end class While def to_javascript 'function (e) {' + " while (#{condition.to_javascript}(e)) { e = #{body.to_javascript}(e); }" + ' return e;' + ' }' end end
  26. 39.

    >> SimpleParser.new.parse('x = 1; while (x < 5) { x

    = x * 3 }'). to_ast.to_javascript => "function (e) { return function (e) { while (function (e) { return function (e) { return e[\"x\"]; }(e) < function (e) { return 5; }(e); }(e)) { e = function (e) { e[\"x\"] = function (e) { return function (e) { return e[\"x\"]; }(e) * function (e) { return 3; }(e); }(e); return e; }(e); } return e; }(function (e) { e[\"x\"] = function (e) { return 1; }(e); return e; }(e)); }"
  27. 40.

    > program = function (e) { return function (e) {

    while (function (e) { return function (e) { return e["x"]; }(e) < function (e) { return 5; }(e); }(e)) { e = function (e) { e["x"] = function (e) { return function (e) { return e["x"]; }(e) * function (e) { return 3; }(e); }(e); return e; }(e); } return e; }(function (e) { e["x"] = function (e) { return 1; }(e); return e; }(e)); } [Function] > program({}) { x: 9 }
  28. 43.

    The good news: compiled programs are faster. • removes interpretive

    overhead at run time • other performance benefits e.g. clever data structures, clever optimizations
  29. 44.

    • have to think about two times instead of one

    • have to implement in two languages instead of one • compiling dynamic languages is challenging The bad news: compilation is harder.
  30. 48.
  31. 49.

    You give a partial evaluator your subject program and some

    of its inputs. It evaluates only the parts of the program that depend on those inputs. What’s left is called the residual program.
  32. 52.

    A partial evaluator splits single-stage execution into two stages by

    timeshifting the processing of some input from the future to the present.
  33. 54.

    • read some source code • build an AST by

    parsing the source code • read some of the program’s inputs • analyse the program by walking over the AST to find the places where known inputs are used • partially evaluate the program by walking over the AST, evaluating fragments of code where possible and generating new code to replace them • emit a residual program
  34. 55.

    def power(n, x) if n.zero? 1 else x * power(n

    - 1, x) end end def power(n, x) if n.zero? 1 else x * power(n - 1, x) end end power(5, …)
  35. 56.

    def power(n, x) if n.zero? 1 else x * power(n

    - 1, x) end end def power(n, x) if n.zero? 1 else x * power(n - 1, x) end end power(5, …)
  36. 57.

    def power(n, x) if n.zero? 1 else x * power(n

    - 1, x) end end def power_5(x) if false 1 else x * power(5 - 1, x) end end power(5, …)
  37. 58.

    def power(n, x) if n.zero? 1 else x * power(n

    - 1, x) end end end def power_5(x) x * power(5 - 1, x) power(5, …)
  38. 59.

    def power(n, x) if n.zero? 1 else x * power(n

    - 1, x) end end end def power_5(x) x * power(4, x) power(5, …)
  39. 60.

    def power(n, x) if n.zero? 1 else x * power(n

    - 1, x) end end def power_5(x) x * if 4.zero? 1 else x * power(4 - 1, x) end end power(5, …)
  40. 61.

    def power(n, x) if n.zero? 1 else x * power(n

    - 1, x) end end def power_5(x) x * if false 1 else x * power(4 - 1, x) end end power(5, …)
  41. 62.

    x * power(4 - 1, x) end def power(n, x)

    if n.zero? 1 else x * power(n - 1, x) end end power(5, …) def power_5(x) x *
  42. 63.

    def power(n, x) if n.zero? 1 else x * power(n

    - 1, x) end end def power_5(x) x * x * power(3, x) end power(5, …)
  43. 64.

    def power(n, x) if n.zero? 1 else x * power(n

    - 1, x) end end def power_5(x) x * x * x * power(2, x) end power(5, …)
  44. 65.

    def power(n, x) if n.zero? 1 else x * power(n

    - 1, x) end end def power_5(x) x * x * x * x * power(1, x) end power(5, …)
  45. 66.

    def power_5(x) x * x * x * x *

    x * end power(0, x) def power(n, x) if n.zero? 1 else x * power(n - 1, x) end end power(5, …)
  46. 67.

    def power(n, x) if n.zero? 1 else x * power(n

    - 1, x) end end def power_5(x) x * x * x * x * x * if 0.zero? 1 else x * power(0 - 1, x) end end power(5, …)
  47. 68.

    def power(n, x) if n.zero? 1 else x * power(n

    - 1, x) end end def power_5(x) x * x * x * x * x * if true 1 else x * power(0 - 1, x) end end power(5, …)
  48. 69.

    def power_5(x) x * x * x * x *

    x * 1 def power(n, x) if n.zero? 1 else x * power(n - 1, x) end end end power(5, …)
  49. 70.

    def power(n, x) if n.zero? 1 else x * power(n

    - 1, x) end end power(5, …) def power_5(x) x * x * x * x * x * 1 end
  50. 71.

    def power(n, x) if n.zero? 1 else x * power(n

    - 1, x) end end def power_5(x) x * x * x * x * x end power(5, …)
  51. 72.

    def power_5(x) power(5, x) end >> power_5(2) => 32 def

    power(n, x) if n.zero? 1 else x * power(n - 1, x) end end This isn’t the same as partial application:
  52. 73.

    power_5 = method(:power). to_proc. curry. call(5) >> power_5.call(2) => 32

    def power(n, x) if n.zero? 1 else x * power(n - 1, x) end end This isn’t the same as partial application:
  53. 75.

    • specialize a web server to a particular config file

    • specialize a general ray tracer to a particular scene • specialize OS X OpenGL pipeline to a particular GPU Partial evaluation can take a general program and specialize it for a fixed input.
  54. 85.

    source, environment = 'x = 2; y = x *

    3', read_environment Treetop.load('simple.treetop') ast = SimpleParser.new.parse(source).to_ast puts ast.evaluate(environment)
  55. 87.

    environment = read_environment ast = Sequence.new( Assign.new( :x, Number.new(2) ),

    Assign.new( :y, Multiply.new( Variable.new(:x), Number.new(3) ) ) ) puts ast.evaluate(environment)
  56. 90.

    def evaluate(environment) environment.merge({ name => expression. evaluate(environment) }) end def

    evaluate(environment) environment.merge({ name => expression. evaluate(environment) }) end def evaluate(environment) second.evaluate( first.evaluate(environment) ) end Number Multiply Variable Number :x :y :x 3 2
  57. 91.

    def evaluate(environment) left.evaluate(environment) * right.evaluate(environment) end def evaluate(environment) environment.merge({ name

    => expression. evaluate(environment) }) end def evaluate(environment) environment.merge({ name => expression. evaluate(environment) }) end def evaluate(environment) second.evaluate( first.evaluate(environment) ) end Number Variable Number :x :y :x 3 2
  58. 92.

    def evaluate(environment) value end def evaluate(environment) value end def evaluate(environment)

    environment[name] end def evaluate(environment) left.evaluate(environment) * right.evaluate(environment) end def evaluate(environment) environment.merge({ name => expression. evaluate(environment) }) end def evaluate(environment) environment.merge({ name => expression. evaluate(environment) }) end def evaluate(environment) second.evaluate( first.evaluate(environment) ) end :x :y :x 3 2
  59. 93.

    def evaluate(environment) 2 end def evaluate(environment) 3 end def evaluate(environment)

    environment[:x] end def evaluate(environment) left.evaluate(environment) * right.evaluate(environment) end def evaluate(environment) environment.merge({ name => expression. evaluate(environment) }) end def evaluate(environment) environment.merge({ name => expression. evaluate(environment) }) end def evaluate(environment) second.evaluate( first.evaluate(environment) ) end :x :y
  60. 94.

    def evaluate(environment) environment[:x] * 3 end def evaluate(environment) environment.merge({ name

    => expression. evaluate(environment) }) end def evaluate(environment) environment.merge({ name => 2 }) end def evaluate(environment) second.evaluate( first.evaluate(environment) ) end :x :y
  61. 95.

    def evaluate(environment) environment.merge({ :y => environment[:x] * 3 }) end

    def evaluate(environment) environment.merge({ :x => 2 }) end def evaluate(environment) second.evaluate( first.evaluate(environment) ) end
  62. 96.

    def evaluate(environment) environment.merge({ :y => environment[:x] * 3 }) end

    def evaluate(environment) second.evaluate( environment.merge({ :x => 2 }) ) end
  63. 99.

    environment = read_environment ast = Sequence.new( Assign.new( :x, Number.new(2) ),

    Assign.new( :y, Multiply.new( Variable.new(:x), Number.new(3) ) ) ) puts ast.evaluate(environment)
  64. 100.

    environment = read_environment environment = environment.merge({ :x => 2 })

    puts environment.merge({ :y => environment[:x] * 3 })
  65. 101.

    environment = read_environment environment = environment.merge({ :x => 2 })

    puts environment.merge({ :y => environment[:x] * 3 }) x = 2; y = x * 3 “ ”
  66. 105.

    source program interpreter residual program target program output input partial

    evaluator residual program target program partial evaluator
  67. 106.

    source program interpreter residual program target program output input partial

    evaluator residual program target program partial evaluator compiler compiler
  68. 107.

    source program interpreter residual program target program output input partial

    evaluator residual program target program partial evaluator compiler compiler a compiler generator!
  69. 109.

    interpreter partial evaluator partial evaluator compiler source program target program

    output input target program compiler a compiler generator!
  70. 111.

    partial evaluator partial evaluator partial evaluator residual program residual program

    interpreter target program compiler source program output target program input compiler
  71. 112.

    partial evaluator partial evaluator partial evaluator residual program residual program

    interpreter target program compiler source program output target program input compiler compiler generator compiler generator
  72. 113.

    partial evaluator partial evaluator partial evaluator residual program residual program

    interpreter target program compiler source program output target program input compiler compiler generator compiler generator a compiler generator generator!
  73. 115.

    This is a fully general technique for generating compilers, so

    it only removes the interpretive overhead — it doesn’t invent new data structures or exploit anything language- or platform-specific.
  74. 116.
  75. 117.

    • “Partial Evaluation and Automatic Program Generation” — http://codon.io/pebook •

    LLVM’s JIT and the PyPy toolchain formerly Psyco use some of these program specialization techniques • Rubinius is built on top of LLVM • Topaz is a Ruby implementation built on top of the PyPy toolchain (RPython • Rubinius and JRuby have interesting and accessible compilers
  76. 118.

    From Simple Machines to Impossible Programs Tom Stuart Understanding Computation

    THE END RUBYCONF (50% off ebook, 40% off print http://computationbook.com/ @tomstuart / tom@codon.com shop.oreilly.com discount code