Slide 1

Slide 1 text

COMPILERS @tomstuart / RubyConf 2013 / 2013-11-09 fo FREE

Slide 2

Slide 2 text

I’m fascinated by metaprogramming. Let’s talk about metaprogramming.

Slide 3

Slide 3 text

As Ruby programmers we already intuitively understand the power of metaprogramming: programs that write programs.

Slide 4

Slide 4 text

But there's another thing that I find even more interesting: programs that manipulate representations of other programs.

Slide 5

Slide 5 text

I’d like to make you look at programs differently.

Slide 6

Slide 6 text

EXECUTING PROGRAMS

Slide 7

Slide 7 text

a program a machine

Slide 8

Slide 8 text

input 1 input 2 output input 3 ⋮

Slide 9

Slide 9 text

This only works if the program is written in a language that the machine understands.

Slide 10

Slide 10 text

If it’s written in an unfamiliar language, we need an interpreter or compiler.

Slide 11

Slide 11 text

INTERPRETERS

Slide 12

Slide 12 text

How do interpreters work?

Slide 13

Slide 13 text

• read some source code • build an AST by parsing the source code • evaluate the program by walking over the AST and performing instructions

Slide 14

Slide 14 text

SIMPLE a simple imperative language) a = 19 + 23 x = 2; y = x * 3 if (a < 10) { a = 0; b = 0 } else { b = 10 } x = 1; while (x < 5) { x = x * 3 }

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

grammar Simple rule statement sequence end rule sequence first:sequenced_statement '; ' second:sequence / sequenced_statement end rule sequenced_statement while / assign / if end rule while 'while (' condition:expression ') { ' body:statement ' }' end rule assign name:[a-z]+ ' = ' expression end rule if

Slide 17

Slide 17 text

rule if 'if (' condition:expression ') { ' consequence:statement ' } else { ' alternative:statement ' }' end rule expression less_than end rule less_than left:add ' < ' right:less_than / add end rule add left:multiply ' + ' right:add / multiply end rule multiply left:term ' * ' right:multiply /

Slide 18

Slide 18 text

rule multiply left:term ' * ' right:multiply / term end rule term number / boolean / variable end rule number [0-9]+ end rule boolean ('true' / 'false') ![a-z] end rule variable [a-z]+ end end

Slide 19

Slide 19 text

>> Treetop.load('simple.treetop') => SimpleParser >> SimpleParser.new.parse('x = 2; y = x * 3') => SyntaxNode+Sequence1+Sequence0 offset=0, "x = 2; y = x * 3" (first,second): SyntaxNode+Assign1+Assign0 offset=0, "x = 2" (name,expression): SyntaxNode offset=0, "x": SyntaxNode offset=0, "x" SyntaxNode offset=1, " = " SyntaxNode+Number0 offset=4, "2": SyntaxNode offset=4, "2" SyntaxNode offset=5, "; " SyntaxNode+Assign1+Assign0 offset=7, "y = x * 3" (name,expression): SyntaxNode offset=7, "y": SyntaxNode offset=7, "y" SyntaxNode offset=8, " = " SyntaxNode+Multiply1+Multiply0 offset=11, "x * 3" (left,right): SyntaxNode+Variable0 offset=11, "x": SyntaxNode offset=11, "x" SyntaxNode offset=12, " * " SyntaxNode+Number0 offset=15, "3": SyntaxNode offset=15, "3"

Slide 20

Slide 20 text

Number = Struct.new :value Boolean = Struct.new :value Variable = Struct.new :name Add = Struct.new :left, :right Multiply = Struct.new :left, :right LessThan = Struct.new :left, :right Assign = Struct.new :name, :expression If = Struct.new :condition, :consequence, :alternative Sequence = Struct.new :first, :second While = Struct.new :condition, :body

Slide 21

Slide 21 text

end rule boolean ('true' / 'false') ![a-z] end rule variable [a-z]+ rule number [0-9]+ end { def to_ast Number.new(text_value.to_i) end } { def to_ast Boolean.new(text_value == 'true') end } { def to_ast Variable.new(text_value.to_sym) end }

Slide 22

Slide 22 text

end rule while 'while (' condition:expression ') { ' body:statement ' }' end rule assign name:[a-z]+ ' = ' expression end rule if 'if (' condition:expression ') { ' consequence:statement ' } else { ' alternative:statement ' }' { def to_ast While.new(condition.to_ast, body.to_ast) end } { def to_ast Assign.new(name.text_value.to_sym, expression.to_ast) end } { def to_ast If.new(condition.to_ast, consequence.to_ast, alternative.to_ast) end }

Slide 23

Slide 23 text

>> SimpleParser.new.parse('x = 2; y = x * 3').to_ast => # >, second=#, right=# > > >

Slide 24

Slide 24 text

Sequence Assign Assign Number Multiply Variable Number :x :y :x 3 2 x = 2; y = x * 3 “ ”

Slide 25

Slide 25 text

class Number def evaluate(environment) value end end class Boolean def evaluate(environment) value end end class Variable def evaluate(environment) environment[name] end end

Slide 26

Slide 26 text

>> Number.new(3).evaluate({}) => 3 >> Boolean.new(false).evaluate({}) => false >> Variable.new(:y).evaluate({ x: 7, y: 11 }) => 11 >> Variable.new(:y).evaluate({ x: 7, y: true }) => true

Slide 27

Slide 27 text

class Add def evaluate(environment) left.evaluate(environment) + right.evaluate(environment) end end class Multiply def evaluate(environment) left.evaluate(environment) * right.evaluate(environment) end end class LessThan def evaluate(environment) left.evaluate(environment) < right.evaluate(environment) end end

Slide 28

Slide 28 text

>> Multiply.new(Variable.new(:x), Variable.new(:y)). evaluate({ x: 2, y: 3 }) => 6 >> LessThan.new(Variable.new(:x), Variable.new(:y)). evaluate({ x: 2, y: 3 }) => true

Slide 29

Slide 29 text

class Assign def evaluate(environment) environment.merge({ name => expression.evaluate(environment) }) end end class If def evaluate(environment) if condition.evaluate(environment) consequence.evaluate(environment) else alternative.evaluate(environment) end end end class Sequence def evaluate(environment) second.evaluate(first.evaluate(environment)) end end class While def evaluate(environment) if condition.evaluate(environment) evaluate(body.evaluate(environment)) else environment end end end

Slide 30

Slide 30 text

>> Assign.new(:x, Number.new(1)).evaluate({}) => {:x=>1} >> Sequence.new( Assign.new(:x, Number.new(1)), Assign.new(:x, Number.new(2)) ).evaluate({}) => {:x=>2} >> SimpleParser.new.parse('x = 2; y = x * 3'). to_ast.evaluate({}) => {:x=>2, :y=>6} >> SimpleParser.new.parse('x = 1; while (x < 5) { x = x * 3 }'). to_ast.evaluate({}) => {:x=>9}

Slide 31

Slide 31 text

Interpreters provide single-stage execution.

Slide 32

Slide 32 text

source program input output interpreter “run time”

Slide 33

Slide 33 text

COMPILERS

Slide 34

Slide 34 text

How do compilers work?

Slide 35

Slide 35 text

• read some source code • build an AST by parsing the source code • generate target code by walking over the AST and emitting instructions

Slide 36

Slide 36 text

require 'json' class Number def to_javascript "function (e) { return #{JSON.dump(value)}; }" end end class Boolean def to_javascript "function (e) { return #{JSON.dump(value)}; }" end end class Variable def to_javascript "function (e) { return e[#{JSON.dump(name)}]; }" end end

Slide 37

Slide 37 text

class Add def to_javascript "function (e) { return #{left.to_javascript}(e) + #{right.to_javascript}(e); }" end end class Multiply def to_javascript "function (e) { return #{left.to_javascript}(e) * #{right.to_javascript}(e); }" end end class LessThan def to_javascript "function (e) { return #{left.to_javascript}(e) < #{right.to_javascript}(e); }" end end

Slide 38

Slide 38 text

class Assign def to_javascript "function (e) { e[#{JSON.dump(name)}] = #{expression.to_javascript}(e); return e; }" end end class If def to_javascript "function (e) { if (#{condition.to_javascript}(e))" + " { return #{consequence.to_javascript}(e); }" + " else { return #{alternative.to_javascript}(e); }" + ' }' end end class Sequence def to_javascript "function (e) { return #{second.to_javascript}(#{first.to_javascript}(e)); }" end end class While def to_javascript 'function (e) {' + " while (#{condition.to_javascript}(e)) { e = #{body.to_javascript}(e); }" + ' return e;' + ' }' end end

Slide 39

Slide 39 text

>> SimpleParser.new.parse('x = 1; while (x < 5) { x = x * 3 }'). to_ast.to_javascript => "function (e) { return function (e) { while (function (e) { return function (e) { return e[\"x\"]; }(e) < function (e) { return 5; }(e); }(e)) { e = function (e) { e[\"x\"] = function (e) { return function (e) { return e[\"x\"]; }(e) * function (e) { return 3; }(e); }(e); return e; }(e); } return e; }(function (e) { e[\"x\"] = function (e) { return 1; }(e); return e; }(e)); }"

Slide 40

Slide 40 text

> program = function (e) { return function (e) { while (function (e) { return function (e) { return e["x"]; }(e) < function (e) { return 5; }(e); }(e)) { e = function (e) { e["x"] = function (e) { return function (e) { return e["x"]; }(e) * function (e) { return 3; }(e); }(e); return e; }(e); } return e; }(function (e) { e["x"] = function (e) { return 1; }(e); return e; }(e)); } [Function] > program({}) { x: 9 }

Slide 41

Slide 41 text

Compilers provide two-stage execution.

Slide 42

Slide 42 text

output source program input target program compiler target program “compile time” “run time”

Slide 43

Slide 43 text

The good news: compiled programs are faster. • removes interpretive overhead at run time • other performance benefits e.g. clever data structures, clever optimizations

Slide 44

Slide 44 text

• have to think about two times instead of one • have to implement in two languages instead of one • compiling dynamic languages is challenging The bad news: compilation is harder.

Slide 45

Slide 45 text

Writing an interpreter is easier, but interpreters are slower.

Slide 46

Slide 46 text

PARTIAL EVALUATORS

Slide 47

Slide 47 text

A partial evaluator is a cross between an interpreter and a compiler.

Slide 48

Slide 48 text

interpreters compilers execute now generate now, execute later partial evaluators execute some now, leave the rest to execute later

Slide 49

Slide 49 text

You give a partial evaluator your subject program and some of its inputs. It evaluates only the parts of the program that depend on those inputs. What’s left is called the residual program.

Slide 50

Slide 50 text

input 1 input 2 output subject program input 3 ⋮

Slide 51

Slide 51 text

input 1 ⋮ output residual program partial evaluator residual program subject program input 2 input 3

Slide 52

Slide 52 text

A partial evaluator splits single-stage execution into two stages by timeshifting the processing of some input from the future to the present.

Slide 53

Slide 53 text

How does a partial evaluator work?

Slide 54

Slide 54 text

• read some source code • build an AST by parsing the source code • read some of the program’s inputs • analyse the program by walking over the AST to find the places where known inputs are used • partially evaluate the program by walking over the AST, evaluating fragments of code where possible and generating new code to replace them • emit a residual program

Slide 55

Slide 55 text

def power(n, x) if n.zero? 1 else x * power(n - 1, x) end end def power(n, x) if n.zero? 1 else x * power(n - 1, x) end end power(5, …)

Slide 56

Slide 56 text

def power(n, x) if n.zero? 1 else x * power(n - 1, x) end end def power(n, x) if n.zero? 1 else x * power(n - 1, x) end end power(5, …)

Slide 57

Slide 57 text

def power(n, x) if n.zero? 1 else x * power(n - 1, x) end end def power_5(x) if false 1 else x * power(5 - 1, x) end end power(5, …)

Slide 58

Slide 58 text

def power(n, x) if n.zero? 1 else x * power(n - 1, x) end end end def power_5(x) x * power(5 - 1, x) power(5, …)

Slide 59

Slide 59 text

def power(n, x) if n.zero? 1 else x * power(n - 1, x) end end end def power_5(x) x * power(4, x) power(5, …)

Slide 60

Slide 60 text

def power(n, x) if n.zero? 1 else x * power(n - 1, x) end end def power_5(x) x * if 4.zero? 1 else x * power(4 - 1, x) end end power(5, …)

Slide 61

Slide 61 text

def power(n, x) if n.zero? 1 else x * power(n - 1, x) end end def power_5(x) x * if false 1 else x * power(4 - 1, x) end end power(5, …)

Slide 62

Slide 62 text

x * power(4 - 1, x) end def power(n, x) if n.zero? 1 else x * power(n - 1, x) end end power(5, …) def power_5(x) x *

Slide 63

Slide 63 text

def power(n, x) if n.zero? 1 else x * power(n - 1, x) end end def power_5(x) x * x * power(3, x) end power(5, …)

Slide 64

Slide 64 text

def power(n, x) if n.zero? 1 else x * power(n - 1, x) end end def power_5(x) x * x * x * power(2, x) end power(5, …)

Slide 65

Slide 65 text

def power(n, x) if n.zero? 1 else x * power(n - 1, x) end end def power_5(x) x * x * x * x * power(1, x) end power(5, …)

Slide 66

Slide 66 text

def power_5(x) x * x * x * x * x * end power(0, x) def power(n, x) if n.zero? 1 else x * power(n - 1, x) end end power(5, …)

Slide 67

Slide 67 text

def power(n, x) if n.zero? 1 else x * power(n - 1, x) end end def power_5(x) x * x * x * x * x * if 0.zero? 1 else x * power(0 - 1, x) end end power(5, …)

Slide 68

Slide 68 text

def power(n, x) if n.zero? 1 else x * power(n - 1, x) end end def power_5(x) x * x * x * x * x * if true 1 else x * power(0 - 1, x) end end power(5, …)

Slide 69

Slide 69 text

def power_5(x) x * x * x * x * x * 1 def power(n, x) if n.zero? 1 else x * power(n - 1, x) end end end power(5, …)

Slide 70

Slide 70 text

def power(n, x) if n.zero? 1 else x * power(n - 1, x) end end power(5, …) def power_5(x) x * x * x * x * x * 1 end

Slide 71

Slide 71 text

def power(n, x) if n.zero? 1 else x * power(n - 1, x) end end def power_5(x) x * x * x * x * x end power(5, …)

Slide 72

Slide 72 text

def power_5(x) power(5, x) end >> power_5(2) => 32 def power(n, x) if n.zero? 1 else x * power(n - 1, x) end end This isn’t the same as partial application:

Slide 73

Slide 73 text

power_5 = method(:power). to_proc. curry. call(5) >> power_5.call(2) => 32 def power(n, x) if n.zero? 1 else x * power(n - 1, x) end end This isn’t the same as partial application:

Slide 74

Slide 74 text

APPLICATIONS

Slide 75

Slide 75 text

• specialize a web server to a particular config file • specialize a general ray tracer to a particular scene • specialize OS X OpenGL pipeline to a particular GPU Partial evaluation can take a general program and specialize it for a fixed input.

Slide 76

Slide 76 text

COOL STORY

Slide 77

Slide 77 text

In 1971, Yoshihiko Futamura realised something cool.

Slide 78

Slide 78 text

input 1 input 2 output subject program input 3 ⋮

Slide 79

Slide 79 text

input 1 ⋮ output residual program partial evaluator residual program subject program input 2 input 3

Slide 80

Slide 80 text

source program input output interpreter

Slide 81

Slide 81 text

source program output residual program partial evaluator residual program interpreter input

Slide 82

Slide 82 text

source program output residual program partial evaluator residual program interpreter input target program target program

Slide 83

Slide 83 text

source program output residual program partial evaluator residual program interpreter input target program target program a compiler!

Slide 84

Slide 84 text

source, environment = read_source, read_environment Treetop.load('simple.treetop') ast = SimpleParser.new.parse(source).to_ast puts ast.evaluate(environment)

Slide 85

Slide 85 text

source, environment = 'x = 2; y = x * 3', read_environment Treetop.load('simple.treetop') ast = SimpleParser.new.parse(source).to_ast puts ast.evaluate(environment)

Slide 86

Slide 86 text

environment = read_environment Treetop.load('simple.treetop') ast = SimpleParser.new.parse('x = 2; y = x * 3').to_ast puts ast.evaluate(environment)

Slide 87

Slide 87 text

environment = read_environment ast = Sequence.new( Assign.new( :x, Number.new(2) ), Assign.new( :y, Multiply.new( Variable.new(:x), Number.new(3) ) ) ) puts ast.evaluate(environment)

Slide 88

Slide 88 text

Sequence Assign Assign Number Multiply Variable Number :x :y :x 3 2

Slide 89

Slide 89 text

def evaluate(environment) second.evaluate( first.evaluate(environment) ) end Assign Assign Number Multiply Variable Number :x :y :x 3 2

Slide 90

Slide 90 text

def evaluate(environment) environment.merge({ name => expression. evaluate(environment) }) end def evaluate(environment) environment.merge({ name => expression. evaluate(environment) }) end def evaluate(environment) second.evaluate( first.evaluate(environment) ) end Number Multiply Variable Number :x :y :x 3 2

Slide 91

Slide 91 text

def evaluate(environment) left.evaluate(environment) * right.evaluate(environment) end def evaluate(environment) environment.merge({ name => expression. evaluate(environment) }) end def evaluate(environment) environment.merge({ name => expression. evaluate(environment) }) end def evaluate(environment) second.evaluate( first.evaluate(environment) ) end Number Variable Number :x :y :x 3 2

Slide 92

Slide 92 text

def evaluate(environment) value end def evaluate(environment) value end def evaluate(environment) environment[name] end def evaluate(environment) left.evaluate(environment) * right.evaluate(environment) end def evaluate(environment) environment.merge({ name => expression. evaluate(environment) }) end def evaluate(environment) environment.merge({ name => expression. evaluate(environment) }) end def evaluate(environment) second.evaluate( first.evaluate(environment) ) end :x :y :x 3 2

Slide 93

Slide 93 text

def evaluate(environment) 2 end def evaluate(environment) 3 end def evaluate(environment) environment[:x] end def evaluate(environment) left.evaluate(environment) * right.evaluate(environment) end def evaluate(environment) environment.merge({ name => expression. evaluate(environment) }) end def evaluate(environment) environment.merge({ name => expression. evaluate(environment) }) end def evaluate(environment) second.evaluate( first.evaluate(environment) ) end :x :y

Slide 94

Slide 94 text

def evaluate(environment) environment[:x] * 3 end def evaluate(environment) environment.merge({ name => expression. evaluate(environment) }) end def evaluate(environment) environment.merge({ name => 2 }) end def evaluate(environment) second.evaluate( first.evaluate(environment) ) end :x :y

Slide 95

Slide 95 text

def evaluate(environment) environment.merge({ :y => environment[:x] * 3 }) end def evaluate(environment) environment.merge({ :x => 2 }) end def evaluate(environment) second.evaluate( first.evaluate(environment) ) end

Slide 96

Slide 96 text

def evaluate(environment) environment.merge({ :y => environment[:x] * 3 }) end def evaluate(environment) second.evaluate( environment.merge({ :x => 2 }) ) end

Slide 97

Slide 97 text

def evaluate(environment) environment = environment.merge({ :x => 2 }) environment.merge({ :y => environment[:x] * 3 }) end

Slide 98

Slide 98 text

def evaluate(environment) environment = environment.merge({ :x => 2 }) environment.merge({ :y => environment[:x] * 3 }) end

Slide 99

Slide 99 text

environment = read_environment ast = Sequence.new( Assign.new( :x, Number.new(2) ), Assign.new( :y, Multiply.new( Variable.new(:x), Number.new(3) ) ) ) puts ast.evaluate(environment)

Slide 100

Slide 100 text

environment = read_environment environment = environment.merge({ :x => 2 }) puts environment.merge({ :y => environment[:x] * 3 })

Slide 101

Slide 101 text

environment = read_environment environment = environment.merge({ :x => 2 }) puts environment.merge({ :y => environment[:x] * 3 }) x = 2; y = x * 3 “ ”

Slide 102

Slide 102 text

target = partial_evaluator(interpreter, source) FIRST FUTAMURA PROJECTION

Slide 103

Slide 103 text

source program output partial evaluator interpreter input target program target program a compiler!

Slide 104

Slide 104 text

source program partial evaluator interpreter target program

Slide 105

Slide 105 text

source program interpreter residual program target program output input partial evaluator residual program target program partial evaluator

Slide 106

Slide 106 text

source program interpreter residual program target program output input partial evaluator residual program target program partial evaluator compiler compiler

Slide 107

Slide 107 text

source program interpreter residual program target program output input partial evaluator residual program target program partial evaluator compiler compiler a compiler generator!

Slide 108

Slide 108 text

compiler = partial_evaluator(partial_evaluator, interpreter) SECOND FUTAMURA PROJECTION

Slide 109

Slide 109 text

interpreter partial evaluator partial evaluator compiler source program target program output input target program compiler a compiler generator!

Slide 110

Slide 110 text

interpreter partial evaluator partial evaluator compiler

Slide 111

Slide 111 text

partial evaluator partial evaluator partial evaluator residual program residual program interpreter target program compiler source program output target program input compiler

Slide 112

Slide 112 text

partial evaluator partial evaluator partial evaluator residual program residual program interpreter target program compiler source program output target program input compiler compiler generator compiler generator

Slide 113

Slide 113 text

partial evaluator partial evaluator partial evaluator residual program residual program interpreter target program compiler source program output target program input compiler compiler generator compiler generator a compiler generator generator!

Slide 114

Slide 114 text

compiler_generator = partial_evaluator(partial_evaluator, partial_evaluator) THIRD FUTAMURA PROJECTION

Slide 115

Slide 115 text

This is a fully general technique for generating compilers, so it only removes the interpretive overhead — it doesn’t invent new data structures or exploit anything language- or platform-specific.

Slide 116

Slide 116 text

MORE‽

Slide 117

Slide 117 text

• “Partial Evaluation and Automatic Program Generation” — http://codon.io/pebook • LLVM’s JIT and the PyPy toolchain formerly Psyco use some of these program specialization techniques • Rubinius is built on top of LLVM • Topaz is a Ruby implementation built on top of the PyPy toolchain (RPython • Rubinius and JRuby have interesting and accessible compilers

Slide 118

Slide 118 text

From Simple Machines to Impossible Programs Tom Stuart Understanding Computation THE END RUBYCONF (50% off ebook, 40% off print http://computationbook.com/ @tomstuart / [email protected] shop.oreilly.com discount code