fancycompleter, ... Consultant, trainer You can hire me :-) http://antocuni.eu antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 2 / 29
ideal for writing VMs JIT & GC for free Python interpreter written in RPython Whatever (dynamic) language you want smalltalk, prolog, javascript, ... antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 4 / 29
accounts for the 80% of the runtime hot-spots Fast Path principle optimize only what is necessary fall back for uncommon cases Most of runtime spent in loops Always the same code paths (likely) antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 7 / 29
accounts for the 80% of the runtime hot-spots Fast Path principle optimize only what is necessary fall back for uncommon cases Most of runtime spent in loops Always the same code paths (likely) antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 7 / 29
Tracing phase linear trace Compiling Execute guards to ensure correctness Profit :-) antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 8 / 29
cold guard failed entering compiled loop guard failure → hot hot guard failed antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 9 / 29
} class IncrOrDecr implements Operation { public int DoSomething(int x) { if (x < 0) return x-1; else return x+1; } } class tracing { public static void main(String argv[]) { int N = 100; int i = 0; Operation op = new IncrOrDecr(); while (i < N) { i = op.DoSomething(i); } System.out.println(i); } } antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 10 / 29
Instruction added to the trace but not executed Method Java code Trace Value antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 12 / 29
Instruction added to the trace but not executed Method Java code Trace Value Main while (i < N) { ILOAD 2 3 ILOAD 1 100 IF ICMPGE LABEL 1 f alse GUARD ICMPLT antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 12 / 29
Instruction added to the trace but not executed Method Java code Trace Value Main while (i < N) { ILOAD 2 3 ILOAD 1 100 IF ICMPGE LABEL 1 f alse GUARD ICMPLT i = op.DoSomething(i); ALOAD 3 IncrOrDecr obj ILOAD 2 3 INVOKEINTERFACE ... GUARD CLASS(IncrOrDecr) antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 12 / 29
Instruction added to the trace but not executed Method Java code Trace Value Main while (i < N) { ILOAD 2 3 ILOAD 1 100 IF ICMPGE LABEL 1 f alse GUARD ICMPLT i = op.DoSomething(i); ALOAD 3 IncrOrDecr obj ILOAD 2 3 INVOKEINTERFACE ... GUARD CLASS(IncrOrDecr) DoSomething if (x < 0) ILOAD 1 3 IFGE LABEL 0 true GUARD GE antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 12 / 29
a = 0; int i = 0; int N = 100; while(i < N) { if (i%2 == 0) a++; else a*=2; i++; } } antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 13 / 29
A LOAD_GLOBAL B BINARY_ADD STORE_FAST C ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 17 / 29
i19 = int_add_ovf(i15, 2) guard_no_overflow() ... optimized ... i17 = int_lt(i15, 5000) guard_true(i17) i19 = int_add(i15, 2) ... It works often array bound checking intbound info propagates all over the trace antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 20 / 29
i19 = int_add_ovf(i15, 2) guard_no_overflow() ... optimized ... i17 = int_lt(i15, 5000) guard_true(i17) i19 = int_add(i15, 2) ... It works often array bound checking intbound info propagates all over the trace antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 20 / 29
i19 = int_add_ovf(i15, 2) guard_no_overflow() ... optimized ... i17 = int_lt(i15, 5000) guard_true(i17) i19 = int_add(i15, 2) ... It works often array bound checking intbound info propagates all over the trace antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 20 / 29
i2 = int_add(i1, 2) p3 = new(W_IntObject) setfield_gc(p3, i2, ’intval’) ... optimized ... i2 = int_add(i1, 2) ... The most important optimization (TM) It works both inside the trace and across the loop It works for tons of cases e.g. function frames antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 22 / 29
i2 = int_add(i1, 2) p3 = new(W_IntObject) setfield_gc(p3, i2, ’intval’) ... optimized ... i2 = int_add(i1, 2) ... The most important optimization (TM) It works both inside the trace and across the loop It works for tons of cases e.g. function frames antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 22 / 29
i2 = int_add(i1, 2) p3 = new(W_IntObject) setfield_gc(p3, i2, ’intval’) ... optimized ... i2 = int_add(i1, 2) ... The most important optimization (TM) It works both inside the trace and across the loop It works for tons of cases e.g. function frames antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 22 / 29
= getfield_pure(<W_Int(2)>, ’intval’) i3 = int_add(i1, i2) ... optimized ... i1 = getfield_pure(p0, ’intval’) i3 = int_add(i1, 2) ... It “finishes the job” Works well together with other optimizations (e.g. virtuals) It also does “normal, boring, static” constant-folding antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 24 / 29
= getfield_pure(<W_Int(2)>, ’intval’) i3 = int_add(i1, i2) ... optimized ... i1 = getfield_pure(p0, ’intval’) i3 = int_add(i1, 2) ... It “finishes the job” Works well together with other optimizations (e.g. virtuals) It also does “normal, boring, static” constant-folding antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 24 / 29
= getfield_pure(<W_Int(2)>, ’intval’) i3 = int_add(i1, i2) ... optimized ... i1 = getfield_pure(p0, ’intval’) i3 = int_add(i1, 2) ... It “finishes the job” Works well together with other optimizations (e.g. virtuals) It also does “normal, boring, static” constant-folding antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 24 / 29
p0 = getfield_gc(<Cell>, ’val’) ... i2 = getfield_pure(p0, ’intval’) i3 = int_add(i1, i2) optimized ... guard_not_invalidated() ... i3 = int_add(i1, 2) ... Python is too dynamic, but we don’t care :-) No overhead in assembler code Used a bit “everywhere” Credits to Mark Shannon for the name :-) antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 26 / 29
p0 = getfield_gc(<Cell>, ’val’) ... i2 = getfield_pure(p0, ’intval’) i3 = int_add(i1, i2) optimized ... guard_not_invalidated() ... i3 = int_add(i1, 2) ... Python is too dynamic, but we don’t care :-) No overhead in assembler code Used a bit “everywhere” Credits to Mark Shannon for the name :-) antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 26 / 29
p0 = getfield_gc(<Cell>, ’val’) ... i2 = getfield_pure(p0, ’intval’) i3 = int_add(i1, i2) optimized ... guard_not_invalidated() ... i3 = int_add(i1, 2) ... Python is too dynamic, but we don’t care :-) No overhead in assembler code Used a bit “everywhere” Credits to Mark Shannon for the name :-) antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 26 / 29