Slide 1

Slide 1 text

PyPy JIT under the hood Antonio Cuni Armin Rigo (guest star) EuroPython 2012 July 4 2012 antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 1 / 29

Slide 2

Slide 2 text

About me PyPy core dev PyPy py3k tech leader pdb++, fancycompleter, ... Consultant, trainer http://antocuni.eu antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 2 / 29

Slide 3

Slide 3 text

About this talk What is PyPy? (in 30 seconds) (for those who missed the keynote :-)) Overview of tracing JITs The PyPy JIT generator Just In Time talk last-modified: July, 4th, 12:06 antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 3 / 29

Slide 4

Slide 4 text

Part 0: What is PyPy? RPython toolchain subset of Python ideal for writing VMs JIT & GC for free Python interpreter written in RPython Whatever (dynamic) language you want smalltalk, prolog, javascript, ... antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 4 / 29

Slide 5

Slide 5 text

Part 1 Overview of tracing JITs antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 5 / 29

Slide 6

Slide 6 text

Compilers When? Batch or Ahead Of Time Just In Time How? Static Dynamic or Adaptive What? Method-based compiler Tracing compiler PyPy: JIT, Dynamic, Tracing antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 6 / 29

Slide 7

Slide 7 text

Compilers When? Batch or Ahead Of Time Just In Time How? Static Dynamic or Adaptive What? Method-based compiler Tracing compiler PyPy: JIT, Dynamic, Tracing antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 6 / 29

Slide 8

Slide 8 text

Compilers When? Batch or Ahead Of Time Just In Time How? Static Dynamic or Adaptive What? Method-based compiler Tracing compiler PyPy: JIT, Dynamic, Tracing antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 6 / 29

Slide 9

Slide 9 text

Compilers When? Batch or Ahead Of Time Just In Time How? Static Dynamic or Adaptive What? Method-based compiler Tracing compiler PyPy: JIT, Dynamic, Tracing antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 6 / 29

Slide 10

Slide 10 text

Assumptions Pareto Principle (80-20 rule) the 20% of the program accounts for the 80% of the runtime hot-spots Fast Path principle optimize only what is necessary fall back for uncommon cases Most of runtime spent in loops Always the same code paths (likely) antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 7 / 29

Slide 11

Slide 11 text

Assumptions Pareto Principle (80-20 rule) the 20% of the program accounts for the 80% of the runtime hot-spots Fast Path principle optimize only what is necessary fall back for uncommon cases Most of runtime spent in loops Always the same code paths (likely) antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 7 / 29

Slide 12

Slide 12 text

Tracing JIT Interpret the program as usual Detect hot loops Tracing phase linear trace Compiling Execute guards to ensure correctness Profit :-) antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 8 / 29

Slide 13

Slide 13 text

Tracing JIT phases Interpretation antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 9 / 29

Slide 14

Slide 14 text

Tracing JIT phases Interpretation Tracing hot loop detected antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 9 / 29

Slide 15

Slide 15 text

Tracing JIT phases Interpretation Tracing hot loop detected Compilation antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 9 / 29

Slide 16

Slide 16 text

Tracing JIT phases Interpretation Tracing hot loop detected Compilation Running antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 9 / 29

Slide 17

Slide 17 text

Tracing JIT phases Interpretation Tracing hot loop detected Compilation Running cold guard failed antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 9 / 29

Slide 18

Slide 18 text

Tracing JIT phases Interpretation Tracing hot loop detected Compilation Running cold guard failed entering compiled loop antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 9 / 29

Slide 19

Slide 19 text

Tracing JIT phases Interpretation Tracing hot loop detected Compilation Running cold guard failed entering compiled loop guard failure → hot antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 9 / 29

Slide 20

Slide 20 text

Tracing JIT phases Interpretation Tracing hot loop detected Compilation Running cold guard failed entering compiled loop guard failure → hot hot guard failed antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 9 / 29

Slide 21

Slide 21 text

Tracing Example (1) java interface Operation { int DoSomething(int x); } class IncrOrDecr implements Operation { public int DoSomething(int x) { if (x < 0) return x-1; else return x+1; } } class tracing { public static void main(String argv[]) { int N = 100; int i = 0; Operation op = new IncrOrDecr(); while (i < N) { i = op.DoSomething(i); } System.out.println(i); } } antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 10 / 29

Slide 22

Slide 22 text

Tracing Example (2) Java bytecode class IncrOrDecr { ... public DoSomething(I)I ILOAD 1 IFGE LABEL_0 ILOAD 1 ICONST_1 ISUB IRETURN LABEL_0 ILOAD 1 ICONST_1 IADD IRETURN } Java bytecode class tracing { ... public static main( [Ljava/lang/String;)V ... LABEL_0 ILOAD 2 ILOAD 1 IF_ICMPGE LABEL_1 ALOAD 3 ILOAD 2 INVOKEINTERFACE Operation.DoSomething (I)I ISTORE 2 GOTO LABEL_0 LABEL_1 ... } antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 11 / 29

Slide 23

Slide 23 text

Tracing Example (2) Java bytecode class IncrOrDecr { ... public DoSomething(I)I ILOAD 1 IFGE LABEL_0 ILOAD 1 ICONST_1 ISUB IRETURN LABEL_0 ILOAD 1 ICONST_1 IADD IRETURN } Java bytecode class tracing { ... public static main( [Ljava/lang/String;)V ... LABEL_0 ILOAD 2 ILOAD 1 IF_ICMPGE LABEL_1 ALOAD 3 ILOAD 2 INVOKEINTERFACE Operation.DoSomething (I)I ISTORE 2 GOTO LABEL_0 LABEL_1 ... } antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 11 / 29

Slide 24

Slide 24 text

Tracing example (3) INSTR: Instruction executed but not recorded INSTR: Instruction added to the trace but not executed Method Java code Trace Value antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 12 / 29

Slide 25

Slide 25 text

Tracing example (3) INSTR: Instruction executed but not recorded INSTR: Instruction added to the trace but not executed Method Java code Trace Value Main while (i < N) { ILOAD 2 3 ILOAD 1 100 IF ICMPGE LABEL 1 f alse GUARD ICMPLT antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 12 / 29

Slide 26

Slide 26 text

Tracing example (3) INSTR: Instruction executed but not recorded INSTR: Instruction added to the trace but not executed Method Java code Trace Value Main while (i < N) { ILOAD 2 3 ILOAD 1 100 IF ICMPGE LABEL 1 f alse GUARD ICMPLT i = op.DoSomething(i); ALOAD 3 IncrOrDecr obj ILOAD 2 3 INVOKEINTERFACE ... GUARD CLASS(IncrOrDecr) antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 12 / 29

Slide 27

Slide 27 text

Tracing example (3) INSTR: Instruction executed but not recorded INSTR: Instruction added to the trace but not executed Method Java code Trace Value Main while (i < N) { ILOAD 2 3 ILOAD 1 100 IF ICMPGE LABEL 1 f alse GUARD ICMPLT i = op.DoSomething(i); ALOAD 3 IncrOrDecr obj ILOAD 2 3 INVOKEINTERFACE ... GUARD CLASS(IncrOrDecr) DoSomething if (x < 0) ILOAD 1 3 IFGE LABEL 0 true GUARD GE antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 12 / 29

Slide 28

Slide 28 text

Tracing example (3) INSTR: Instruction executed but not recorded INSTR: Instruction added to the trace but not executed Method Java code Trace Value Main while (i < N) { ILOAD 2 3 ILOAD 1 100 IF ICMPGE LABEL 1 f alse GUARD ICMPLT i = op.DoSomething(i); ALOAD 3 IncrOrDecr obj ILOAD 2 3 INVOKEINTERFACE ... GUARD CLASS(IncrOrDecr) DoSomething if (x < 0) ILOAD 1 3 IFGE LABEL 0 true GUARD GE return x+1; ILOAD 1 3 ICONST 1 1 IADD 4 IRETURN antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 12 / 29

Slide 29

Slide 29 text

Tracing example (3) INSTR: Instruction executed but not recorded INSTR: Instruction added to the trace but not executed Method Java code Trace Value Main while (i < N) { ILOAD 2 3 ILOAD 1 100 IF ICMPGE LABEL 1 f alse GUARD ICMPLT i = op.DoSomething(i); ALOAD 3 IncrOrDecr obj ILOAD 2 3 INVOKEINTERFACE ... GUARD CLASS(IncrOrDecr) DoSomething if (x < 0) ILOAD 1 3 IFGE LABEL 0 true GUARD GE return x+1; ILOAD 1 3 ICONST 1 1 IADD 4 IRETURN Main ISTORE 2 i = op.DoSomething(i); } GOTO LABEL 0 4 antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 12 / 29

Slide 30

Slide 30 text

Trace trees (1) tracetree.java public static void trace_trees() { int a = 0; int i = 0; int N = 100; while(i < N) { if (i%2 == 0) a++; else a*=2; i++; } } antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 13 / 29

Slide 31

Slide 31 text

Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD 1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 14 / 29

Slide 32

Slide 32 text

Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD 1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 14 / 29

Slide 33

Slide 33 text

Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD 1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 14 / 29

Slide 34

Slide 34 text

Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD 1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 BLACKHOLE antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 14 / 29

Slide 35

Slide 35 text

Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD 1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 BLACKHOLE INTERPRETER antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 14 / 29

Slide 36

Slide 36 text

Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD 1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 BLACKHOLE INTERPRETER antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 14 / 29

Slide 37

Slide 37 text

Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD 1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 14 / 29

Slide 38

Slide 38 text

Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD 1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 IINC 0 1 IINC 1 1 antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 14 / 29

Slide 39

Slide 39 text

Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD 1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 IINC 0 1 IINC 1 1 antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 14 / 29

Slide 40

Slide 40 text

Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD 1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 IINC 0 1 IINC 1 1 antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 14 / 29

Slide 41

Slide 41 text

Part 2 The PyPy JIT generator antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 15 / 29

Slide 42

Slide 42 text

General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self): ... RPYTHON antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 16 / 29

Slide 43

Slide 43 text

General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self): ... RPYTHON CODEWRITER antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 16 / 29

Slide 44

Slide 44 text

General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self): ... RPYTHON CODEWRITER ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) .... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... ... promote_class(p0) i0 = getfield_gc(p0, 'intval') promote_class(p1) i1 = getfield_gc(p1, 'intval') i2 = int_add(i0, i1) if (overflowed) goto ... p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') .... JITCODE antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 16 / 29

Slide 45

Slide 45 text

General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self): ... RPYTHON CODEWRITER ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) .... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... ... promote_class(p0) i0 = getfield_gc(p0, 'intval') promote_class(p1) i1 = getfield_gc(p1, 'intval') i2 = int_add(i0, i1) if (overflowed) goto ... p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') .... JITCODE compile-time runtime antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 16 / 29

Slide 46

Slide 46 text

General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self): ... RPYTHON CODEWRITER ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) .... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... ... promote_class(p0) i0 = getfield_gc(p0, 'intval') promote_class(p1) i1 = getfield_gc(p1, 'intval') i2 = int_add(i0, i1) if (overflowed) goto ... p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') .... JITCODE compile-time runtime META-TRACER antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 16 / 29

Slide 47

Slide 47 text

General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self): ... RPYTHON CODEWRITER ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) .... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... ... promote_class(p0) i0 = getfield_gc(p0, 'intval') promote_class(p1) i1 = getfield_gc(p1, 'intval') i2 = int_add(i0, i1) if (overflowed) goto ... p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') .... JITCODE compile-time runtime META-TRACER OPTIMIZER antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 16 / 29

Slide 48

Slide 48 text

General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self): ... RPYTHON CODEWRITER ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) .... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... ... promote_class(p0) i0 = getfield_gc(p0, 'intval') promote_class(p1) i1 = getfield_gc(p1, 'intval') i2 = int_add(i0, i1) if (overflowed) goto ... p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') .... JITCODE compile-time runtime META-TRACER OPTIMIZER BACKEND antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 16 / 29

Slide 49

Slide 49 text

General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self): ... RPYTHON CODEWRITER ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) .... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... ... promote_class(p0) i0 = getfield_gc(p0, 'intval') promote_class(p1) i1 = getfield_gc(p1, 'intval') i2 = int_add(i0, i1) if (overflowed) goto ... p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') .... JITCODE compile-time runtime META-TRACER OPTIMIZER BACKEND ASSEMBLER antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 16 / 29

Slide 50

Slide 50 text

PyPy trace example def fn(): c = a+b ... antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 17 / 29

Slide 51

Slide 51 text

PyPy trace example def fn(): c = a+b ... LOAD_GLOBAL A LOAD_GLOBAL B BINARY_ADD STORE_FAST C antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 17 / 29

Slide 52

Slide 52 text

PyPy trace example def fn(): c = a+b ... LOAD_GLOBAL A LOAD_GLOBAL B BINARY_ADD STORE_FAST C ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 17 / 29

Slide 53

Slide 53 text

PyPy trace example def fn(): c = a+b ... LOAD_GLOBAL A LOAD_GLOBAL B BINARY_ADD STORE_FAST C ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 17 / 29

Slide 54

Slide 54 text

PyPy trace example def fn(): c = a+b ... LOAD_GLOBAL A LOAD_GLOBAL B BINARY_ADD STORE_FAST C ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... ... guard_class(p0, W_IntObject) i0 = getfield_gc(p0, 'intval') guard_class(p1, W_IntObject) i1 = getfield_gc(p1, 'intval') i2 = int_add(00, i1) guard_not_overflow() p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') ... antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 17 / 29

Slide 55

Slide 55 text

PyPy trace example def fn(): c = a+b ... LOAD_GLOBAL A LOAD_GLOBAL B BINARY_ADD STORE_FAST C ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... ... guard_class(p0, W_IntObject) i0 = getfield_gc(p0, 'intval') guard_class(p1, W_IntObject) i1 = getfield_gc(p1, 'intval') i2 = int_add(00, i1) guard_not_overflow() p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') ... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 17 / 29

Slide 56

Slide 56 text

PyPy optimizer intbounds constant folding / pure operations virtuals string optimizations heap (multiple get/setfield, etc) ffi unroll antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 18 / 29

Slide 57

Slide 57 text

Intbound optimization (1) intbound.py def fn(): i = 0 while i < 5000: i += 2 return i antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 19 / 29

Slide 58

Slide 58 text

Intbound optimization (2) unoptimized ... i17 = int_lt(i15, 5000) guard_true(i17) i19 = int_add_ovf(i15, 2) guard_no_overflow() ... optimized ... i17 = int_lt(i15, 5000) guard_true(i17) i19 = int_add(i15, 2) ... It works often array bound checking intbound info propagates all over the trace antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 20 / 29

Slide 59

Slide 59 text

Intbound optimization (2) unoptimized ... i17 = int_lt(i15, 5000) guard_true(i17) i19 = int_add_ovf(i15, 2) guard_no_overflow() ... optimized ... i17 = int_lt(i15, 5000) guard_true(i17) i19 = int_add(i15, 2) ... It works often array bound checking intbound info propagates all over the trace antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 20 / 29

Slide 60

Slide 60 text

Intbound optimization (2) unoptimized ... i17 = int_lt(i15, 5000) guard_true(i17) i19 = int_add_ovf(i15, 2) guard_no_overflow() ... optimized ... i17 = int_lt(i15, 5000) guard_true(i17) i19 = int_add(i15, 2) ... It works often array bound checking intbound info propagates all over the trace antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 20 / 29

Slide 61

Slide 61 text

Virtuals (1) virtuals.py def fn(): i = 0 while i < 5000: i += 2 return i antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 21 / 29

Slide 62

Slide 62 text

Virtuals (2) unoptimized ... guard_class(p0, W_IntObject) i1 = getfield_pure(p0, ’intval’) i2 = int_add(i1, 2) p3 = new(W_IntObject) setfield_gc(p3, i2, ’intval’) ... optimized ... i2 = int_add(i1, 2) ... The most important optimization (TM) It works both inside the trace and across the loop It works for tons of cases e.g. function frames antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 22 / 29

Slide 63

Slide 63 text

Virtuals (2) unoptimized ... guard_class(p0, W_IntObject) i1 = getfield_pure(p0, ’intval’) i2 = int_add(i1, 2) p3 = new(W_IntObject) setfield_gc(p3, i2, ’intval’) ... optimized ... i2 = int_add(i1, 2) ... The most important optimization (TM) It works both inside the trace and across the loop It works for tons of cases e.g. function frames antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 22 / 29

Slide 64

Slide 64 text

Virtuals (2) unoptimized ... guard_class(p0, W_IntObject) i1 = getfield_pure(p0, ’intval’) i2 = int_add(i1, 2) p3 = new(W_IntObject) setfield_gc(p3, i2, ’intval’) ... optimized ... i2 = int_add(i1, 2) ... The most important optimization (TM) It works both inside the trace and across the loop It works for tons of cases e.g. function frames antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 22 / 29

Slide 65

Slide 65 text

Constant folding (1) constfold.py def fn(): i = 0 while i < 5000: i += 2 return i antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 23 / 29

Slide 66

Slide 66 text

Constant folding (2) unoptimized ... i1 = getfield_pure(p0, ’intval’) i2 = getfield_pure(, ’intval’) i3 = int_add(i1, i2) ... optimized ... i1 = getfield_pure(p0, ’intval’) i3 = int_add(i1, 2) ... It “finishes the job” Works well together with other optimizations (e.g. virtuals) It also does “normal, boring, static” constant-folding antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 24 / 29

Slide 67

Slide 67 text

Constant folding (2) unoptimized ... i1 = getfield_pure(p0, ’intval’) i2 = getfield_pure(, ’intval’) i3 = int_add(i1, i2) ... optimized ... i1 = getfield_pure(p0, ’intval’) i3 = int_add(i1, 2) ... It “finishes the job” Works well together with other optimizations (e.g. virtuals) It also does “normal, boring, static” constant-folding antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 24 / 29

Slide 68

Slide 68 text

Constant folding (2) unoptimized ... i1 = getfield_pure(p0, ’intval’) i2 = getfield_pure(, ’intval’) i3 = int_add(i1, i2) ... optimized ... i1 = getfield_pure(p0, ’intval’) i3 = int_add(i1, 2) ... It “finishes the job” Works well together with other optimizations (e.g. virtuals) It also does “normal, boring, static” constant-folding antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 24 / 29

Slide 69

Slide 69 text

Out of line guards (1) outoflineguards.py N = 2 def fn(): i = 0 while i < 5000: i += N return i antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 25 / 29

Slide 70

Slide 70 text

Out of line guards (2) unoptimized ... quasiimmut_field(, ’val’) guard_not_invalidated() p0 = getfield_gc(, ’val’) ... i2 = getfield_pure(p0, ’intval’) i3 = int_add(i1, i2) optimized ... guard_not_invalidated() ... i3 = int_add(i1, 2) ... Python is too dynamic, but we don’t care :-) No overhead in assembler code Used a bit “everywhere” Credits to Mark Shannon for the name :-) antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 26 / 29

Slide 71

Slide 71 text

Out of line guards (2) unoptimized ... quasiimmut_field(, ’val’) guard_not_invalidated() p0 = getfield_gc(, ’val’) ... i2 = getfield_pure(p0, ’intval’) i3 = int_add(i1, i2) optimized ... guard_not_invalidated() ... i3 = int_add(i1, 2) ... Python is too dynamic, but we don’t care :-) No overhead in assembler code Used a bit “everywhere” Credits to Mark Shannon for the name :-) antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 26 / 29

Slide 72

Slide 72 text

Out of line guards (2) unoptimized ... quasiimmut_field(, ’val’) guard_not_invalidated() p0 = getfield_gc(, ’val’) ... i2 = getfield_pure(p0, ’intval’) i3 = int_add(i1, i2) optimized ... guard_not_invalidated() ... i3 = int_add(i1, 2) ... Python is too dynamic, but we don’t care :-) No overhead in assembler code Used a bit “everywhere” Credits to Mark Shannon for the name :-) antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 26 / 29

Slide 73

Slide 73 text

Guards guard_true guard_false guard_class guard_no_overflow guard_value antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 27 / 29

Slide 74

Slide 74 text

Promotion guard_value specialize code make sure not to overspecialize example: type of objects example: function code objects, ... antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 28 / 29

Slide 75

Slide 75 text

Conclusion PyPy is cool :-) Any question? antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 29 / 29