PyPy JIT under the hood

PyPy JIT under the hood

EuroPython 2012, Firenze, Italy

Cdc3cafa377f0e0e93fc69636021ef65?s=128

Antonio Cuni

July 04, 2012
Tweet

Transcript

  1. PyPy JIT under the hood Antonio Cuni Armin Rigo (guest

    star) EuroPython 2012 July 4 2012 antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 1 / 29
  2. About me PyPy core dev PyPy py3k tech leader pdb++,

    fancycompleter, ... Consultant, trainer http://antocuni.eu antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 2 / 29
  3. About this talk What is PyPy? (in 30 seconds) (for

    those who missed the keynote :-)) Overview of tracing JITs The PyPy JIT generator Just In Time talk last-modified: July, 4th, 12:06 antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 3 / 29
  4. Part 0: What is PyPy? RPython toolchain subset of Python

    ideal for writing VMs JIT & GC for free Python interpreter written in RPython Whatever (dynamic) language you want smalltalk, prolog, javascript, ... antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 4 / 29
  5. Part 1 Overview of tracing JITs antocuni, arigo (EuroPython 2012)

    PyPy JIT under the hood July 4 2012 5 / 29
  6. Compilers When? Batch or Ahead Of Time Just In Time

    How? Static Dynamic or Adaptive What? Method-based compiler Tracing compiler PyPy: JIT, Dynamic, Tracing antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 6 / 29
  7. Compilers When? Batch or Ahead Of Time Just In Time

    How? Static Dynamic or Adaptive What? Method-based compiler Tracing compiler PyPy: JIT, Dynamic, Tracing antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 6 / 29
  8. Compilers When? Batch or Ahead Of Time Just In Time

    How? Static Dynamic or Adaptive What? Method-based compiler Tracing compiler PyPy: JIT, Dynamic, Tracing antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 6 / 29
  9. Compilers When? Batch or Ahead Of Time Just In Time

    How? Static Dynamic or Adaptive What? Method-based compiler Tracing compiler PyPy: JIT, Dynamic, Tracing antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 6 / 29
  10. Assumptions Pareto Principle (80-20 rule) the 20% of the program

    accounts for the 80% of the runtime hot-spots Fast Path principle optimize only what is necessary fall back for uncommon cases Most of runtime spent in loops Always the same code paths (likely) antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 7 / 29
  11. Assumptions Pareto Principle (80-20 rule) the 20% of the program

    accounts for the 80% of the runtime hot-spots Fast Path principle optimize only what is necessary fall back for uncommon cases Most of runtime spent in loops Always the same code paths (likely) antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 7 / 29
  12. Tracing JIT Interpret the program as usual Detect hot loops

    Tracing phase linear trace Compiling Execute guards to ensure correctness Profit :-) antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 8 / 29
  13. Tracing JIT phases Interpretation antocuni, arigo (EuroPython 2012) PyPy JIT

    under the hood July 4 2012 9 / 29
  14. Tracing JIT phases Interpretation Tracing hot loop detected antocuni, arigo

    (EuroPython 2012) PyPy JIT under the hood July 4 2012 9 / 29
  15. Tracing JIT phases Interpretation Tracing hot loop detected Compilation antocuni,

    arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 9 / 29
  16. Tracing JIT phases Interpretation Tracing hot loop detected Compilation Running

    antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 9 / 29
  17. Tracing JIT phases Interpretation Tracing hot loop detected Compilation Running

    cold guard failed antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 9 / 29
  18. Tracing JIT phases Interpretation Tracing hot loop detected Compilation Running

    cold guard failed entering compiled loop antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 9 / 29
  19. Tracing JIT phases Interpretation Tracing hot loop detected Compilation Running

    cold guard failed entering compiled loop guard failure → hot antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 9 / 29
  20. Tracing JIT phases Interpretation Tracing hot loop detected Compilation Running

    cold guard failed entering compiled loop guard failure → hot hot guard failed antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 9 / 29
  21. Tracing Example (1) java interface Operation { int DoSomething(int x);

    } class IncrOrDecr implements Operation { public int DoSomething(int x) { if (x < 0) return x-1; else return x+1; } } class tracing { public static void main(String argv[]) { int N = 100; int i = 0; Operation op = new IncrOrDecr(); while (i < N) { i = op.DoSomething(i); } System.out.println(i); } } antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 10 / 29
  22. Tracing Example (2) Java bytecode class IncrOrDecr { ... public

    DoSomething(I)I ILOAD 1 IFGE LABEL_0 ILOAD 1 ICONST_1 ISUB IRETURN LABEL_0 ILOAD 1 ICONST_1 IADD IRETURN } Java bytecode class tracing { ... public static main( [Ljava/lang/String;)V ... LABEL_0 ILOAD 2 ILOAD 1 IF_ICMPGE LABEL_1 ALOAD 3 ILOAD 2 INVOKEINTERFACE Operation.DoSomething (I)I ISTORE 2 GOTO LABEL_0 LABEL_1 ... } antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 11 / 29
  23. Tracing Example (2) Java bytecode class IncrOrDecr { ... public

    DoSomething(I)I ILOAD 1 IFGE LABEL_0 ILOAD 1 ICONST_1 ISUB IRETURN LABEL_0 ILOAD 1 ICONST_1 IADD IRETURN } Java bytecode class tracing { ... public static main( [Ljava/lang/String;)V ... LABEL_0 ILOAD 2 ILOAD 1 IF_ICMPGE LABEL_1 ALOAD 3 ILOAD 2 INVOKEINTERFACE Operation.DoSomething (I)I ISTORE 2 GOTO LABEL_0 LABEL_1 ... } antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 11 / 29
  24. Tracing example (3) INSTR: Instruction executed but not recorded INSTR:

    Instruction added to the trace but not executed Method Java code Trace Value antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 12 / 29
  25. Tracing example (3) INSTR: Instruction executed but not recorded INSTR:

    Instruction added to the trace but not executed Method Java code Trace Value Main while (i < N) { ILOAD 2 3 ILOAD 1 100 IF ICMPGE LABEL 1 f alse GUARD ICMPLT antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 12 / 29
  26. Tracing example (3) INSTR: Instruction executed but not recorded INSTR:

    Instruction added to the trace but not executed Method Java code Trace Value Main while (i < N) { ILOAD 2 3 ILOAD 1 100 IF ICMPGE LABEL 1 f alse GUARD ICMPLT i = op.DoSomething(i); ALOAD 3 IncrOrDecr obj ILOAD 2 3 INVOKEINTERFACE ... GUARD CLASS(IncrOrDecr) antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 12 / 29
  27. Tracing example (3) INSTR: Instruction executed but not recorded INSTR:

    Instruction added to the trace but not executed Method Java code Trace Value Main while (i < N) { ILOAD 2 3 ILOAD 1 100 IF ICMPGE LABEL 1 f alse GUARD ICMPLT i = op.DoSomething(i); ALOAD 3 IncrOrDecr obj ILOAD 2 3 INVOKEINTERFACE ... GUARD CLASS(IncrOrDecr) DoSomething if (x < 0) ILOAD 1 3 IFGE LABEL 0 true GUARD GE antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 12 / 29
  28. Tracing example (3) INSTR: Instruction executed but not recorded INSTR:

    Instruction added to the trace but not executed Method Java code Trace Value Main while (i < N) { ILOAD 2 3 ILOAD 1 100 IF ICMPGE LABEL 1 f alse GUARD ICMPLT i = op.DoSomething(i); ALOAD 3 IncrOrDecr obj ILOAD 2 3 INVOKEINTERFACE ... GUARD CLASS(IncrOrDecr) DoSomething if (x < 0) ILOAD 1 3 IFGE LABEL 0 true GUARD GE return x+1; ILOAD 1 3 ICONST 1 1 IADD 4 IRETURN antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 12 / 29
  29. Tracing example (3) INSTR: Instruction executed but not recorded INSTR:

    Instruction added to the trace but not executed Method Java code Trace Value Main while (i < N) { ILOAD 2 3 ILOAD 1 100 IF ICMPGE LABEL 1 f alse GUARD ICMPLT i = op.DoSomething(i); ALOAD 3 IncrOrDecr obj ILOAD 2 3 INVOKEINTERFACE ... GUARD CLASS(IncrOrDecr) DoSomething if (x < 0) ILOAD 1 3 IFGE LABEL 0 true GUARD GE return x+1; ILOAD 1 3 ICONST 1 1 IADD 4 IRETURN Main ISTORE 2 i = op.DoSomething(i); } GOTO LABEL 0 4 antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 12 / 29
  30. Trace trees (1) tracetree.java public static void trace_trees() { int

    a = 0; int i = 0; int N = 100; while(i < N) { if (i%2 == 0) a++; else a*=2; i++; } } antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 13 / 29
  31. Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD

    1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 14 / 29
  32. Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD

    1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 14 / 29
  33. Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD

    1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 14 / 29
  34. Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD

    1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 BLACKHOLE antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 14 / 29
  35. Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD

    1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 BLACKHOLE INTERPRETER antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 14 / 29
  36. Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD

    1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 BLACKHOLE INTERPRETER antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 14 / 29
  37. Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD

    1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 14 / 29
  38. Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD

    1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 IINC 0 1 IINC 1 1 antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 14 / 29
  39. Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD

    1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 IINC 0 1 IINC 1 1 antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 14 / 29
  40. Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD

    1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 IINC 0 1 IINC 1 1 antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 14 / 29
  41. Part 2 The PyPy JIT generator antocuni, arigo (EuroPython 2012)

    PyPy JIT under the hood July 4 2012 15 / 29
  42. General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self):

    ... RPYTHON antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 16 / 29
  43. General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self):

    ... RPYTHON CODEWRITER antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 16 / 29
  44. General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self):

    ... RPYTHON CODEWRITER ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) .... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... ... promote_class(p0) i0 = getfield_gc(p0, 'intval') promote_class(p1) i1 = getfield_gc(p1, 'intval') i2 = int_add(i0, i1) if (overflowed) goto ... p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') .... JITCODE antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 16 / 29
  45. General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self):

    ... RPYTHON CODEWRITER ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) .... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... ... promote_class(p0) i0 = getfield_gc(p0, 'intval') promote_class(p1) i1 = getfield_gc(p1, 'intval') i2 = int_add(i0, i1) if (overflowed) goto ... p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') .... JITCODE compile-time runtime antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 16 / 29
  46. General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self):

    ... RPYTHON CODEWRITER ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) .... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... ... promote_class(p0) i0 = getfield_gc(p0, 'intval') promote_class(p1) i1 = getfield_gc(p1, 'intval') i2 = int_add(i0, i1) if (overflowed) goto ... p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') .... JITCODE compile-time runtime META-TRACER antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 16 / 29
  47. General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self):

    ... RPYTHON CODEWRITER ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) .... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... ... promote_class(p0) i0 = getfield_gc(p0, 'intval') promote_class(p1) i1 = getfield_gc(p1, 'intval') i2 = int_add(i0, i1) if (overflowed) goto ... p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') .... JITCODE compile-time runtime META-TRACER OPTIMIZER antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 16 / 29
  48. General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self):

    ... RPYTHON CODEWRITER ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) .... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... ... promote_class(p0) i0 = getfield_gc(p0, 'intval') promote_class(p1) i1 = getfield_gc(p1, 'intval') i2 = int_add(i0, i1) if (overflowed) goto ... p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') .... JITCODE compile-time runtime META-TRACER OPTIMIZER BACKEND antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 16 / 29
  49. General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self):

    ... RPYTHON CODEWRITER ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) .... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... ... promote_class(p0) i0 = getfield_gc(p0, 'intval') promote_class(p1) i1 = getfield_gc(p1, 'intval') i2 = int_add(i0, i1) if (overflowed) goto ... p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') .... JITCODE compile-time runtime META-TRACER OPTIMIZER BACKEND ASSEMBLER antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 16 / 29
  50. PyPy trace example def fn(): c = a+b ... antocuni,

    arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 17 / 29
  51. PyPy trace example def fn(): c = a+b ... LOAD_GLOBAL

    A LOAD_GLOBAL B BINARY_ADD STORE_FAST C antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 17 / 29
  52. PyPy trace example def fn(): c = a+b ... LOAD_GLOBAL

    A LOAD_GLOBAL B BINARY_ADD STORE_FAST C ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 17 / 29
  53. PyPy trace example def fn(): c = a+b ... LOAD_GLOBAL

    A LOAD_GLOBAL B BINARY_ADD STORE_FAST C ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 17 / 29
  54. PyPy trace example def fn(): c = a+b ... LOAD_GLOBAL

    A LOAD_GLOBAL B BINARY_ADD STORE_FAST C ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... ... guard_class(p0, W_IntObject) i0 = getfield_gc(p0, 'intval') guard_class(p1, W_IntObject) i1 = getfield_gc(p1, 'intval') i2 = int_add(00, i1) guard_not_overflow() p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') ... antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 17 / 29
  55. PyPy trace example def fn(): c = a+b ... LOAD_GLOBAL

    A LOAD_GLOBAL B BINARY_ADD STORE_FAST C ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... ... guard_class(p0, W_IntObject) i0 = getfield_gc(p0, 'intval') guard_class(p1, W_IntObject) i1 = getfield_gc(p1, 'intval') i2 = int_add(00, i1) guard_not_overflow() p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') ... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 17 / 29
  56. PyPy optimizer intbounds constant folding / pure operations virtuals string

    optimizations heap (multiple get/setfield, etc) ffi unroll antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 18 / 29
  57. Intbound optimization (1) intbound.py def fn(): i = 0 while

    i < 5000: i += 2 return i antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 19 / 29
  58. Intbound optimization (2) unoptimized ... i17 = int_lt(i15, 5000) guard_true(i17)

    i19 = int_add_ovf(i15, 2) guard_no_overflow() ... optimized ... i17 = int_lt(i15, 5000) guard_true(i17) i19 = int_add(i15, 2) ... It works often array bound checking intbound info propagates all over the trace antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 20 / 29
  59. Intbound optimization (2) unoptimized ... i17 = int_lt(i15, 5000) guard_true(i17)

    i19 = int_add_ovf(i15, 2) guard_no_overflow() ... optimized ... i17 = int_lt(i15, 5000) guard_true(i17) i19 = int_add(i15, 2) ... It works often array bound checking intbound info propagates all over the trace antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 20 / 29
  60. Intbound optimization (2) unoptimized ... i17 = int_lt(i15, 5000) guard_true(i17)

    i19 = int_add_ovf(i15, 2) guard_no_overflow() ... optimized ... i17 = int_lt(i15, 5000) guard_true(i17) i19 = int_add(i15, 2) ... It works often array bound checking intbound info propagates all over the trace antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 20 / 29
  61. Virtuals (1) virtuals.py def fn(): i = 0 while i

    < 5000: i += 2 return i antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 21 / 29
  62. Virtuals (2) unoptimized ... guard_class(p0, W_IntObject) i1 = getfield_pure(p0, ’intval’)

    i2 = int_add(i1, 2) p3 = new(W_IntObject) setfield_gc(p3, i2, ’intval’) ... optimized ... i2 = int_add(i1, 2) ... The most important optimization (TM) It works both inside the trace and across the loop It works for tons of cases e.g. function frames antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 22 / 29
  63. Virtuals (2) unoptimized ... guard_class(p0, W_IntObject) i1 = getfield_pure(p0, ’intval’)

    i2 = int_add(i1, 2) p3 = new(W_IntObject) setfield_gc(p3, i2, ’intval’) ... optimized ... i2 = int_add(i1, 2) ... The most important optimization (TM) It works both inside the trace and across the loop It works for tons of cases e.g. function frames antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 22 / 29
  64. Virtuals (2) unoptimized ... guard_class(p0, W_IntObject) i1 = getfield_pure(p0, ’intval’)

    i2 = int_add(i1, 2) p3 = new(W_IntObject) setfield_gc(p3, i2, ’intval’) ... optimized ... i2 = int_add(i1, 2) ... The most important optimization (TM) It works both inside the trace and across the loop It works for tons of cases e.g. function frames antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 22 / 29
  65. Constant folding (1) constfold.py def fn(): i = 0 while

    i < 5000: i += 2 return i antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 23 / 29
  66. Constant folding (2) unoptimized ... i1 = getfield_pure(p0, ’intval’) i2

    = getfield_pure(<W_Int(2)>, ’intval’) i3 = int_add(i1, i2) ... optimized ... i1 = getfield_pure(p0, ’intval’) i3 = int_add(i1, 2) ... It “finishes the job” Works well together with other optimizations (e.g. virtuals) It also does “normal, boring, static” constant-folding antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 24 / 29
  67. Constant folding (2) unoptimized ... i1 = getfield_pure(p0, ’intval’) i2

    = getfield_pure(<W_Int(2)>, ’intval’) i3 = int_add(i1, i2) ... optimized ... i1 = getfield_pure(p0, ’intval’) i3 = int_add(i1, 2) ... It “finishes the job” Works well together with other optimizations (e.g. virtuals) It also does “normal, boring, static” constant-folding antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 24 / 29
  68. Constant folding (2) unoptimized ... i1 = getfield_pure(p0, ’intval’) i2

    = getfield_pure(<W_Int(2)>, ’intval’) i3 = int_add(i1, i2) ... optimized ... i1 = getfield_pure(p0, ’intval’) i3 = int_add(i1, 2) ... It “finishes the job” Works well together with other optimizations (e.g. virtuals) It also does “normal, boring, static” constant-folding antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 24 / 29
  69. Out of line guards (1) outoflineguards.py N = 2 def

    fn(): i = 0 while i < 5000: i += N return i antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 25 / 29
  70. Out of line guards (2) unoptimized ... quasiimmut_field(<Cell>, ’val’) guard_not_invalidated()

    p0 = getfield_gc(<Cell>, ’val’) ... i2 = getfield_pure(p0, ’intval’) i3 = int_add(i1, i2) optimized ... guard_not_invalidated() ... i3 = int_add(i1, 2) ... Python is too dynamic, but we don’t care :-) No overhead in assembler code Used a bit “everywhere” Credits to Mark Shannon for the name :-) antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 26 / 29
  71. Out of line guards (2) unoptimized ... quasiimmut_field(<Cell>, ’val’) guard_not_invalidated()

    p0 = getfield_gc(<Cell>, ’val’) ... i2 = getfield_pure(p0, ’intval’) i3 = int_add(i1, i2) optimized ... guard_not_invalidated() ... i3 = int_add(i1, 2) ... Python is too dynamic, but we don’t care :-) No overhead in assembler code Used a bit “everywhere” Credits to Mark Shannon for the name :-) antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 26 / 29
  72. Out of line guards (2) unoptimized ... quasiimmut_field(<Cell>, ’val’) guard_not_invalidated()

    p0 = getfield_gc(<Cell>, ’val’) ... i2 = getfield_pure(p0, ’intval’) i3 = int_add(i1, i2) optimized ... guard_not_invalidated() ... i3 = int_add(i1, 2) ... Python is too dynamic, but we don’t care :-) No overhead in assembler code Used a bit “everywhere” Credits to Mark Shannon for the name :-) antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 26 / 29
  73. Guards guard_true guard_false guard_class guard_no_overflow guard_value antocuni, arigo (EuroPython 2012)

    PyPy JIT under the hood July 4 2012 27 / 29
  74. Promotion guard_value specialize code make sure not to overspecialize example:

    type of objects example: function code objects, ... antocuni, arigo (EuroPython 2012) PyPy JIT under the hood July 4 2012 28 / 29
  75. Conclusion PyPy is cool :-) Any question? antocuni, arigo (EuroPython

    2012) PyPy JIT under the hood July 4 2012 29 / 29