Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PyPy JIT Under the Hood

Antonio Cuni
September 28, 2012

PyPy JIT Under the Hood

Antonio Cuni

September 28, 2012
Tweet

More Decks by Antonio Cuni

Other Decks in Programming

Transcript

  1. PyPy JIT under the hood Antonio Cuni PyCon UK 2012

    September 28, 2012 antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 1 / 29
  2. About me PyPy core dev PyPy py3k tech leader pdb++,

    fancycompleter, ... Consultant, trainer You can hire me :-) http://antocuni.eu antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 2 / 29
  3. About this talk What is PyPy? (in 30 seconds) Overview

    of tracing JITs The PyPy JIT generator antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 3 / 29
  4. Part 0: What is PyPy? RPython toolchain subset of Python

    ideal for writing VMs JIT & GC for free Python interpreter written in RPython Whatever (dynamic) language you want smalltalk, prolog, javascript, ... antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 4 / 29
  5. Part 1 Overview of tracing JITs antocuni (PyCon UK 2012)

    PyPy JIT under the hood September 28, 2012 5 / 29
  6. Compilers When? Batch or Ahead Of Time Just In Time

    How? Static Dynamic or Adaptive What? Method-based compiler Tracing compiler PyPy: JIT, Dynamic, Tracing antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 6 / 29
  7. Compilers When? Batch or Ahead Of Time Just In Time

    How? Static Dynamic or Adaptive What? Method-based compiler Tracing compiler PyPy: JIT, Dynamic, Tracing antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 6 / 29
  8. Compilers When? Batch or Ahead Of Time Just In Time

    How? Static Dynamic or Adaptive What? Method-based compiler Tracing compiler PyPy: JIT, Dynamic, Tracing antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 6 / 29
  9. Compilers When? Batch or Ahead Of Time Just In Time

    How? Static Dynamic or Adaptive What? Method-based compiler Tracing compiler PyPy: JIT, Dynamic, Tracing antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 6 / 29
  10. Assumptions Pareto Principle (80-20 rule) the 20% of the program

    accounts for the 80% of the runtime hot-spots Fast Path principle optimize only what is necessary fall back for uncommon cases Most of runtime spent in loops Always the same code paths (likely) antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 7 / 29
  11. Assumptions Pareto Principle (80-20 rule) the 20% of the program

    accounts for the 80% of the runtime hot-spots Fast Path principle optimize only what is necessary fall back for uncommon cases Most of runtime spent in loops Always the same code paths (likely) antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 7 / 29
  12. Tracing JIT Interpret the program as usual Detect hot loops

    Tracing phase linear trace Compiling Execute guards to ensure correctness Profit :-) antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 8 / 29
  13. Tracing JIT phases Interpretation Tracing hot loop detected antocuni (PyCon

    UK 2012) PyPy JIT under the hood September 28, 2012 9 / 29
  14. Tracing JIT phases Interpretation Tracing hot loop detected Compilation antocuni

    (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 9 / 29
  15. Tracing JIT phases Interpretation Tracing hot loop detected Compilation Running

    antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 9 / 29
  16. Tracing JIT phases Interpretation Tracing hot loop detected Compilation Running

    cold guard failed antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 9 / 29
  17. Tracing JIT phases Interpretation Tracing hot loop detected Compilation Running

    cold guard failed entering compiled loop antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 9 / 29
  18. Tracing JIT phases Interpretation Tracing hot loop detected Compilation Running

    cold guard failed entering compiled loop guard failure → hot antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 9 / 29
  19. Tracing JIT phases Interpretation Tracing hot loop detected Compilation Running

    cold guard failed entering compiled loop guard failure → hot hot guard failed antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 9 / 29
  20. Tracing Example (1) java interface Operation { int DoSomething(int x);

    } class IncrOrDecr implements Operation { public int DoSomething(int x) { if (x < 0) return x-1; else return x+1; } } class tracing { public static void main(String argv[]) { int N = 100; int i = 0; Operation op = new IncrOrDecr(); while (i < N) { i = op.DoSomething(i); } System.out.println(i); } } antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 10 / 29
  21. Tracing Example (2) Java bytecode class IncrOrDecr { ... public

    DoSomething(I)I ILOAD 1 IFGE LABEL_0 ILOAD 1 ICONST_1 ISUB IRETURN LABEL_0 ILOAD 1 ICONST_1 IADD IRETURN } Java bytecode class tracing { ... public static main( [Ljava/lang/String;)V ... LABEL_0 ILOAD 2 ILOAD 1 IF_ICMPGE LABEL_1 ALOAD 3 ILOAD 2 INVOKEINTERFACE Operation.DoSomething (I)I ISTORE 2 GOTO LABEL_0 LABEL_1 ... } antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 11 / 29
  22. Tracing Example (2) Java bytecode class IncrOrDecr { ... public

    DoSomething(I)I ILOAD 1 IFGE LABEL_0 ILOAD 1 ICONST_1 ISUB IRETURN LABEL_0 ILOAD 1 ICONST_1 IADD IRETURN } Java bytecode class tracing { ... public static main( [Ljava/lang/String;)V ... LABEL_0 ILOAD 2 ILOAD 1 IF_ICMPGE LABEL_1 ALOAD 3 ILOAD 2 INVOKEINTERFACE Operation.DoSomething (I)I ISTORE 2 GOTO LABEL_0 LABEL_1 ... } antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 11 / 29
  23. Tracing example (3) INSTR: Instruction executed but not recorded INSTR:

    Instruction added to the trace but not executed Method Java code Trace Value antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 12 / 29
  24. Tracing example (3) INSTR: Instruction executed but not recorded INSTR:

    Instruction added to the trace but not executed Method Java code Trace Value Main while (i < N) { ILOAD 2 3 ILOAD 1 100 IF ICMPGE LABEL 1 f alse GUARD ICMPLT antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 12 / 29
  25. Tracing example (3) INSTR: Instruction executed but not recorded INSTR:

    Instruction added to the trace but not executed Method Java code Trace Value Main while (i < N) { ILOAD 2 3 ILOAD 1 100 IF ICMPGE LABEL 1 f alse GUARD ICMPLT i = op.DoSomething(i); ALOAD 3 IncrOrDecr obj ILOAD 2 3 INVOKEINTERFACE ... GUARD CLASS(IncrOrDecr) antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 12 / 29
  26. Tracing example (3) INSTR: Instruction executed but not recorded INSTR:

    Instruction added to the trace but not executed Method Java code Trace Value Main while (i < N) { ILOAD 2 3 ILOAD 1 100 IF ICMPGE LABEL 1 f alse GUARD ICMPLT i = op.DoSomething(i); ALOAD 3 IncrOrDecr obj ILOAD 2 3 INVOKEINTERFACE ... GUARD CLASS(IncrOrDecr) DoSomething if (x < 0) ILOAD 1 3 IFGE LABEL 0 true GUARD GE antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 12 / 29
  27. Tracing example (3) INSTR: Instruction executed but not recorded INSTR:

    Instruction added to the trace but not executed Method Java code Trace Value Main while (i < N) { ILOAD 2 3 ILOAD 1 100 IF ICMPGE LABEL 1 f alse GUARD ICMPLT i = op.DoSomething(i); ALOAD 3 IncrOrDecr obj ILOAD 2 3 INVOKEINTERFACE ... GUARD CLASS(IncrOrDecr) DoSomething if (x < 0) ILOAD 1 3 IFGE LABEL 0 true GUARD GE return x+1; ILOAD 1 3 ICONST 1 1 IADD 4 IRETURN antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 12 / 29
  28. Tracing example (3) INSTR: Instruction executed but not recorded INSTR:

    Instruction added to the trace but not executed Method Java code Trace Value Main while (i < N) { ILOAD 2 3 ILOAD 1 100 IF ICMPGE LABEL 1 f alse GUARD ICMPLT i = op.DoSomething(i); ALOAD 3 IncrOrDecr obj ILOAD 2 3 INVOKEINTERFACE ... GUARD CLASS(IncrOrDecr) DoSomething if (x < 0) ILOAD 1 3 IFGE LABEL 0 true GUARD GE return x+1; ILOAD 1 3 ICONST 1 1 IADD 4 IRETURN Main ISTORE 2 i = op.DoSomething(i); } GOTO LABEL 0 4 antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 12 / 29
  29. Trace trees (1) tracetree.java public static void trace_trees() { int

    a = 0; int i = 0; int N = 100; while(i < N) { if (i%2 == 0) a++; else a*=2; i++; } } antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 13 / 29
  30. Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD

    1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 14 / 29
  31. Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD

    1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 14 / 29
  32. Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD

    1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 14 / 29
  33. Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD

    1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 BLACKHOLE antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 14 / 29
  34. Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD

    1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 BLACKHOLE INTERPRETER antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 14 / 29
  35. Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD

    1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 BLACKHOLE INTERPRETER antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 14 / 29
  36. Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD

    1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 14 / 29
  37. Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD

    1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 IINC 0 1 IINC 1 1 antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 14 / 29
  38. Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD

    1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 IINC 0 1 IINC 1 1 antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 14 / 29
  39. Trace trees (2) ILOAD 1 ILOAD 2 GUARD ICMPLT ILOAD

    1 ICONST 2 IREM GUARD NE ILOAD 0 ICONST 2 IMUL ISTORE 0 IINC 1 1 IINC 0 1 IINC 1 1 antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 14 / 29
  40. Part 2 The PyPy JIT generator antocuni (PyCon UK 2012)

    PyPy JIT under the hood September 28, 2012 15 / 29
  41. General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self):

    ... RPYTHON antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 16 / 29
  42. General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self):

    ... RPYTHON CODEWRITER antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 16 / 29
  43. General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self):

    ... RPYTHON CODEWRITER ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) .... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... ... promote_class(p0) i0 = getfield_gc(p0, 'intval') promote_class(p1) i1 = getfield_gc(p1, 'intval') i2 = int_add(i0, i1) if (overflowed) goto ... p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') .... JITCODE antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 16 / 29
  44. General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self):

    ... RPYTHON CODEWRITER ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) .... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... ... promote_class(p0) i0 = getfield_gc(p0, 'intval') promote_class(p1) i1 = getfield_gc(p1, 'intval') i2 = int_add(i0, i1) if (overflowed) goto ... p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') .... JITCODE compile-time runtime antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 16 / 29
  45. General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self):

    ... RPYTHON CODEWRITER ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) .... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... ... promote_class(p0) i0 = getfield_gc(p0, 'intval') promote_class(p1) i1 = getfield_gc(p1, 'intval') i2 = int_add(i0, i1) if (overflowed) goto ... p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') .... JITCODE compile-time runtime META-TRACER antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 16 / 29
  46. General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self):

    ... RPYTHON CODEWRITER ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) .... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... ... promote_class(p0) i0 = getfield_gc(p0, 'intval') promote_class(p1) i1 = getfield_gc(p1, 'intval') i2 = int_add(i0, i1) if (overflowed) goto ... p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') .... JITCODE compile-time runtime META-TRACER OPTIMIZER antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 16 / 29
  47. General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self):

    ... RPYTHON CODEWRITER ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) .... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... ... promote_class(p0) i0 = getfield_gc(p0, 'intval') promote_class(p1) i1 = getfield_gc(p1, 'intval') i2 = int_add(i0, i1) if (overflowed) goto ... p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') .... JITCODE compile-time runtime META-TRACER OPTIMIZER BACKEND antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 16 / 29
  48. General architecture def LOAD_GLOBAL(self): ... def STORE_FAST(self): ... def BINARY_ADD(self):

    ... RPYTHON CODEWRITER ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) .... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... ... promote_class(p0) i0 = getfield_gc(p0, 'intval') promote_class(p1) i1 = getfield_gc(p1, 'intval') i2 = int_add(i0, i1) if (overflowed) goto ... p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') .... JITCODE compile-time runtime META-TRACER OPTIMIZER BACKEND ASSEMBLER antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 16 / 29
  49. PyPy trace example def fn(): c = a+b ... antocuni

    (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 17 / 29
  50. PyPy trace example def fn(): c = a+b ... LOAD_GLOBAL

    A LOAD_GLOBAL B BINARY_ADD STORE_FAST C antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 17 / 29
  51. PyPy trace example def fn(): c = a+b ... LOAD_GLOBAL

    A LOAD_GLOBAL B BINARY_ADD STORE_FAST C ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 17 / 29
  52. PyPy trace example def fn(): c = a+b ... LOAD_GLOBAL

    A LOAD_GLOBAL B BINARY_ADD STORE_FAST C ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 17 / 29
  53. PyPy trace example def fn(): c = a+b ... LOAD_GLOBAL

    A LOAD_GLOBAL B BINARY_ADD STORE_FAST C ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... ... guard_class(p0, W_IntObject) i0 = getfield_gc(p0, 'intval') guard_class(p1, W_IntObject) i1 = getfield_gc(p1, 'intval') i2 = int_add(00, i1) guard_not_overflow() p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') ... antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 17 / 29
  54. PyPy trace example def fn(): c = a+b ... LOAD_GLOBAL

    A LOAD_GLOBAL B BINARY_ADD STORE_FAST C ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... ... p0 = getfield_gc(p0, 'func_globals') p2 = getfield_gc(p1, 'strval') call(dict_lookup, p0, p2) ... ... guard_class(p0, W_IntObject) i0 = getfield_gc(p0, 'intval') guard_class(p1, W_IntObject) i1 = getfield_gc(p1, 'intval') i2 = int_add(00, i1) guard_not_overflow() p2 = new_with_vtable('W_IntObject') setfield_gc(p2, i2, 'intval') ... ... p0 = getfield_gc(p0, 'locals_w') setarrayitem_gc(p0, i0, p1) .... antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 17 / 29
  55. PyPy optimizer intbounds constant folding / pure operations virtuals string

    optimizations heap (multiple get/setfield, etc) ffi unroll antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 18 / 29
  56. Intbound optimization (1) intbound.py def fn(): i = 0 while

    i < 5000: i += 2 return i antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 19 / 29
  57. Intbound optimization (2) unoptimized ... i17 = int_lt(i15, 5000) guard_true(i17)

    i19 = int_add_ovf(i15, 2) guard_no_overflow() ... optimized ... i17 = int_lt(i15, 5000) guard_true(i17) i19 = int_add(i15, 2) ... It works often array bound checking intbound info propagates all over the trace antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 20 / 29
  58. Intbound optimization (2) unoptimized ... i17 = int_lt(i15, 5000) guard_true(i17)

    i19 = int_add_ovf(i15, 2) guard_no_overflow() ... optimized ... i17 = int_lt(i15, 5000) guard_true(i17) i19 = int_add(i15, 2) ... It works often array bound checking intbound info propagates all over the trace antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 20 / 29
  59. Intbound optimization (2) unoptimized ... i17 = int_lt(i15, 5000) guard_true(i17)

    i19 = int_add_ovf(i15, 2) guard_no_overflow() ... optimized ... i17 = int_lt(i15, 5000) guard_true(i17) i19 = int_add(i15, 2) ... It works often array bound checking intbound info propagates all over the trace antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 20 / 29
  60. Virtuals (1) virtuals.py def fn(): i = 0 while i

    < 5000: i += 2 return i antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 21 / 29
  61. Virtuals (2) unoptimized ... guard_class(p0, W_IntObject) i1 = getfield_pure(p0, ’intval’)

    i2 = int_add(i1, 2) p3 = new(W_IntObject) setfield_gc(p3, i2, ’intval’) ... optimized ... i2 = int_add(i1, 2) ... The most important optimization (TM) It works both inside the trace and across the loop It works for tons of cases e.g. function frames antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 22 / 29
  62. Virtuals (2) unoptimized ... guard_class(p0, W_IntObject) i1 = getfield_pure(p0, ’intval’)

    i2 = int_add(i1, 2) p3 = new(W_IntObject) setfield_gc(p3, i2, ’intval’) ... optimized ... i2 = int_add(i1, 2) ... The most important optimization (TM) It works both inside the trace and across the loop It works for tons of cases e.g. function frames antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 22 / 29
  63. Virtuals (2) unoptimized ... guard_class(p0, W_IntObject) i1 = getfield_pure(p0, ’intval’)

    i2 = int_add(i1, 2) p3 = new(W_IntObject) setfield_gc(p3, i2, ’intval’) ... optimized ... i2 = int_add(i1, 2) ... The most important optimization (TM) It works both inside the trace and across the loop It works for tons of cases e.g. function frames antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 22 / 29
  64. Constant folding (1) constfold.py def fn(): i = 0 while

    i < 5000: i += 2 return i antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 23 / 29
  65. Constant folding (2) unoptimized ... i1 = getfield_pure(p0, ’intval’) i2

    = getfield_pure(<W_Int(2)>, ’intval’) i3 = int_add(i1, i2) ... optimized ... i1 = getfield_pure(p0, ’intval’) i3 = int_add(i1, 2) ... It “finishes the job” Works well together with other optimizations (e.g. virtuals) It also does “normal, boring, static” constant-folding antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 24 / 29
  66. Constant folding (2) unoptimized ... i1 = getfield_pure(p0, ’intval’) i2

    = getfield_pure(<W_Int(2)>, ’intval’) i3 = int_add(i1, i2) ... optimized ... i1 = getfield_pure(p0, ’intval’) i3 = int_add(i1, 2) ... It “finishes the job” Works well together with other optimizations (e.g. virtuals) It also does “normal, boring, static” constant-folding antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 24 / 29
  67. Constant folding (2) unoptimized ... i1 = getfield_pure(p0, ’intval’) i2

    = getfield_pure(<W_Int(2)>, ’intval’) i3 = int_add(i1, i2) ... optimized ... i1 = getfield_pure(p0, ’intval’) i3 = int_add(i1, 2) ... It “finishes the job” Works well together with other optimizations (e.g. virtuals) It also does “normal, boring, static” constant-folding antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 24 / 29
  68. Out of line guards (1) outoflineguards.py N = 2 def

    fn(): i = 0 while i < 5000: i += N return i antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 25 / 29
  69. Out of line guards (2) unoptimized ... quasiimmut_field(<Cell>, ’val’) guard_not_invalidated()

    p0 = getfield_gc(<Cell>, ’val’) ... i2 = getfield_pure(p0, ’intval’) i3 = int_add(i1, i2) optimized ... guard_not_invalidated() ... i3 = int_add(i1, 2) ... Python is too dynamic, but we don’t care :-) No overhead in assembler code Used a bit “everywhere” Credits to Mark Shannon for the name :-) antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 26 / 29
  70. Out of line guards (2) unoptimized ... quasiimmut_field(<Cell>, ’val’) guard_not_invalidated()

    p0 = getfield_gc(<Cell>, ’val’) ... i2 = getfield_pure(p0, ’intval’) i3 = int_add(i1, i2) optimized ... guard_not_invalidated() ... i3 = int_add(i1, 2) ... Python is too dynamic, but we don’t care :-) No overhead in assembler code Used a bit “everywhere” Credits to Mark Shannon for the name :-) antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 26 / 29
  71. Out of line guards (2) unoptimized ... quasiimmut_field(<Cell>, ’val’) guard_not_invalidated()

    p0 = getfield_gc(<Cell>, ’val’) ... i2 = getfield_pure(p0, ’intval’) i3 = int_add(i1, i2) optimized ... guard_not_invalidated() ... i3 = int_add(i1, 2) ... Python is too dynamic, but we don’t care :-) No overhead in assembler code Used a bit “everywhere” Credits to Mark Shannon for the name :-) antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 26 / 29
  72. Promotion guard_value specialize code make sure not to overspecialize example:

    type of objects example: function code objects, ... antocuni (PyCon UK 2012) PyPy JIT under the hood September 28, 2012 28 / 29
  73. Conclusion PyPy is cool :-) Any question? antocuni (PyCon UK

    2012) PyPy JIT under the hood September 28, 2012 29 / 29