Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Let's Talk about PyPy

Let's Talk about PyPy

Invited Keynote at PyCon 2012. Santa Clara. Conference video at https://www.youtube.com/watch?v=l_HBRhcgeuQ

David Beazley

October 03, 2012
Tweet

More Decks by David Beazley

Other Decks in Programming

Transcript

  1. Let's Talk About
    David Beazley
    @dabeaz

    View Slide

  2. PyPy Project
    • Perhaps you've heard about PyPy
    • Python implemented in Python
    • It is apparently quite a bit faster
    • How is that possible? Magic???

    View Slide

  3. It's Not Slow
    Draw the Mandelbrot set
    Credit: Jeff Preshing
    CPython 2.7: 502s
    _ = (
    255,
    lambda
    V ,B,c
    :c and Y(V*V+B,B, c
    -1)if(abs(V)<6)else
    ( 2+c-4*abs(V)**-0.4)/i
    ) ;v, x=1500,1000;C=range(v*x
    );import struct;P=struct.pack;M,\
    j ='for X in j('BM'+P(M,v*x*3+26,26,12,v,x,1,24))or C:
    i ,Y=_;j(P('BBB',*(lambda T:(T*80+T**9
    *i-950*T **99,T*70-880*T**18+701*
    T **9 ,T*i**(1-T**45*2)))(sum(
    [ Y(0,(A%3/3.+X%v+(X/v+
    A/3/3.-x/2)/1j)*2.5
    /x -2.7,i)**2 for \
    A in C
    [:9]])
    /9)
    ) )

    View Slide

  4. It's Not Slow
    Draw the Mandelbrot set
    Credit: Jeff Preshing
    CPython 2.7: 502s
    _ = (
    255,
    lambda
    V ,B,c
    :c and Y(V*V+B,B, c
    -1)if(abs(V)<6)else
    ( 2+c-4*abs(V)**-0.4)/i
    ) ;v, x=1500,1000;C=range(v*x
    );import struct;P=struct.pack;M,\
    j ='for X in j('BM'+P(M,v*x*3+26,26,12,v,x,1,24))or C:
    i ,Y=_;j(P('BBB',*(lambda T:(T*80+T**9
    *i-950*T **99,T*70-880*T**18+701*
    T **9 ,T*i**(1-T**45*2)))(sum(
    [ Y(0,(A%3/3.+X%v+(X/v+
    A/3/3.-x/2)/1j)*2.5
    /x -2.7,i)**2 for \
    A in C
    [:9]])
    /9)
    ) )
    PyPy-1.8: 203s
    Speedup of ~2.5x

    View Slide

  5. It's Not Slow
    Draw the Mandelbrot set (non-obfuscated)
    Python 2.7.2 : 14.5s
    Python 2.7.2+ctypes : 0.95s
    PyPy-1.8 : 0.42s
    Yow! That's 34x faster!

    View Slide

  6. I LIKE IT!
    "Laziness is the first great virtue of a programmer"
    -- Larry Wall

    View Slide

  7. CPython PyPy
    One is clearly faster than the other...

    View Slide

  8. CPython PyPy
    Just in time compilation
    Translation to C
    Optimization
    One is clearly faster than the other...

    View Slide

  9. CPython PyPy
    Just in time compilation
    Translation to C
    Optimization
    ... but performance is not what this talk is about.
    One is clearly faster than the other...

    View Slide

  10. CPython PyPy
    • Which one can you adjust with a pocketknife?

    View Slide

  11. CPython PyPy
    • Which one can you adjust with a pocketknife?
    ... in the dark

    View Slide

  12. CPython PyPy
    • Which one can you adjust with a pocketknife?
    ... in the dark
    ... under a pressing deadline

    View Slide

  13. CPython PyPy
    • Which one can you adjust with a pocketknife?
    ... in the dark
    ... under a pressing deadline
    I speak from some experience...

    View Slide

  14. View Slide

  15. Thinking about Tinkering
    (with PyPy)

    View Slide

  16. Tinkering Matters!
    CPython

    View Slide

  17. Tinkering Matters!
    CPython
    Patches

    View Slide

  18. Tinkering Matters!
    CPython
    Patches
    PEPs

    View Slide

  19. Tinkering Matters!
    CPython
    Patches
    PEPs Extensions
    python-ideas

    View Slide

  20. "Oh, Interesting..."

    View Slide

  21. Exploring New Ideas
    ported to
    An "afternoon hack," with a big impact
    parallel Python

    View Slide

  22. Exploring New Ideas
    ported to
    An "afternoon hack," with a big impact
    parallel Python
    ... we didn't choose Python for performance.

    View Slide

  23. Tinkering Creates Cool Stuff

    View Slide

  24. CPython PyPy
    An honest question
    • Is PyPy something that YOU can tinker with?
    • As in YOU... sitting in this room.
    • Or is it for "evil genuises only?"

    View Slide

  25. Armin
    Maciej
    Alex
    You?

    View Slide

  26. A Confession
    • PyPy scares me
    • It's fast. I get that.
    • A lot of moving parts
    • A lot of advanced computer science inside

    View Slide

  27. Tinker Away!
    I Worry About Complexity ...
    Abandon all hope

    View Slide

  28. CPython PyPy
    An honest question
    • Is PyPy something that YOU can tinker with?
    Honest answer: I don't know

    View Slide

  29. • See if I could teach myself to tinker with PyPy
    • From scratch (I'm not a PyPy developer)
    • Use nothing but the source, online docs, etc.
    An Experiment:

    View Slide

  30. Constraints
    A part-time project

    View Slide

  31. Tinkering with PyPy != Using PyPy
    • If you want to use it, just run it
    • It's Python.
    • Not so interesting (not as much as tinkering)
    bash % pypy gofaster.py

    View Slide

  32. Tinkering with PyPy != Creating PyPy
    • submit a useful bug report (or patch)
    • Make extensions
    • Study parts of the implementation (GIL, etc.)
    • Post messages on [email protected]

    View Slide

  33. Where To Start?
    • Tinkerers use source
    • You build it yourself!
    • You read instructions
    http://pypy.org
    Go Download it. Now!

    View Slide

  34. Running py.py
    • PyPy is written in "Python"... you can run it
    bash % python pypy-1.8/pypy/bin/py.py
    [platform:execute] gcc-4.0 -c -arch x86_64 -O
    frame-pointer - \
    ...
    PyPy 1.8.0 in StdObjSpace on top of Python 2.
    (startuptime: 23.23 secs)
    >>>>
    • Performance is terrible!
    • You wouldn't do it except for debugging

    View Slide

  35. Translating PyPy
    • To get the "real" version, you translate it
    • Huh? No makefile? No setup.py?
    • Already, I'm getting nervous.
    bash % cd pypy/translator/goal
    bash % python translate.py -Ojit

    View Slide

  36. Demo

    View Slide

  37. Building PyPy
    Some Facts:
    • Movie is @ 64x speed
    • Takes a few hours
    Contrast: Configure and build CPython-3.2.2
    • ./configure; make -j8
    • Takes about 90 seconds
    Immediate Problem:
    • Finding enough RAM
    • It takes >4GB
    4 cores, 8 GB RAM EC2, m2.xlarge (17GB)
    What's Actually Happening
    • Translation of PyPy to C
    • Creates ~800 C files
    • ~10.4 million lines!
    • 350 Mbytes
    It might kill the C compiler (or your machine)
    • Example: gcc-4.2
    This is Amazing!
    • Dare I say "diabolical"
    • If not intimidating
    One of the most daunting parts of PyPy
    • Must redo the process if you make any tweak
    • An obvious barrier to casual tinkering

    View Slide

  38. RPython
    • PyPy is actually implemented in "RPython"
    • RPython is not an "interpreter", but a
    restricted subset of the Python language
    Python
    rpython
    • It can run as valid Python code, but that's
    about the only similarity

    View Slide

  39. RPython
    • Formal specification (in their own words):
    "RPython is everything that our translation
    toolchain can accept"

    View Slide

  40. RPython
    • Formal specification (in their own words):
    "RPython is everything that our translation
    toolchain can accept"
    • An analogy
    "Python is everything that runs without
    generating a traceback."

    View Slide

  41. RPython
    • Formal specification (in their own words):
    "RPython is everything that our translation
    toolchain can accept"
    • An analogy
    "Python is everything that runs without
    generating a traceback."
    • Okay, let's go reading...

    View Slide

  42. Documentation

    View Slide

  43. High-level Docs

    View Slide

  44. Detailed Tech Reports

    View Slide

  45. Detailed Tech Reports
    To be fair, it was a funded
    academic project in PL.
    (They had to write like this)

    View Slide

  46. Source Code
    • 454 directories
    • 5534 files (4513 .py source files)
    • ~1.25 million non-blank source lines (.py)
    By The Numbers:
    It's not so easy to just jump in and make sense of it

    View Slide

  47. Reading Blogs
    • Recommend start: Andrew Brown
    • Laurence Tratt
    "Fast Enough VMs in Fast Enough Time"
    "Tutorial: Writing an Interpreter with PyPy"
    http://bit.ly/fmV2wx
    http://bit.ly/y8GLqf

    View Slide

  48. Just Do It
    (Live RPython Coding Demo)

    View Slide

  49. RPython in a Nutshell
    • RPython is a completely different language
    • Python syntax, yes.
    • Must be compiled (like C, C++, etc.)
    • Static typing via type inference
    • Limited set of libraries
    • If you love Python, you will hate RPython

    View Slide

  50. Type Inference Illustrated
    def fib(n):
    if n < 2:
    return n
    else:
    return fib(n-1) + fib(n-2)
    def main(argv):
    print fib(int(argv[1]))
    return 0

    View Slide

  51. Type Inference Illustrated
    def fib(n):
    if n < 2:
    return n
    else:
    return fib(n-1) + fib(n-2)
    def main(argv):
    print fib(int(argv[1]))
    return 0
    int

    View Slide

  52. Type Inference Illustrated
    def fib(n):
    if n < 2:
    return n
    else:
    return fib(n-1) + fib(n-2)
    def main(argv):
    print fib(int(argv[1]))
    return 0
    int
    int

    View Slide

  53. Type Inference Illustrated
    def fib(n):
    if n < 2:
    return n
    else:
    return fib(n-1) + fib(n-2)
    def main(argv):
    print fib(int(argv[1]))
    return 0
    int
    int
    int

    View Slide

  54. Type Inference Illustrated
    def fib(n):
    if n < 2:
    return n
    else:
    return fib(n-1) + fib(n-2)
    def main(argv):
    print fib(int(argv[1]))
    return 0
    int
    int
    int
    int

    View Slide

  55. Type Inference Illustrated
    def fib(n):
    if n < 2:
    return n
    else:
    return fib(n-1) + fib(n-2)
    def main(argv):
    print fib(int(argv[1]))
    return 0
    int
    int
    int
    int
    int
    Key Point: Think static typing (like C)

    View Slide

  56. # file1.py
    def name2(args):
    statement
    statement
    def name3(args):
    statement
    statement
    statement
    statement
    def name1(args):
    statement
    statement
    statement
    # file2.py
    def name1(args):
    statement
    statement
    statement
    class B(object):
    def method1(self,args):
    statement
    statement
    statement
    def method2(self,args):
    statement
    statement
    PyPy
    Source
    def name1(args):
    statement
    statement
    # file3.py
    def name2(args):
    statement
    statement
    def name3(args):
    statement
    statement
    def name4(args):
    statement
    statement
    Now think about the
    whole program
    Type inference +
    restrictions

    View Slide

  57. # file1.py
    def name2(args):
    statement
    statement
    def name3(args):
    statement
    statement
    statement
    statement
    def name1(args):
    statement
    statement
    statement
    # file2.py
    def name1(args):
    statement
    statement
    statement
    class B(object):
    def method1(self,args):
    statement
    statement
    statement
    def method2(self,args):
    statement
    statement
    PyPy
    Source
    def name1(args):
    statement
    statement
    # file3.py
    def name2(args):
    statement
    statement
    def name3(args):
    statement
    statement
    def name4(args):
    statement
    statement
    There is a single spark...
    Entry point

    View Slide

  58. # file1.py
    def name2(args):
    statement
    statement
    def name3(args):
    statement
    statement
    statement
    statement
    def name1(args):
    statement
    statement
    statement
    # file2.py
    def name1(args):
    statement
    statement
    statement
    class B(object):
    def method1(self,args):
    statement
    statement
    statement
    def method2(self,args):
    statement
    statement
    PyPy
    Source
    def name1(args):
    statement
    statement
    # file3.py
    def name2(args):
    statement
    statement
    def name3(args):
    statement
    statement
    def name4(args):
    statement
    statement
    Entry point
    Exploration Begins

    View Slide

  59. # file1.py
    def name2(args):
    statement
    statement
    def name3(args):
    statement
    statement
    statement
    statement
    def name1(args):
    statement
    statement
    statement
    # file2.py
    def name1(args):
    statement
    statement
    statement
    class B(object):
    def method1(self,args):
    statement
    statement
    statement
    def method2(self,args):
    statement
    statement
    PyPy
    Source
    def name1(args):
    statement
    statement
    # file3.py
    def name2(args):
    statement
    statement
    def name3(args):
    statement
    statement
    def name4(args):
    statement
    statement
    Exploration Begins
    Entry point

    View Slide

  60. # file1.py
    def name2(args):
    statement
    statement
    def name3(args):
    statement
    statement
    statement
    statement
    def name1(args):
    statement
    statement
    statement
    # file2.py
    def name1(args):
    statement
    statement
    statement
    class B(object):
    def method1(self,args):
    statement
    statement
    statement
    def method2(self,args):
    statement
    statement
    PyPy
    Source
    def name1(args):
    statement
    statement
    # file3.py
    def name2(args):
    statement
    statement
    def name3(args):
    statement
    statement
    def name4(args):
    statement
    statement
    Entry point
    Exploration Begins
    Whole program
    type annotation!

    View Slide

  61. # file1.py
    def name2(args):
    statement
    statement
    def name3(args):
    statement
    statement
    statement
    statement
    def name1(args):
    statement
    statement
    statement
    # file2.py
    def name1(args):
    statement
    statement
    statement
    class B(object):
    def method1(self,args):
    statement
    statement
    statement
    def method2(self,args):
    statement
    statement
    PyPy
    Source
    def name1(args):
    statement
    statement
    # file3.py
    def name2(args):
    statement
    statement
    def name3(args):
    statement
    statement
    def name4(args):
    statement
    statement
    All reachable control-
    flow paths are followed
    Entry point
    Whole program
    type annotation!

    View Slide

  62. # file1.py
    def name2(args):
    statement
    statement
    def name3(args):
    statement
    statement
    statement
    statement
    def name1(args):
    statement
    statement
    statement
    # file2.py
    def name1(args):
    statement
    statement
    statement
    class B(object):
    def method1(self,args):
    statement
    statement
    statement
    def method2(self,args):
    statement
    statement
    PyPy
    Source
    def name1(args):
    statement
    statement
    # file3.py
    def name2(args):
    statement
    statement
    def name3(args):
    statement
    statement
    def name4(args):
    statement
    statement
    Entry point
    Imagine this diagram,
    but with tens of
    thousands of functions
    Whole program
    type annotation!

    View Slide

  63. # file1.py
    def name2(args):
    statement
    statement
    def name3(args):
    statement
    statement
    statement
    statement
    def name1(args):
    statement
    statement
    statement
    # file2.py
    def name1(args):
    statement
    statement
    statement
    class B(object):
    def method1(self,args):
    statement
    statement
    statement
    def method2(self,args):
    statement
    statement
    PyPy
    Source
    def name1(args):
    statement
    statement
    # file3.py
    def name2(args):
    statement
    statement
    def name3(args):
    statement
    statement
    def name4(args):
    statement
    statement
    Key Insight:
    Entry point
    All of the reachable
    code is RPython

    View Slide

  64. RPython
    def name1(args):
    statement
    statement
    statement
    # file1.py
    def name2(args):
    statement
    statement
    def name3(args):
    statement
    statement
    statement
    statement
    def name1(args):
    statement
    statement
    statement
    def name3(args):
    statement
    statement
    statement
    statement
    # file2.py
    def name1(args):
    statement
    statement
    statement
    class B(object):
    def method1(self,args):
    statement
    statement
    statement
    def method2(self,args):
    statement
    statement
    PyPy
    Source
    def name1(args):
    statement
    statement
    # file3.py
    def name2(args):
    statement
    statement
    def name3(args):
    statement
    statement
    def name4(args):
    statement
    statement
    class B(object):
    def method1(self,args):
    statement
    statement
    statement
    def method2(self,args):
    statement
    statement
    def name1(args):
    statement
    statement
    def name2(args):
    statement
    statement
    def name4(args):
    statement
    statement
    Translation
    C
    Compile
    pypy-c
    Entry point

    View Slide

  65. View Slide

  66. Understanding Translation
    • The translation process will blow your mind
    • Full understanding by mortals is probably futile
    • Snakes + Souls of Ph.D. students inside?
    • Let's look at a small taste...

    View Slide

  67. A Function
    def fib(n):
    if n < 2:
    return n
    else:
    return fib(n-1) + fib(n-2)
    Obvious question: How does it translate to C?

    View Slide

  68. Traditional Compiler
    def fib(n):
    if n < 2:
    return n
    else:
    return fib(n-1) + fib(n-2)
    Lexer Parser IR C
    You might think it's like a traditional compiler.

    View Slide

  69. Traditional Compiler
    def fib(n):
    if n < 2:
    return n
    else:
    return fib(n-1) + fib(n-2)
    Lexer Parser IR C
    You might think it's like a traditional compiler.
    (and you would be wrong)

    View Slide

  70. Traditional Compiler
    IR
    def fib(n):
    if n < 2:
    return n
    else:
    return fib(n-1) + fib(n-2)
    Lexer Parser C
    Insight: Python already parsed the code!
    ... so don't do it again.
    ?????

    View Slide

  71. RPython Translation
    IR C
    Translation occurs directly from Python code objects
    >>> fib.__code__.co_code
    '|\x00\x00d\x01\x00k\x00\x00r\x10\x00
    d\x02\x00St\x00\x00|\x00\x00d\x02\x00
    \x18\x83\x01\x00t\x00\x00|\x00\x00d
    \x01\x00\x18\x83\x01\x00\x17Sd\x00\x00S'

    View Slide

  72. Bytecode Interpretation
    CPython
    • Python has a bytecode interpreter
    • Core of the eval loop (written in C).
    bytecode
    interpreter
    runtime

    View Slide

  73. Bytecode Interpretation
    CPython
    • It executes the bytecode
    bytecode
    interpreter
    runtime
    >>> fib.__code__.co_code
    '|\x00\x00d\x01\x00k\x00\x00r\x1
    d\x02\x00St\x00\x00|\x00\x00d\x
    \x18\x83\x01\x00t\x00\x00|\x00\
    \x01\x00\x18\x83\x01\x00\x17Sd\

    View Slide

  74. Bytecode Interpretation
    PyPy
    • PyPy has a bytecode interpreter too
    • Written in pure Python (that's the whole idea)
    bytecode
    interpreter
    runtime
    >>> fib.__code__.co_code
    '|\x00\x00d\x01\x00k\x00\x00r\x1
    d\x02\x00St\x00\x00|\x00\x00d\x
    \x18\x83\x01\x00t\x00\x00|\x00\
    \x01\x00\x18\x83\x01\x00\x17Sd\

    View Slide

  75. Bytecode Interpretation
    PyPy
    runtime
    bytecode
    interpreter
    • Bytecode interpreter is modular
    • Also used by the translate.py program
    translate.py
    bytecode
    interpreter
    abstract runtime

    View Slide

  76. Just to be clear...
    PyPy translates itself using its
    own bytecode interpreter

    View Slide

  77. Abstract Interpretation
    translate.py
    bytecode
    interpreter
    abstract runtime
    2 0 LOAD_FAST 0 (n)
    3 LOAD_CONST 1 (2)
    6 COMPARE_OP 0 (<)
    9 POP_JUMP_IF_FALSE 16
    3 12 LOAD_CONST 2 (1)
    15 RETURN_VALUE
    5 >> 16 LOAD_GLOBAL 0 (fib)
    19 LOAD_FAST 0 (n)
    22 LOAD_CONST 2 (1)
    25 BINARY_SUBTRACT
    26 CALL_FUNCTION 1
    29 LOAD_GLOBAL 0 (fib)
    32 LOAD_FAST 0 (n)
    35 LOAD_CONST 1 (2)
    38 BINARY_SUBTRACT
    39 CALL_FUNCTION 1
    42 BINARY_ADD
    43 RETURN_VALUE
    44 LOAD_CONST 0 (None)
    47 RETURN_VALUE
    Translator runs the
    code "in the abstract"

    View Slide

  78. Abstract Interpretation
    Translator runs the
    code "in the abstract"
    2 0 LOAD_FAST 0 (n)
    3 LOAD_CONST 1 (2)
    6 COMPARE_OP 0 (<)
    9 POP_JUMP_IF_FALSE 16
    3 12 LOAD_CONST 2 (1)
    15 RETURN_VALUE
    5 >> 16 LOAD_GLOBAL 0 (fib)
    19 LOAD_FAST 0 (n)
    22 LOAD_CONST 2 (1)
    25 BINARY_SUBTRACT
    26 CALL_FUNCTION 1
    29 LOAD_GLOBAL 0 (fib)
    32 LOAD_FAST 0 (n)
    35 LOAD_CONST 1 (2)
    38 BINARY_SUBTRACT
    39 CALL_FUNCTION 1
    42 BINARY_ADD
    43 RETURN_VALUE
    44 LOAD_CONST 0 (None)
    47 RETURN_VALUE
    2 0 LOAD_FAST 0 (n)
    3 LOAD_CONST 1 (2)
    6 COMPARE_OP 0 (<)
    9 POP_JUMP_IF_FALSE 16
    3 12 LOAD_CONST 2 (1)
    15 RETURN_VALUE
    5 >> 16 LOAD_GLOBAL 0 (fib)
    19 LOAD_FAST 0 (n)
    22 LOAD_CONST 2 (1)
    25 BINARY_SUBTRACT
    26 CALL_FUNCTION 1
    29 LOAD_GLOBAL 0 (fib)
    32 LOAD_FAST 0 (n)
    35 LOAD_CONST 1 (2)
    38 BINARY_SUBTRACT
    39 CALL_FUNCTION 1
    42 BINARY_ADD
    43 RETURN_VALUE
    44 LOAD_CONST 0 (None)
    47 RETURN_VALUE
    Translator runs the
    code "in the abstract"
    translate.py
    bytecode
    interpreter
    abstract runtime

    View Slide

  79. Abstract Interpretation
    Translator runs the
    code "in the abstract"
    2 0 LOAD_FAST 0 (n)
    3 LOAD_CONST 1 (2)
    6 COMPARE_OP 0 (<)
    9 POP_JUMP_IF_FALSE 16
    3 12 LOAD_CONST 2 (1)
    15 RETURN_VALUE
    5 >> 16 LOAD_GLOBAL 0 (fib)
    19 LOAD_FAST 0 (n)
    22 LOAD_CONST 2 (1)
    25 BINARY_SUBTRACT
    26 CALL_FUNCTION 1
    29 LOAD_GLOBAL 0 (fib)
    32 LOAD_FAST 0 (n)
    35 LOAD_CONST 1 (2)
    38 BINARY_SUBTRACT
    39 CALL_FUNCTION 1
    42 BINARY_ADD
    43 RETURN_VALUE
    44 LOAD_CONST 0 (None)
    47 RETURN_VALUE
    translate.py
    bytecode
    interpreter
    abstract runtime

    View Slide

  80. Abstract Interpretation
    Translator runs the
    code "in the abstract"
    2 0 LOAD_FAST 0 (n)
    3 LOAD_CONST 1 (2)
    6 COMPARE_OP 0 (<)
    9 POP_JUMP_IF_FALSE 16
    3 12 LOAD_CONST 2 (1)
    15 RETURN_VALUE
    5 >> 16 LOAD_GLOBAL 0 (fib)
    19 LOAD_FAST 0 (n)
    22 LOAD_CONST 2 (1)
    25 BINARY_SUBTRACT
    26 CALL_FUNCTION 1
    29 LOAD_GLOBAL 0 (fib)
    32 LOAD_FAST 0 (n)
    35 LOAD_CONST 1 (2)
    38 BINARY_SUBTRACT
    39 CALL_FUNCTION 1
    42 BINARY_ADD
    43 RETURN_VALUE
    44 LOAD_CONST 0 (None)
    47 RETURN_VALUE
    translate.py
    bytecode
    interpreter
    abstract runtime

    View Slide

  81. Abstract Interpretation
    Translator runs the
    code "in the abstract"
    2 0 LOAD_FAST 0 (n)
    3 LOAD_CONST 1 (2)
    6 COMPARE_OP 0 (<)
    9 POP_JUMP_IF_FALSE 16
    3 12 LOAD_CONST 2 (1)
    15 RETURN_VALUE
    5 >> 16 LOAD_GLOBAL 0 (fib)
    19 LOAD_FAST 0 (n)
    22 LOAD_CONST 2 (1)
    25 BINARY_SUBTRACT
    26 CALL_FUNCTION 1
    29 LOAD_GLOBAL 0 (fib)
    32 LOAD_FAST 0 (n)
    35 LOAD_CONST 1 (2)
    38 BINARY_SUBTRACT
    39 CALL_FUNCTION 1
    42 BINARY_ADD
    43 RETURN_VALUE
    44 LOAD_CONST 0 (None)
    47 RETURN_VALUE
    translate.py
    bytecode
    interpreter
    abstract runtime

    View Slide

  82. Abstract Interpretation
    Translator runs the
    code "in the abstract"
    2 0 LOAD_FAST 0 (n)
    3 LOAD_CONST 1 (2)
    6 COMPARE_OP 0 (<)
    9 POP_JUMP_IF_FALSE 16
    3 12 LOAD_CONST 2 (1)
    15 RETURN_VALUE
    5 >> 16 LOAD_GLOBAL 0 (fib)
    19 LOAD_FAST 0 (n)
    22 LOAD_CONST 2 (1)
    25 BINARY_SUBTRACT
    26 CALL_FUNCTION 1
    29 LOAD_GLOBAL 0 (fib)
    32 LOAD_FAST 0 (n)
    35 LOAD_CONST 1 (2)
    38 BINARY_SUBTRACT
    39 CALL_FUNCTION 1
    42 BINARY_ADD
    43 RETURN_VALUE
    44 LOAD_CONST 0 (None)
    47 RETURN_VALUE
    translate.py
    bytecode
    interpreter
    abstract runtime

    View Slide

  83. Abstract Interpretation
    Translator runs the
    code "in the abstract"
    2 0 LOAD_FAST 0 (n)
    3 LOAD_CONST 1 (2)
    6 COMPARE_OP 0 (<)
    9 POP_JUMP_IF_FALSE 16
    3 12 LOAD_CONST 2 (1)
    15 RETURN_VALUE
    5 >> 16 LOAD_GLOBAL 0 (fib)
    19 LOAD_FAST 0 (n)
    22 LOAD_CONST 2 (1)
    25 BINARY_SUBTRACT
    26 CALL_FUNCTION 1
    29 LOAD_GLOBAL 0 (fib)
    32 LOAD_FAST 0 (n)
    35 LOAD_CONST 1 (2)
    38 BINARY_SUBTRACT
    39 CALL_FUNCTION 1
    42 BINARY_ADD
    43 RETURN_VALUE
    44 LOAD_CONST 0 (None)
    47 RETURN_VALUE
    translate.py
    bytecode
    interpreter
    abstract runtime

    View Slide

  84. Abstract Interpretation
    Translator runs the
    code "in the abstract"
    2 0 LOAD_FAST 0 (n)
    3 LOAD_CONST 1 (2)
    6 COMPARE_OP 0 (<)
    9 POP_JUMP_IF_FALSE 16
    3 12 LOAD_CONST 2 (1)
    15 RETURN_VALUE
    5 >> 16 LOAD_GLOBAL 0 (fib)
    19 LOAD_FAST 0 (n)
    22 LOAD_CONST 2 (1)
    25 BINARY_SUBTRACT
    26 CALL_FUNCTION 1
    29 LOAD_GLOBAL 0 (fib)
    32 LOAD_FAST 0 (n)
    35 LOAD_CONST 1 (2)
    38 BINARY_SUBTRACT
    39 CALL_FUNCTION 1
    42 BINARY_ADD
    43 RETURN_VALUE
    44 LOAD_CONST 0 (None)
    47 RETURN_VALUE
    translate.py
    bytecode
    interpreter
    abstract runtime

    View Slide

  85. Abstract Interpretation
    Translator runs the
    code "in the abstract"
    2 0 LOAD_FAST 0 (n)
    3 LOAD_CONST 1 (2)
    6 COMPARE_OP 0 (<)
    9 POP_JUMP_IF_FALSE 16
    3 12 LOAD_CONST 2 (1)
    15 RETURN_VALUE
    5 >> 16 LOAD_GLOBAL 0 (fib)
    19 LOAD_FAST 0 (n)
    22 LOAD_CONST 2 (1)
    25 BINARY_SUBTRACT
    26 CALL_FUNCTION 1
    29 LOAD_GLOBAL 0 (fib)
    32 LOAD_FAST 0 (n)
    35 LOAD_CONST 1 (2)
    38 BINARY_SUBTRACT
    39 CALL_FUNCTION 1
    42 BINARY_ADD
    43 RETURN_VALUE
    44 LOAD_CONST 0 (None)
    47 RETURN_VALUE
    translate.py
    bytecode
    interpreter
    abstract runtime

    View Slide

  86. Abstract Interpretation
    A Control flow graph is
    constructed
    translate.py
    bytecode
    interpreter
    abstract runtime

    View Slide

  87. Abstract Interpretation
    C Code
    The full details are "hairy"

    View Slide

  88. "You are in a maze of twisty
    little passages, all alike."
    (and a huge green fierce snake bars the way)

    View Slide

  89. Understanding the Source
    • Two different languages co-exist (same syntax)
    # file1.py
    def name2(args):
    statement
    statement
    def name3(args):
    statement
    statement
    statement
    statement
    def name1(args):
    statement
    statement
    statement
    Full Python????
    RPython????
    Which is it?

    View Slide

  90. Understanding the Source
    • Two different languages co-exist (same syntax)
    # file1.py
    def name2(args):
    statement
    statement
    def name3(args):
    statement
    statement
    statement
    statement
    def name1(args):
    statement
    statement
    statement
    Full Python????
    RPython????
    Which is it?
    (You can't look in isolation)

    View Slide

  91. Understanding the Source
    def cast_object_to_ptr(PTR, object):
    """NOT_RPYTHON: hack. The object may
    Limited to casting a given object to
    """
    if isinstance(PTR, lltype.Ptr):
    TO = PTR.TO
    else:
    TO = PTR
    ...
    • PyPy uses doc strings to help you sort it out
    • It is enforced by the translator (an assertion)

    View Slide

  92. Understanding the Source
    • Deeper question: Why would you have mixed code?
    # file1.py
    def name2(args):
    statement
    statement
    def name3(args):
    statement
    statement
    statement
    statement
    def name1(args):
    statement
    statement
    statement
    RPython
    Python
    • Head throbbing...

    View Slide

  93. Execution Contexts
    # file1.py
    def name2(args):
    statement
    statement
    def name3(args):
    statement
    statement
    statement
    statement
    def name1(args):
    statement
    statement
    statement
    Translation (Python) Executable (C)

    View Slide

  94. Execution Contexts
    # file1.py
    def name2(args):
    statement
    statement
    def name3(args):
    statement
    statement
    statement
    statement
    def name1(args):
    statement
    statement
    statement
    Translation (Python) Executable (C)
    • At translation, the code separates
    Metaprogramming Implementation
    • decorators
    • metaclasses
    • exec()

    View Slide

  95. Example
    def decorator(func):
    statements
    ...
    def wrapper(*args,**kwargs):
    statements
    ...
    return func(*args,**kwargs)
    return wrapper
    @decorator
    def func(args):
    statements
    ...

    View Slide

  96. Example
    def decorator(func):
    statements
    ...
    def wrapper(*args,**kwargs):
    statements
    ...
    return func(*args,**kwargs)
    return wrapper
    @decorator
    def func(args):
    statements
    ...
    Python
    RPython
    Python
    RPython

    View Slide

  97. Rules of Thumb
    • Code that executes at import time can
    make use of all Python features
    • Code reachable through the entry point
    (target) is RPython
    • Keeping it straight is hard (for me anyways)

    View Slide

  98. But Wait, There's More!

    View Slide

  99. Foreign Code
    • PyPy is written in "Python", but can access
    external C code and libraries
    • os, math, time, threads, etc.
    • There is a highly developed FFI mechanism
    • Plus a configuration system (think autoconf)

    View Slide

  100. Example
    (Accessing Foreign Functions)

    View Slide

  101. A Quandary
    • How do I end this talk?
    • I've only talked about RPython
    • When do we get to the PyPy?

    View Slide

  102. A Realization
    I still don't know know
    how PyPy works!
    Score: PyPy: 1 Dave: 0

    View Slide

  103. A Deeper Realization
    I don't even know how
    CPython works!

    View Slide

  104. WAT!?!
    WAT!?!

    View Slide

  105. A Clarification
    I do know how to use the
    tools that make CPython

    View Slide

  106. A Clarification
    I do know how to use the
    tools that make CPython
    • ANSI C
    • Makefiles
    • Algorithms
    • Data Structures

    View Slide

  107. The Challenge
    PyPy has a different set of tools
    • RPython
    • translate.py
    • Metaprogramming
    • Foreign Functions

    View Slide

  108. So how to end this talk?

    View Slide

  109. Wait! I used to be an evil professor!

    View Slide

  110. Figuring out how PyPy works
    is left as an exercise!
    (You'll learn a lot)

    View Slide

  111. Postscript

    View Slide

  112. Postscript
    Let's talk about Ruby!
    threads.each { |aThread| aThread.join }

    View Slide

  113. Breaking GILs
    • As you know, I like breaking GILs
    • You know, global interpreter locks
    • As in threads and stuff...
    • I love it!

    View Slide

  114. A Benchmark
    • Message-passing with a CPU-bound thread
    C : 1.11s
    Python 2.7 : 1.60s
    Ruby 1.9 : 5839.4s
    • Don't concern yourself with the details
    • Ruby 3600x slower than Python?
    • What's that all about? Let's go tinker!

    View Slide

  115. Tinkering with Ruby
    • It was pretty straightforward
    • Finding the GIL didn't take long
    • An afternoon of fiddling around
    (Search for my talk at RuPy 2011)
    • Caused by a more extreme case of the
    thread priority inversion that's in Python 3.3

    View Slide

  116. Just to be clear...
    I couldn't write a real Ruby
    program to save my life right now.

    View Slide

  117. A PyPy Benchmark
    • A similar message-passing benchmark
    Python 2.7 : 15.6s
    PyPy-1.6 : 6689.2s (428x slower)
    • Huh? What's that all about?
    • No idea! Or even how to look.
    • That is the whole reason for this talk

    View Slide

  118. Parting Words
    • Can you tinker with PyPy?
    • Honest answer: I still don't know
    • Should you try to go tinker with it anyways?
    • YES!
    • You will find interesting things inside

    View Slide

  119. "My God, it's full of stars!"

    View Slide

  120. "My God, it's full of stars!"
    (or VMs?)

    View Slide

  121. Thanks!
    • Hope you learned at least one new thing
    • Special thanks:
    • Alex Gaynor
    • Maciej Fijalkowski
    • Chipy
    • Twitter: @dabeaz

    View Slide