David Beazley
October 03, 2012
Invited Keynote at PyCon 2012. Santa Clara. Conference video at https://www.youtube.com/watch?v=l_HBRhcgeuQ

October 03, 2012

## Transcript

David Beazley
@dabeaz

2. PyPy Project
• Perhaps you've heard about PyPy
• Python implemented in Python
• It is apparently quite a bit faster
• How is that possible? Magic???

3. It's Not Slow
Draw the Mandelbrot set
Credit: Jeff Preshing
CPython 2.7: 502s
_ = (
255,
lambda
V ,B,c
:c and Y(V*V+B,B, c
-1)if(abs(V)<6)else
( 2+c-4*abs(V)**-0.4)/i
) ;v, x=1500,1000;C=range(v*x
);import struct;P=struct.pack;M,\
j ='for X in j('BM'+P(M,v*x*3+26,26,12,v,x,1,24))or C:
i ,Y=_;j(P('BBB',*(lambda T:(T*80+T**9
*i-950*T **99,T*70-880*T**18+701*
T **9 ,T*i**(1-T**45*2)))(sum(
[ Y(0,(A%3/3.+X%v+(X/v+
A/3/3.-x/2)/1j)*2.5
/x -2.7,i)**2 for \
A in C
[:9]])
/9)
) )

5. It's Not Slow
Draw the Mandelbrot set (non-obfuscated)
Python 2.7.2 : 14.5s
Python 2.7.2+ctypes : 0.95s
PyPy-1.8 : 0.42s
Yow! That's 34x faster!

6. I LIKE IT!
"Laziness is the ﬁrst great virtue of a programmer"
-- Larry Wall

15. Tinkering Matters!
CPython

16. Tinkering Matters!
CPython
Patches

17. Tinkering Matters!
CPython
Patches
PEPs

18. Tinkering Matters!
CPython
Patches
PEPs Extensions
python-ideas

19. "Oh, Interesting..."

20. Exploring New Ideas
ported to
An "afternoon hack," with a big impact
parallel Python

21. Exploring New Ideas
ported to
An "afternoon hack," with a big impact
parallel Python
... we didn't choose Python for performance.

22. Tinkering Creates Cool Stuff

23. CPython PyPy
An honest question
• Is PyPy something that YOU can tinker with?
• As in YOU... sitting in this room.
• Or is it for "evil genuises only?"

24. Armin
Maciej
Alex
You?

25. A Confession
• PyPy scares me
• It's fast. I get that.
• A lot of moving parts
• A lot of advanced computer science inside

26. Tinker Away!
Abandon all hope

27. CPython PyPy
An honest question
• Is PyPy something that YOU can tinker with?

28. • See if I could teach myself to tinker with PyPy
• From scratch (I'm not a PyPy developer)
• Use nothing but the source, online docs, etc.
An Experiment:

29. Constraints
A part-time project

30. Tinkering with PyPy != Using PyPy
• If you want to use it, just run it
• It's Python.
• Not so interesting (not as much as tinkering)
bash % pypy gofaster.py

31. Tinkering with PyPy != Creating PyPy
• submit a useful bug report (or patch)
• Make extensions
• Study parts of the implementation (GIL, etc.)
• Post messages on [email protected]

32. Where To Start?
• Tinkerers use source
• You build it yourself!
http://pypy.org

33. Running py.py
• PyPy is written in "Python"... you can run it
bash % python pypy-1.8/pypy/bin/py.py
[platform:execute] gcc-4.0 -c -arch x86_64 -O
frame-pointer - \
...
PyPy 1.8.0 in StdObjSpace on top of Python 2.
(startuptime: 23.23 secs)
>>>>
• Performance is terrible!
• You wouldn't do it except for debugging

34. Translating PyPy
• To get the "real" version, you translate it
• Huh? No makeﬁle? No setup.py?
bash % cd pypy/translator/goal
bash % python translate.py -Ojit

35. Demo

36. Building PyPy
Some Facts:
• Movie is @ 64x speed
• Takes a few hours
Contrast: Conﬁgure and build CPython-3.2.2
• ./configure; make -j8
Immediate Problem:
• Finding enough RAM
• It takes >4GB
4 cores, 8 GB RAM EC2, m2.xlarge (17GB)
What's Actually Happening
• Translation of PyPy to C
• Creates ~800 C ﬁles
• ~10.4 million lines!
• 350 Mbytes
It might kill the C compiler (or your machine)
• Example: gcc-4.2
This is Amazing!
• Dare I say "diabolical"
• If not intimidating
One of the most daunting parts of PyPy
• Must redo the process if you make any tweak
• An obvious barrier to casual tinkering

37. RPython
• PyPy is actually implemented in "RPython"
• RPython is not an "interpreter", but a
restricted subset of the Python language
Python
rpython
• It can run as valid Python code, but that's

38. RPython
• Formal speciﬁcation (in their own words):
"RPython is everything that our translation
toolchain can accept"

39. RPython
• Formal speciﬁcation (in their own words):
"RPython is everything that our translation
toolchain can accept"
• An analogy
"Python is everything that runs without
generating a traceback."

40. RPython
• Formal speciﬁcation (in their own words):
"RPython is everything that our translation
toolchain can accept"
• An analogy
"Python is everything that runs without
generating a traceback."

41. Documentation

42. High-level Docs

43. Detailed Tech Reports

44. Detailed Tech Reports
To be fair, it was a funded
(They had to write like this)

45. Source Code
• 454 directories
• 5534 ﬁles (4513 .py source ﬁles)
• ~1.25 million non-blank source lines (.py)
By The Numbers:
It's not so easy to just jump in and make sense of it

• Recommend start: Andrew Brown
• Laurence Tratt
"Fast Enough VMs in Fast Enough Time"
"Tutorial: Writing an Interpreter with PyPy"
http://bit.ly/fmV2wx
http://bit.ly/y8GLqf

47. Just Do It
(Live RPython Coding Demo)

48. RPython in a Nutshell
• RPython is a completely different language
• Python syntax, yes.
• Must be compiled (like C, C++, etc.)
• Static typing via type inference
• Limited set of libraries
• If you love Python, you will hate RPython

49. Type Inference Illustrated
def fib(n):
if n < 2:
return n
else:
return fib(n-1) + fib(n-2)
def main(argv):
print fib(int(argv[1]))
return 0

55. # file1.py
def name2(args):
statement
statement
def name3(args):
statement
statement
statement
statement
def name1(args):
statement
statement
statement
# file2.py
def name1(args):
statement
statement
statement
class B(object):
def method1(self,args):
statement
statement
statement
def method2(self,args):
statement
statement
PyPy
Source
def name1(args):
statement
statement
# file3.py
def name2(args):
statement
statement
def name3(args):
statement
statement
def name4(args):
statement
statement
whole program
Type inference +
restrictions

64. Understanding Translation
• The translation process will blow your mind
• Full understanding by mortals is probably futile
• Snakes + Souls of Ph.D. students inside?
• Let's look at a small taste...

65. A Function
def fib(n):
if n < 2:
return n
else:
return fib(n-1) + fib(n-2)
Obvious question: How does it translate to C?

def fib(n):
if n < 2:
return n
else:
return fib(n-1) + fib(n-2)
Lexer Parser IR C
You might think it's like a traditional compiler.

def fib(n):
if n < 2:
return n
else:
return fib(n-1) + fib(n-2)
Lexer Parser IR C
You might think it's like a traditional compiler.
(and you would be wrong)

IR
def fib(n):
if n < 2:
return n
else:
return fib(n-1) + fib(n-2)
Lexer Parser C
Insight: Python already parsed the code!
... so don't do it again.
?????

69. RPython Translation
IR C
Translation occurs directly from Python code objects
>>> fib.__code__.co_code
'|\x00\x00d\x01\x00k\x00\x00r\x10\x00
d\x02\x00St\x00\x00|\x00\x00d\x02\x00
\x18\x83\x01\x00t\x00\x00|\x00\x00d
\x01\x00\x18\x83\x01\x00\x17Sd\x00\x00S'

70. Bytecode Interpretation
CPython
• Python has a bytecode interpreter
• Core of the eval loop (written in C).
bytecode
interpreter
runtime

71. Bytecode Interpretation
CPython
• It executes the bytecode
bytecode
interpreter
runtime
>>> fib.__code__.co_code
'|\x00\x00d\x01\x00k\x00\x00r\x1
d\x02\x00St\x00\x00|\x00\x00d\x
\x18\x83\x01\x00t\x00\x00|\x00\
\x01\x00\x18\x83\x01\x00\x17Sd\

86. "You are in a maze of twisty
little passages, all alike."
(and a huge green ﬁerce snake bars the way)

87. Understanding the Source
• Two different languages co-exist (same syntax)
# file1.py
def name2(args):
statement
statement
def name3(args):
statement
statement
statement
statement
def name1(args):
statement
statement
statement
Full Python????
RPython????
Which is it?

88. Understanding the Source
• Two different languages co-exist (same syntax)
# file1.py
def name2(args):
statement
statement
def name3(args):
statement
statement
statement
statement
def name1(args):
statement
statement
statement
Full Python????
RPython????
Which is it?
(You can't look in isolation)

89. Understanding the Source
def cast_object_to_ptr(PTR, object):
"""NOT_RPYTHON: hack. The object may
Limited to casting a given object to
"""
if isinstance(PTR, lltype.Ptr):
TO = PTR.TO
else:
TO = PTR
...
• It is enforced by the translator (an assertion)

90. Understanding the Source
• Deeper question: Why would you have mixed code?
# file1.py
def name2(args):
statement
statement
def name3(args):
statement
statement
statement
statement
def name1(args):
statement
statement
statement
RPython
Python

91. Execution Contexts
# file1.py
def name2(args):
statement
statement
def name3(args):
statement
statement
statement
statement
def name1(args):
statement
statement
statement
Translation (Python) Executable (C)

92. Execution Contexts
# file1.py
def name2(args):
statement
statement
def name3(args):
statement
statement
statement
statement
def name1(args):
statement
statement
statement
Translation (Python) Executable (C)
• At translation, the code separates
Metaprogramming Implementation
• decorators
• metaclasses
• exec()

93. Example
def decorator(func):
statements
...
def wrapper(*args,**kwargs):
statements
...
return func(*args,**kwargs)
return wrapper
@decorator
def func(args):
statements
...

94. Example
def decorator(func):
statements
...
def wrapper(*args,**kwargs):
statements
...
return func(*args,**kwargs)
return wrapper
@decorator
def func(args):
statements
...
Python
RPython
Python
RPython

95. Rules of Thumb
• Code that executes at import time can
make use of all Python features
• Code reachable through the entry point
(target) is RPython
• Keeping it straight is hard (for me anyways)

96. But Wait, There's More!

97. Foreign Code
• PyPy is written in "Python", but can access
external C code and libraries
• os, math, time, threads, etc.
• There is a highly developed FFI mechanism
• Plus a conﬁguration system (think autoconf)

98. Example
(Accessing Foreign Functions)

99. A Quandary
• How do I end this talk?
• I've only talked about RPython
• When do we get to the PyPy?

100. A Realization
I still don't know know
how PyPy works!
Score: PyPy: 1 Dave: 0

101. A Deeper Realization
I don't even know how
CPython works!

102. WAT!?!
WAT!?!

103. A Clariﬁcation
I do know how to use the
tools that make CPython

104. A Clariﬁcation
I do know how to use the
tools that make CPython
• ANSI C
• Makeﬁles
• Algorithms
• Data Structures

105. The Challenge
PyPy has a different set of tools
• RPython
• translate.py
• Metaprogramming
• Foreign Functions

106. So how to end this talk?

107. Wait! I used to be an evil professor!

108. Figuring out how PyPy works
is left as an exercise!
(You'll learn a lot)

109. Postscript

110. Postscript

111. Breaking GILs
• As you know, I like breaking GILs
• You know, global interpreter locks
• As in threads and stuff...
• I love it!

112. A Benchmark
• Message-passing with a CPU-bound thread
C : 1.11s
Python 2.7 : 1.60s
Ruby 1.9 : 5839.4s
• Don't concern yourself with the details
• Ruby 3600x slower than Python?
• What's that all about? Let's go tinker!

113. Tinkering with Ruby
• It was pretty straightforward
• Finding the GIL didn't take long
• An afternoon of ﬁddling around
(Search for my talk at RuPy 2011)
• Caused by a more extreme case of the
thread priority inversion that's in Python 3.3

114. Just to be clear...
I couldn't write a real Ruby
program to save my life right now.

115. A PyPy Benchmark
• A similar message-passing benchmark
Python 2.7 : 15.6s
PyPy-1.6 : 6689.2s (428x slower)
• Huh? What's that all about?
• No idea! Or even how to look.
• That is the whole reason for this talk

116. Parting Words
• Can you tinker with PyPy?
• Honest answer: I still don't know
• Should you try to go tinker with it anyways?
• YES!
• You will ﬁnd interesting things inside

117. "My God, it's full of stars!"

118. "My God, it's full of stars!"
(or VMs?)

119. Thanks!
• Hope you learned at least one new thing
• Special thanks:
• Alex Gaynor
• Maciej Fijalkowski
• Chipy