Slide 1

Slide 1 text

Low Level RPython David Beazley http://www.dabeaz.com @dabeaz February 9, 2012

Slide 2

Slide 2 text

PyPy Overview • PyPy is Python implemented in Python Interpreter (ANSI C) Python Program Interpreter (Python) Python Program CPython PyPy • Take the C version of the interpreter and rewrite it as a Python program.

Slide 3

Slide 3 text

rpython • PyPy is actually implemented in "rpython" • rpython is not an "interpreter", but a restricted subset of the Python language Python rpython • It can run as valid Python code, but that's about the only similarity

Slide 4

Slide 4 text

rpython • rpython is a completely different language • Python syntax, yes. • Must be compiled (like C, C++, etc.) • Static typing via type inference • Very different than anything you're used to

Slide 5

Slide 5 text

A Simple Example • Example rpython code: # fib.py def fib(n): if n < 2: return 1 else: return fib(n-1) + fib(n-2) # entry point. Like C main() def main(argv): print fib(int(argv[1])) return 0 def target(*args): return main, None

Slide 6

Slide 6 text

Translation (Compilation) • rpython translation bash % pypy/translator/goal/translate.py fib.py [platform:msg] Setting platform to 'host' cc=None [translation:info] Translating target as defined by hello [platform:execute] gcc-4.0 -c -arch x86_64 -O3 - fomit-frame-pointer -mdynamic-no-pic /var/folders/- \ ... lots of additional output ... • Creates and compiles a C program into an exe bash % ./fib-c 38 63245986 bash %

Slide 7

Slide 7 text

Performance • It runs pretty fast CPython 2.7 95.4s pypy 17.0s rpython 2.6s ANSI C (-O2) 2.1s • Almost as fast as ANSI C

Slide 8

Slide 8 text

R is for Restricted • rpython allows no dynamic typing def add(x,y): return x+y def main(argv): r1 = add(2,3) # Ok r2 = add("Hello","World") # Error return 0 • Functions can only have one type signature

Slide 9

Slide 9 text

R is for Restricted • Containers can only have a single type numbers = [1,2,3,4,5] # Ok items = [1, "Hello", 3.5] # Error names = { # Ok 'dabeaz' : 'David Beazley', 'gaynor' : 'Alex Gaynor', } record = { # Error 'name' : 'ACME', 'shares' : 100 } • Think C, not Python.

Slide 10

Slide 10 text

R is for Restricted • Attributes can only be a single type class Pair(object): def __init__(self,x,y): self.x = x self.y = y a = Pair(2,3) # OK (first use) b = Pair("Hello","World") # Error • Again, think C

Slide 11

Slide 11 text

Today's Topic • Going deeper into the generated C • Looking at the code • Studying efficiency • Accessing C libraries • ???

Slide 12

Slide 12 text

Advance Head Explosion This is a pretty dark and poorly documented corner of rpython

Slide 13

Slide 13 text

Looking at the C Code • rpython generates C code and places it into a temporary directory ... [translation:info] usession directory: /var/folders/M7/ M7Q2OurGGbezUFSLGEgQZ++++TI/-Tmp-/usession-unknown-0 [translation:info] created: /Users/beazley/Desktop/PyPyResearch/fib-c [Timer] Timings: [Timer] annotate --- 2.2 s [Timer] rtype_lltype --- 1.8 s [Timer] backendopt_lltype --- 1.2 s [Timer] stackcheckinsertion_lltype --- 0.0 s [Timer] database_c --- 16.7 s [Timer] source_c --- 2.8 s [Timer] compile_c --- 2.2 s [Timer] ========================================= [Timer] Total: --- 26.8 s bash %

Slide 14

Slide 14 text

Looking at the C Code • Go look for the "testing_1" directory bash % cd /var/folders/M7/M7Q2OurGGbezUFSLGEgQZ++++TI/-Tmp-/usession-unknown-0 bash % cd testing_1 bash % ls *.c data_objspace_flow_specialcase.c data_rlib_rdtoa.c data_rlib_rposix.c data_rlib_rstack.c data_rlib_rstack_1.c data_rpython_lltypesystem_rffi.c data_rpython_lltypesystem_rlist.c data_rpython_memory_gc_env.c data_rpython_memory_gc_minimark.c data_rpython_memory_gc_minimark_1.c data_rpython_memory_gctransform_framework.c debug_print.c implement.c nonfuncnodes.c objspace_flow_specialcase.c profiling.c ...

Slide 15

Slide 15 text

Essential Files • Here's where most of the generated code from your program gets placed • implement.c (functions) • nonfuncnodes.c (globals) • structdef.h (data structures) • Look at them if you dare... yes.

Slide 16

Slide 16 text

Example Code long pypy_g_fib(long l_n_0) { bool_t l_v280; bool_t l_v283; bool_t l_v286; bool_t l_v289; long l_v278; long l_v279; long l_v284; long l_v287; long l_v290; ... goto block0; block0: OP_INT_LT(l_n_0, 2L, l_v280); if (l_v280) { l_v294 = 1L; goto block5; } goto block1; block1: pypy_g_stack_check___(); l_v282 = (&pypy_g_ExcData)->ed_exc_type; l_v283 = (l_v282 == NULL); if (!l_v283) { goto block8; } goto block2; block2: OP_INT_SUB(l_n_0, 1L, l_v284); l_v278 = pypy_g_fib(l_v284); PYPY_INHIBIT_TAIL_CALL(); l_v285 = (&pypy_g_ExcData)->ed_exc_type;

Slide 17

Slide 17 text

Example Code long pypy_g_fib(long l_n_0) { bool_t l_v280; bool_t l_v283; bool_t l_v286; bool_t l_v289; long l_v278; long l_v279; long l_v284; long l_v287; long l_v290; ... goto block0; block0: OP_INT_LT(l_n_0, 2L, l_v280); if (l_v280) { l_v294 = 1L; goto block5; } goto block1; block1: pypy_g_stack_check___(); l_v282 = (&pypy_g_ExcData)->ed_exc_type; l_v283 = (l_v282 == NULL); if (!l_v283) { goto block8; } goto block2; block2: OP_INT_SUB(l_n_0, 1L, l_v284); l_v278 = pypy_g_fib(l_v284); PYPY_INHIBIT_TAIL_CALL(); l_v285 = (&pypy_g_ExcData)->ed_exc_type; l_v286 = (l_v285 == NULL); It's a literal translation of flow-graphs into C (rather cryptic)

Slide 18

Slide 18 text

Experimental Coding • You can try different things in rpython and go look at the output C code • A bit of a challenge • But interesting to study what happens

Slide 19

Slide 19 text

Example: Objects class Stock(object): def __init__(self,name,shares,price): self.name = name self.shares = shares self.price = price s = Stock('ACME',50,123.45) // structdef.h ... struct pypy_cls_Stock0 { struct pypy_object0 s_super; struct pypy_rpy_string0 *s_inst_name; double s_inst_price; long s_inst_shares; } rpython

Slide 20

Slide 20 text

Accessing C Code • If everything in PyPy is written in rpython, how does it access low-level C libraries? • os modules • time functions • math functions • General question: How would you access C code from any Python program?

Slide 21

Slide 21 text

ctypes (CPython) • Perhaps you've used ctypes before... import ctypes mlib = ctypes.cdll.LoadLibrary("libm.dylib") sin = mlib.sin sin.argtypes = (ctypes.c_double,) sin.restype = ctypes.c_double ... x = sin(2) • rpython is kind of similar

Slide 22

Slide 22 text

rpython rffi • Foreign Function Interface from pypy.rpython.lltypesystem import rffi sin = rffi.llexternal("sin", [rffi.DOUBLE], rffi.DOUBLE) cos = rffi.llexternal("cos", [rffi.DOUBLE], rffi.DOUBLE) ... • Declares external C functions with types • Can use in your rpython program y = sin(x) + cos(x) ... • Instructive to look at low-level C code

Slide 23

Slide 23 text

rffi Commentary • The rpython rffi is highly developed • Most C primitive datatypes • Arrays • Structures • Pointers • Memory management • (More advanced example shortly)

Slide 24

Slide 24 text

Configuration System • There is a C compilation/configuration system from pypy.translator.tool.cbuild import ExternalCompilationInfo from pypy.rpython.tool import rffi_platform as platform class CConfig: _compilation_info_ = ExternalCompilationInfo( includes = ['math.h'], libraries = ['m'], ) M_PI = platform.DefinedConstantDouble('M_PI') M_E = platform.DefinedConstantDouble('M_E') config = platform.configure(CConfig) M_PI = config['M_PI'] M_E = config['M_E']

Slide 25

Slide 25 text

Configuration System • There is a C compilation/configuration system from pypy.translator.tool.cbuild import ExternalCompilationInfo from pypy.rpython.tool import rffi_platform as platform class CConfig: _compilation_info_ = ExternalCompilationInfo( includes = ['math.h'], libraries = ['m'], ) M_PI = platform.DefinedConstantDouble('M_PI') M_E = platform.DefinedConstantDouble('M_E') config = platform.configure(CConfig) M_PI = config['M_PI'] M_E = config['M_E'] C compiler specification (includes, libraries, paths, etc.)

Slide 26

Slide 26 text

Configuration System • There is a C compilation/configuration system from pypy.translator.tool.cbuild import ExternalCompilationInfo from pypy.rpython.tool import rffi_platform as platform class CConfig: _compilation_info_ = ExternalCompilationInfo( includes = ['math.h'], libraries = ['m'], ) M_PI = platform.DefinedConstantDouble('M_PI') M_E = platform.DefinedConstantDouble('M_E') config = platform.configure(CConfig) M_PI = config['M_PI'] M_E = config['M_E'] Some "queries" for things you want to know from C

Slide 27

Slide 27 text

Configuration System • There is a C compilation/configuration system from pypy.translator.tool.cbuild import ExternalCompilationInfo from pypy.rpython.tool import rffi_platform as platform class CConfig: _compilation_info_ = ExternalCompilationInfo( includes = ['math.h'], libraries = ['m'], ) M_PI = platform.DefinedConstantDouble('M_PI') M_E = platform.DefinedConstantDouble('M_E') config = platform.configure(CConfig) M_PI = config['M_PI'] M_E = config['M_E'] Run the C compiler and get results back

Slide 28

Slide 28 text

Configuration Comments • There is no centralized "configuration" • Individual program modules simply request information from the C compilation environment whenever they need it • Somehow (magically), the system will invoke the C compiler as needed. • It hurts my head...

Slide 29

Slide 29 text

Final Demo

Slide 30

Slide 30 text

Now What?!? • We haven't even talked about PyPy yet! • .... or the JIT • Implemented in rpython • So, all of this is just a starting point • More talks? (maybe)