Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Low Level RPython

Low Level RPython

Talk at Chicago Python User's Group. February 9, 2012. Video at https://www.youtube.com/watch?v=kkt_BtR9Kzk

David Beazley

February 09, 2012
Tweet

More Decks by David Beazley

Other Decks in Programming

Transcript

  1. PyPy Overview • PyPy is Python implemented in Python Interpreter

    (ANSI C) Python Program Interpreter (Python) Python Program CPython PyPy • Take the C version of the interpreter and rewrite it as a Python program.
  2. rpython • PyPy is actually implemented in "rpython" • rpython

    is not an "interpreter", but a restricted subset of the Python language Python rpython • It can run as valid Python code, but that's about the only similarity
  3. rpython • rpython is a completely different language • Python

    syntax, yes. • Must be compiled (like C, C++, etc.) • Static typing via type inference • Very different than anything you're used to
  4. A Simple Example • Example rpython code: # fib.py def

    fib(n): if n < 2: return 1 else: return fib(n-1) + fib(n-2) # entry point. Like C main() def main(argv): print fib(int(argv[1])) return 0 def target(*args): return main, None
  5. Translation (Compilation) • rpython translation bash % pypy/translator/goal/translate.py fib.py [platform:msg]

    Setting platform to 'host' cc=None [translation:info] Translating target as defined by hello [platform:execute] gcc-4.0 -c -arch x86_64 -O3 - fomit-frame-pointer -mdynamic-no-pic /var/folders/- \ ... lots of additional output ... • Creates and compiles a C program into an exe bash % ./fib-c 38 63245986 bash %
  6. Performance • It runs pretty fast CPython 2.7 95.4s pypy

    17.0s rpython 2.6s ANSI C (-O2) 2.1s • Almost as fast as ANSI C
  7. R is for Restricted • rpython allows no dynamic typing

    def add(x,y): return x+y def main(argv): r1 = add(2,3) # Ok r2 = add("Hello","World") # Error return 0 • Functions can only have one type signature
  8. R is for Restricted • Containers can only have a

    single type numbers = [1,2,3,4,5] # Ok items = [1, "Hello", 3.5] # Error names = { # Ok 'dabeaz' : 'David Beazley', 'gaynor' : 'Alex Gaynor', } record = { # Error 'name' : 'ACME', 'shares' : 100 } • Think C, not Python.
  9. R is for Restricted • Attributes can only be a

    single type class Pair(object): def __init__(self,x,y): self.x = x self.y = y a = Pair(2,3) # OK (first use) b = Pair("Hello","World") # Error • Again, think C
  10. Today's Topic • Going deeper into the generated C •

    Looking at the code • Studying efficiency • Accessing C libraries • ???
  11. Looking at the C Code • rpython generates C code

    and places it into a temporary directory ... [translation:info] usession directory: /var/folders/M7/ M7Q2OurGGbezUFSLGEgQZ++++TI/-Tmp-/usession-unknown-0 [translation:info] created: /Users/beazley/Desktop/PyPyResearch/fib-c [Timer] Timings: [Timer] annotate --- 2.2 s [Timer] rtype_lltype --- 1.8 s [Timer] backendopt_lltype --- 1.2 s [Timer] stackcheckinsertion_lltype --- 0.0 s [Timer] database_c --- 16.7 s [Timer] source_c --- 2.8 s [Timer] compile_c --- 2.2 s [Timer] ========================================= [Timer] Total: --- 26.8 s bash %
  12. Looking at the C Code • Go look for the

    "testing_1" directory bash % cd /var/folders/M7/M7Q2OurGGbezUFSLGEgQZ++++TI/-Tmp-/usession-unknown-0 bash % cd testing_1 bash % ls *.c data_objspace_flow_specialcase.c data_rlib_rdtoa.c data_rlib_rposix.c data_rlib_rstack.c data_rlib_rstack_1.c data_rpython_lltypesystem_rffi.c data_rpython_lltypesystem_rlist.c data_rpython_memory_gc_env.c data_rpython_memory_gc_minimark.c data_rpython_memory_gc_minimark_1.c data_rpython_memory_gctransform_framework.c debug_print.c implement.c nonfuncnodes.c objspace_flow_specialcase.c profiling.c ...
  13. Essential Files • Here's where most of the generated code

    from your program gets placed • implement.c (functions) • nonfuncnodes.c (globals) • structdef.h (data structures) • Look at them if you dare... yes.
  14. Example Code long pypy_g_fib(long l_n_0) { bool_t l_v280; bool_t l_v283;

    bool_t l_v286; bool_t l_v289; long l_v278; long l_v279; long l_v284; long l_v287; long l_v290; ... goto block0; block0: OP_INT_LT(l_n_0, 2L, l_v280); if (l_v280) { l_v294 = 1L; goto block5; } goto block1; block1: pypy_g_stack_check___(); l_v282 = (&pypy_g_ExcData)->ed_exc_type; l_v283 = (l_v282 == NULL); if (!l_v283) { goto block8; } goto block2; block2: OP_INT_SUB(l_n_0, 1L, l_v284); l_v278 = pypy_g_fib(l_v284); PYPY_INHIBIT_TAIL_CALL(); l_v285 = (&pypy_g_ExcData)->ed_exc_type;
  15. Example Code long pypy_g_fib(long l_n_0) { bool_t l_v280; bool_t l_v283;

    bool_t l_v286; bool_t l_v289; long l_v278; long l_v279; long l_v284; long l_v287; long l_v290; ... goto block0; block0: OP_INT_LT(l_n_0, 2L, l_v280); if (l_v280) { l_v294 = 1L; goto block5; } goto block1; block1: pypy_g_stack_check___(); l_v282 = (&pypy_g_ExcData)->ed_exc_type; l_v283 = (l_v282 == NULL); if (!l_v283) { goto block8; } goto block2; block2: OP_INT_SUB(l_n_0, 1L, l_v284); l_v278 = pypy_g_fib(l_v284); PYPY_INHIBIT_TAIL_CALL(); l_v285 = (&pypy_g_ExcData)->ed_exc_type; l_v286 = (l_v285 == NULL); It's a literal translation of flow-graphs into C (rather cryptic)
  16. Experimental Coding • You can try different things in rpython

    and go look at the output C code • A bit of a challenge • But interesting to study what happens
  17. Example: Objects class Stock(object): def __init__(self,name,shares,price): self.name = name self.shares

    = shares self.price = price s = Stock('ACME',50,123.45) // structdef.h ... struct pypy_cls_Stock0 { struct pypy_object0 s_super; struct pypy_rpy_string0 *s_inst_name; double s_inst_price; long s_inst_shares; } rpython
  18. Accessing C Code • If everything in PyPy is written

    in rpython, how does it access low-level C libraries? • os modules • time functions • math functions • General question: How would you access C code from any Python program?
  19. ctypes (CPython) • Perhaps you've used ctypes before... import ctypes

    mlib = ctypes.cdll.LoadLibrary("libm.dylib") sin = mlib.sin sin.argtypes = (ctypes.c_double,) sin.restype = ctypes.c_double ... x = sin(2) • rpython is kind of similar
  20. rpython rffi • Foreign Function Interface from pypy.rpython.lltypesystem import rffi

    sin = rffi.llexternal("sin", [rffi.DOUBLE], rffi.DOUBLE) cos = rffi.llexternal("cos", [rffi.DOUBLE], rffi.DOUBLE) ... • Declares external C functions with types • Can use in your rpython program y = sin(x) + cos(x) ... • Instructive to look at low-level C code
  21. rffi Commentary • The rpython rffi is highly developed •

    Most C primitive datatypes • Arrays • Structures • Pointers • Memory management • (More advanced example shortly)
  22. Configuration System • There is a C compilation/configuration system from

    pypy.translator.tool.cbuild import ExternalCompilationInfo from pypy.rpython.tool import rffi_platform as platform class CConfig: _compilation_info_ = ExternalCompilationInfo( includes = ['math.h'], libraries = ['m'], ) M_PI = platform.DefinedConstantDouble('M_PI') M_E = platform.DefinedConstantDouble('M_E') config = platform.configure(CConfig) M_PI = config['M_PI'] M_E = config['M_E']
  23. Configuration System • There is a C compilation/configuration system from

    pypy.translator.tool.cbuild import ExternalCompilationInfo from pypy.rpython.tool import rffi_platform as platform class CConfig: _compilation_info_ = ExternalCompilationInfo( includes = ['math.h'], libraries = ['m'], ) M_PI = platform.DefinedConstantDouble('M_PI') M_E = platform.DefinedConstantDouble('M_E') config = platform.configure(CConfig) M_PI = config['M_PI'] M_E = config['M_E'] C compiler specification (includes, libraries, paths, etc.)
  24. Configuration System • There is a C compilation/configuration system from

    pypy.translator.tool.cbuild import ExternalCompilationInfo from pypy.rpython.tool import rffi_platform as platform class CConfig: _compilation_info_ = ExternalCompilationInfo( includes = ['math.h'], libraries = ['m'], ) M_PI = platform.DefinedConstantDouble('M_PI') M_E = platform.DefinedConstantDouble('M_E') config = platform.configure(CConfig) M_PI = config['M_PI'] M_E = config['M_E'] Some "queries" for things you want to know from C
  25. Configuration System • There is a C compilation/configuration system from

    pypy.translator.tool.cbuild import ExternalCompilationInfo from pypy.rpython.tool import rffi_platform as platform class CConfig: _compilation_info_ = ExternalCompilationInfo( includes = ['math.h'], libraries = ['m'], ) M_PI = platform.DefinedConstantDouble('M_PI') M_E = platform.DefinedConstantDouble('M_E') config = platform.configure(CConfig) M_PI = config['M_PI'] M_E = config['M_E'] Run the C compiler and get results back
  26. Configuration Comments • There is no centralized "configuration" • Individual

    program modules simply request information from the C compilation environment whenever they need it • Somehow (magically), the system will invoke the C compiler as needed. • It hurts my head...
  27. Now What?!? • We haven't even talked about PyPy yet!

    • .... or the JIT • Implemented in rpython • So, all of this is just a starting point • More talks? (maybe)