Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Exploring Python Code Objects

Exploring Python Code Objects

Python is an interpreted language, right? Wrong! In this talk, dive deep into Python bytecode, and learn what actually happens in everyone's favorite Python program, 'print "Hello world"'. Learn to use the compile() builtin and its best friend the exec statement, understand what your Python code is doing with the dis and compiler modules, and discover new ways to explore and enjoy Python at a low level.

PyGotham II, June 8, 2012.

"The Known Universe" video from American Museum of Natural History and the Hayden Planetarium. See http://www.youtube.com/watch?v=17jymDn0W6U.

dcrosta

June 09, 2012
Tweet

More Decks by dcrosta

Other Decks in Programming

Transcript

  1. "This popular meme is incorrect, or, rather, constructed upon a

    misunderstanding of (natural) language levels: a similar mistake would be to say 'the Bible is a hardcover book'." "I've been given to understand that Python is an interpreted language..." Alex Martelli, http://stackoverflow.com/questions/2998215
  2. CPYTHON • Compiles Python source to "bytecode" • On demand,

    when modules are loaded • Virtual machine for this bytecode
  3. BYTECODE? • About 150 primitive instructions • Associates data with

    those operations • CPython implements a virtual processor • Stored in .pyc files
  4. MAKE YOUR OWN • compile() (we will use this) •

    Many other ways: • compiler • parser • compileall • py_compile
  5. MAKE YOUR OWN >>> code_str = """ ... print "Hello,

    world" ... """ >>> code_obj = compile( ... code_str, '<string>', 'exec')
  6. MAKE YOUR OWN >>> code_str = """ ... print "Hello,

    world" ... """ >>> code_obj = compile( ... code_str, '<string>', 'exec') >>> code_obj <code object <module> at 0x1054c74b0, file "<string>", line 2>
  7. LOOK INSIDE >>> code_obj.co_filename '<string>' >>> code_obj.co_name '<module>' >>> dir(code_obj)

    ['co_argcount', 'co_cellvars', 'co_code', 'co_consts', 'co_filename', 'co_firstlineno', 'co_flags', 'co_freevars', 'co_lnotab', 'co_name', 'co_names', 'co_nlocals', 'co_stacksize', 'co_varnames']
  8. MAKE IT GO >>> code_obj() Traceback (most recent call last):

    File "<stdin>", line 1, in <module> TypeError: 'code' object is not callable
  9. MAKE IT GO >>> code_obj() Traceback (most recent call last):

    File "<stdin>", line 1, in <module> TypeError: 'code' object is not callable >>> exec code_obj Hello, world
  10. MAKE IT GO >>> code_obj() Traceback (most recent call last):

    File "<stdin>", line 1, in <module> TypeError: 'code' object is not callable >>> exec code_obj Hello, world
  11. MAKE IT GO >>> code_str = """ ... x =

    1 ... y = x + 1 ... z = x + y ... """ >>> code_obj = compile( ... code_str, '<string>', 'exec')
  12. MAKE IT GO >>> code_str = """ ... x =

    1 ... y = x + 1 ... z = x + y ... """ >>> code_obj = compile( ... code_str, '<string>', 'exec') >>> myglobals = {} >>> mylocals = {} >>> exec code_obj in myglobals, mylocals >>> mylocals['z'] 3
  13. UNDER THE HOOD >>> code_str = """ ... x =

    1 ... y = x + 1 ... z = x + y ... """ >>> code_obj = compile( ... code_str, '<string>', 'exec') >>> code_obj.co_consts (1, None) >>> code_obj.co_names ('x', 'y', 'z') >>> code_obj.co_stacksize 2
  14. UNDER THE HOOD >>> dis.dis(code_obj) 2 0 LOAD_CONST 0 (1)

    3 STORE_NAME 0 (x) 3 6 LOAD_NAME 0 (x) 9 LOAD_CONST 0 (1) 12 BINARY_ADD 13 STORE_NAME 1 (y) 4 16 LOAD_NAME 0 (x) 19 LOAD_NAME 1 (y) 22 BINARY_ADD 23 STORE_NAME 2 (z) 26 LOAD_CONST 1 (None) 29 RETURN_VALUE
  15. UNDER THE HOOD 0 LOAD_CONST 0 (1) 3 STORE_NAME 0

    (x) 6 LOAD_NAME 0 (x) 9 LOAD_CONST 0 (1) 12 BINARY_ADD 13 STORE_NAME 1 (y) 16 LOAD_NAME 0 (x) 19 LOAD_NAME 1 (y) 22 BINARY_ADD 23 STORE_NAME 2 (z) 26 LOAD_CONST 1 (None) 29 RETURN_VALUE locals: 'x' => undefined 'y' => undefined 'z' => undefined
  16. UNDER THE HOOD 0 LOAD_CONST 0 (1) 3 STORE_NAME 0

    (x) 6 LOAD_NAME 0 (x) 9 LOAD_CONST 0 (1) 12 BINARY_ADD 13 STORE_NAME 1 (y) 16 LOAD_NAME 0 (x) 19 LOAD_NAME 1 (y) 22 BINARY_ADD 23 STORE_NAME 2 (z) 26 LOAD_CONST 1 (None) 29 RETURN_VALUE locals: 'x' => undefined 'y' => undefined 'z' => undefined
  17. UNDER THE HOOD 0 LOAD_CONST 0 (1) 3 STORE_NAME 0

    (x) 6 LOAD_NAME 0 (x) 9 LOAD_CONST 0 (1) 12 BINARY_ADD 13 STORE_NAME 1 (y) 16 LOAD_NAME 0 (x) 19 LOAD_NAME 1 (y) 22 BINARY_ADD 23 STORE_NAME 2 (z) 26 LOAD_CONST 1 (None) 29 RETURN_VALUE locals: 'x' => undefined 'y' => undefined 'z' => undefined 1
  18. UNDER THE HOOD 0 LOAD_CONST 0 (1) 3 STORE_NAME 0

    (x) 6 LOAD_NAME 0 (x) 9 LOAD_CONST 0 (1) 12 BINARY_ADD 13 STORE_NAME 1 (y) 16 LOAD_NAME 0 (x) 19 LOAD_NAME 1 (y) 22 BINARY_ADD 23 STORE_NAME 2 (z) 26 LOAD_CONST 1 (None) 29 RETURN_VALUE locals: 'x' => undefined 'y' => undefined 'z' => undefined 1
  19. UNDER THE HOOD 0 LOAD_CONST 0 (1) 3 STORE_NAME 0

    (x) 6 LOAD_NAME 0 (x) 9 LOAD_CONST 0 (1) 12 BINARY_ADD 13 STORE_NAME 1 (y) 16 LOAD_NAME 0 (x) 19 LOAD_NAME 1 (y) 22 BINARY_ADD 23 STORE_NAME 2 (z) 26 LOAD_CONST 1 (None) 29 RETURN_VALUE locals: 'x' => 1 'y' => undefined 'z' => undefined
  20. UNDER THE HOOD 0 LOAD_CONST 0 (1) 3 STORE_NAME 0

    (x) 6 LOAD_NAME 0 (x) 9 LOAD_CONST 0 (1) 12 BINARY_ADD 13 STORE_NAME 1 (y) 16 LOAD_NAME 0 (x) 19 LOAD_NAME 1 (y) 22 BINARY_ADD 23 STORE_NAME 2 (z) 26 LOAD_CONST 1 (None) 29 RETURN_VALUE locals: 'x' => 1 'y' => undefined 'z' => undefined
  21. UNDER THE HOOD 0 LOAD_CONST 0 (1) 3 STORE_NAME 0

    (x) 6 LOAD_NAME 0 (x) 9 LOAD_CONST 0 (1) 12 BINARY_ADD 13 STORE_NAME 1 (y) 16 LOAD_NAME 0 (x) 19 LOAD_NAME 1 (y) 22 BINARY_ADD 23 STORE_NAME 2 (z) 26 LOAD_CONST 1 (None) 29 RETURN_VALUE locals: 'x' => 1 'y' => undefined 'z' => undefined 1
  22. UNDER THE HOOD 0 LOAD_CONST 0 (1) 3 STORE_NAME 0

    (x) 6 LOAD_NAME 0 (x) 9 LOAD_CONST 0 (1) 12 BINARY_ADD 13 STORE_NAME 1 (y) 16 LOAD_NAME 0 (x) 19 LOAD_NAME 1 (y) 22 BINARY_ADD 23 STORE_NAME 2 (z) 26 LOAD_CONST 1 (None) 29 RETURN_VALUE locals: 'x' => 1 'y' => undefined 'z' => undefined 1
  23. UNDER THE HOOD 0 LOAD_CONST 0 (1) 3 STORE_NAME 0

    (x) 6 LOAD_NAME 0 (x) 9 LOAD_CONST 0 (1) 12 BINARY_ADD 13 STORE_NAME 1 (y) 16 LOAD_NAME 0 (x) 19 LOAD_NAME 1 (y) 22 BINARY_ADD 23 STORE_NAME 2 (z) 26 LOAD_CONST 1 (None) 29 RETURN_VALUE locals: 'x' => 1 'y' => undefined 'z' => undefined 1 1
  24. UNDER THE HOOD 0 LOAD_CONST 0 (1) 3 STORE_NAME 0

    (x) 6 LOAD_NAME 0 (x) 9 LOAD_CONST 0 (1) 12 BINARY_ADD 13 STORE_NAME 1 (y) 16 LOAD_NAME 0 (x) 19 LOAD_NAME 1 (y) 22 BINARY_ADD 23 STORE_NAME 2 (z) 26 LOAD_CONST 1 (None) 29 RETURN_VALUE locals: 'x' => 1 'y' => undefined 'z' => undefined 1 1
  25. UNDER THE HOOD 0 LOAD_CONST 0 (1) 3 STORE_NAME 0

    (x) 6 LOAD_NAME 0 (x) 9 LOAD_CONST 0 (1) 12 BINARY_ADD 13 STORE_NAME 1 (y) 16 LOAD_NAME 0 (x) 19 LOAD_NAME 1 (y) 22 BINARY_ADD 23 STORE_NAME 2 (z) 26 LOAD_CONST 1 (None) 29 RETURN_VALUE locals: 'x' => 1 'y' => undefined 'z' => undefined 2
  26. UNDER THE HOOD 0 LOAD_CONST 0 (1) 3 STORE_NAME 0

    (x) 6 LOAD_NAME 0 (x) 9 LOAD_CONST 0 (1) 12 BINARY_ADD 13 STORE_NAME 1 (y) 16 LOAD_NAME 0 (x) 19 LOAD_NAME 1 (y) 22 BINARY_ADD 23 STORE_NAME 2 (z) 26 LOAD_CONST 1 (None) 29 RETURN_VALUE locals: 'x' => 1 'y' => undefined 'z' => undefined 2
  27. UNDER THE HOOD 0 LOAD_CONST 0 (1) 3 STORE_NAME 0

    (x) 6 LOAD_NAME 0 (x) 9 LOAD_CONST 0 (1) 12 BINARY_ADD 13 STORE_NAME 1 (y) 16 LOAD_NAME 0 (x) 19 LOAD_NAME 1 (y) 22 BINARY_ADD 23 STORE_NAME 2 (z) 26 LOAD_CONST 1 (None) 29 RETURN_VALUE locals: 'x' => 1 'y' => 2 'z' => undefined
  28. UNDER THE HOOD 0 LOAD_CONST 0 (1) 3 STORE_NAME 0

    (x) 6 LOAD_NAME 0 (x) 9 LOAD_CONST 0 (1) 12 BINARY_ADD 13 STORE_NAME 1 (y) 16 LOAD_NAME 0 (x) 19 LOAD_NAME 1 (y) 22 BINARY_ADD 23 STORE_NAME 2 (z) 26 LOAD_CONST 1 (None) 29 RETURN_VALUE locals: 'x' => 1 'y' => 2 'z' => undefined
  29. UNDER THE HOOD 0 LOAD_CONST 0 (1) 3 STORE_NAME 0

    (x) 6 LOAD_NAME 0 (x) 9 LOAD_CONST 0 (1) 12 BINARY_ADD 13 STORE_NAME 1 (y) 16 LOAD_NAME 0 (x) 19 LOAD_NAME 1 (y) 22 BINARY_ADD 23 STORE_NAME 2 (z) 26 LOAD_CONST 1 (None) 29 RETURN_VALUE locals: 'x' => 1 'y' => 2 'z' => undefined 1
  30. UNDER THE HOOD 0 LOAD_CONST 0 (1) 3 STORE_NAME 0

    (x) 6 LOAD_NAME 0 (x) 9 LOAD_CONST 0 (1) 12 BINARY_ADD 13 STORE_NAME 1 (y) 16 LOAD_NAME 0 (x) 19 LOAD_NAME 1 (y) 22 BINARY_ADD 23 STORE_NAME 2 (z) 26 LOAD_CONST 1 (None) 29 RETURN_VALUE locals: 'x' => 1 'y' => 2 'z' => undefined 1
  31. UNDER THE HOOD 0 LOAD_CONST 0 (1) 3 STORE_NAME 0

    (x) 6 LOAD_NAME 0 (x) 9 LOAD_CONST 0 (1) 12 BINARY_ADD 13 STORE_NAME 1 (y) 16 LOAD_NAME 0 (x) 19 LOAD_NAME 1 (y) 22 BINARY_ADD 23 STORE_NAME 2 (z) 26 LOAD_CONST 1 (None) 29 RETURN_VALUE locals: 'x' => 1 'y' => 2 'z' => undefined 1 2
  32. UNDER THE HOOD 0 LOAD_CONST 0 (1) 3 STORE_NAME 0

    (x) 6 LOAD_NAME 0 (x) 9 LOAD_CONST 0 (1) 12 BINARY_ADD 13 STORE_NAME 1 (y) 16 LOAD_NAME 0 (x) 19 LOAD_NAME 1 (y) 22 BINARY_ADD 23 STORE_NAME 2 (z) 26 LOAD_CONST 1 (None) 29 RETURN_VALUE locals: 'x' => 1 'y' => 2 'z' => undefined 1 2
  33. UNDER THE HOOD 0 LOAD_CONST 0 (1) 3 STORE_NAME 0

    (x) 6 LOAD_NAME 0 (x) 9 LOAD_CONST 0 (1) 12 BINARY_ADD 13 STORE_NAME 1 (y) 16 LOAD_NAME 0 (x) 19 LOAD_NAME 1 (y) 22 BINARY_ADD 23 STORE_NAME 2 (z) 26 LOAD_CONST 1 (None) 29 RETURN_VALUE locals: 'x' => 1 'y' => 2 'z' => undefined 3
  34. UNDER THE HOOD 0 LOAD_CONST 0 (1) 3 STORE_NAME 0

    (x) 6 LOAD_NAME 0 (x) 9 LOAD_CONST 0 (1) 12 BINARY_ADD 13 STORE_NAME 1 (y) 16 LOAD_NAME 0 (x) 19 LOAD_NAME 1 (y) 22 BINARY_ADD 23 STORE_NAME 2 (z) 26 LOAD_CONST 1 (None) 29 RETURN_VALUE locals: 'x' => 1 'y' => 2 'z' => undefined 3
  35. UNDER THE HOOD 0 LOAD_CONST 0 (1) 3 STORE_NAME 0

    (x) 6 LOAD_NAME 0 (x) 9 LOAD_CONST 0 (1) 12 BINARY_ADD 13 STORE_NAME 1 (y) 16 LOAD_NAME 0 (x) 19 LOAD_NAME 1 (y) 22 BINARY_ADD 23 STORE_NAME 2 (z) 26 LOAD_CONST 1 (None) 29 RETURN_VALUE locals: 'x' => 1 'y' => 2 'z' => 3
  36. UNDER THE HOOD 0 LOAD_CONST 0 (1) 3 STORE_NAME 0

    (x) 6 LOAD_NAME 0 (x) 9 LOAD_CONST 0 (1) 12 BINARY_ADD 13 STORE_NAME 1 (y) 16 LOAD_NAME 0 (x) 19 LOAD_NAME 1 (y) 22 BINARY_ADD 23 STORE_NAME 2 (z) 26 LOAD_CONST 1 (None) 29 RETURN_VALUE locals: 'x' => 1 'y' => 2 'z' => 3
  37. UNDER THE HOOD 0 LOAD_CONST 0 (1) 3 STORE_NAME 0

    (x) 6 LOAD_NAME 0 (x) 9 LOAD_CONST 0 (1) 12 BINARY_ADD 13 STORE_NAME 1 (y) 16 LOAD_NAME 0 (x) 19 LOAD_NAME 1 (y) 22 BINARY_ADD 23 STORE_NAME 2 (z) 26 LOAD_CONST 1 (None) 29 RETURN_VALUE locals: 'x' => 1 'y' => 2 'z' => 3 None
  38. UNDER THE HOOD 0 LOAD_CONST 0 (1) 3 STORE_NAME 0

    (x) 6 LOAD_NAME 0 (x) 9 LOAD_CONST 0 (1) 12 BINARY_ADD 13 STORE_NAME 1 (y) 16 LOAD_NAME 0 (x) 19 LOAD_NAME 1 (y) 22 BINARY_ADD 23 STORE_NAME 2 (z) 26 LOAD_CONST 1 (None) 29 RETURN_VALUE locals: 'x' => 1 'y' => 2 'z' => 3 None
  39. UNDER THE HOOD 0 LOAD_CONST 0 (1) 3 STORE_NAME 0

    (x) 6 LOAD_NAME 0 (x) 9 LOAD_CONST 0 (1) 12 BINARY_ADD 13 STORE_NAME 1 (y) 16 LOAD_NAME 0 (x) 19 LOAD_NAME 1 (y) 22 BINARY_ADD 23 STORE_NAME 2 (z) 26 LOAD_CONST 1 (None) 29 RETURN_VALUE locals: 'x' => 1 'y' => 2 'z' => 3
  40. CEVAL.C PyObject * PyEval_EvalFrameEx(PyFrameObject *f, int throwflag) { PyObject *retval

    = NULL; PyCodeObject *co; co = f->f_code; for (;;) { opcode = NEXTOP(); switch (opcode) { /* eventually retval is set */ } } return retval; }
  41. CEVAL.C case BINARY_ADD: w = POP(); v = TOP(); if

    (PyInt_CheckExact(v) && PyInt_CheckExact(w)) { long a, b, i; a = PyInt_AS_LONG(v); b = PyInt_AS_LONG(w); /* cast to avoid undefined behaviour on overflow */ i = (long)((unsigned long)a + b); x = PyInt_FromLong(i); } Py_DECREF(v); Py_DECREF(w); SET_TOP(x);
  42. CLASSES ARE CODE, TOO >>> def factory(): ... class MyClass(object):

    ... def method(self, arg): ... return arg ... return MyClass ... >>> dis.dis(factory) 2 0 LOAD_CONST 1 ('MyClass') 3 LOAD_GLOBAL 0 (object) 6 BUILD_TUPLE 1 9 LOAD_CONST 2 (<code object MyClass>) 12 MAKE_FUNCTION 0 15 CALL_FUNCTION 0 18 BUILD_CLASS 19 STORE_FAST 0 (MyClass) 5 22 LOAD_FAST 0 (MyClass) 25 RETURN_VALUE
  43. CLASSES ARE CODE, TOO >>> dis.dis(factory.func_code.co_consts[2]) 2 0 LOAD_NAME 0

    (__name__) 3 STORE_NAME 1 (__module__) 3 6 LOAD_CONST 0 (<code object method>) 9 MAKE_FUNCTION 0 12 STORE_NAME 2 (method) 15 LOAD_LOCALS 16 RETURN_VALUE >>> dis.dis(factory.func_code.co_consts[2].co_consts[0]) 4 0 LOAD_FAST 1 (arg) 3 RETURN_VALUE
  44. CLOSURES >>> def outer(x): ... def inner(y): ... return x

    + y ... return inner ... >>> dis.dis(outer) 2 0 LOAD_CLOSURE 0 (x) 3 BUILD_TUPLE 1 6 LOAD_CONST 1 (<code object inner>) 9 MAKE_CLOSURE 0 12 STORE_FAST 1 (inner) 4 15 LOAD_FAST 1 (inner) 18 RETURN_VALUE
  45. CLOSURES >>> inner = outer(1) >>> dis.dis(inner) 3 0 LOAD_DEREF

    0 (x) 3 LOAD_FAST 0 (y) 6 BINARY_ADD 7 RETURN_VALUE
  46. CLOSURES >>> inner = outer(1) >>> dis.dis(inner) 3 0 LOAD_DEREF

    0 (x) 3 LOAD_FAST 0 (y) 6 BINARY_ADD 7 RETURN_VALUE >>> inner.func_closure (<cell at 0x1050ef210: int object at 0x104f131b8>,) >>> inner.func_closure[0].cell_contents 1
  47. ZOOMING OUT • I hope this was interesting • This

    is all only for CPython (2.x)! • Go exploring, share what you find • Questions?