Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥

Beyond the Code: Manipulating Bytecode and Buil...

Avatar for note35 note35
October 07, 2025

Beyond the Code: Manipulating Bytecode and Building Community

Explore Python bytecode's evolution, its impact on tools like pytype through PEP 709, and the vital role of OSS collaboration in shaping Python's future.

Avatar for note35

note35

October 07, 2025
Tweet

More Decks by note35

Other Decks in Programming

Transcript

  1. PEP 729 2023-11-20 EuroPython 2024 2024-07-10 * All information is

    publicly available. 3 Google lays off its Python team 2024-04
  2. 4 Pytype's future? Can anyone maintain it? EuroPython 2025 2025-07-14

    2024-04 PEP 729 2023-11-20 EuroPython 2024 2024-07-10
  3. >>> python_code = "x=1" >>> import dis # disassembler >>>

    dis.dis(python_code) 8 Code in CPython 3.13.2 Source code
  4. >>> python_code = "x=1" >>> import dis # disassembler >>>

    dis.dis(python_code) 0 RESUME 0 1 LOAD_CONST 0 (1) STORE_NAME 0 (x) RETURN_CONST 1 (None) 9 Code in CPython 3.13.2 Bytecode operations Source code
  5. >>> python_code = "x=1" >>> import dis # disassembler >>>

    dis.dis(python_code) 0 RESUME 0 1 LOAD_CONST 0 (1) STORE_NAME 0 (x) RETURN_CONST 1 (None) >>> code_object = compile(python_code, '<string>', 'exec') 10 Code in CPython 3.13.2 Code object Bytecode operations Source code
  6. >>> python_code = "x=1" >>> import dis # disassembler >>>

    dis.dis(python_code) 0 RESUME 0 1 LOAD_CONST 0 (1) STORE_NAME 0 (x) RETURN_CONST 1 (None) >>> code_object = compile(python_code, '<string>', 'exec') >>> bytecode = code_obj.co_code >>> [opc for opc in bytecode] [149, 0, 83, 0, 114, 0, 103, 1] >>> [dis.opname[opc] for opc in bytecode] ['RESUME', 'CACHE', 'LOAD_CONST', 'CACHE', 'STORE_NAME', 'CACHE', 'RETURN_CONST', 'BEFORE_ASYNC_WITH'] 11 Code in CPython 3.13.2 Code object Bytecode Bytecode operations Source code
  7. x=1 12 [149, 0, 83, 0, 114, 0, 103, 1]

    compile? how? Code in CPython 3.13.2
  8. 13 >>> import ast >>> print(ast.dump(ast.parse(py_code), indent=2)) Module( body=[ Assign(

    targets=[ Name(id='x', ctx=Store())], value=Constant(value=1))]) Lexer/Parser: AST (AST = Abstract Syntax Tree) x=1 Code in CPython 3.13.2
  9. 14 ast Module( body=[ Assign( targets=[ Name(id='x', ctx=Store())], value=Constant(value=1))]) Compiler:

    Bytecode x=1 dis 0 RESUME 0 1 LOAD_CONST 0 (1) STORE_NAME 0 (x) RETURN_CONST 1 (None) [149, 0, 83, 0, 114, 0, 103, 1] Code in CPython 3.13.2
  10. x=1 15 [149, 0, 83, 0, 114, 0, 103, 1]

    compile, but Why? Code in CPython 3.13.2
  11. Why does Python need bytecode? Frontend Backend source code IR

    (Intermediate Representation) Machine code 16
  12. Why does Python need bytecode? Frontend Backend source code IR

    (Intermediate Representation) Machine code Lexer / Parser Compiler Python AST (Abstract syntax tree) Bytecode 17
  13. Run the bytecode under a VM that supports different platforms

    (windows, linux, …). Why does Python need bytecode? 1. Instruction set for Virtual Machine (VM) to manage complexity. 18 Lexer / Parser Compiler Python AST Bytecode
  14. Why does Python need bytecode? 2. Intermediate Representation (IR) is

    not specific to any source language and target machine. Lexer / Parser Compiler Python AST Bytecode 19 Ease the effort to implement things based on AST like transpiler.
  15. Why does Python need bytecode? Frontend Backend source code IR

    Machine code Lexer / Parser Compiler Python AST Bytecode 20 3. Optimization clarity.
  16. AST optimization 21 n = 2**32 → n = 4294967296

    >>> print(ast.dump(ast.parse(py_code, optimize=1), indent=2)) Module( body=[ Assign( targets=[ Name(id='n', ctx=Store())], value=Constant(value=4294967296))]) >>> py_code = "n=2**32" >>> print(ast.dump(ast.parse(py_code), indent=2)) Module( body=[ Assign( targets=[ Name(id='n', ctx=Store())], value=BinOp( left=Constant(value=2), op=Pow(), right=Constant(value=32)))]) Code in CPython 3.13.2
  17. Bytecode optimization >>> py_code = "def f(): return; x =

    1" >>> print(ast.dump(ast.parse(py_code, optimize=1), indent=2)) Module( body=[ FunctionDef( name='f', args=arguments(), body=[ Return(), Assign( targets=[ Name(id='x', ctx=Store())], value=Constant(value=1))])]) 22 def f(): return x = 1 → def f(): return >>> dis.dis(py_code) 0 RESUME 0 1 LOAD_CONST 0 (<code object f at 0x...) MAKE_FUNCTION STORE_NAME 0 (f) RETURN_CONST 1 (None) Disassembly of <code object f at 0x...: 1 RESUME 0 RETURN_CONST 0 (None) * AST optimization may not always work due to the limitation of static analysis. Code in CPython 3.13.2
  18. Case study: PEP 709 - Inlined comprehensions 27 >>> def

    f(lst): ... return [x for x in lst] Code in PEP 709 Bytecode for this is changed for the performance improvement!
  19. Bytecode before PEP 709 28 >>> def f(lst): ... return

    [x for x in lst] ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_CONST 1 (<code object <listcomp> at 0x...) 4 MAKE_FUNCTION 0 6 LOAD_FAST 0 (lst) 8 GET_ITER 10 CALL 0 20 RETURN_VALUE Disassembly of <code object <listcomp> at 0x...>: 2 0 RESUME 0 2 BUILD_LIST 0 4 LOAD_FAST 0 (.0) >> 6 FOR_ITER 4 (to 18) 10 STORE_FAST 1 (x) 12 LOAD_FAST 1 (x) 14 LIST_APPEND 2 16 JUMP_BACKWARD 6 (to 6) >> 18 END_FOR 20 RETURN_VALUE Before PEP 709, MAKE_FUNCTION is performed for comprehensions because of the compiler simplicity. Code in PEP 709
  20. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_CONST 1 (<code object <listcomp> at 0x...) 4 MAKE_FUNCTION 0 6 LOAD_FAST 0 (lst) 8 GET_ITER 10 CALL 0 20 RETURN_VALUE Disassembly of <code object <listcomp> at 0x...>: 2 0 RESUME 0 2 BUILD_LIST 0 4 LOAD_FAST 0 (.0) >> 6 FOR_ITER 4 (to 18) 10 STORE_FAST 1 (x) 12 LOAD_FAST 1 (x) 14 LIST_APPEND 2 16 JUMP_BACKWARD 6 (to 6) >> 18 END_FOR 20 RETURN_VALUE Stack-based VM dive deep 29 co_consts = { } co_varnames = { } Code in PEP 709
  21. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_CONST 1 (<code object <listcomp> at 0x...) 4 MAKE_FUNCTION 0 6 LOAD_FAST 0 (lst) 8 GET_ITER 10 CALL 0 20 RETURN_VALUE Disassembly of <code object <listcomp> at 0x...>: 2 0 RESUME 0 2 BUILD_LIST 0 4 LOAD_FAST 0 (.0) >> 6 FOR_ITER 4 (to 18) 10 STORE_FAST 1 (x) 12 LOAD_FAST 1 (x) 14 LIST_APPEND 2 16 JUMP_BACKWARD 6 (to 6) >> 18 END_FOR 20 RETURN_VALUE Stack-based VM dive deep 30 <listcomp> co_consts = { # 0: None 1: <listcomp> } co_varnames = { } Code in PEP 709
  22. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_CONST 1 (<code object <listcomp> at 0x...) 4 MAKE_FUNCTION 0 6 LOAD_FAST 0 (lst) 8 GET_ITER 10 CALL 0 20 RETURN_VALUE Disassembly of <code object <listcomp> at 0x...>: 2 0 RESUME 0 2 BUILD_LIST 0 4 LOAD_FAST 0 (.0) >> 6 FOR_ITER 4 (to 18) 10 STORE_FAST 1 (x) 12 LOAD_FAST 1 (x) 14 LIST_APPEND 2 16 JUMP_BACKWARD 6 (to 6) >> 18 END_FOR 20 RETURN_VALUE Stack-based VM dive deep 31 <listcomp> co_consts = { 1: <listcomp> } co_varnames = { } Code in PEP 709
  23. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_CONST 1 (<code object <listcomp> at 0x...) 4 MAKE_FUNCTION 0 6 LOAD_FAST 0 (lst) 8 GET_ITER 10 CALL 0 20 RETURN_VALUE Disassembly of <code object <listcomp> at 0x...>: 2 0 RESUME 0 2 BUILD_LIST 0 4 LOAD_FAST 0 (.0) >> 6 FOR_ITER 4 (to 18) 10 STORE_FAST 1 (x) 12 LOAD_FAST 1 (x) 14 LIST_APPEND 2 16 JUMP_BACKWARD 6 (to 6) >> 18 END_FOR 20 RETURN_VALUE Stack-based VM dive deep 32 co_consts = { 1: <listcomp> } co_varnames = { 0: lst } <listcomp> lst Code in PEP 709 <listcomp>
  24. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_CONST 1 (<code object <listcomp> at 0x...) 4 MAKE_FUNCTION 0 6 LOAD_FAST 0 (lst) 8 GET_ITER 10 CALL 0 20 RETURN_VALUE Disassembly of <code object <listcomp> at 0x...>: 2 0 RESUME 0 2 BUILD_LIST 0 4 LOAD_FAST 0 (.0) >> 6 FOR_ITER 4 (to 18) 10 STORE_FAST 1 (x) 12 LOAD_FAST 1 (x) 14 LIST_APPEND 2 16 JUMP_BACKWARD 6 (to 6) >> 18 END_FOR 20 RETURN_VALUE Stack-based VM dive deep 33 co_consts = { 1: <listcomp> } co_varnames = { 0: lst } <listcomp> iter(lst) Code in PEP 709
  25. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_CONST 1 (<code object <listcomp> at 0x...) 4 MAKE_FUNCTION 0 6 LOAD_FAST 0 (lst) 8 GET_ITER 10 CALL 0 20 RETURN_VALUE Disassembly of <code object <listcomp> at 0x...>: 2 0 RESUME 0 2 BUILD_LIST 0 4 LOAD_FAST 0 (.0) >> 6 FOR_ITER 4 (to 18) 10 STORE_FAST 1 (x) 12 LOAD_FAST 1 (x) 14 LIST_APPEND 2 16 JUMP_BACKWARD 6 (to 6) >> 18 END_FOR 20 RETURN_VALUE Stack-based VM dive deep 34 co_consts = { 1: <listcomp> } co_varnames = { 0: lst } <listcomp>(iter(lst)) Code in PEP 709 <listcomp> iter(lst)
  26. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_CONST 1 (<code object <listcomp> at 0x...) 4 MAKE_FUNCTION 0 6 LOAD_FAST 0 (lst) 8 GET_ITER 10 CALL 0 20 RETURN_VALUE Disassembly of <code object <listcomp> at 0x...>: 2 0 RESUME 0 2 BUILD_LIST 0 4 LOAD_FAST 0 (.0) >> 6 FOR_ITER 4 (to 18) 10 STORE_FAST 1 (x) 12 LOAD_FAST 1 (x) 14 LIST_APPEND 2 16 JUMP_BACKWARD 6 (to 6) >> 18 END_FOR 20 RETURN_VALUE Stack-based VM dive deep 35 Code in PEP 709 co_consts = { } co_varnames = { 0: .0 (= iter(lst)) }
  27. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_CONST 1 (<code object <listcomp> at 0x...) 4 MAKE_FUNCTION 0 6 LOAD_FAST 0 (lst) 8 GET_ITER 10 CALL 0 20 RETURN_VALUE Disassembly of <code object <listcomp> at 0x...>: 2 0 RESUME 0 2 BUILD_LIST 0 4 LOAD_FAST 0 (.0) >> 6 FOR_ITER 4 (to 18) 10 STORE_FAST 1 (x) 12 LOAD_FAST 1 (x) 14 LIST_APPEND 2 16 JUMP_BACKWARD 6 (to 6) >> 18 END_FOR 20 RETURN_VALUE Stack-based VM dive deep 36 [] Code in PEP 709 co_consts = { } co_varnames = { 0: .0 (= iter(lst)) }
  28. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_CONST 1 (<code object <listcomp> at 0x...) 4 MAKE_FUNCTION 0 6 LOAD_FAST 0 (lst) 8 GET_ITER 10 CALL 0 20 RETURN_VALUE Disassembly of <code object <listcomp> at 0x...>: 2 0 RESUME 0 2 BUILD_LIST 0 4 LOAD_FAST 0 (.0) >> 6 FOR_ITER 4 (to 18) 10 STORE_FAST 1 (x) 12 LOAD_FAST 1 (x) 14 LIST_APPEND 2 16 JUMP_BACKWARD 6 (to 6) >> 18 END_FOR 20 RETURN_VALUE Stack-based VM dive deep 37 [] iter(lst) Code in PEP 709 co_consts = { } co_varnames = { 0: .0 (= iter(lst)) }
  29. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_CONST 1 (<code object <listcomp> at 0x...) 4 MAKE_FUNCTION 0 6 LOAD_FAST 0 (lst) 8 GET_ITER 10 CALL 0 20 RETURN_VALUE Disassembly of <code object <listcomp> at 0x...>: 2 0 RESUME 0 2 BUILD_LIST 0 4 LOAD_FAST 0 (.0) >> 6 FOR_ITER 4 (to 18) 10 STORE_FAST 1 (x) 12 LOAD_FAST 1 (x) 14 LIST_APPEND 2 16 JUMP_BACKWARD 6 (to 6) >> 18 END_FOR 20 RETURN_VALUE Stack-based VM dive deep 38 co_varnames = { 0: .0 (= iter(lst)) } nxt = next(iter(lst)) [] iter(lst) nxt Code in PEP 709 co_consts = { }
  30. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_CONST 1 (<code object <listcomp> at 0x...) 4 MAKE_FUNCTION 0 6 LOAD_FAST 0 (lst) 8 GET_ITER 10 CALL 0 20 RETURN_VALUE Disassembly of <code object <listcomp> at 0x...>: 2 0 RESUME 0 2 BUILD_LIST 0 4 LOAD_FAST 0 (.0) >> 6 FOR_ITER 4 (to 18) 10 STORE_FAST 1 (x) 12 LOAD_FAST 1 (x) 14 LIST_APPEND 2 16 JUMP_BACKWARD 6 (to 6) >> 18 END_FOR 20 RETURN_VALUE Stack-based VM dive deep 39 [] iter(lst) Code in PEP 709 co_consts = { } co_varnames = { 0: .0 (= iter(lst)) 1: nxt (x) }
  31. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_CONST 1 (<code object <listcomp> at 0x...) 4 MAKE_FUNCTION 0 6 LOAD_FAST 0 (lst) 8 GET_ITER 10 CALL 0 20 RETURN_VALUE Disassembly of <code object <listcomp> at 0x...>: 2 0 RESUME 0 2 BUILD_LIST 0 4 LOAD_FAST 0 (.0) >> 6 FOR_ITER 4 (to 18) 10 STORE_FAST 1 (x) 12 LOAD_FAST 1 (x) 14 LIST_APPEND 2 16 JUMP_BACKWARD 6 (to 6) >> 18 END_FOR 20 RETURN_VALUE Stack-based VM dive deep 40 [] iter(lst) nxt Code in PEP 709 co_consts = { } co_varnames = { 0: .0 (= iter(lst)) 1: nxt (x) }
  32. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_CONST 1 (<code object <listcomp> at 0x...) 4 MAKE_FUNCTION 0 6 LOAD_FAST 0 (lst) 8 GET_ITER 10 CALL 0 20 RETURN_VALUE Disassembly of <code object <listcomp> at 0x...>: 2 0 RESUME 0 2 BUILD_LIST 0 4 LOAD_FAST 0 (.0) >> 6 FOR_ITER 4 (to 18) 10 STORE_FAST 1 (x) 12 LOAD_FAST 1 (x) 14 LIST_APPEND 2 16 JUMP_BACKWARD 6 (to 6) >> 18 END_FOR 20 RETURN_VALUE Stack-based VM dive deep 41 [nxt] iter(lst) Code in PEP 709 co_consts = { } co_varnames = { 0: .0 (= iter(lst)) 1: nxt (x) }
  33. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_CONST 1 (<code object <listcomp> at 0x...) 4 MAKE_FUNCTION 0 6 LOAD_FAST 0 (lst) 8 GET_ITER 10 CALL 0 20 RETURN_VALUE Disassembly of <code object <listcomp> at 0x...>: 2 0 RESUME 0 2 BUILD_LIST 0 4 LOAD_FAST 0 (.0) >> 6 FOR_ITER 4 (to 18) 10 STORE_FAST 1 (x) 12 LOAD_FAST 1 (x) 14 LIST_APPEND 2 16 JUMP_BACKWARD 6 (to 6) >> 18 END_FOR 20 RETURN_VALUE Stack-based VM dive deep 42 [nxt, ...] iter(lst) Code in PEP 709 co_consts = { } co_varnames = { 0: .0 (= iter(lst)) 1: nxt (x) }
  34. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_CONST 1 (<code object <listcomp> at 0x...) 4 MAKE_FUNCTION 0 6 LOAD_FAST 0 (lst) 8 GET_ITER 10 CALL 0 20 RETURN_VALUE Disassembly of <code object <listcomp> at 0x...>: 2 0 RESUME 0 2 BUILD_LIST 0 4 LOAD_FAST 0 (.0) >> 6 FOR_ITER 4 (to 18) 10 STORE_FAST 1 (x) 12 LOAD_FAST 1 (x) 14 LIST_APPEND 2 16 JUMP_BACKWARD 6 (to 6) >> 18 END_FOR 20 RETURN_VALUE Stack-based VM dive deep 43 [nxt, ...] Code in PEP 709 co_consts = { } co_varnames = { 0: .0 (= iter(lst)) 1: nxt (x) }
  35. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_CONST 1 (<code object <listcomp> at 0x...) 4 MAKE_FUNCTION 0 6 LOAD_FAST 0 (lst) 8 GET_ITER 10 CALL 0 20 RETURN_VALUE Disassembly of <code object <listcomp> at 0x...>: 2 0 RESUME 0 2 BUILD_LIST 0 4 LOAD_FAST 0 (.0) >> 6 FOR_ITER 4 (to 18) 10 STORE_FAST 1 (x) 12 LOAD_FAST 1 (x) 14 LIST_APPEND 2 16 JUMP_BACKWARD 6 (to 6) >> 18 END_FOR 20 RETURN_VALUE Stack-based VM dive deep 44 Code in PEP 709 co_consts = { } co_varnames = { 0: .0 (= iter(lst)) 1: nxt (x) }
  36. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_CONST 1 (<code object <listcomp> at 0x...) 4 MAKE_FUNCTION 0 6 LOAD_FAST 0 (lst) 8 GET_ITER 10 CALL 0 20 RETURN_VALUE Disassembly of <code object <listcomp> at 0x...>: 2 0 RESUME 0 2 BUILD_LIST 0 4 LOAD_FAST 0 (.0) >> 6 FOR_ITER 4 (to 18) 10 STORE_FAST 1 (x) 12 LOAD_FAST 1 (x) 14 LIST_APPEND 2 16 JUMP_BACKWARD 6 (to 6) >> 18 END_FOR 20 RETURN_VALUE Stack-based VM dive deep 45 co_consts = { 1: <listcomp> } [nxt, ...] Code in PEP 709
  37. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_CONST 1 (<code object <listcomp> at 0x...) 4 MAKE_FUNCTION 0 6 LOAD_FAST 0 (lst) 8 GET_ITER 10 CALL 0 20 RETURN_VALUE Disassembly of <code object <listcomp> at 0x...>: 2 0 RESUME 0 2 BUILD_LIST 0 4 LOAD_FAST 0 (.0) >> 6 FOR_ITER 4 (to 18) 10 STORE_FAST 1 (x) 12 LOAD_FAST 1 (x) 14 LIST_APPEND 2 16 JUMP_BACKWARD 6 (to 6) >> 18 END_FOR 20 RETURN_VALUE Stack-based VM dive deep 46 co_consts = { 1: <listcomp> } Code in PEP 709
  38. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_CONST 1 (<code object <listcomp> at 4 MAKE_FUNCTION 0 6 LOAD_FAST 0 (lst) 8 GET_ITER 10 CALL 0 20 RETURN_VALUE Disassembly of <code object <listcomp> at 0x...>: 2 0 RESUME 0 2 BUILD_LIST 0 4 LOAD_FAST 0 (.0) >> 6 FOR_ITER 4 (to 18) 10 STORE_FAST 1 (x) 12 LOAD_FAST 1 (x) 14 LIST_APPEND 2 16 JUMP_BACKWARD 6 (to 6) >> 18 END_FOR 20 RETURN_VALUE 1 0 RESUME 0 2 2 LOAD_FAST 0 (lst) 4 GET_ITER 6 LOAD_FAST_AND_CLEAR 1 (x) 8 SWAP 2 10 BUILD_LIST 0 12 SWAP 2 >> 14 FOR_ITER 4 (to 26) 18 STORE_FAST 1 (x) 20 LOAD_FAST 1 (x) 22 LIST_APPEND 2 24 JUMP_BACKWARD 6 (to 14) >> 26 END_FOR 28 SWAP 2 30 STORE_FAST 1 (x) 32 RETURN_VALUE 47 Instructions in MAKE_FUNCTION is replaced by LOAD_FAST_AND_CLEAR + SWAP. Code in PEP 709 After PEP 709 New Bytecode: LOAD_FAST_AND_CLEAR Before Bytecode after PEP 709
  39. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_FAST 0 (lst) 4 GET_ITER 6 LOAD_FAST_AND_CLEAR 1 (x) 8 SWAP 2 10 BUILD_LIST 0 12 SWAP 2 >> 14 FOR_ITER 4 (to 26) 18 STORE_FAST 1 (x) 20 LOAD_FAST 1 (x) 22 LIST_APPEND 2 24 JUMP_BACKWARD 6 (to 14) >> 26 END_FOR 28 SWAP 2 30 STORE_FAST 1 (x) 32 RETURN_VALUE Stack-based VM dive deep 48 Code in PEP 709 co_consts = { } co_varnames = { 0: lst 1: local_val (x) } iter(lst) local_val 1. Pushes local variables to the stack from co_varnames.
  40. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_FAST 0 (lst) 4 GET_ITER 6 LOAD_FAST_AND_CLEAR 1 (x) 8 SWAP 2 10 BUILD_LIST 0 12 SWAP 2 >> 14 FOR_ITER 4 (to 26) 18 STORE_FAST 1 (x) 20 LOAD_FAST 1 (x) 22 LIST_APPEND 2 24 JUMP_BACKWARD 6 (to 14) >> 26 END_FOR 28 SWAP 2 30 STORE_FAST 1 (x) 32 RETURN_VALUE Stack-based VM dive deep 49 Code in PEP 709 co_consts = { } co_varnames = { 0: lst 1: local_val (x) } iter(lst) local_val 1. Pushes local variables to the stack from co_varnames.
  41. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_FAST 0 (lst) 4 GET_ITER 6 LOAD_FAST_AND_CLEAR 1 (x) 8 SWAP 2 10 BUILD_LIST 0 12 SWAP 2 >> 14 FOR_ITER 4 (to 26) 18 STORE_FAST 1 (x) 20 LOAD_FAST 1 (x) 22 LIST_APPEND 2 24 JUMP_BACKWARD 6 (to 14) >> 26 END_FOR 28 SWAP 2 30 STORE_FAST 1 (x) 32 RETURN_VALUE Stack-based VM dive deep 50 local_val iter(lst) Code in PEP 709 co_consts = { } co_varnames = { 0: lst 1: local_val (x) }
  42. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_FAST 0 (lst) 4 GET_ITER 6 LOAD_FAST_AND_CLEAR 1 (x) 8 SWAP 2 10 BUILD_LIST 0 12 SWAP 2 >> 14 FOR_ITER 4 (to 26) 18 STORE_FAST 1 (x) 20 LOAD_FAST 1 (x) 22 LIST_APPEND 2 24 JUMP_BACKWARD 6 (to 14) >> 26 END_FOR 28 SWAP 2 30 STORE_FAST 1 (x) 32 RETURN_VALUE Stack-based VM dive deep 51 local_val iter(lst) [] Code in PEP 709 co_consts = { } co_varnames = { 0: lst 1: local_val (x) }
  43. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_FAST 0 (lst) 4 GET_ITER 6 LOAD_FAST_AND_CLEAR 1 (x) 8 SWAP 2 10 BUILD_LIST 0 12 SWAP 2 >> 14 FOR_ITER 4 (to 26) 18 STORE_FAST 1 (x) 20 LOAD_FAST 1 (x) 22 LIST_APPEND 2 24 JUMP_BACKWARD 6 (to 14) >> 26 END_FOR 28 SWAP 2 30 STORE_FAST 1 (x) 32 RETURN_VALUE Stack-based VM dive deep 52 local_val [] iter(lst) Code in PEP 709 The stack is ready to perform operations in the loop! co_consts = { } co_varnames = { 0: lst 1: local_val (x) }
  44. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_FAST 0 (lst) 4 GET_ITER 6 LOAD_FAST_AND_CLEAR 1 (x) 8 SWAP 2 10 BUILD_LIST 0 12 SWAP 2 >> 14 FOR_ITER 4 (to 26) 18 STORE_FAST 1 (x) 20 LOAD_FAST 1 (x) 22 LIST_APPEND 2 24 JUMP_BACKWARD 6 (to 14) >> 26 END_FOR 28 SWAP 2 30 STORE_FAST 1 (x) 32 RETURN_VALUE Stack-based VM dive deep 54 local_val [nxt, ...] Code in PEP 709 co_consts = { } co_varnames = { 0: lst 1: nxt }
  45. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_FAST 0 (lst) 4 GET_ITER 6 LOAD_FAST_AND_CLEAR 1 (x) 8 SWAP 2 10 BUILD_LIST 0 12 SWAP 2 >> 14 FOR_ITER 4 (to 26) 18 STORE_FAST 1 (x) 20 LOAD_FAST 1 (x) 22 LIST_APPEND 2 24 JUMP_BACKWARD 6 (to 14) >> 26 END_FOR 28 SWAP 2 30 STORE_FAST 1 (x) 32 RETURN_VALUE Stack-based VM dive deep 55 [nxt, ...] local_val Code in PEP 709 co_consts = { } co_varnames = { 0: lst 1: nxt }
  46. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_FAST 0 (lst) 4 GET_ITER 6 LOAD_FAST_AND_CLEAR 1 (x) 8 SWAP 2 10 BUILD_LIST 0 12 SWAP 2 >> 14 FOR_ITER 4 (to 26) 18 STORE_FAST 1 (x) 20 LOAD_FAST 1 (x) 22 LIST_APPEND 2 24 JUMP_BACKWARD 6 (to 14) >> 26 END_FOR 28 SWAP 2 30 STORE_FAST 1 (x) 32 RETURN_VALUE Stack-based VM dive deep 56 [nxt, ...] Code in PEP 709 co_consts = { } co_varnames = { 0: lst 1: local_val } 2. Pops local variables back to co_varnames.
  47. >>> def f(lst): ... return [x for x in lst]

    ... >>> dis.dis(f) 1 0 RESUME 0 2 2 LOAD_CONST 1 (<code object <listcomp> at 4 MAKE_FUNCTION 0 6 LOAD_FAST 0 (lst) 8 GET_ITER 10 CALL 0 20 RETURN_VALUE Disassembly of <code object <listcomp> at 0x...>: 2 0 RESUME 0 2 BUILD_LIST 0 4 LOAD_FAST 0 (.0) >> 6 FOR_ITER 4 (to 18) 10 STORE_FAST 1 (x) 12 LOAD_FAST 1 (x) 14 LIST_APPEND 2 16 JUMP_BACKWARD 6 (to 6) >> 18 END_FOR 20 RETURN_VALUE 1 0 RESUME 0 2 2 LOAD_FAST 0 (lst) 4 GET_ITER 6 LOAD_FAST_AND_CLEAR 1 (x) 8 SWAP 2 10 BUILD_LIST 0 12 SWAP 2 >> 14 FOR_ITER 4 (to 26) 18 STORE_FAST 1 (x) 20 LOAD_FAST 1 (x) 22 LIST_APPEND 2 24 JUMP_BACKWARD 6 (to 14) >> 26 END_FOR 28 SWAP 2 30 STORE_FAST 1 (x) 32 RETURN_VALUE 57 After PEP 709 New Bytecode: LOAD_FAST_AND_CLEAR Before Code in PEP 709 Recap
  48. 60 #1: Bytecode is difficult. 1) Bytecode requires deep CPython

    knowledge. 2) Bytecode change can cause unexpected impacts on CPython itself and other projects.
  49. 66 Case 1 Case 2 Case 3 def f1(x: str):

    pass f1(1) // type error! def f2(x): def g(x: str): pass g(x) f2(1) // type error! def f3(x: int): def g(x: str): pass g(x) f3(1) // type error! AST-based type checker Bytecode-based type checker Type Check: AST vs Bytecode ⏰
  50. 67 Case 1 Case 2 Case 3 def f1(x: str):

    pass f1(1) // type error! def f2(x): def g(x: str): pass g(x) f2(1) // type error! def f3(x: str): def g(x: str): pass g(x) f3(1) // type error! AST-based type checker ✅ ❌ ✅ Bytecode-based type checker Type Check: AST vs Bytecode
  51. 68 Case 1 Case 2 Case 3 def f1(x: str):

    pass f1(1) // type error! def f2(x): def g(x: str): pass g(x) f2(1) // type error! def f3(x: str): def g(x: str): pass g(x) f3(1) // type error! AST-based type checker ✅ ❌ ✅ Bytecode-based type checker ✅ ✅ ✅ Type Check: AST vs Bytecode
  52. 70 1. Maintainability: The developer needs to be very passionate

    about Python. 2. Performance: Deep check ∝ Performance cost (Skipped check for optimization's example.) Why is Bytecode-based type checker not popular?
  53. Pytype developers fight with new bytecode • PR: Minimal changes

    for Python 3.12 • LOAD_FAST_AND_CLEAR opcode support in Pytype's VM - 1 2 71
  54. • python-xdis - disassemble bytecode from different version of Python

    • GraalPython - Python language for the JVM built on GraalVM ◦ Github issue: Support for Python 3.12? ▪ Unresolved as of 2025/Apr ◦ Bytecode in graalpython/lib-python/3/opcode.py • PyPy - JIT compiler ◦ Release tagged with supported Python versions: https://github.com/pypy/pypy/tags ▪ CPython 3.11 as of 2025/Apr ◦ Bytecode in rpython/tool/stdlib_opcode.py ◦ Bytecode in rpython/rlib/rsre/rsre_constants.py • … and more 72 Other Python implementations and specialized tools
  55. 74 #1: Bytecode is difficult. 1) Bytecode requires deep CPython

    knowledge. 2) Bytecode change can cause unexpected impacts on CPython itself and other projects. #2: Manipulate Python bytecode is expensive. 1) Developers passion on CPython itself. 2) Hard to keep updated with the newest CPython. 3) Hard to connect to a business value directly.
  56. 76

  57. 78 PyCon PEP discuss.python.org (social media, and issue tracker…) Blog

    Today's Python Future's Python Community Engagement Podcast Youtube Deepen your understanding of CPython?
  58. def f(): y = 2 x = 1 return x

    + y 82 [149, 0, 83, 0, 26, 0, 114, 0, 103, 1] 3 Code in CPython 3.13.2
  59. Create your own bytecode parser (disassembler)? 83 0 RESUME 0

    2 LOAD_CONST 1 4 STORE_FAST 0 6 LOAD_CONST 2 8 STORE_FAST 1 10 LOAD_FAST 1 12 LOAD_FAST 0 14 BINARY_OP 0 16 CACHE 0 18 RETURN_VALUE 0 Code in CPython 3.11.11 import dis def my_dis(bytecodes: bytes, show_caches: bool=False): # Disassemble the bytecode like dis.dis, CPython >= 3.6 & < 3.13. i = 0 while i < len(bytecodes): opcode, oparg = dis.opname[bytecodes[i]], bytecodes[i+1] if show_caches is True or opcode != 'CACHE': print(i, opcode, oparg) i += 2 def f(): y = 2 x = 1 return x + y print(my_dis(f.__code__.co_code))
  60. def my_byterun(func): # Run the bytecode, CPython >= 3.6 &

    < 3.13. co_code, co_varnames, co_consts = func.__code__.co_code, func.__code__.co_varnames, func.__code__.co_consts ret, value_stack, varname_to_value = None, [], {} i = 0 while i < len(co_code): opcode, oparg = dis.opname[co_code[i]], co_code[i+1] # print(i, opcode, oparg) match (opcode): case 'CACHE' | 'RESUME': pass case 'LOAD_CONST': value_stack.append(co_consts[oparg]) case 'LOAD_FAST': value_stack.append(varname_to_value[co_varnames[oparg]]) case 'STORE_FAST': varname_to_value[co_varnames[oparg]] = value_stack.pop() case 'RETURN_VALUE': ret = value_stack.pop() case 'BINARY_OP': left, right = value_stack.pop(), value_stack.pop() # Skip mapping BINARY_OP to + (requires a hardcoded map w/ oparg). value_stack.append(left + right) i += 2 return ret 84 Create your own bytecode runner (virtual machine)? >>> import dis >>> def f(): y = 2; x = 1; return x + y >>> print(my_byterun(f)) >>> 3 Code in CPython 3.11.11
  61. Code to Bytecode Python code >>> a = f''' if

    x > 42: y = True else: y = '42' ''' AST >>> print(ast.dump(ast.parse(a), indent=2)) Module( body=[ If( test=Compare( left=Name(id='x', ctx=Load()), ops=[ Gt()], comparators=[ Constant(value=42)]), body=[ Assign( targets=[ Name(id='y', ctx=Store())], value=Constant(value=True))], orelse=[ Assign( targets=[ Name(id='y', ctx=Store())], value=Constant(value='42'))])], type_ignores=[]) Bytecode operations >>> dis.dis(a) 0 RESUME 0 2 LOAD_NAME 0 (x) LOAD_CONST 0 (42) COMPARE_OP 148 (bool(>)) POP_JUMP_IF_FALSE 3 (to L1) 3 LOAD_CONST 1 (True) STORE_NAME 1 (y) RETURN_CONST 3 (None) 5 L1: LOAD_CONST 2 ('42') STORE_NAME 1 (y) RETURN_CONST 3 (None) 86 Code in CPython 3.13.2
  62. Code to Bytecode - AST nodes Bytecode operations >>> dis.dis(a)

    0 RESUME 0 2 LOAD_NAME 0 (x) LOAD_CONST 0 (42) COMPARE_OP 148 (bool(>)) POP_JUMP_IF_FALSE 3 (to L1) 3 LOAD_CONST 1 (True) STORE_NAME 1 (y) RETURN_CONST 3 (None) 5 L1: LOAD_CONST 2 ('42') STORE_NAME 1 (y) RETURN_CONST 3 (None) ... 87 One AST node Python code >>> a = f''' if x > 42: y = True else: y = '42' ... ''' Other AST nodes AST >>> print(ast.dump(ast.parse(a), indent=2)) Module( body=[ If( test=Compare( left=Name(id='x', ctx=Load()), ops=[ Gt()], comparators=[ Constant(value=42)]), body=[ Assign( targets=[ Name(id='y', ctx=Store())], value=Constant(value=True))], orelse=[ Assign( targets=[ Name(id='y', ctx=Store())], value=Constant(value='42'))]), ...], type_ignores=[]) Code in CPython 3.13.2
  63. CFG Control flow graph (of AST nodes) 88 Module( body=[

    Node B, Node C, Node D, Node A]) AST nodes Node A (entry) Node D Node B Node C 0 RESUME 0 1 LOAD_CONST 0 (<code object f a MAKE_FUNCTION STORE_NAME 0 (f) 4 LOAD_CONST 1 (<code object g a MAKE_FUNCTION STORE_NAME 1 (g) 7 LOAD_CONST 2 (<code object h a MAKE_FUNCTION STORE_NAME 2 (h) 10 LOAD_NAME 2 (h) PUSH_NULL CALL 0 POP_TOP RETURN_CONST 3 (None) Disassembly of <code object f at 0x102cc1550, file "", lin 1 RESUME 0 2 RETURN_CONST 1 (1) Disassembly of <code object g at 0x102bcd450, file "", lin 4 RESUME 0 5 LOAD_GLOBAL 1 (f + NULL) CALL 0 RETURN_VALUE Disassembly of <code object h at 0x102b96c30, file "", lin 7 RESUME 0 8 LOAD_GLOBAL 1 (f + NULL) CALL 0 LOAD_GLOBAL 3 (g + NULL) CALL 0 BINARY_OP 0 (+) RETURN_VALUE Node A (entry) Node C Node D Node B Code to Bytecode - AST nodes AST nodes to CFG Code in CPython 3.13.2
  64. Code to Bytecode - from AST nodes to i_opcode 0

    RESUME 0 2 LOAD_NAME 0 (x) LOAD_CONST 0 (42) COMPARE_OP 148 (bool(>)) POP_JUMP_IF_FALSE 3 (to L1) 3 LOAD_CONST 1 (True) STORE_NAME 1 (y) RETURN_CONST 3 (None) 5 L1: LOAD_CONST 2 ('42') STORE_NAME 1 (y) RETURN_CONST 3 (None) 89 Basic block AST node Instruction 0 2 3 5 AST node(s) i_opcode i_oparg Other AST nodes … Code in CPython 3.13.2
  65. 90 Code to Bytecode - symtable [...] compile? py_code =

    ''' global_var = 1 def outer_f(): outer_var = 2 def inner_f(): global global_var nonlocal outer_var local_var = 3 ''' CFG Control flow graph (of AST nodes) AST node (each node has their own symtable entry) Compiler unit Created for the first node of a new CFG Compiler Global state of compilation (symtable) Code object Bytecode Python code Code in CPython 3.13.2
  66. 91 Code to Bytecode - symtable local, global, free variables

    Module( body=[ Assign( targets=[ Name(id='global_var', ctx=Store())], value=Constant(value=1)), FunctionDef( name='outer_f', args=arguments(), body=[ Assign( targets=[ Name(id='outer_var', ctx=Store())], value=Constant(value=2)), FunctionDef( name='inner_f', args=arguments(), body=[ Global( names=[ 'global_var']), Nonlocal( names=[ 'outer_var']), Assign( targets=[ Name(id='local_var', ctx=Store())], value=Constant(value=3))])])]) from symtable import symtable table = symtable(py_code, "<string>", "exec") outer_table = table.get_children()[0] inner_table = outer_table.get_children()[0] for s in inner_table.get_symbols(): print(f"{s.get_name()}: {s.is_local()}, {s.is_global()}, {s.is_free()}") [ 'global_var: False, True, False', 'outer_var: False, False, True', 'local_var: True, False, False' ] py_code = ''' global_var = 1 def outer_f(): outer_var = 2 def inner_f(): global global_var nonlocal outer_var local_var = 3 ''' Code in CPython 3.13.2
  67. Examples that Bytecode-based type checker can do 93 from typing

    import Optional def f(x: Optional[str]): if x is not None: g(x) def g(x: str): pass from random import randint def f(x: int): pass y = 42 if randint(1, 2) else "foo" f(y) def f(x: int): pass y = 42 if True else "foo" f(y)
  68. Appendix • Bytecode optimization ◦ Copy-And-Patch: generating fast code +

    generating code faster ◦ 2024 Brandt Bucher: Building a JIT compiler for CPython ◦ 2023 Brandt Bucher: Inside CPython 3.11's new specializing, adaptive interpreter • Rust-based bytecode type checker ◦ Ruff: https://github.com/astral-sh/ruff/issues/3893 (AST-based) ◦ pylyzer: https://github.com/mtshiba/pylyzer (AST-based) • CPython - ceval.c ◦ https://github.com/python/cpython/blob/main/Python/ceval.c ◦ https://github.com/python/cpython/blob/3.9/Python/ceval.c ◦ https://github.com/python/cpython/blob/main/Tools/cases_generator/interpreter_definition.md • Ten Thousand Meters ◦ Python behind the scenes #2: how the CPython compiler works 95