Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Liran Haimovitch - Understanding Python’s Debug...

Liran Haimovitch - Understanding Python’s Debugging Internals

Knowing your enemies is as important as knowing your friends. Understanding your debugger is a little of both. Have you ever wondered how Python debugging looks on the inside? On our journey to building a Python debugger, we learned a lot about its internals, quirks and more.

During this session, we’ll share how debugging actually works in Python. We’ll discuss the differences between CPython and PyPy interpreters, explain the underlying debugging mechanism and show you how to utilize this knowledge at work and up your watercooler talk game.

https://us.pycon.org/2019/schedule/presentation/181/

PyCon 2019

May 03, 2019
Tweet

More Decks by PyCon 2019

Other Decks in Programming

Transcript

  1. 2 Who am I? • Co-Founder and CTO of Rookout

    • An advocate of modern software methodologies • My passion is to understand how software ACTUALLY works
  2. 3 About Rookout • Rookout is a platform for live-data

    collection and delivery • It slashes debugging time, saving hours and days of writing and deploying logs, then waiting for data to arrive • Rookout’s non-breaking breakpoints enable dev team to collect data on the fly with no restarts, extra coding, or redeployments
  3. 4 • Python standard library pdb • PyDev Debugger •

    • Others IIDLE and more... Python Debuggers Out There
  4. 6 What does sys.settrace do? • sys.settrace registers a callback

    for the Python interpreter • This callback is invoked on interpreter events: • Function call • Line Execution • Function return • Exception raised
  5. 7 Example Callback def simple_tracer(frame, event, arg): co = frame.f_code

    func_name = co.co_name line_no = frame.f_lineno print("{e} {f} {l}".format( e=event, f=func_name, l=line_no)) return simple_tracer
  6. 8 Flow example side-by-side 1) def a(): 2) return b()

    * 2 3) 4) def b(): 5) return 'response_from_b' 6) 7) sys.settrace(simple_tracer) 8) a() ➔ call a 1 ➔ line a 2 ➔ call b 4 ➔ line b 5 ➔ return b 5 ➔ return a 2
  7. 9 Example Callback def simple_tracer(frame, event, arg): co = frame.f_code

    func_name = co.co_name line_no = frame.f_lineno print("{e} {f} {l}".format( e=event, f=func_name, l=line_no)) return simple_tracer
  8. 10 What’s happening here? sys.settrace global trace registers callback invokes

    local trace returns - line - return - exception events
  9. 11 Looking back 1) def a(): 2) return b() *

    2 3) 4) def b(): 5) return 'response_from_b' 6) 7) sys.settrace(simple_tracer) 8) a() Global: A: B: Global: simple_ tracer A: simple_ tracer B: simple_ tracer
  10. 12 How do you handle multi threading? • threading.settrace() •

    Must be called as early as possible - or you’ll miss threads • Doesn’t cover the underlying `thread` module and other low-level implementations • gevent/eventlet • Global tracing function will be shared among greenlets
  11. 15 Add set_breakpoint def set_breakpoint(self, filename, lineno, method): self.set_break(filename, lineno)

    try : self.breakpoints[(filename, lineno)].add(method) except KeyError: self.breakpoints[(filename, lineno)] = [method]
  12. 16 def user_line(self, frame): if not self.break_here(frame): return # Get

    filename and lineno from frame (filename, lineno, _, _, _) = inspect.getframeinfo(frame) methods = self.breakpoints[(filename, lineno)] for method in methods: method(frame) Override user_line
  13. 18 CPython Performance Benchmarks def empty_method(): pass def simple_method(): a

    = 1 b = 2 c = 3 d = 4 e = 5 f = 6 g = 7 h = 8 i = 9 j = 10 We’ll test multiple scenarios with each implementation: • Test without debugger • Test with debugger but no breakpoints • Test with a breakpoint in the different file • Test with a breakpoint in the same file
  14. 20 • Avoid local tracing • Optimize “call” events •

    Optimize “line” events Performance Optimizations
  15. 23 • Python bdb is naive • Performance may be

    improved by a significant margin • Performance becomes gradually harder to improve • What happens if we set an empty tracer? Performance Insights def pass_tracer(frame, event, arg): return pass_tracer
  16. 25 Diving Into CPython • Turning on tracing, sets up

    CPython for extra work • Some in Python • Some in C - maybe_call_line_trace (eval.c:4384)
  17. 27 Python Bytecode Bytecode - is a form of instruction

    set designed for efficient execution by a software interpreter (Wikipedia). Python compiles our sources into its bytecode.
  18. 28 Python Bytecode def multiply(a, b): result = a *

    b return result 4 0 LOAD_FAST 0 (a) 3 LOAD_FAST 1 (b) 6 BINARY_MULTIPLY 7 STORE_FAST 2 (result) 5 10 LOAD_FAST 2 (result) 13 RETURN_VALUE '|\x00\x00|\x01\x00\x14}\x02\x00|\x02\x00S'
  19. 30 1. Go to the byte code 2. Find the

    line of code 3. Insert our breakpoint The Rookout Way
  20. 31 Tools for Bytecode Manipulation 1. Python Standard Library (read-only):

    a. inspect b. dis 2. Google: a. cloud-debug-python
  21. 33 Confidential What do you do once you have stopped

    in a breakpoint? 1. frame - see into the interpreter 2. The inspect built-in module is well documented and easy to use 3. Performance is surprisingly awesome, as this is similar to how the interpreter actually works
  22. 34 Getting Frame Information def test_frameinfo(): frame = inspect.currentframe() print(inspect.getframeinfo(frame))

    Traceback( filename='frame_inspect.py', lineno=16, function='test_frameinfo', code_context=[' print(inspect.getframeinfo(frame)) \n'], index=0 ) output
  23. 35 Getting Local Variables def test_vars(): mystr = "mystr" mydict

    = {"mykey": "myvalue"} mylist = [1, 2, 3] print(inspect.currentframe().f_locals) { 'mydict': { 'mykey':'myvalue' }, 'mystr': 'mystr', 'mylist': [ 1, 2, 3 ] } output
  24. 36 Use cases 1. Show off your Python skills :)

    2. Get source information (logging module) 3. Walk up the stack 4. Build a debugger