Upgrade to Pro — share decks privately, control downloads, hide ads and more …

DIY Python Debugger

DIY Python Debugger

Debuggers are indispensable tools for all Python developers, empowering them to conquer bugs and unravel complex systems. But have you ever wondered how they work? Curious about the implementation of features like conditional breakpoints and single stepping? Join me for a talk in which we create our own debugger with conditional breakpoints, single stepping and a Python based debugging shell and learn a lot on debuggers along the way.

Recording at https://youtu.be/zCWjj98Wvg0?t=735
Blog series at https://mostlynerdless.de/blog/tag/lets-create-a-debugger-together/

Johannes Bechberger

November 17, 2023
Tweet

More Decks by Johannes Bechberger

Other Decks in Programming

Transcript

  1. DIY Python Debugger
    Johannes Bechberger
    mostlynerdless.de

    View full-size slide

  2. If debugging is the
    process of removing
    software bugs, then
    programming must be
    the process of putting
    them in.
    — Edsger Dijkstra

    View full-size slide

  3. ➜ python3 counter.py \
    lines counter.py
    0

    View full-size slide

  4. ➜ python3 counter.py \
    lines counter.py
    26

    View full-size slide

  5. Let’s look at the code

    View full-size slide

  6. def main():
    match cmd := sys.argv[1]:
    case "lines":
    count = count_code_lines(Path(sys.argv[2]))
    print(count)
    case "help":
    print_help()
    case _:
    raise ValueError(f"Unknown operation {cmd}")

    View full-size slide

  7. def is_code_line(line: str) -> bool:
    return line.isspace() and line.strip().startswith("#")
    def count_code_lines(file: Path) -> int:
    count = 0
    with file.open('r') as f:
    for line in f:
    if is_code_line(line):
    count += 1
    return count

    View full-size slide

  8. Debuggers are your friend

    View full-size slide

  9. Who of you used
    a debugger before?

    View full-size slide

  10. Debugging
    Testing
    Profiling
    Toolbox

    View full-size slide

  11. We could use an existing
    debugger...

    View full-size slide

  12. Debuggers are no rocket
    science, so ...

    View full-size slide

  13. Does the interpreter
    "know" breakpoints?

    View full-size slide

  14. def is_code_line(line: str) -> bool:
    return line.isspace() and line.strip().startswith("#
    def count_code_lines(file: Path) -> int:
    count = 0
    with file.open('r') as f:
    for line in f:
    if is_code_line(line):
    count += 1
    return count
    dbg();
    dbg();
    dbg();
    dbg();
    dbg();
    dbg();
    dbg();

    View full-size slide

  15. def dbg():
    if at_breakpoint(file, line):
    dbg_shell()
    ?
    dbg(); line

    View full-size slide

  16. sys._getframe

    View full-size slide

  17. sys._getframe

    View full-size slide

  18. sys._getframe
    CPython implementation detail
    locals(), globals(), sys._getframe(), sys.exc_info(), and
    sys.settrace work in PyPy, but they incur a performance
    penalty that can be huge by disabling the JIT over the
    enclosing JIT scope.

    – https://www.pypy.org/performance.html

    View full-size slide

  19. main
    count_code_lines
    is_code_line
    dbg sys._getframe(0)
    sys._getframe(1)

    View full-size slide

  20. main
    count_code_lines
    is_code_line
    dbg sys._getframe(0)
    sys._getframe(1)
    f_back
    f_lineno 6
    f_globals ...
    f_locals {'line':
    'import sys\n'}
    f_code.
    co_filename
    counter.py

    View full-size slide

  21. def dbg():
    frame = sys._getframe(1)
    line = frame.f_lineno
    file = Path(frame.f_code.co_filename).stem
    if at_breakpoint(file, line):
    dbg_shell()
    dbg(); line

    View full-size slide

  22. def dbg():
    frame = sys._getframe(1)
    line = frame.f_lineno
    file = Path(frame.f_code.co_filename).stem
    if at_breakpoint(file, line):
    dbg_shell(frame)
    dbg(); line

    View full-size slide

  23. def dbg():
    frame = sys._getframe(1)
    line = frame.f_lineno
    file = Path(frame.f_code.co_filename).stem
    if at_breakpoint(file, line):
    dbg_shell(frame)
    dbg(); line

    View full-size slide

  24. def dbg():
    frame = sys._getframe(1)
    line = frame.f_lineno
    file = Path(frame.f_code.co_filename).stem
    if at_breakpoint(file, line):
    dbg_shell(frame)
    def at_breakpoint(file: str, line: int) -> bool:
    return file == "counter" and line == 6
    dbg(); line

    View full-size slide

  25. Demo: counter0.py

    View full-size slide

  26. But how do we
    automate this?

    View full-size slide

  27. sys.settrace

    View full-size slide

  28. sys.settrace(handler)
    Event = Union['call', 'line', 'return', 'exception', 'opcode']
    def handler(frame: FrameType, event: Event, arg):
    pass

    View full-size slide

  29. def is_code_line(line: str) -> bool:
    return line.isspace() and line.strip().startswith("#")
    def count_code_lines(file: Path) -> int:
    count = 0
    with file.open('r') as f:
    for line in f:
    if is_code_line(line):
    count += 1
    return count
    handler(frame, 'call', None)
    handler(frame, 'call', None)

    View full-size slide

  30. Demo settrace1.py
    event: call main
    event: call count_code_lines
    event: call is_code_line
    event: call is_code_line
    event: call is_code_line
    event: call is_code_line
    ...

    View full-size slide

  31. sys.settrace(handler)
    def inner_handler(frame: FrameType, event: str, arg):
    pass
    def handler(frame: FrameType, event: Event, arg) \
    -> Optional[Callable[[FrameType, Event, Any], None]]:
    return inner_handler

    View full-size slide

  32. sys.settrace(handler)
    def inner_handler(frame: FrameType, event: Event, arg):
    pass
    def handler(frame: FrameType, event: Event, arg) \
    -> Optional[Callable[[FrameType, Event, Any], None]]:
    return inner_handler

    View full-size slide

  33. Demo settrace2.py
    event: call is_code_line
    inner: line 6
    inner: return 6
    inner: line 12
    inner: line 13
    ...

    View full-size slide

  34. def dbg():
    frame = sys._getframe(1)
    line = frame.f_lineno
    file = Path(frame.f_code.co_filename).stem
    if at_breakpoint(file, line):
    dbg_shell(frame)
    def at_breakpoint(file: str, line: int) -> bool:
    return file == "counter" and line == 6
    dbg(); line

    View full-size slide

  35. def inner_handler(frame: FrameType, event: str, arg):
    if event != 'line':
    return
    line = frame.f_lineno
    file = Path(frame.f_code.co_filename).stem
    if at_breakpoint(file, line):
    dbg_shell(frame)
    def at_breakpoint(file: str, line: int) -> bool:
    return file == "counter" and line == 6
    dbg(); line

    View full-size slide

  36. def inner_handler(frame: FrameType, event: str, arg):
    if event != 'line':
    return
    line = frame.f_lineno
    file = Path(frame.f_code.co_filename).stem
    if at_breakpoint(file, line):
    dbg_shell(frame)
    def at_breakpoint(file: str, line: int) -> bool:
    return file == "counter" and line == 6
    dbg(); line

    View full-size slide

  37. Demo settrace3.py
    event: call main
    event: call count_code_lines
    event: call is_code_line
    in break point at line 6
    >>> line
    'import sys\n'

    View full-size slide

  38. def inner_handler(frame: FrameType, event: str, arg):
    if event != 'line':
    return
    line = frame.f_lineno
    file = Path(frame.f_code.co_filename).stem
    if at_breakpoint(file, line):
    dbg_shell(frame)
    def at_breakpoint(file: str, line: int) -> bool:
    return file == "counter" and line == 6
    dbg(); line
    make configurable first_line or Breakpoint(file,
    line) in current_breakpoints

    View full-size slide

  39. Demo settrace4.py
    event: call main
    in break point at line 23
    >>> br('counter', 6)
    >>>
    event: call count_code_lines
    event: call is_code_line
    in break point at line 6
    >>> line
    'import sys\n'

    View full-size slide

  40. Single Stepping

    View full-size slide

  41. main:25
    count_code_lines:13
    is_code_line:6
    main:25
    count_code_lines:12
    step out
    Just extend at_breakpoint

    View full-size slide

  42. main:25
    count_code_lines:12
    main:25
    count_code_lines:13
    step
    Just extend at_breakpoint

    View full-size slide

  43. main:25
    count_code_lines:12
    main:25
    count_code_lines:13
    step into is_code_line:6
    Just extend at_breakpoint

    View full-size slide

  44. Do we get line events
    for every function?

    View full-size slide

  45. def is_code_line(line: str) -> bool:
    return line.isspace() and line.strip().startswith("#")
    def count_code_lines(file: Path) -> int:
    count = 0
    with file.open('r') as f:
    for line in f:
    if is_code_line(line):
    count += 1
    return count
    handler(frame, …, None)
    add breakpoint
    handler(frame, 'call', None)

    View full-size slide

  46. Performance?

    View full-size slide

  47. Any ideas to improve it?

    View full-size slide

  48. Make at_breakpoint faster

    View full-size slide

  49. Add a new API
    Python 3.12
    and PEP 669

    View full-size slide

  50. # some aliases and constants
    mon = sys.monitoring
    E = mon.events
    TOOL_ID = mon.DEBUGGER_ID
    # register the tool
    mon.use_tool_id(TOOL_ID, "dbg")

    View full-size slide

  51. # some aliases and constants
    mon = sys.monitoring
    E = mon.events
    TOOL_ID = mon.DEBUGGER_ID
    # register the tool
    mon.use_tool_id(TOOL_ID, "dbg")
    # register callbacks for the events we are interested in
    mon.register_callback(TOOL_ID, E.LINE, line_handler)
    mon.register_callback(TOOL_ID, E.PY_START, start_handler)
    def start_handler(code: CodeType, offset: int):
    pass
    def line_handler(code: CodeType, line: int):
    pass

    View full-size slide

  52. # some aliases and constants
    mon = sys.monitoring
    E = mon.events
    TOOL_ID = mon.DEBUGGER_ID
    # register the tool
    mon.use_tool_id(TOOL_ID, "dbg")
    # register callbacks for the events we are interested in
    mon.register_callback(TOOL_ID, E.LINE, line_handler)
    mon.register_callback(TOOL_ID, E.PY_START, start_handler)
    # enable PY_START event globally
    mon.set_events(TOOL_ID, E.PY_START)
    # Later
    mon.set_local_events(TOOL_ID, code, E.LINE)

    View full-size slide

  53. Globally
    set events
    Locally
    set events
    Enabled events per function

    View full-size slide

  54. What's fast?

    View full-size slide

  55. register_callback
    get_tool
    set_local_events
    use_tool_id
    set_events
    Fast
    Rather fast
    Slow
    The earlier the faster

    View full-size slide

  56. Demo sys_mon1.py
    start main
    start count_code_lines
    start is_code_line
    hit line 6
    start is_code_line
    hit line 6
    ...

    View full-size slide

  57. Putting it all together

    View full-size slide

  58. Debugging
    Testing
    Profiling
    Toolbox

    View full-size slide

  59. @parttimen3rd on Twitter
    parttimenerd on GitHub
    mostlynerdless.de
    @SweetSapMachine
    sapmachine.io

    View full-size slide