Upgrade to Pro — share decks privately, control downloads, hide ads and more …

2017 - Python Debugging with PUDB

PyBay
August 21, 2017

2017 - Python Debugging with PUDB

Description

When tracking down a tricky bug, tools are everything. I'll demonstrate three useful debugging tools and we'll see how we can use them to find bugs, whether they are in networking, logic, or performance.

Abstract

Stop using print statements forever! You'll learn how to use these tools: PUDB - an interactive, ncurses debugger Charles - a web debugging proxy cProfile - python's built-in profiling library RunSnakeRun and SnakeViz - Tool for visualizing profile output I'll also talk about the process of debugging and profiling, common error patterns and how to use your time most efficiently.

Bio

Chris Beacham (aka Lady Red) is a python developer and Senior Software Engineer at Hipmunk. She also does performance, sewing, sculpture and painting in her free time, and is a frequent sight at the Noisebridge Hackerspace, where this talk was first delivered.

https://www.youtube.com/watch?v=mbdYATn7h6Q

PyBay

August 21, 2017
Tweet

More Decks by PyBay

Other Decks in Programming

Transcript

  1. Python Debugging
    with PuDB, Charles, and cProfile
    Christopher Beacham / Lady Red,
    Senior Engineer @ Hipmunk
    Hipmunk: [email protected]
    Personal: [email protected]

    View Slide

  2. I write code, and this is a little bit embarrassing, but…

    View Slide

  3. I write code, and this is a little bit embarrassing, but…
    My Code Has Bugs

    View Slide

  4. View Slide

  5. The faster we can find bugs,
    the faster we can fix them

    View Slide

  6. The faster we can find bugs,
    the faster we can fix them
    It’s hard to see what a program is doing!

    View Slide

  7. Debugging tools can make the invisible visible.

    View Slide

  8. PuDB - Interactive visual debugger.
    Charles Proxy - Web debugging proxy
    cProfile + others - python profilers and visualization tools
    Tools we’ll cover:

    View Slide

  9. Stop using Print Statements
    One is never enough
    You’ll never find a problem you aren’t specifically looking for
    You have to remember to delete them

    View Slide

  10. Debugging Process
    Systematically check your assumptions in a binary search
    First Test

    View Slide

  11. Debugging Process
    Systematically check your assumptions in a binary search
    First Test
    Second
    Test

    View Slide

  12. Debugging Process
    Systematically check your assumptions in a binary search
    First Test
    Second
    Test
    Third Test

    View Slide

  13. Debugging Process
    Systematically check your assumptions in a binary search
    First Test
    Second
    Test
    Third Test
    FOUND
    THE
    BUG

    View Slide

  14. How many of you have used a debugger before?

    View Slide

  15. PuDB
    PuDB is my favorite debugger.
    It’s very similar to pdb and ipdb, but it’s
    VISUAL.
    It show you everything in scope, your
    code, and a terminal.

    View Slide

  16. View Slide

  17. View Slide

  18. View Slide

  19. View Slide

  20. View Slide

  21. View Slide

  22. View Slide

  23. View Slide

  24. View Slide

  25. View Slide

  26. Let’s USE it!

    View Slide

  27. We’re going to use PuDB to debug a
    fortune webserver written in python (flask)

    View Slide

  28. PuDB: Extra Credit
    - You can use PuDB for unit tests
    - both in the test,
    - the code under test,
    - and you can step smoothly between them
    - If you use nose testing, you can drop into Pudb on failures with the
    nose-pudb package
    - PuDB will catch exceptions, giving you a chance to inspect the entire
    scope.

    View Slide

  29. View Slide

  30. How many of you have used one of these tools?
    (Dev tools)
    nettop
    curl

    View Slide

  31. What is Charles?
    ● Charles is a web debugging proxy
    ● It shows the content and statistics about any
    HTTP/HTTPS traffic that passes through it.
    ● You can repeat requests, modify, intercept, all the
    forms of beautiful meddling.

    View Slide

  32. View Slide

  33. Breaking SSL
    Charles is the Man in the Middle

    View Slide

  34. Fortune - now with distributed architecture?

    View Slide

  35. View Slide

  36. View Slide

  37. View Slide

  38. View Slide

  39. View Slide

  40. View Slide

  41. View Slide

  42. View Slide

  43. (make slide showing editing a request)

    View Slide

  44. View Slide

  45. View Slide

  46. View Slide

  47. View Slide

  48. View Slide

  49. View Slide

  50. View Slide

  51. View Slide

  52. View Slide

  53. Let’s USE it!

    View Slide

  54. Charles - Extra Credit
    ● It can publish a gist to github of a request/response
    ● You can view xml/json/protocol buffers in a structured way
    ● You can get the curl command for a given request
    ● Use breakpoints or rewrite tool to modify any request in realtime

    View Slide

  55. Charles extra credit - continued
    ● You can throttle to mimic low-speed connections
    ● You can blacklist hosts to block connections to them
    ● It can serve local files from a local folder in response to requests to a server
    ● You can use Charles “Mirror” tool to record a mirror of a site to disk, and then
    use Map Local to serve it back up
    ● … there’s a ton more

    View Slide

  56. Charles is nagware
    After you install it, it’ll keep bugging you until you buy a license.
    I think it’s just one developer building it, so yeah, if you find it useful, it’s a good cause to buy a license and support his
    work.
    (I’m not being compensated for this talk - I just like this software)

    View Slide

  57. View Slide

  58. How many of you have had code that was slow?

    View Slide

  59. How many of you have had code that was slow?
    How many have used profiling to identify why?

    View Slide

  60. Is anyone not sure what a profiler is or how it works?

    View Slide

  61. Profiling
    Profiling is a different hippopotamus from debugging.

    View Slide

  62. Profiling
    - Interpreting a profiler output is almost an art.

    View Slide

  63. Profiling
    - Interpreting a profiler output is almost an art.
    - You can’t recognize wrong unless you would know what right looks like.

    View Slide

  64. Profiling
    - Interpreting a profiler output is almost an art.
    - You can’t recognize wrong unless you would know what right looks like.
    - A lot of profiling is reading between the lines

    View Slide

  65. Profiling
    - Always profile before making performance-related improvements.

    View Slide

  66. Profiling
    - Always profile before making performance-related improvements.
    - Profiling before making performance-related improvements can keep
    you from wasting your time

    View Slide

  67. Profiling
    - Always profile before making performance-related improvements.
    - Profiling before making performance-related improvements can keep you
    from wasting your time
    - Your assumptions about what is taking the most time are often wrong.

    View Slide

  68. Profiling
    - Always profile before making performance-related improvements.
    - Profiling before making performance-related improvements can keep you
    from wasting your time
    - Your assumptions about what is taking the most time are often wrong.
    - Speeding up code that is already fast is useless!

    View Slide

  69. Profiling
    - Always profile before making performance-related improvements.
    - Profiling before making performance-related improvements can keep you
    from wasting your time
    - Your assumptions about what is taking the most time are often wrong.
    - Speeding up code that is already fast is useless!
    - You must profile with realistic load / realistic data

    View Slide

  70. The Sad State of Python Profilers
    I wanted to tell you that there was an awesome profiler tool you should use...

    View Slide

  71. The Sad State of Python Profilers
    I wanted to tell you that there was an awesome profiler tool you should use...

    View Slide

  72. cProfile!
    Python has 3 built-in profilers, but cProfile is the most commonly used one.
    It’s in the standard library.
    python -m cProfile [-o output_file] [-s sort_order] myscript.py
    Two output formats - binary and human readable.

    View Slide

  73. “Human readable” output
    I find it’s best to sort by cumtime
    (cumulative time)
    Difficult to distinguish which times
    are additive, and which times are
    nested.
    Gets confusing for anything more
    than the simplest program
    47371645 function calls in 20.013 seconds
    Ordered by: cumulative time
    ncalls tottime percall cumtime percall filename:lineno(function)
    1 0.001 0.001 20.013 20.013 simulation.py:3()
    2 9.524 4.762 20.006 10.003
    simulation.py:24(run_simulation)
    2441124 1.842 0.000 2.324 0.000 queue.py:82(enqueue)
    2297290 1.804 0.000 2.320 0.000 queue.py:39(enqueue)
    2294714 1.369 0.000 1.369 0.000 queue.py:52(dequeue)
    18940222 1.260 0.000 1.260 0.000 {method 'random' of
    '_random.Random' objects}
    2438880 1.160 0.000 1.160 0.000 queue.py:95(dequeue)
    4738410 0.750 0.000 0.750 0.000 {max}
    2297290 0.517 0.000 0.517 0.000 queue.py:28(__init__)
    2441124 0.482 0.000 0.482 0.000 queue.py:69(__init__)
    2439596 0.347 0.000 0.347 0.000 queue.py:105(is_empty)
    2295021 0.342 0.000 0.342 0.000 queue.py:62(is_empty)
    2441122 0.300 0.000 0.300 0.000 queue.py:102(size)
    2297288 0.282 0.000 0.282 0.000 queue.py:59(size)
    9471 0.028 0.000 0.028 0.000 {built-in method now}
    1 0.001 0.001 0.004 0.004 random.py:40()
    1 0.002 0.002 0.002 0.002 hashlib.py:56()
    1 0.001 0.001 0.002 0.002 queue.py:1()

    View Slide

  74. 47371645 function calls in 20.013 seconds
    Ordered by: cumulative time
    ncalls tottime percall cumtime percall filename:lineno(function)
    1 0.001 0.001 20.013 20.013 simulation.py:3()
    2 9.524 4.762 20.006 10.003 simulation.py:24(run_simulation)
    2441124 1.842 0.000 2.324 0.000 queue.py:82(enqueue)
    2297290 1.804 0.000 2.320 0.000 queue.py:39(enqueue)
    2294714 1.369 0.000 1.369 0.000 queue.py:52(dequeue)
    18940222 1.260 0.000 1.260 0.000 {method 'random' of
    '_random.Random' objects}
    2438880 1.160 0.000 1.160 0.000 queue.py:95(dequeue)
    4738410 0.750 0.000 0.750 0.000 {max}

    View Slide

  75. cProfile Visualizers
    cProfile can make binary output that can
    be read by several tools.

    View Slide

  76. pyprof2calltree
    + qcachegrind

    View Slide

  77. View Slide

  78. while True:
    now = datetime.now()
    ...

    View Slide

  79. View Slide

  80. After - datetime.now is no more!
    while True:
    now = datetime.now()
    ...
    while True:
    if count % 1000 == 0:
    now = datetime.now()

    View Slide

  81. Simulation complete, ran 287,768.6 operations
    per second
    Simulation complete, ran 419,200.0 operations
    per second

    View Slide

  82. View Slide

  83. View Slide

  84. PuDB
    Install: pip install pudb
    Invoke import pudb; pudb.set_trace()
    PUDB https://documen.tician.de/pudb/
    Nose-PUDB (for nose testing) - https://pypi.python.org/pypi/nose-pudb

    View Slide

  85. Charles
    https://www.charlesproxy.com

    View Slide

  86. Profiling
    cProfile https://docs.python.org/3/library/profile.html
    Using cProfile w/ pyprof2calltree and kcachegrind/qcachegrind:
    https://julien.danjou.info/blog/2015/guide-to-python-profiling-cprofile-concrete-case
    -carbonara

    View Slide

  87. Happy Hunting!!

    View Slide