Who am I? • Yusuke Miyazaki / ٶ㟒 ༐ี / @ymyzk • Site Reliability Engineer @ Indeed • Personal projects • wsgi_lineprof: line-by-line profiler • mypy Playground: mypy on web browsers • tox-gh-actions: smooth integration of tox with GitHub Actions • See ymyzk.com for more details
Agenda • Introduction • What is wsgi_lineprof? • Features of wsgi_lineprof • How wsgi_lineprof works? • Technologies for implementing a profiler • Conclusion
How to Use Add a few lines to the existing WSGI application # Your existing WSGI application app: WSGIApplication # Enable profiler from wsgi_lineprof.middleware \ import LineProfilerMiddleware app = LineProfilerMiddleware(app)
Features of wsgi_lineprof • Line-by-line profiling • Finding a bottleneck quickly • WSGI middleware • Integration with many Python web applications • Easily pluggable • All configuration for profiling in one place
Predecessors wsgi_lineprof is inspired by the predecessors • Line-by-line profiler for Python • rkern/line_profiler • Line-by-line profiler for Ruby web applications • kainosnoema/rack-lineprof
Python Architecture Web Server Web App Profiler Timer 1. Capture request/response 2. Get information on execution 3. Measure exec. time per line precisely
WSGI Middleware wsgi_lineprof is implemented as a WSGI middleware • What is WSGI middleware? • Wrapper for WSGI app to add features • Why WSGI middleware? • Keep track of HTTP requests • Support multiple web application frameworks • Easily enable/disable a middleware
Implementing WSGI Middleware • WSGI application is a callable (e.g., a function) • WSGI middleware can be implemented as a higher-order function WSGIApplication = Callable[ [WSGIEnvironment, StartResponse], Iterable[bytes]] WSGIMiddleware = Callable[ [WSGIApplication], WSGIApplication]
Tracing Line-by-Line Execution wsgi_lineprof uses PyEval_SetTrace Python/C API via Cython extension to track line-by-line execution • What is tracing? 1. Register a callback function to CPython 2. CPython calls the callback function on specific events like executing a line, calling a function, raising exception…
How CPython Evaluates Source Code 1. Read source code 2. Convert to parse tree 3. Convert to AST 4. Convert to CFG 5. Convert to byte code • Byte code retains corresponding line number 6. Evaluate byte code • Call tracing function when evaluating byte code See Python Developers’s Guide
Measuring Time wsgi_lineprof calls a different timer via C extension depending on platforms for measuring time • Why measuring time is not simple? • Needs to use monotonic timer • time.time() / datetime.now() are not • Reduce overhead of getting time • Use of time.monotonic/perf_counter() incurs additional overhead
Measuring Time on Various Platforms APIs used by wsgi_lineprof: • POSIX: clock_gettime(CLOCK_MONOTONIC) • macOS: mach_absolute_time() • Windows: QueryPerformanceCounter() • Fallback: gettimeofday() See extensions/timer.c
• Cython: for implementing the core logic • linecache: for accessing source code efficiently • threading/queue: for writing results efficiently • wheel: for packaging with compiled extensions • Airspeed Velocity: for measuring overhead Other Technologies
Python Architecture of wsgi_lineprof Web Server Web App Profiler Timer 1. WSGI middleware to capture request/response 2. Tracing API to track line-by-line execution 3. Monotonic timer to measure time
Conclusion We can build a profiler by combining some basic technologies, though making an efficient profiler requires additional efforts Future plan of wsgi_lineprof: • Show result via HTTP endpoint • Support ASGI (asyncio) / multithread • Reduce overhead wsgi_lineprof is available on GitHub