Slide 1

Slide 1 text

Python΢ΣϒΞϓϦέʔγϣϯͷͨΊͷ
 ϓϩϑΝΠϥͷ࣮૷ Yusuke Miyazaki
 PyCon JP 2019 / 2019-09-16 Implementation of a profiler for Python web applications

Slide 2

Slide 2 text

Who am I? • Yusuke Miyazaki / ٶ㟒 ༐ี / @ymyzk • Site Reliability Engineer @ Indeed • Personal projects • wsgi_lineprof: line-by-line profiler • mypy Playground: mypy on web browsers • tox-gh-actions:
 smooth integration of tox with GitHub Actions • See ymyzk.com for more details

Slide 3

Slide 3 text

Have you ever used a profiler?
 
 Result at PyCon JP: about 50%

Slide 4

Slide 4 text

Have you ever built a profiler?
 
 " " " Result at PyCon JP: only 1 person

Slide 5

Slide 5 text

Goal of the Talk “Implementing a profiler is not so difficult”
 through understanding inside of a profiler

Slide 6

Slide 6 text

Agenda • Introduction • What is wsgi_lineprof? • Features of wsgi_lineprof • How wsgi_lineprof works? • Technologies for implementing a profiler • Conclusion

Slide 7

Slide 7 text

What is wsgi_lineprof?

Slide 8

Slide 8 text

Line-by-Line Profiler for Web Apps Shows line-by-line execution time for each request

Slide 9

Slide 9 text

How to Use Add a few lines to the existing WSGI application # Your existing WSGI application app: WSGIApplication # Enable profiler from wsgi_lineprof.middleware \
 import LineProfilerMiddleware app = LineProfilerMiddleware(app)

Slide 10

Slide 10 text

Features of wsgi_lineprof • Line-by-line profiling • Finding a bottleneck quickly • WSGI middleware • Integration with many Python web applications • Easily pluggable • All configuration for profiling in one place

Slide 11

Slide 11 text

Predecessors wsgi_lineprof is inspired by the predecessors • Line-by-line profiler for Python • rkern/line_profiler • Line-by-line profiler for Ruby web applications • kainosnoema/rack-lineprof

Slide 12

Slide 12 text

How wsgi_lineprof works?

Slide 13

Slide 13 text

Python Architecture Web
 Server Web
 App

Slide 14

Slide 14 text

Python Architecture Web
 Server Web
 App Profiler 1. Capture
 request/response

Slide 15

Slide 15 text

Python Architecture Web
 Server Web
 App Profiler 1. Capture
 request/response 2. Get information
 on execution

Slide 16

Slide 16 text

Python Architecture Web
 Server Web
 App Profiler Timer 1. Capture
 request/response 2. Get information
 on execution 3. Measure exec. time
 per line precisely

Slide 17

Slide 17 text

WSGI Middleware wsgi_lineprof is implemented as a WSGI middleware • What is WSGI middleware? • Wrapper for WSGI app to add features • Why WSGI middleware? • Keep track of HTTP requests • Support multiple web application frameworks • Easily enable/disable a middleware

Slide 18

Slide 18 text

Implementing WSGI Middleware • WSGI application is a callable (e.g., a function) • WSGI middleware can be implemented as a higher-order function WSGIApplication = Callable[
 [WSGIEnvironment, StartResponse],
 Iterable[bytes]] WSGIMiddleware = Callable[
 [WSGIApplication], WSGIApplication]

Slide 19

Slide 19 text

Example of WSGI Middleware class MyMiddleware: def __init__(self, app): self.app = app
 
 def __call__(self, env, start_response): print(“start profiling”)
 res = self.app(env, start_response)
 print(“end profiling”) return res See wsgi_lineprof/middleware.py

Slide 20

Slide 20 text

Tracing Line-by-Line Execution wsgi_lineprof uses PyEval_SetTrace Python/C API via Cython extension to track line-by-line execution • What is tracing? 1. Register a callback function to CPython 2. CPython calls the callback function on specific events like executing a line, calling a function, raising exception…

Slide 21

Slide 21 text

How to Use Tracing on CPython CPython provides various levels of APIs for tracing • trace module • setprofile/settrace functions in sys module
 
 
 • PyEval_SetTrace/PyEval_SetProfile functions in Python/C API import sys sys.settrace(lambda frame, event, arg: ...) See extensions/extensions.pyx

Slide 22

Slide 22 text

How CPython Evaluates Source Code 1. Read source code 2. Convert to parse tree 3. Convert to AST 4. Convert to CFG 5. Convert to byte code • Byte code retains corresponding line number 6. Evaluate byte code • Call tracing function when evaluating byte code See Python Developers’s Guide

Slide 23

Slide 23 text

Measuring Time wsgi_lineprof calls a different timer via C extension depending on platforms for measuring time
 • Why measuring time is not simple? • Needs to use monotonic timer • time.time() / datetime.now() are not • Reduce overhead of getting time • Use of time.monotonic/perf_counter() incurs additional overhead

Slide 24

Slide 24 text

Measuring Time on Various Platforms APIs used by wsgi_lineprof: • POSIX: clock_gettime(CLOCK_MONOTONIC) • macOS: mach_absolute_time() • Windows: QueryPerformanceCounter() • Fallback: gettimeofday() See extensions/timer.c

Slide 25

Slide 25 text

• Cython: for implementing the core logic • linecache: for accessing source code efficiently • threading/queue: for writing results efficiently • wheel: for packaging with compiled extensions • Airspeed Velocity: for measuring overhead Other Technologies

Slide 26

Slide 26 text

Conclusion

Slide 27

Slide 27 text

Python Architecture of wsgi_lineprof Web
 Server Web
 App Profiler Timer 1. WSGI middleware to capture
 request/response 2. Tracing API to track line-by-line execution 3. Monotonic timer to measure time

Slide 28

Slide 28 text

Conclusion We can build a profiler by combining some basic technologies, though making an efficient profiler requires additional efforts 
 Future plan of wsgi_lineprof: • Show result via HTTP endpoint • Support ASGI (asyncio) / multithread • Reduce overhead wsgi_lineprof is available on GitHub