Python ウェブアプリケーションのためのプロファイラの実装 // Implementation of a profiler for Python web applications

Slide 1

Slide 1 text

Python΢ΣϒΞϓϦέʔγϣϯͷͨΊͷ  ϓϩϑΝΠϥͷ࣮૷ Yusuke Miyazaki  PyCon JP 2019 / 2019-09-16 Implementation of a proﬁler for Python web applications

Slide 2

Slide 2 text

Who am I? • Yusuke Miyazaki / ٶ㟒 ༐ี / @ymyzk • Site Reliability Engineer @ Indeed • Personal projects • wsgi_lineprof: line-by-line proﬁler • mypy Playground: mypy on web browsers • tox-gh-actions:  smooth integration of tox with GitHub Actions • See ymyzk.com for more details

Slide 3

Slide 3 text

Have you ever used a proﬁler?    Result at PyCon JP: about 50%

Slide 4

Slide 4 text

Have you ever built a proﬁler?    " " " Result at PyCon JP: only 1 person

Slide 5

Slide 5 text

Goal of the Talk “Implementing a profiler is not so difficult”  through understanding inside of a profiler

Slide 6

Slide 6 text

Agenda • Introduction • What is wsgi_lineprof? • Features of wsgi_lineprof • How wsgi_lineprof works? • Technologies for implementing a proﬁler • Conclusion

Slide 7

Slide 7 text

What is wsgi_lineprof?

Slide 8

Slide 8 text

Line-by-Line Proﬁler for Web Apps Shows line-by-line execution time for each request

Slide 9

Slide 9 text

How to Use Add a few lines to the existing WSGI application # Your existing WSGI application app: WSGIApplication # Enable profiler from wsgi_lineprof.middleware \  import LineProfilerMiddleware app = LineProfilerMiddleware(app)

Slide 10

Slide 10 text

Features of wsgi_lineprof • Line-by-line profiling • Finding a bottleneck quickly • WSGI middleware • Integration with many Python web applications • Easily pluggable • All configuration for profiling in one place

Slide 11

Slide 11 text

Predecessors wsgi_lineprof is inspired by the predecessors • Line-by-line profiler for Python • rkern/line_profiler • Line-by-line profiler for Ruby web applications • kainosnoema/rack-lineprof

Slide 12

Slide 12 text

How wsgi_lineprof works?

Slide 13

Slide 13 text

Python Architecture Web  Server Web  App

Slide 14

Slide 14 text

Python Architecture Web  Server Web  App Proﬁler 1. Capture  request/response

Slide 15

Slide 15 text

Python Architecture Web  Server Web  App Proﬁler 1. Capture  request/response 2. Get information  on execution

Slide 16

Slide 16 text

Python Architecture Web  Server Web  App Proﬁler Timer 1. Capture  request/response 2. Get information  on execution 3. Measure exec. time  per line precisely

Slide 17

Slide 17 text

WSGI Middleware wsgi_lineprof is implemented as a WSGI middleware • What is WSGI middleware? • Wrapper for WSGI app to add features • Why WSGI middleware? • Keep track of HTTP requests • Support multiple web application frameworks • Easily enable/disable a middleware

Slide 18

Slide 18 text

Implementing WSGI Middleware • WSGI application is a callable (e.g., a function) • WSGI middleware can be implemented as a higher-order function WSGIApplication = Callable[  [WSGIEnvironment, StartResponse],  Iterable[bytes]] WSGIMiddleware = Callable[  [WSGIApplication], WSGIApplication]

Slide 19

Slide 19 text

Example of WSGI Middleware class MyMiddleware: def __init__(self, app): self.app = app    def __call__(self, env, start_response): print(“start profiling”)  res = self.app(env, start_response)  print(“end profiling”) return res See wsgi_lineprof/middleware.py

Slide 20

Slide 20 text

Tracing Line-by-Line Execution wsgi_lineprof uses PyEval_SetTrace Python/C API via Cython extension to track line-by-line execution • What is tracing? 1. Register a callback function to CPython 2. CPython calls the callback function on speciﬁc events like executing a line, calling a function, raising exception…

Slide 21

Slide 21 text

How to Use Tracing on CPython CPython provides various levels of APIs for tracing • trace module • setprofile/settrace functions in sys module      • PyEval_SetTrace/PyEval_SetProfile functions in Python/C API import sys sys.settrace(lambda frame, event, arg: ...) See extensions/extensions.pyx

Slide 22

Slide 22 text

How CPython Evaluates Source Code 1. Read source code 2. Convert to parse tree 3. Convert to AST 4. Convert to CFG 5. Convert to byte code • Byte code retains corresponding line number 6. Evaluate byte code • Call tracing function when evaluating byte code See Python Developers’s Guide

Slide 23

Slide 23 text

Measuring Time wsgi_lineprof calls a different timer via C extension depending on platforms for measuring time  • Why measuring time is not simple? • Needs to use monotonic timer • time.time() / datetime.now() are not • Reduce overhead of getting time • Use of time.monotonic/perf_counter() incurs additional overhead

Slide 24

Slide 24 text

Measuring Time on Various Platforms APIs used by wsgi_lineprof: • POSIX: clock_gettime(CLOCK_MONOTONIC) • macOS: mach_absolute_time() • Windows: QueryPerformanceCounter() • Fallback: gettimeofday() See extensions/timer.c

Slide 25

Slide 25 text

• Cython: for implementing the core logic • linecache: for accessing source code efﬁciently • threading/queue: for writing results efﬁciently • wheel: for packaging with compiled extensions • Airspeed Velocity: for measuring overhead Other Technologies

Slide 26

Slide 26 text

Conclusion

Slide 27

Slide 27 text

Python Architecture of wsgi_lineprof Web  Server Web  App Proﬁler Timer 1. WSGI middleware to capture  request/response 2. Tracing API to track line-by-line execution 3. Monotonic timer to measure time

Slide 28

Slide 28 text

Conclusion We can build a profiler by combining some basic technologies, though making an efficient profiler requires additional efforts   Future plan of wsgi_lineprof: • Show result via HTTP endpoint • Support ASGI (asyncio) / multithread • Reduce overhead wsgi_lineprof is available on GitHub