Upgrade to Pro — share decks privately, control downloads, hide ads and more …

2016 - Mahmoud Hashemi - Python Profiling and P...

PyBay
August 21, 2016

2016 - Mahmoud Hashemi - Python Profiling and Performance: Elementary to Enterprise

Description
This talk provides an end-to-end introduction and overview of Python performance practices, from fundamentals to functional industry practices to the future of performant Python. If you've ever felt lost in or out of touch with the constant whirl of Python performance advancements, this practical talk will put it back into perspective.

Abstract
Performance is a complex topic. It means a lot of things to a lot of people. Python gives us a great starting point: strong primitives and the "good enough" philosophy. But is Python actually good enough for performance-critical applications?

This talk defines different kinds of performance, covers basic principles, and dives right into measurement. With those foundations laid, it outlines eight approaches to scaling Python, four of which are stack-agnostic and four of which are Python-specific. It outlines many examples from industry to promote a holistic view of performance as a practical process, not a large-scale benchmarking competition.

Bio
Mahmoud Hashemi is Lead Developer of Python Infrastructure at PayPal, where he focuses on distributed systems, API design, and application security. He presented O'Reilly's Enterprise Software with Python, as well as several guides to topics from DNS to software versioning to statistics. An avid Wikipedian, Mahmoud is half of Hatnote, creators of Listen to Wikipedia and other fine wiki-based software.

https://youtu.be/Dgnp28Ijm_M

PyBay

August 21, 2016
Tweet

More Decks by PyBay

Other Decks in Programming

Transcript

  1. About Me Lead Developer, PayPal Python Infrastructure Python programmer since

    2009 Built services taking 101 to 109 requests per day Enjoys: Coding open-source Python Reading Wikipedia occu.py-ing PayPal
  2. Python Profiling and Performance - Mahmoud Hashemi Profiling and performance

    The engineer’s delight and/or doom. Types of performance Ground rules Profiling tools Scaling strategies GOTTA GO FAST
  3. Python Profiling and Performance - Mahmoud Hashemi Performance types What

    is fast? 1. Latency - Response time “200 millisecond client roundtrip” 2. Throughput - Successful traffic flow “200 requests per second” 3. Efficiency - Utilization and return on investment “2000 users per 4-core VM at 50% CPU” X. Scalability - Not a type of performance Complex, and often more about scaling people than software
  4. Python Profiling and Performance - Mahmoud Hashemi Safety first! Always

    wear your seatbelt, helmet, and check your mirrors: 1. Predictability is power 2. Good work takes cycles 3. Abide by Amdahl’s Law Performance ground rules
  5. Python Profiling and Performance - Mahmoud Hashemi Predictability is power

    Automated tests first, then benchmarks Test after every optimization Remember, optimized code is: Harder to write and read Less maintainable Buggier, more brittle See CPython’s consistency: Fast startup No JIT, no warmup No complex GC Performance ground rule #1
  6. Python Profiling and Performance - Mahmoud Hashemi Good work takes

    cycles Healthy enterprise applications have big bones Gather requirements Security Instrumentation Compatibility Establish percentile SLAs: 50, 95, 99, max Stick to the budget and put down the ping pong paddles Good enough is good enough! Performance ground rule #2
  7. Python Profiling and Performance - Mahmoud Hashemi Abide by Amdahl’s

    Law Speedups are relative to task significance Keep perspective, maximize impact Focus on one part at a time Recheck proportions after every optimization Performance ground rule #3 https://commons.wikimedia.org/wiki/File:Optimizing-different-parts.svg
  8. Python Profiling and Performance - Mahmoud Hashemi Performance ground rules

    Always wear your seatbelt, helmet, and check your mirrors: 1. Predictability is power 2. Good work takes cycles 3. Abide by Amdahl’s Law Recap
  9. Python Profiling and Performance - Mahmoud Hashemi Python profiling tools

    1. Casual profiling time.time() timeit module 2. Offline profiling cProfile Other advanced options 3. Online profiling
  10. Python Profiling and Performance - Mahmoud Hashemi time.time() Trusty, print-based

    debugging will never die: import time start = time.time() do_work() print time.time() - start, 'seconds' Casual Python profiling Easy to use and explain. But: Single measurement might misrepresent Measurement expense could exceed operation Must switch to time.clock() on Windows for better resolution Gets tedious
  11. Python Profiling and Performance - Mahmoud Hashemi The timeit module

    Built-in, and fixes the main problems of time.time()-based profiling: Does multiple runs to offset system variability Each run does thousands or millions of repetitions, compensating for very fast operations Automatically uses the platform’s best timing function Multiple easy usage patterns Command line $ python -m timeit -s "import json" "json.dumps({})" 1000000 loops, best of 3: 1.73 usec per loop Python Casual Python profiling Other timeit facts: It disables garbage collection for even more consistency It represents one of the few valid uses of Python code generation Jupyter notebook has built-in support (%%timeit) Less than 300 lines of code Notable successor: perf Runs separate processes Somewhat more statistical Still easy to use
  12. Python Profiling and Performance - Mahmoud Hashemi The cProfile module

    Simpler tools assume you know which code you want to measure, whereas the built-in cProfile module: Measures a whole thread of execution Identifies time-consuming functions Command-line and programmatic interfaces Command line $ python -m cProfile target_code.py Python import cProfile, pstats pr = cProfile.Profile() pr.enable() # call the code you want to measure pr.disable() Offline Python profiling 624865 function calls (517453 primitive calls) in 0.289 seconds Ordered by: internal time List reduced from 44 to 10 due to restriction <10> ncalls tottime percall cumtime percall filename:lineno(function) 2989/7 0.054 0.000 0.186 0.027 serdes.py:365(vo_tree) 12618/45 0.040 0.000 0.106 0.002 serdes_bin.py:78(append_field) 16393 0.037 0.000 0.044 0.000 serdes.py:315(_uniq_field) 84450/48 0.028 0.000 0.186 0.004 serdes.py:348(field_tree) 84450 0.023 0.000 0.037 0.000 {getattr} 68389 0.014 0.000 0.014 0.000 field.py:116(__get__) 26182 0.012 0.000 0.017 0.000 serdes_bin.py:128(append_int) 2757 0.009 0.000 0.010 0.000 serdes.py:336(_uniq_vo) 2489/43 0.008 0.000 0.105 0.002 serdes_bin.py:205(append_vo) 5103/94 0.007 0.000 0.103 0.001 serdes_bin.py:192(append_list) The cProfile + pstats text output is utilitarian, but interesting visualizations can be created with RunSnakeRun, SnakeViz, Gprof2dot, pyprof2calltree, and pyinstrument.
  13. Python Profiling and Performance - Mahmoud Hashemi Advanced tooling Offline

    Python profiling line_profiler Line-by-line profiling inside of functions yep Function-oriented profiler that crosses the Python-C boundary GreenletProfiler A concurrency-friendly profiler for applications using greenlet memory_profiler Process and line-by-line memory consumption measurement A variety of measurement and tuning tools can be built out of the following modules: sys.setprofile() sys.settrace() sys._getframe() gc resource psutil (not built-in) Just run a search for “python <module>” and check out the docs. Also, the built-in dis module works well for getting the most out of line_profiler
  14. Python Profiling and Performance - Mahmoud Hashemi Online profiling Offline

    profiling involves hypothetical scenarios, historical data, and performance sacrifice. Online profiling offers: Imperceptible performance impact Live data collection, in real time Online profiling requires more time to be accurate. Still, ending speculation is worth the work. At PayPal we use sampro for its lightweight online collection process. We also use lithoxyl for semantic instrumentation, including performance measurements. github.com/doublereedkurt/sampro github.com/mahmoud/lithoxyl
  15. Python Profiling and Performance - Mahmoud Hashemi Scaling strategies There

    are eight ways to scale Python projects. 1. Add more hardware 2. Rearchitect to divide work 3. Adopt the asynchronous approach 4. Use a smarter algorithm 5. Write faster Python 6. Build native Python extensions 7. Use a library with a faster implementation 8. Use a different Python runtime At PayPal we’ve used all 8 of these, and most multi-engineer projects will use at least four or five.
  16. Python Profiling and Performance - Mahmoud Hashemi Add more hardware

    Python scaling strategy #1 Adding more hardware is something enterprises often do best. It’s the obvious choice, and is either a “happy path” solution or a last-ditch desperation move. The Good Solves certain problems Easy to explain The essence of scalability The Bad Only solves certain problems Provisioning & deployment must scale Budget limits If adding more machines solves all your problems in a reliable and cost-effective manner, congratulations on achieving scalability!
  17. Python Profiling and Performance - Mahmoud Hashemi Rearchitect to divide

    work Python scaling strategy #2 Whether split across machines or CPUs, reworking your application to redistribute the work can make sense technically and organizationally. The Good Easy-to-explain principle Many SOA technologies The Bad Easy to mess up Short-term correctness Long-term extensibility “Now you have n2 problems” Scaling through abstractions like services and queues is such a common enterprise practice that there are bound to be replicable success stories within your organization.
  18. Python Profiling and Performance - Mahmoud Hashemi Adopt the asynchronous

    approach Python scaling strategy #3 In IO-bound scenarios, this is the go-to method for getting more done. The Good Many libraries exist Great way to learn about operating systems and Python The Bad Drastically changes application Complicates code, especially debugging Limits “off-the-shelf” library usage Make sure you need it. Ask: What are the utilization requirements? How many requests per process are being handled now? Applications of substance are rarely only IO-bound, and the limitations brought by async can complicate feature development.
  19. Python Profiling and Performance - Mahmoud Hashemi Use a smarter

    algorithm Python scaling strategy #4 The ideal way to reduce work and speed up projects in CPU-bound scenarios. The Good Python is runnable pseudocode! Many books and Wikipedia articles Good for those interview questions The Bad Might be hard to find Often takes time to understand Language-specific algorithms Python has many built-in examples of this approach. bisect and heapq for list operations, itertools for iterators, re for string operations, among others. Even naive implementations can be made smarter through careful caching. Always make sure your queues and caches are bounded!
  20. Python Profiling and Performance - Mahmoud Hashemi Write faster Python

    Python scaling strategy #5 Python has obvious ways of doing things, but they aren't always the fastest. The Good Packaging & deployment stay the same Easy to measure and iterate Same debugging and profiling tools Python builtins are fast! The Bad Gains can be small Minimal parallelization Code can be less clear Relatively low-risk, high-reward, and easy to get started using aforementioned profiling techniques. Becomes second nature pretty readily.
  21. Python Profiling and Performance - Mahmoud Hashemi Build a native

    Python extension Python scaling strategy #6 Python’s close relationship with C is one of its greatest strengths. The Good C has 2-10x less overhead Integrate with C open-source ecosystem CPython is clean, idiomatic C Cython is mature and greatly helpful The Bad C has riskier bugs Takes longer to get right Complicates build and deployment Not for the faint of heart, but the partnership of Python and C is a key to performance and scale found in almost every Python enterprise.
  22. Python Profiling and Performance - Mahmoud Hashemi Use a faster

    implementation Python scaling strategy #7 Python has a huge community, so let’s leverage those open-source libraries. The Good Python standards Often “just works” Many libraries are drop-in The Bad Availability and maturity Architectural compatibility Build and deployment constraints There are so many success stories here. Built-in, ElementTree vs cElementTree, and external, json vs ujson. Not to mention all the libraries based on NumPy for fast math. See the “Choosing Dependencies” segment for more specific guidance.
  23. Python Profiling and Performance - Mahmoud Hashemi Use a different

    runtime Python scaling strategy #8 While CPython reigns supreme, Python does have alternate implementations, like PyPy and Jython, which can offer some scalability benefits in specific cases. The Good 2-10x speedups depending Easy to test for simple projects Work with cool new features Great for a blog posts The Bad Drastically changes deployment Architecture and environment limits Libraries are often incompatible All large projects will need code rework While some experimentation is fine, this type of change almost never happens, making it a very risky scaling strategy for substantial projects.
  24. Python Profiling and Performance - Mahmoud Hashemi Quick recap Remember

    and reference: How to define “performant” How to stay safe when optimizing How to measure How to scale 8 ways from Sunday Profiling and performance GOTTA GO FAST
  25. Python Profiling and Performance - Mahmoud Hashemi Thanks! @mhashemi github.com/mahmoud

    sedimental.org paypal-engineering.com/tag/python O’Reilly’s Enterprise Software with Python