Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Thinking Outside the GIL

Thinking Outside the GIL

Have you ever written a small, elegant application that couldn't keep up with the growth of your data or user demand? Did your beautiful design end up buried in threads and locks? Did Python's very special Global Interpreter Lock make all of this an exercise in futility?

This talk is for you! With the combined powers of AsyncIO and multiprocessing, we'll redesign an old multithreaded application limited by the GIL into a modern solution that scales with the demand using only the standard library. No prior AsyncIO or multiprocessing experience required.

Presented at PyCon US 2018 in Cleveland: https://youtu.be/0kXaLh8Fz3k

John Reese

May 11, 2018
Tweet

More Decks by John Reese

Other Decks in Programming

Transcript

  1. Thinking Outside the GIL
    With AsyncIO and Multiprocessing
    John Reese
    Production Engineer, Facebook
    @n7cmdr

    github.com/jreese

    View full-size slide

  2. • Global Interpreter Lock
    • One VM thread at a time
    • No concurrent memory access
    • I/O wait releases lock
    What’s the GIL?

    View full-size slide

  3. • Gather ~100M data points
    • Process and aggregate anomalies
    • Easy to add new checks
    • Simple deployment
    • Few dependencies
    Stateful monitoring

    View full-size slide

  4. • One binary
    • Fetch the world
    • Process everything
    • Aggregate results
    • Thread pool for I/O
    #impact

    View full-size slide

  5. • Scales in time and memory
    • Runtime now too slow
    • Underutilizing hardware
    • Ultimately limited by the GIL
    Not aging well

    View full-size slide

  6. Give me options

    View full-size slide

  7. Switch to py3
    ~45% memory savings ~20% runtime reduction

    View full-size slide

  8. • Technically correct
    • Scales with number of workers
    • Complicated deployments
    • Communication overhead
    Sharding

    View full-size slide

  9. • Scales with CPU cores
    • Automatic IPC
    • Pool.map is really useful
    Multiprocessing

    View full-size slide

  10. • Scales with CPU cores
    • Automatic IPC
    • Pool.map is really useful
    • One task per process
    • Beware forking, pickling
    Multiprocessing

    View full-size slide

  11. • Based on futures
    • Faster than threads
    • Massive I/O concurrency
    AsyncIO

    View full-size slide

  12. • Based on futures
    • Faster than threads
    • Massive I/O concurrency
    • Processing still limited by GIL
    • Beware timeouts and queue length
    AsyncIO

    View full-size slide

  13. Why not both?

    View full-size slide

  14. • Use multiprocessing primitives
    • Event loop per process
    • Queues for work/results
    • Highly parallel workload
    • Need to do some plumbing
    Multiprocessing + AsyncIO

    View full-size slide

  15. • Multiple work queues
    • Combine tasks into batches
    • Use spawned processes
    Optimizations

    View full-size slide

  16. • Minimize what you pickle
    • Prechunk work items
    • Aggregate results in the child
    • Use map/reduce
    Considerations

    View full-size slide

  17. Performance comparison
    Threads
    AsyncIO
    Multiprocessing
    Multi/Async Naive
    Multi/Async Tuned
    Multi/Async Map/Reduce
    Processed / Second

    View full-size slide

  18. $ pip install aiomultiprocess

    View full-size slide

  19. • Simple implementation
    • Emulates multiprocessing API
    • One shot or process pool
    • Supports map/reduce workloads
    aiomultiprocess
    github.com/jreese/aiomultiprocess

    View full-size slide

  20. Python is slow

    View full-size slide

  21. Python is slow powerful

    View full-size slide

  22. Great tools make
    complex tasks simple

    View full-size slide

  23. John Reese
    Production Engineer, Facebook
    @n7cmdr

    github.com/jreese

    View full-size slide