Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Thinking Outside the GIL

Thinking Outside the GIL

Have you ever written a small, elegant application that couldn't keep up with the growth of your data or user demand? Did your beautiful design end up buried in threads and locks? Did Python's very special Global Interpreter Lock make all of this an exercise in futility?

This talk is for you! With the combined powers of AsyncIO and multiprocessing, we'll redesign an old multithreaded application limited by the GIL into a modern solution that scales with the demand using only the standard library. No prior AsyncIO or multiprocessing experience required.

Presented at PyCon US 2018 in Cleveland: https://youtu.be/0kXaLh8Fz3k

John Reese

May 11, 2018
Tweet

More Decks by John Reese

Other Decks in Programming

Transcript

  1. Thinking Outside the GIL With AsyncIO and Multiprocessing John Reese

    Production Engineer, Facebook @n7cmdr
 github.com/jreese
  2. • Global Interpreter Lock • One VM thread at a

    time • No concurrent memory access • I/O wait releases lock What’s the GIL?
  3. • Gather ~100M data points • Process and aggregate anomalies

    • Easy to add new checks • Simple deployment • Few dependencies Stateful monitoring
  4. • One binary • Fetch the world • Process everything

    • Aggregate results • Thread pool for I/O #impact
  5. • Scales in time and memory • Runtime now too

    slow • Underutilizing hardware • Ultimately limited by the GIL Not aging well
  6. • Technically correct • Scales with number of workers •

    Complicated deployments • Communication overhead Sharding
  7. • Scales with CPU cores • Automatic IPC • Pool.map

    is really useful • One task per process • Beware forking, pickling Multiprocessing
  8. • Based on futures • Faster than threads • Massive

    I/O concurrency • Processing still limited by GIL • Beware timeouts and queue length AsyncIO
  9. • Use multiprocessing primitives • Event loop per process •

    Queues for work/results • Highly parallel workload • Need to do some plumbing Multiprocessing + AsyncIO
  10. • Minimize what you pickle • Prechunk work items •

    Aggregate results in the child • Use map/reduce Considerations
  11. • Simple implementation • Emulates multiprocessing API • One shot

    or process pool • Supports map/reduce workloads aiomultiprocess github.com/jreese/aiomultiprocess