PyPy: Python without the GIL by Armin Rigo and Maciej Fijałkowski

PyPy: Python without the GIL by Armin Rigo and Maciej Fijałkowski

Afcfefa1f067d10bd021de0cc2e5e806?s=128

PyCon 2013

March 15, 2013
Tweet

Transcript

  1. 1.

    Python without the GIL Armin Rigo Maciej Fijałkowski PyCon US

    2013 March 15 2013 arigo, fijal (PyCon US 2013) Python without the GIL March 15 2013 1 / 24
  2. 2.

    Intro PyPy is a Python interpreter with stuff No general

    PyPy talk this year, find us around, come to the BoF (tomorrow 2pm) arigo, fijal (PyCon US 2013) Python without the GIL March 15 2013 2 / 24
  3. 3.

    This is about... This talk is about using multiple cores

    to achieve better performance in Python (or any other existing, non-trivial, non-functional, non-designed-for-this-problem, language) arigo, fijal (PyCon US 2013) Python without the GIL March 15 2013 3 / 24
  4. 4.

    Problem An existing complex, large program that does stuff “stuff”

    consists of bits that are mostly independent from each other ... but not quite arigo, fijal (PyCon US 2013) Python without the GIL March 15 2013 4 / 24
  5. 5.

    Problem An existing complex, large program that does stuff “stuff”

    consists of bits that are mostly independent from each other ... but not quite arigo, fijal (PyCon US 2013) Python without the GIL March 15 2013 4 / 24
  6. 6.

    Problem We want to parallelize the program to use all

    these cores We have some shared mutable state Not too much of it --- otherwise, no chance for parallelism But still some arigo, fijal (PyCon US 2013) Python without the GIL March 15 2013 5 / 24
  7. 7.

    Problem We want to parallelize the program to use all

    these cores We have some shared mutable state Not too much of it --- otherwise, no chance for parallelism But still some arigo, fijal (PyCon US 2013) Python without the GIL March 15 2013 5 / 24
  8. 8.

    Classic solutions Bare-metal multi-threading: large shared state needs careful usage

    of locks mostly hindered by the GIL in CPython (but not in Jython or IronPython) arigo, fijal (PyCon US 2013) Python without the GIL March 15 2013 6 / 24
  9. 9.

    Classic solutions Multi-processing: no shared mutable state at all copying,

    keeping in sync is your problem serializing and deserializing is expensive and hard memory usage is often multiplied (unless you’re lucky with fork, but not on Python) arigo, fijal (PyCon US 2013) Python without the GIL March 15 2013 7 / 24
  10. 10.

    Classic solutions A range of intermediate solutions: MPI: message passing,

    with limited shared state etc.: tons of experiments that never caught on in the mainstream arigo, fijal (PyCon US 2013) Python without the GIL March 15 2013 8 / 24
  11. 11.

    Classic solutions The typical solution for web servers: run independent

    processes share data only via the database the database itself handles concurrency with transactions arigo, fijal (PyCon US 2013) Python without the GIL March 15 2013 9 / 24
  12. 12.

    Demo pypy-stm internally based on “transactions” (STM, HTM) arigo, fijal

    (PyCon US 2013) Python without the GIL March 15 2013 10 / 24
  13. 14.

    Status of the implementation mostly works major GC collections missing

    (leaks slowly) JIT integration is not done tons of optimizations possible arigo, fijal (PyCon US 2013) Python without the GIL March 15 2013 12 / 24
  14. 15.

    How do I use it? just like with the GIL

    __pypy__.thread.atomic with atomic: print "hello", username the illusion of serialization arigo, fijal (PyCon US 2013) Python without the GIL March 15 2013 13 / 24
  15. 16.

    STM - low level STM = Software Transactional Memory Basic

    idea: each thread runs in parallel, but everything it does is in a series of “transactions” A transaction keeps all changes to pre-existing memory “local” The changes are made visible only when the transaction “commits” arigo, fijal (PyCon US 2013) Python without the GIL March 15 2013 14 / 24
  16. 17.

    STM - low level (2) The transaction will “abort” if

    a conflict is detected, and it will be transparently retried Non-reversible operations like I/O turn the transaction “inevitable” and stop progress in the other threads __pypy__.thread.last_abort_info() -> traceback-like information arigo, fijal (PyCon US 2013) Python without the GIL March 15 2013 15 / 24
  17. 18.

    Alternative - HTM Intel Haswell (released soon) has got HTM

    great for the “remove the GIL” part not so great for large transactions, at least for now arigo, fijal (PyCon US 2013) Python without the GIL March 15 2013 16 / 24
  18. 19.

    Higher level: Threads Are Bad based on (and fully compatible

    with) threads existing multithreaded programs work but opens up unexpected alternatives arigo, fijal (PyCon US 2013) Python without the GIL March 15 2013 17 / 24
  19. 20.

    Higher level: Atomic we can run multiple threads but at

    the same time use atomic with the GIL-based idea of atomic it wouldn’t make sense multiple threads but they’re all using atomic i.e. only one at a time will ever run ...except no :-) arigo, fijal (PyCon US 2013) Python without the GIL March 15 2013 18 / 24
  20. 21.

    Transactions transaction.py: example of wrapper hiding threads illusion of serial

    execution: can “sanely” reason about algorithms arigo, fijal (PyCon US 2013) Python without the GIL March 15 2013 19 / 24
  21. 22.

    Transaction conflicts “Conflict tracebacks” Might need to debug them ---

    but they are performance bugs, not correctness bugs The idea is that we fix only XX% of the bugs and we are done Maybe some new “performance debugger” tools might help too arigo, fijal (PyCon US 2013) Python without the GIL March 15 2013 20 / 24
  22. 23.

    We’re not the only ones TigerQuoll, 1st March: same idea

    with JavaScript (for servers) Various models possible: events dispatchers futures map/reduce, scatter/gather arigo, fijal (PyCon US 2013) Python without the GIL March 15 2013 21 / 24
  23. 24.

    Event dispatchers twisted, tulip, etc. run the event dispatcher in

    one thread (e.g. the main thread), and schedule the actual events to run on a different thread from a pool the events are run with atomic, so that they appear to run serially does not rely on any change to the user program arigo, fijal (PyCon US 2013) Python without the GIL March 15 2013 22 / 24
  24. 25.

    Donate! STM is primarily funded by donations We got quite

    far with $22k USD Thanks to the PSF and all others! We need your help too arigo, fijal (PyCon US 2013) Python without the GIL March 15 2013 23 / 24