Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Parallelism in Python - Kiwi PyCon X

Parallelism in Python - Kiwi PyCon X

Python has a terrible reputation when it comes to its parallel processing capabilities.

Ignoring the standard arguments about its threads and the GIL (which are mostly valid), the real problem with parallelism in Python isn't a technical one, but a pedagogical one. The common tutorials surrounding Threading and Multiprocessing in Python, while generally excellent, are pretty “heavy.” They start in the intense stuff and stop before they get to the really good, day-to-day useful parts.

The talk will highlight the impact of Global Interpreter Lock on multithreaded programs and include code snippets to compare single and multithreaded programs using a real-life example.

Then, we move on to the concept of multiprocessing and the address its benefits over multithreading and mention some examples for it.

The talk will end with an application of multiprocessing in the area of Computer Vision and how it speeds up the completion of a task by 82%.

Moreover, this talk aims to inspire both novices and experts to look more into this subject and create something that could possibly be of benefit to the entire Python community around the world.

Rounak Vyas

August 24, 2019
Tweet

Other Decks in Programming

Transcript

  1. About Me • Final year CS student • Using python

    for the last 3 years. • Student Researcher at Next Tech Lab
  2. • Python has enjoyed a decade of usage in industry

    and academia. • Popular abstractions to scientific computing, AI/ML, etc. • Yet, has a bad rep for its parallel processing capabilities. Overview
  3. How the interpreter works Python uses reference counting for memory

    management. The reference count variable needs protection from race conditions. Source: Real Python
  4. This count variable can be kept safe by adding locks

    to all data structures. But, adding a lock to each object means multiple locks resulting in deadlocks/dec in performance. How the interpreter works
  5. Global Interpreter Lock (GIL) • A mutex (or a lock).

    • Allows only one thread to hold the control of the Python interpreter.
  6. Multiple threads ~ 3.66 secs (Overhead) Impact of GIL on

    multi-threaded programs Source: Real Python
  7. When is GIL not a problem? I/O Bound Tasks: Everything

    that blocks the current thread while not consuming much CPU.
  8. When it is a problem? CPU Bound Tasks: Tasks that

    mostly consume CPU time, like heavy computations or moving lots of data around in-memory (sorting, shuffling)
  9. Each Python process gets its own Python interpreter and memory

    space so the GIL won’t be a problem. Introducing: Multi-Processing
  10. Thumbnailing thousands of images. A common CPU bound task for

    someone working on vision, image processing, etc. Real Life Example