Goran Peretin
@gperetin
Freelancer
Hi!
Sunday, October 13, 13
Slide 4
Slide 4 text
Couple of notes...
• Python -> cPython
• Unix/Linux systems
Sunday, October 13, 13
Slide 5
Slide 5 text
• explanations of how something works
• understand terms and concepts
There will be...
Sunday, October 13, 13
Slide 6
Slide 6 text
• turnkey solutions
• pip install concurrency (?)
• inner workings of these libraries
There won’t be...
Sunday, October 13, 13
Slide 7
Slide 7 text
Executing multiple tasks in the same time frame.
What is concurrency?
Sunday, October 13, 13
Slide 8
Slide 8 text
Parallelism means executing simultaneously.
We need multiple CPUs for that.
Concurrency != Parallelism
Sunday, October 13, 13
Slide 9
Slide 9 text
• trivial to run in parallel
• no communication or synchronization needed
• your typical web app
Embarrassingly parallel problem
Sunday, October 13, 13
Slide 10
Slide 10 text
Types of concurrency
Sunday, October 13, 13
Slide 11
Slide 11 text
Process based concurrency
if task == process:
Sunday, October 13, 13
Slide 12
Slide 12 text
• OS is the scheduler
• preemptive multitasking
• pros: simple
• cons: takes a lot of RAM, you can’t really have “a
lot” of processes
Process based concurrency
Sunday, October 13, 13
Slide 13
Slide 13 text
Thread based concurrency
if task == thread:
Sunday, October 13, 13
Slide 14
Slide 14 text
• OS is the scheduler
• pros: takes less RAM, still simple
• cons: well...
Thread based concurrency
Sunday, October 13, 13
Slide 15
Slide 15 text
When we talk about threads and Python...
Sunday, October 13, 13
Slide 16
Slide 16 text
GIL
Sunday, October 13, 13
Slide 17
Slide 17 text
Only one thread can run Python program at any point in
time.
Threads can’t run in parallel.
Global Interpreter Lock
Sunday, October 13, 13
Slide 18
Slide 18 text
When can a thread run Python?
• OS scheduler schedules that thread
• Thread manages to acquire the GIL
Sunday, October 13, 13
Slide 19
Slide 19 text
CPU bound - uses a lot of CPU
IO bound - makes a lot of I/O requests
CPU bound vs IO bound task
Sunday, October 13, 13
Slide 20
Slide 20 text
CPU bound vs IO bound task
Sunday, October 13, 13
Slide 21
Slide 21 text
How is this related to GIL?
Sunday, October 13, 13
Slide 22
Slide 22 text
CPU bound program
2 Threads
Thread(fib(25))
Thread(fib(25))
1 Thread
fib(35)
fib(35)
real 0m5.044s
user 0m5.040s
sys 0m0.000s
real 0m7.462s
user 0m8.980s
sys 0m2.728s
Sunday, October 13, 13
Slide 23
Slide 23 text
Sunday, October 13, 13
Slide 24
Slide 24 text
Sunday, October 13, 13
Slide 25
Slide 25 text
Yes, GIL is a problem for CPU bound tasks.
Sunday, October 13, 13
Slide 26
Slide 26 text
Let’s focus on I/O bound tasks
• if we have CPU bound task, we have to use
multiple processes
Sunday, October 13, 13
Slide 27
Slide 27 text
I/O bound program
3 Threads
Thread(urlopen())
Thread(urlopen())
Thread(urlopen())
1 Thread
urlopen()
urlopen()
urlopen()
real 0m0.375s
user 0m0.028s
sys 0m0.004s
real 0m0.145s
user 0m0.036s
sys 0m0.000s
Sunday, October 13, 13
Slide 28
Slide 28 text
• Python releases the GIL
• OS makes the request and suspends the thread
until the response is here
When a thread does I/O call...
Sunday, October 13, 13
Slide 29
Slide 29 text
Blocking I/O
Sunday, October 13, 13
Slide 30
Slide 30 text
GIL might not be a problem for I/O bound tasks.
Sunday, October 13, 13
Slide 31
Slide 31 text
• thread per request
Problem is in the blocking thing. When our program
does a blocking I/O call, OS scheduler suspends that
thread and we can’t run anything else in that thread.
Might not?
Sunday, October 13, 13
Slide 32
Slide 32 text
*A lot* of requests/tasks.
Problem?
Sunday, October 13, 13
Slide 33
Slide 33 text
We would like to be able to run multiple things inside a
single OS thread (handle multiple requests). When one
request makes a blocking I/O call, continue processing
another request.
Sunday, October 13, 13
Slide 34
Slide 34 text
Non-blocking I/O
Sunday, October 13, 13
Slide 35
Slide 35 text
Non-blocking I/O
Let’s do an I/O call that returns immediately.
Sunday, October 13, 13
Slide 36
Slide 36 text
Event-driven concurrency
Non-blocking I/O gives us...
Sunday, October 13, 13
Slide 37
Slide 37 text
Callbacks (Twisted, Tornado)
vs
Coroutines (Gevent, Eventlet)
Event-driven concurrency - Python
Sunday, October 13, 13
Slide 38
Slide 38 text
• pass in callback function with the I/O call
• pros: it’s not a hack
• cons: it’s callback based
Callback-based
Sunday, October 13, 13
Slide 39
Slide 39 text
• use coroutines as microthreads
• pros: it’s not callback based
• cons: it’s a hack
Coroutine-based
Sunday, October 13, 13
Slide 40
Slide 40 text
Function that can suspend it’s execution and then later
resume where it was suspended.
Greenlets.
Coroutine
Sunday, October 13, 13
Slide 41
Slide 41 text
• pros: you don’t have to change the flow of your
program
• cons:
• because of the way it works, your program
can’t make regular I/O calls
• also hard to debug
Coroutine pros & cons (seriously)
Sunday, October 13, 13
Slide 42
Slide 42 text
Callback or greenlets?
Sunday, October 13, 13
Slide 43
Slide 43 text
PEP 3156 - new async specification
Tulip - reference implementation of the PEP
Future?
Sunday, October 13, 13
Slide 44
Slide 44 text
Just tell me what should I pip install to make it work...
So...
Sunday, October 13, 13
Slide 45
Slide 45 text
If you really need *high* concurrency, Python probably
isn’t the tool for the job.
Sunday, October 13, 13