Slide 1

Slide 1 text

A TALE OF CONCURRENCY THROUGH CREATIVITY IN PYTHON: A DEEP DIVE INTO HOW GEVENT WORKS

Slide 2

Slide 2 text

KAVYA

Slide 3

Slide 3 text

GEVENT

Slide 4

Slide 4 text

What is asynchronous I/O? What is gevent?

Slide 5

Slide 5 text

download_photos network

Slide 6

Slide 6 text

# Open a connection to the server conn = get_authenticated_connection(user) # Download all photos photos = get_photos(conn) # Save for later display save_photos(user, photos) def download_photos(user):

Slide 7

Slide 7 text

def downloader(): users = get_users() for user in users: download_photos(user) network I/O

Slide 8

Slide 8 text

import multiprocessing threading twisted green_thread ?

Slide 9

Slide 9 text

import multiprocessing

Slide 10

Slide 10 text

import multiprocessing as mp def downloader(): pool = [] for user in users: p = mp.Process(download_photos, user) pool.append(p) p.start() for p in pool: p.join()

Slide 11

Slide 11 text

import threading

Slide 12

Slide 12 text

import threading def downloader(): pool = [] for user in users: t = threading.Thread(download_photos, user) pool.append(t) t.start() for t in pool: t.join()

Slide 13

Slide 13 text

import twisted

Slide 14

Slide 14 text

import twisted def download_photos(): # Modify this to add callbacks def downloader(): # Something something loop.run()

Slide 15

Slide 15 text

green threads user space — 
 the OS does not create or manage them cooperatively scheduled — 
 the OS does not schedule or preempt them lightweight

Slide 16

Slide 16 text

import gevent

Slide 17

Slide 17 text

import gevent from gevent import monkey; monkey.patch_all() def downloader(): pool = [] for user in users: g = gevent.Greenlet(download_photos, user) g.start() pool.append(g) gevent.joinall(pool)

Slide 18

Slide 18 text

THE BUILDING BLOCKS PUTTING IT TOGETHER WRAP-UP/ Q&A

Slide 19

Slide 19 text

THE BUILDING BLOCKS

Slide 20

Slide 20 text

from greenlet import greenlet ... class Greenlet(greenlet): """ A light-weight cooperatively-scheduled execution unit. """ ... ? g = gevent.Greenlet(download_photos, user)

Slide 21

Slide 21 text

def print_red(): print 'red' gr2.switch() print ‘red done!’ def print_blue(): print 'blue' gr1.switch() print ‘blue done!’ red blue red done! from greenlet import greenlet gr1 = greenlet(print_red) gr2 = greenlet(print_blue) gr1.switch()

Slide 22

Slide 22 text

.switch() pause current + yield control flow resume next.switch() coroutine

Slide 23

Slide 23 text

gr1 = greenlet(run_fn) { } run_fn parent …

Slide 24

Slide 24 text

{ base = SP1 } SP1 SP2 { base = SP1 start = SP2 } { base = SP2 } gr1.switch() gr2.switch() SP3 gr1.switch() } C STACK

Slide 25

Slide 25 text

} start SP3 SP4 = HEAP C STACK

Slide 26

Slide 26 text

greenlets for coroutines via assembly-based stack-slicing

Slide 27

Slide 27 text

import gevent from gevent import monkey; monkey.patch_all() def downloader(): pool = [] for user in users: g = gevent.Greenlet(download_photos, user) g.start() pool.append(g) gevent.joinall(pool)

Slide 28

Slide 28 text

def start(self): """ Schedule the greenlet to run in this loop iteration. """ if self._start_event is None: self._start_event = \ ...loop.run_callback(self.switch) g.start()

Slide 29

Slide 29 text

libev API to register event_handler callbacks watches for events calls registered callbacks

Slide 30

Slide 30 text

“Hey loop, Wait for a write on this socket and call parse_recv() when that happens.”

Slide 31

Slide 31 text

while True: block for I/O call pending io_watchers fd = make_nonblocking(socket_fd) loop.io_watch(fd, write, callback_fn) loop.run() call all pre_block_watchers call all post_block_watchers

Slide 32

Slide 32 text

always call pre_block_watchers Hook to integrate other event mechanisms into the loop. “Hey loop, If there are coroutines ready to run, run them. Then, block for a write on...”

Slide 33

Slide 33 text

libev for an event loop

Slide 34

Slide 34 text

PUTTING IT TOGETHER

Slide 35

Slide 35 text

import gevent from gevent import monkey; monkey.patch_all() def downloader(): pool = [] for user in users: g = gevent.Greenlet(download_photos, user) g.start() pool.append(g) gevent.joinall(pool)

Slide 36

Slide 36 text

for user in users: g = gevent.Greenlet(download_photos,user)

Slide 37

Slide 37 text

g = gevent.Greenlet(download_photos,user) class Greenlet(greenlet): def __init__(self, run=None,...): greenlet.__init__(self, None, get_hub()) g.parent = Hub

Slide 38

Slide 38 text

class Greenlet(greenlet): greenlet.__init__(self, None, get_hub()) g.parent = Hub class Hub(greenlet): def __init__(self): greenlet.__init__(self) self.loop = ...

Slide 39

Slide 39 text

Greenlet() a greenlet — to run download_photos() the event loop — i.e. the Hub .parent

Slide 40

Slide 40 text

for user in users: g = gevent.Greenlet(download_photos,user) g.start()

Slide 41

Slide 41 text

self.parent.loop.run_callback(self.switch) g.start() Hub pre_block_watcher

Slide 42

Slide 42 text

while True: block for I/O ... call all pre_block_watchers = g.switch loop.run()

Slide 43

Slide 43 text

.start() “Hey loop, This coroutine is ready to run. Run it before you block...”

Slide 44

Slide 44 text

for user in users: g = gevent.Greenlet(download_photos,user) g.start() pool.append(g) gevent.joinall(pool)

Slide 45

Slide 45 text

gevent.joinall() g.join() result = self.parent.switch() Hub class Hub(greenlet): def run(self): while True: self.loop.run()

Slide 46

Slide 46 text

.join() runs the loop

Slide 47

Slide 47 text

.join() = loop.run() while True: ... call pre_block_watchers = g.switch download_photos()

Slide 48

Slide 48 text

HUB

Slide 49

Slide 49 text

loop.run() g.switch() download_photos() network I/O

Slide 50

Slide 50 text

import gevent from gevent import monkey; monkey.patch_all() def downloader(): ...

Slide 51

Slide 51 text

socket gevent.socket import

Slide 52

Slide 52 text

fd = make_nonblocking(socket_fd) loop.io_watch(fd, write, callback_fn) loop.run() g.switch Hub.switch create: send:

Slide 53

Slide 53 text

network I/O

Slide 54

Slide 54 text

for user in users: g = gevent.Greenlet(download_photos,user) g.start() pre_block_watchers = [g1.switch, g2.switch]

Slide 55

Slide 55 text

gevent.joinall() g1.switch() loop.run() call pre_block_watchers = [g1.switch, ...] Hub download_photos(user1) network_request g1 io_watchers = [g1.switch] Hub.switch()

Slide 56

Slide 56 text

g2.switch() loop.run() call pre_block_watchers = [g2.switch] Hub download_photos(user2) network_request g2 io_watchers = [g2.switch, g1.switch] Hub.switch()

Slide 57

Slide 57 text

block for I/O call pending io_watchers = [g1.switch] Hub resumes download_photos(user1) g1 g1.switch() loop.run() call pre_block_watchers = [] ...

Slide 58

Slide 58 text

WRAP-UP

Slide 59

Slide 59 text

minuses no parallelism non-cooperative code will block the entire process: 
 C-extensions —> use pure Python libraries
 compute-bound greenlets —> use gevent.sleep(0)
 —> use greenlet blocking detection
 monkey-patch may have confusing implications
 order of imports matters

Slide 60

Slide 60 text

…but excellent for workloads that are: 
 I/O bound, highly concurrent —> 20-30k concurrent connections!
 Used at “web scale” at:
 Pinterest, Facebook, Mixpanel, PayPal, Disqus, Nylas…

Slide 61

Slide 61 text

greenlet libev Hub monkeypatching

Slide 62

Slide 62 text

KAVYA @KAVYA719

Slide 63

Slide 63 text

greenlet libev Hub monkeypatching