Slide 1

Slide 1 text

No content

Slide 2

Slide 2 text

https://goo.gl/LJ0hsy

Slide 3

Slide 3 text

Concurrency In Python with gevent, asyncio Asynchronous Python

Slide 4

Slide 4 text

whoami github.com/ianjuma @IanJuma medium.com/@IanJuma

Slide 5

Slide 5 text

Why asyncio/ gevent?

Slide 6

Slide 6 text

Why asyncIO/ gevent? The need for reactive programming Callback hell The need to write async code in a sync manner Non blocking/ Event based programming

Slide 7

Slide 7 text

Asynchronous control flow 1. Callbacks 2. Futures/ Promises 3. Co-routines (asyncio, gevent) 4. Generators (asyncio) 5. Defferds (Twisted) - callback based 6. continuations - (interrupt, save state and continue) 7. async / await (C#, ES7, Scala - SIP 22)

Slide 8

Slide 8 text

What’s the event loop anyway? The need for event-driven programming The boon for heavily concurrent I/O applications, services Reactive manifesto Register an event and react to it as it arrives - seems natural

Slide 9

Slide 9 text

Reviewing PEP 492 Coroutines with async await Generators and coroutines in Python; what’s the difference? Prior to Python 3.4 really the same thing Aim of the PEP is to separate the two and make native coroutines a stand-alone concept in Python This PEP enabled the language to use coroutines as iterables and context managers. Ultimate goal is to make async programming in Python easy with async/ await

Slide 10

Slide 10 text

Hello world Native co-routine example async /await is really just about making async code more feel sync to the developer Turn the generator into a coroutine import asyncio async def foo(): await asyncio.sleep(1.0) print('foo generator')

Slide 11

Slide 11 text

Generator based co-routine What’s a generator based co-routine? Meant to be backwards compatible with Python < 3 Generator based (before 3, there was no real difference between coroutines and generators) import asyncio @asyncio.coroutine def foo(): yield from asyncio.sleep(1.0) print('foo generator')

Slide 12

Slide 12 text

Event loop? Creating and passing tasks to the event loop import asyncio @asyncio.coroutine def foo(): yield from asyncio.sleep(1.0) print('foo generator') # create event loop loop = asyncio.get_event_loop() loop.run_until_complete(foo()) loop.close()

Slide 13

Slide 13 text

“...It is proposed to make coroutines a proper standalone concept in Python, and introduce new supporting syntax. The ultimate goal is to help establish a common, easily approachable, mental model of asynchronous programming in Python and make it as close to synchronous programming as possible” PEP

Slide 14

Slide 14 text

Back to the Future

Slide 15

Slide 15 text

Future A read only ref to a value that may not exist (“future” value) Non-blocking by default and make use of callbacks This allows for fast asynchronous code PEP 3148 - futures in stdlib (Accepted) They revolve around some Execution context - (threads pool, process pool, main thread**) Future can be in two states complete/ incomplete - onComplete / fail/ success import asyncio tasks = [67, 45, 67, 78, 56] def factorial(number): ... executor = futures.ThreadPoolExecutor(max_workers = 10) # list of futures future_to_task = dict((executor.submit(factorial, number), number, for number in tasks) for future in futures.as_completed(future_to_task): task = future_to_task[future] if future.exception() is not None: print('%r generated an exception: %s' % (task, future.exception())) else: print('task fact({}) result is {} '.format(task, future.result()))

Slide 16

Slide 16 text

Which Future? concurrent.futures.Future vs asyncio.Future

Slide 17

Slide 17 text

If you're using ThreadPoolExecutor or ProcessPoolExecutor, or want to use a Future directly for thread-based or process-based concurrency, use concurrent.futures.Future If you're using asyncio, use asyncio.Future

Slide 18

Slide 18 text

Blocking future? import asyncio # create executor context executor = futures.ThreadPoolExecutor( max_workers = 10 ) … loop = asyncio.get_event_loop() loop.run_until_complete(foo()) loop.close() No blocking construct - for long running non I/O tasks Wrap a future in a different Execution context They revolve around some Execution context - (threads pool, process pool, main thread**) Future can be in two states complete/ incomplete - onComplete / fail/ success

Slide 19

Slide 19 text

Side stepping the GIL Avoid the GIL with the multiprocessing module - with Futures Deadlocks - a deadlock can occur when the callable future is waiting on the result of another future Achieving true-concurrency Limiting concurrency with max_workers import asyncio from concurrent import futures tasks = [67, 45, 67, 78, 56] # some heavy, blocking task (CPU bound) def heavy_task(number): … return res executor = futures.ProcessPoolExecutor(max_workers = 10) futures_to_task = dict((executor.submit(heavy_task, number), number, for number in tasks) for future in futures.as_completed(futures_to_task): task = future_to_task[future] if future.exception() is not None: print('%r generated an exception: %s' % (task, future.exception())) else: print('task fact({}) result is {} '.format(task, future.result()))

Slide 20

Slide 20 text

Combining co-routines, Futures Why? Chaining Futures - some file operation, some operation after a DB call Chain if result A - is depended on by Task B import asyncio @asyncio.coroutine def create(): yield from asyncio.sleep(2.0) print('created file') @asyncio.coroutine def write(): yield from asyncio.sleep(1.0) print('write file') @asyncio.coroutine def test(): yield from asyncio.ensure_future(create()) yield from asyncio.ensure_future(write()) yield from asyncio.ensure_future(close()) yield from asyncio.sleep(2.0) loop.stop() loop = asyncio.get_event_loop() asyncio.ensure_future(test()) loop.run_forever() print("Pending exit: {}", asyncio.Task.all_tasks(loop)) loop.close()

Slide 21

Slide 21 text

Asyncio event loop Based on libev Co-operative multi-tasking Interleaving tasks run; handing over control when running a blocking operation Enable easy debugging with asyncio import asyncio @asyncio.coroutine def task(): yield from asyncio.sleep(1.0) print('foo generator') # single threaded event loop loop = asyncio.get_event_loop() loop.run_until_complete(task()) loop.close() # debug the event loop with PYTHONASYNCIODEBUG=1

Slide 22

Slide 22 text

No content

Slide 23

Slide 23 text

Gevent Co-routine based Co-operative multi-tasking; interleaving tasks, when blocking, yield control, suspend execution and hand over control Based on libev Ability to use lightweight green threads called greenlets Access libev event loop internals through gevent.core import gevent def hello(): print('Running in hello') gevent.sleep(0) # blocking print('Explicit context switch to world again') def world(): print('Running in world') gevent.sleep(0) # blocking print('Explicit context switch to hello again') gevent.joinall([ gevent.spawn(hello), gevent.spawn(world) ])

Slide 24

Slide 24 text

Gevent API Very thread like - spawn, like POSIX threads monkey-patching non-cooperating libraries Patterns to spawn greenlets Deterministic - given the same input, greenlets will produce the same output Limiting concurrency with a Pool Python < 2.7 Actor model in Gevent - Using Queue asyncIO gevent / asyncIO import gevent from gevent import Greenlet def foo(message, n): gevent.sleep(n) print(message) thread1 = Greenlet.spawn(foo, "Hello", 1) # Wrapper for creating and running a Greenlet gthread2 = gevent.spawn(foo, "I live!", 2) # Lambda expressions gthread3 = gevent.spawn(lambda x: (x+1), 2) gthreads = [gthread1,gthread2, gthread3] # Block until all threads complete gevent.joinall(threads)

Slide 25

Slide 25 text

Monkey patching monkey-patching non-cooperating libraries Place gevent friendly functions in place of std lib functions so they can work better with Gevent Patch the sockets as early as possible from gevent import monkey monkey.patch_all() patch_all( socket=True, dns=True, thread=True, os=True, ssl=True, httplib=False, subprocess=True, sys=False, aggressive=True, Event=False, builtins=True, signal=True )

Slide 26

Slide 26 text

Greenlet states 1. started() - bool 2. ready() - bool 3. successful () - bool 4. value 5. exception

Slide 27

Slide 27 text

Async execution in Gevent Using gevent for co-operative multi-tasking Yielding control when blocking Great for I/O import gevent import random def task(pid): gevent.sleep(random.randint(0,2)*0.001) print('Task %s done' % pid) def synchronous(): for i in range(1,10): task(i) def asynchronous(): # using gevent threads = [gevent.spawn(task, i) for i in xrange(10)] gevent.joinall(threads) print('Synchronous:') synchronous() print('Asynchronous:') asynchronous()

Slide 28

Slide 28 text

Program shutdown Avoiding zombie processes by killing the Gevent greenlet system on exit Using gevent.signal to kill greenlets when a kill signal is received import gevent import signal def run_app(): gevent.sleep(1000) if __name__ == '__main__': gevent.signal(signal.SIGQUIT, gevent.kill) thread = gevent.spawn(run_app) thread.join()

Slide 29

Slide 29 text

Greenlets and timeouts Timing a greenlet - create a time constraint on a Greenlet/ block of code Future timeout import gevent from gevent import Timeout seconds = 5 timeout = Timeout(seconds) timeout.start() def wait(): gevent.sleep(5) try: gevent.spawn(wait).join() except Timeout: print('Could not complete')

Slide 30

Slide 30 text

Group Gevent has a feature that allows us to group threads that can be scheduled together A group is also some sort of parallel dispatcher Useful for managing similar groups of Tasks (some API call) Allows us to order execution of tasks (chain) import gevent from gevent.pool import Group def talk(msg): for i in xrange(3): print(msg) g1 = gevent.spawn(talk, 'bar') g2 = gevent.spawn(talk, 'foo') g3 = gevent.spawn(talk, 'foobar') group = Group() group.add(g1) group.add(g2) group.join() group.add(g3) group.join()

Slide 31

Slide 31 text

Pool Creating a concurrency limit with Pool; for IO tasks import gevent from gevent.pool import Pool pool = Pool(2) def hello_from(n): print('Size of pool %s' % len(pool)) pool.map(hello_from, xrange(3))

Slide 32

Slide 32 text

The Actor model with gevent It’s a model that allows you to design and build concurrency at a higher level; through message passing A thing that receives a message and acts on it They have an address so they can receive messages; they have a mailbox to store the incoming messages Actors almost behave like threads. They don’t map one -on-one with threads; But this depends on the dispatcher/ execution context

Slide 33

Slide 33 text

The Actor model with gevent Asynchronous message passing to actors

Slide 34

Slide 34 text

The Actor model with gevent import gevent from gevent.queue import Queue class Actor(gevent.Greenlet): def __init__(self): self.inbox = Queue() # mailbox Greenlet.__init__(self) def receive(self, message): raise NotImplemented() # override -implement def _run(self): self.running = True while self.running: message = self.inbox.get() self.receive(message) Asynchronous message passing to actors Inbox with a Queue Override receive with action to process received message Send messages by calling inbox.put in an actor We violate the first Policy - don’t change the internal state of an actor. (We do just that) No supervision ?

Slide 35

Slide 35 text

actors ... import gevent from gevent.queue import Queue class Actor(gevent.Greenlet): def __init__(self): self.inbox = Queue() # mailbox Greenlet.__init__(self) def receive(self, message): raise NotImplemented() # override -implement def _run(self): self.running = True while self.running: message = self.inbox.get() self.receive(message) Asynchronous message passing to actors Inbox with a Queue Override receive with action to process received message Send messages by calling inbox.put on an Actor instance We violate the first Policy - don’t change the internal state of an actor. (We do just that) No supervision ?

Slide 36

Slide 36 text

actors ... class AskGoogle(Actor): def receive(self, message): # do something service.inbox.put("message") google.inbox.put("poll_self") # self send message gevent.sleep(0) class SendMessageService(Actor): def receive(self, message): # send msg google.inbox.put("google") gevent.sleep(0) sendMessageService = SendMessageService() googleService = AskGoogle() service.start() google.start() service.inbox.put("start") gevent.joinall([ service, google ]) Actor to fetch stats off a web service Subclass actor to create an Actor service Implement receive method with desired action An actor ref here is an instance of the class Send a message to the Actor by adding an element to its queue. (violation) let’s just assume it’s an actor reference Start the actor instances

Slide 37

Slide 37 text

Thanks! AfricasTalking Galaza plaza, 7th floor Off Galana rd Nairobi, Kenya [email protected] [email protected] www.africastalking.com