Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Asynchronous Python with gevent and asyncIO

Ian Juma
September 30, 2016

Asynchronous Python with gevent and asyncIO

In this task, we look at how to build Asynchronous applications in Python with gevent and asyncio

Ian Juma

September 30, 2016
Tweet

More Decks by Ian Juma

Other Decks in Programming

Transcript

  1. Why asyncIO/ gevent? The need for reactive programming Callback hell

    The need to write async code in a sync manner Non blocking/ Event based programming
  2. Asynchronous control flow 1. Callbacks 2. Futures/ Promises 3. Co-routines

    (asyncio, gevent) 4. Generators (asyncio) 5. Defferds (Twisted) - callback based 6. continuations - (interrupt, save state and continue) 7. async / await (C#, ES7, Scala - SIP 22)
  3. What’s the event loop anyway? The need for event-driven programming

    The boon for heavily concurrent I/O applications, services Reactive manifesto Register an event and react to it as it arrives - seems natural
  4. Reviewing PEP 492 Coroutines with async await Generators and coroutines

    in Python; what’s the difference? Prior to Python 3.4 really the same thing Aim of the PEP is to separate the two and make native coroutines a stand-alone concept in Python This PEP enabled the language to use coroutines as iterables and context managers. Ultimate goal is to make async programming in Python easy with async/ await
  5. Hello world Native co-routine example async /await is really just

    about making async code more feel sync to the developer Turn the generator into a coroutine import asyncio async def foo(): await asyncio.sleep(1.0) print('foo generator')
  6. Generator based co-routine What’s a generator based co-routine? Meant to

    be backwards compatible with Python < 3 Generator based (before 3, there was no real difference between coroutines and generators) import asyncio @asyncio.coroutine def foo(): yield from asyncio.sleep(1.0) print('foo generator')
  7. Event loop? Creating and passing tasks to the event loop

    import asyncio @asyncio.coroutine def foo(): yield from asyncio.sleep(1.0) print('foo generator') # create event loop loop = asyncio.get_event_loop() loop.run_until_complete(foo()) loop.close()
  8. “...It is proposed to make coroutines a proper standalone concept

    in Python, and introduce new supporting syntax. The ultimate goal is to help establish a common, easily approachable, mental model of asynchronous programming in Python and make it as close to synchronous programming as possible” PEP
  9. Future A read only ref to a value that may

    not exist (“future” value) Non-blocking by default and make use of callbacks This allows for fast asynchronous code PEP 3148 - futures in stdlib (Accepted) They revolve around some Execution context - (threads pool, process pool, main thread**) Future can be in two states complete/ incomplete - onComplete / fail/ success import asyncio tasks = [67, 45, 67, 78, 56] def factorial(number): ... executor = futures.ThreadPoolExecutor(max_workers = 10) # list of futures future_to_task = dict((executor.submit(factorial, number), number, for number in tasks) for future in futures.as_completed(future_to_task): task = future_to_task[future] if future.exception() is not None: print('%r generated an exception: %s' % (task, future.exception())) else: print('task fact({}) result is {} '.format(task, future.result()))
  10. If you're using ThreadPoolExecutor or ProcessPoolExecutor, or want to use

    a Future directly for thread-based or process-based concurrency, use concurrent.futures.Future If you're using asyncio, use asyncio.Future
  11. Blocking future? import asyncio # create executor context executor =

    futures.ThreadPoolExecutor( max_workers = 10 ) … loop = asyncio.get_event_loop() loop.run_until_complete(foo()) loop.close() No blocking construct - for long running non I/O tasks Wrap a future in a different Execution context They revolve around some Execution context - (threads pool, process pool, main thread**) Future can be in two states complete/ incomplete - onComplete / fail/ success
  12. Side stepping the GIL Avoid the GIL with the multiprocessing

    module - with Futures Deadlocks - a deadlock can occur when the callable future is waiting on the result of another future Achieving true-concurrency Limiting concurrency with max_workers import asyncio from concurrent import futures tasks = [67, 45, 67, 78, 56] # some heavy, blocking task (CPU bound) def heavy_task(number): … return res executor = futures.ProcessPoolExecutor(max_workers = 10) futures_to_task = dict((executor.submit(heavy_task, number), number, for number in tasks) for future in futures.as_completed(futures_to_task): task = future_to_task[future] if future.exception() is not None: print('%r generated an exception: %s' % (task, future.exception())) else: print('task fact({}) result is {} '.format(task, future.result()))
  13. Combining co-routines, Futures Why? Chaining Futures - some file operation,

    some operation after a DB call Chain if result A - is depended on by Task B import asyncio @asyncio.coroutine def create(): yield from asyncio.sleep(2.0) print('created file') @asyncio.coroutine def write(): yield from asyncio.sleep(1.0) print('write file') @asyncio.coroutine def test(): yield from asyncio.ensure_future(create()) yield from asyncio.ensure_future(write()) yield from asyncio.ensure_future(close()) yield from asyncio.sleep(2.0) loop.stop() loop = asyncio.get_event_loop() asyncio.ensure_future(test()) loop.run_forever() print("Pending exit: {}", asyncio.Task.all_tasks(loop)) loop.close()
  14. Asyncio event loop Based on libev Co-operative multi-tasking Interleaving tasks

    run; handing over control when running a blocking operation Enable easy debugging with asyncio import asyncio @asyncio.coroutine def task(): yield from asyncio.sleep(1.0) print('foo generator') # single threaded event loop loop = asyncio.get_event_loop() loop.run_until_complete(task()) loop.close() # debug the event loop with PYTHONASYNCIODEBUG=1
  15. Gevent Co-routine based Co-operative multi-tasking; interleaving tasks, when blocking, yield

    control, suspend execution and hand over control Based on libev Ability to use lightweight green threads called greenlets Access libev event loop internals through gevent.core import gevent def hello(): print('Running in hello') gevent.sleep(0) # blocking print('Explicit context switch to world again') def world(): print('Running in world') gevent.sleep(0) # blocking print('Explicit context switch to hello again') gevent.joinall([ gevent.spawn(hello), gevent.spawn(world) ])
  16. Gevent API Very thread like - spawn, like POSIX threads

    monkey-patching non-cooperating libraries Patterns to spawn greenlets Deterministic - given the same input, greenlets will produce the same output Limiting concurrency with a Pool Python < 2.7 Actor model in Gevent - Using Queue asyncIO gevent / asyncIO import gevent from gevent import Greenlet def foo(message, n): gevent.sleep(n) print(message) thread1 = Greenlet.spawn(foo, "Hello", 1) # Wrapper for creating and running a Greenlet gthread2 = gevent.spawn(foo, "I live!", 2) # Lambda expressions gthread3 = gevent.spawn(lambda x: (x+1), 2) gthreads = [gthread1,gthread2, gthread3] # Block until all threads complete gevent.joinall(threads)
  17. Monkey patching monkey-patching non-cooperating libraries Place gevent friendly functions in

    place of std lib functions so they can work better with Gevent Patch the sockets as early as possible from gevent import monkey monkey.patch_all() patch_all( socket=True, dns=True, thread=True, os=True, ssl=True, httplib=False, subprocess=True, sys=False, aggressive=True, Event=False, builtins=True, signal=True )
  18. Greenlet states 1. started() - bool 2. ready() - bool

    3. successful () - bool 4. value 5. exception
  19. Async execution in Gevent Using gevent for co-operative multi-tasking Yielding

    control when blocking Great for I/O import gevent import random def task(pid): gevent.sleep(random.randint(0,2)*0.001) print('Task %s done' % pid) def synchronous(): for i in range(1,10): task(i) def asynchronous(): # using gevent threads = [gevent.spawn(task, i) for i in xrange(10)] gevent.joinall(threads) print('Synchronous:') synchronous() print('Asynchronous:') asynchronous()
  20. Program shutdown Avoiding zombie processes by killing the Gevent greenlet

    system on exit Using gevent.signal to kill greenlets when a kill signal is received import gevent import signal def run_app(): gevent.sleep(1000) if __name__ == '__main__': gevent.signal(signal.SIGQUIT, gevent.kill) thread = gevent.spawn(run_app) thread.join()
  21. Greenlets and timeouts Timing a greenlet - create a time

    constraint on a Greenlet/ block of code Future timeout import gevent from gevent import Timeout seconds = 5 timeout = Timeout(seconds) timeout.start() def wait(): gevent.sleep(5) try: gevent.spawn(wait).join() except Timeout: print('Could not complete')
  22. Group Gevent has a feature that allows us to group

    threads that can be scheduled together A group is also some sort of parallel dispatcher Useful for managing similar groups of Tasks (some API call) Allows us to order execution of tasks (chain) import gevent from gevent.pool import Group def talk(msg): for i in xrange(3): print(msg) g1 = gevent.spawn(talk, 'bar') g2 = gevent.spawn(talk, 'foo') g3 = gevent.spawn(talk, 'foobar') group = Group() group.add(g1) group.add(g2) group.join() group.add(g3) group.join()
  23. Pool Creating a concurrency limit with Pool; for IO tasks

    import gevent from gevent.pool import Pool pool = Pool(2) def hello_from(n): print('Size of pool %s' % len(pool)) pool.map(hello_from, xrange(3))
  24. The Actor model with gevent It’s a model that allows

    you to design and build concurrency at a higher level; through message passing A thing that receives a message and acts on it They have an address so they can receive messages; they have a mailbox to store the incoming messages Actors almost behave like threads. They don’t map one -on-one with threads; But this depends on the dispatcher/ execution context
  25. The Actor model with gevent import gevent from gevent.queue import

    Queue class Actor(gevent.Greenlet): def __init__(self): self.inbox = Queue() # mailbox Greenlet.__init__(self) def receive(self, message): raise NotImplemented() # override -implement def _run(self): self.running = True while self.running: message = self.inbox.get() self.receive(message) Asynchronous message passing to actors Inbox with a Queue Override receive with action to process received message Send messages by calling inbox.put in an actor We violate the first Policy - don’t change the internal state of an actor. (We do just that) No supervision ?
  26. actors ... import gevent from gevent.queue import Queue class Actor(gevent.Greenlet):

    def __init__(self): self.inbox = Queue() # mailbox Greenlet.__init__(self) def receive(self, message): raise NotImplemented() # override -implement def _run(self): self.running = True while self.running: message = self.inbox.get() self.receive(message) Asynchronous message passing to actors Inbox with a Queue Override receive with action to process received message Send messages by calling inbox.put on an Actor instance We violate the first Policy - don’t change the internal state of an actor. (We do just that) No supervision ?
  27. actors ... class AskGoogle(Actor): def receive(self, message): # do something

    service.inbox.put("message") google.inbox.put("poll_self") # self send message gevent.sleep(0) class SendMessageService(Actor): def receive(self, message): # send msg google.inbox.put("google") gevent.sleep(0) sendMessageService = SendMessageService() googleService = AskGoogle() service.start() google.start() service.inbox.put("start") gevent.joinall([ service, google ]) Actor to fetch stats off a web service Subclass actor to create an Actor service Implement receive method with desired action An actor ref here is an instance of the class Send a message to the Actor by adding an element to its queue. (violation) let’s just assume it’s an actor reference Start the actor instances