Upgrade to Pro — share decks privately, control downloads, hide ads and more …

An introduction to concurrent programming with ...

Pycon ZA
October 11, 2018

An introduction to concurrent programming with asyncio by Bruce Merry

Concurrent programming is useful any time one needs to deal with multiple concurrent tasks: a server answering requests from multiple clients, a client scraping data from multiple servers, a workflow manager running external processes in a pipeline, and more.

While there are many concurrent programming frameworks for Python, there is one that is included out of the box: asyncio. I will introduce the framework and explain the syntax and APIs. Perhaps more importantly, I will offer practical tips on development with asyncio, such as exception handling, testing, debugging, and integration with existing code.

Attendees will come away with an understanding of why they will want to use asyncio instead of multi-threading, an understanding of the basic concepts, and knowledge of some additional libraries that will help them be productive with asyncio.

Pycon ZA

October 11, 2018
Tweet

More Decks by Pycon ZA

Other Decks in Programming

Transcript

  1. www.ska.ac.za Outline • Concurrent programming • First look at asyncio

    • Futures and Tasks • Cancellation • Blocking code • Testing • Debugging
  2. www.ska.ac.za Outline • Concurrent programming • First look at asyncio

    • Futures and Tasks • Cancellation • Blocking code • Testing • Debugging
  3. www.ska.ac.za The trouble with threads What does this code output?

    import concurrent.futures a = 0 def incr(): global a 6 for i in range(1000000): a += 1 8 executor = concurrent.futures.ThreadPoolExecutor() futures = [executor.submit(incr) for i in range(4)] concurrent.futures.wait(futures) print(a)
  4. www.ska.ac.za The trouble with threads What does this code output?

    import concurrent.futures a = 0 def incr(): global a 6 for i in range(1000000): a += 1 8 executor = concurrent.futures.ThreadPoolExecutor() futures = [executor.submit(incr) for i in range(4)] concurrent.futures.wait(futures) print(a) 688
  5. www.ska.ac.za More trouble with threads While connection is open: .

    Read a request . Process request . Send reply
  6. www.ska.ac.za More trouble with threads While connection is open: .

    Read a request . Process request . Send reply Stop the world, I want to get off!
  7. www.ska.ac.za Concurrent programming Concurrency is about dealing with lots of

    things at once. Parallelism is about doing lots of things at once. — Rob Pike
  8. www.ska.ac.za Concurrent programming Concurrency is about dealing with lots of

    things at once. Parallelism is about doing lots of things at once. — Rob Pike • Multiple tasks active
  9. www.ska.ac.za Concurrent programming Concurrency is about dealing with lots of

    things at once. Parallelism is about doing lots of things at once. — Rob Pike • Multiple tasks active • Only progress one at a time
  10. www.ska.ac.za Concurrent programming Concurrency is about dealing with lots of

    things at once. Parallelism is about doing lots of things at once. — Rob Pike • Multiple tasks active • Only progress one at a time • Only switch tasks at known points (typically I/O)
  11. www.ska.ac.za Concurrent programming Concurrency is about dealing with lots of

    things at once. Parallelism is about doing lots of things at once. — Rob Pike • Multiple tasks active • Only progress one at a time • Only switch tasks at known points (typically I/O) • Better scalability
  12. www.ska.ac.za Concurrent programming Concurrency is about dealing with lots of

    things at once. Parallelism is about doing lots of things at once. — Rob Pike • Multiple tasks active • Only progress one at a time • Only switch tasks at known points (typically I/O) • Better scalability • Easier to reason about
  13. www.ska.ac.za Outline • Concurrent programming • First look at asyncio

    • Futures and Tasks • Cancellation • Blocking code • Testing • Debugging
  14. www.ska.ac.za Example time import asyncio import aiohttp async def grab(session,

    url): resp = await session.get(url) 6 text = await resp.text() return text.splitlines()[0] 8 async def grab_urls(urls): session = aiohttp.ClientSession() coros = [grab(session, url) for url in urls] result = await asyncio.gather(*coros) await session.close() return result 6 loop = asyncio.get_event_loop() work = grab_urls([’https://www.python.org/’, ’http://www.ska.ac.za’]) 8 print(loop.run_until_complete(work))
  15. www.ska.ac.za Support in Python versions Active development — use the

    newest Python you can . Provisional support, based on generators, no async / await . Adds async and await .6 API stable, numerous small additions (with backports to . ) . More small improvements, async / await are now true keywords
  16. www.ska.ac.za About this talk × Every API call × All

    the syntax Concepts Practical tips
  17. www.ska.ac.za Event loop Minimal low-level API • call_soon, call_soon_threadsafe •

    call_later, call_at • add_reader, add_writer • add_signal_handler • run_forever, run_until_complete • Assorted networking/socket functions
  18. www.ska.ac.za Event loop Minimal low-level API • call_soon, call_soon_threadsafe •

    call_later, call_at • add_reader, add_writer • add_signal_handler • run_forever, run_until_complete • Assorted networking/socket functions Can be replaced by an alternative implementation
  19. www.ska.ac.za Continuation style Painful way to code def step1(loop): do_stuff()

    loop.call_later(1, step2) def step2(loop): 6 do_more_stuff()
  20. www.ska.ac.za Continuation style Painful way to code def step1(loop): do_stuff()

    loop.call_later(1, step2) def step2(loop): 6 do_more_stuff() versus async def run(): do_stuff() await asyncio.sleep(1) do_more_stuff()
  21. www.ska.ac.za Outline • Concurrent programming • First look at asyncio

    • Futures and Tasks • Cancellation • Blocking code • Testing • Debugging
  22. www.ska.ac.za Futures The future is already here — it’s just

    not very evenly distributed. — William Gibson When an expression is given to the evaluator by the user, a future for that expression is returned which is a promise to deliver the value of that expression at some later time... — Baker and Hewitt, The Incremental Garbage Collection of Processes
  23. www.ska.ac.za Futures The future is already here — it’s just

    not very evenly distributed. — William Gibson When an expression is given to the evaluator by the user, a future for that expression is returned which is a promise to deliver the value of that expression at some later time... — Baker and Hewitt, The Incremental Garbage Collection of Processes A future • Provides storage for a result (or exception) • Allows the result to be waited for with await
  24. www.ska.ac.za Future example A simple implementation of asyncio.sleep async def

    my_sleep(delay): loop = asyncio.get_event_loop() future = loop.create_future() loop.call_later(delay, future.set_result, None) await future 6 # Usage: 8 await my_sleep(1)
  25. www.ska.ac.za Tasks Tasks are coroutines executing concurrently async def my_coroutine():

    ... async def run_concurrent(): task = loop.create_task(my_coroutine()) 6 ... result = await task
  26. www.ska.ac.za What the What is a What? Coroutine func Awaitable

    Coroutine Future Task () create_task ensure_future ensure_future subclass create
  27. www.ska.ac.za Outline • Concurrent programming • First look at asyncio

    • Futures and Tasks • Cancellation • Blocking code • Testing • Debugging
  28. www.ska.ac.za Simple server example async def do_connection(reader, writer): try: while

    True: line = await reader.readline() if not line: 6 break # Connection was closed writer.write(await process(line)) 8 finally: writer.close()
  29. www.ska.ac.za Simple server example async def do_connection(reader, writer): try: while

    True: line = await reader.readline() if not line: 6 break # Connection was closed writer.write(await process(line)) 8 finally: writer.close() loop = asyncio.get_event_loop() task = loop.create_task(do_connection(reader, writer)) ... await task
  30. www.ska.ac.za Server shutdown async def do_connection(reader, writer): try: while True:

    line = await reader.readline() if not line: 6 break # Connection was closed writer.write(await process(line)) 8 finally: writer.close() task.cancel()
  31. www.ska.ac.za Server shutdown async def do_connection(reader, writer): try: while True:

    line = await reader.readline() if not line: 6 break # Connection was closed writer.write(await process(line)) 8 finally: writer.close() task.cancel() CancelledError
  32. www.ska.ac.za Catching cancellation async def do_connection(reader, writer): try: while True:

    line = await reader.readline() if not line: 6 break # Connection was closed writer.write(await process(line)) 8 except asyncio.CancelledError: writer.write(b'Server shutting down\n') raise finally: writer.close()
  33. www.ska.ac.za Anti-pattern: Futures as Events async def eat_sandwich(): await lunchtime_future

    mouth.insert("sandwich") async def drink_coffee(): 6 await lunchtime_future mouth.insert("coffee") 8 def finish_talk(): lunchtime_future.set_result(None)
  34. www.ska.ac.za Anti-pattern: Futures as Events async def eat_sandwich(): await lunchtime_future

    mouth.insert("sandwich") async def drink_coffee(): 6 await lunchtime_future mouth.insert("coffee") 8 def finish_talk(): lunchtime_future.set_result(None) What if eat_sandwich is cancelled?
  35. www.ska.ac.za Anti-pattern: Futures as Events async def eat_sandwich(): await lunchtime_future

    mouth.insert("sandwich") async def drink_coffee(): 6 await lunchtime_future mouth.insert("coffee") 8 def finish_talk(): lunchtime_future.set_result(None) What if eat_sandwich is cancelled? Use a asyncio.Event instead.
  36. www.ska.ac.za Timeouts Don’t add timeout arguments to your async APIs.

    The caller can cancel. try: await asyncio.wait_for(fetch(), timeout=10) except asyncio.TimeoutError: ...
  37. www.ska.ac.za Timeouts Don’t add timeout arguments to your async APIs.

    The caller can cancel. try: await asyncio.wait_for(fetch(), timeout=10) except asyncio.TimeoutError: ... Or, with async_timeout library, try: with async_timeout.timeout(10): await fetch('thing1') await fetch('thing2') except asyncio.TimeoutError: 6 ...
  38. www.ska.ac.za Outline • Concurrent programming • First look at asyncio

    • Futures and Tasks • Cancellation • Blocking code • Testing • Debugging
  39. www.ska.ac.za Blocking calls Main disadvantage of asyncio versus threads •

    Blocking I/O blocks the entire event loop... • ...even if it drops the GIL
  40. www.ska.ac.za Solutions • Just accept it • Run blocking code

    on another thread • Migrate to an async library
  41. www.ska.ac.za Dispatching to a thread asyncio makes it easy: await

    loop.run_in_executor(executor, func, arg)
  42. www.ska.ac.za Dispatching to a thread asyncio makes it easy: await

    loop.run_in_executor(executor, func, arg) But there are caveats • Cancellation won’t interrupt execution • func can’t directly use the event loop • All the usual thread safety pitfalls
  43. www.ska.ac.za Async libraries aioamqp aiobotocore aiodns aiodocker aioes / aioelasticsearch

    aioetcd aiofiles aiohttp aiokafka aiomcache / aiomemcache aiomysql aioodbc aiopg aioprocessing aiopyramid aioredis / asyncio-redis aiosqlite aiozmq aiozipkin asyncio-mongo
  44. www.ska.ac.za Outline • Concurrent programming • First look at asyncio

    • Futures and Tasks • Cancellation • Blocking code • Testing • Debugging
  45. www.ska.ac.za unittest/nosetests With asyncio plus asynctest import asynctest class MyTest(asynctest.TestCase):

    async def setUp(self): ... 6 async def tearDown(self): 8 ... async def testThing(self): ...
  46. www.ska.ac.za asynctest.ClockedTestCase Test time-dependent code e.g. timeouts import asyncio, async_timeout,

    asynctest async def times_out(): with async_timeout.timeout(10): await asyncio.sleep(20) 6 class TestTimeout(asynctest.ClockedTestCase): 8 async def test_times_out(self): task = self.loop.create_task(times_out()) await self.advance(11) self.assertTrue(task.done()) # Avoid hanging forever with self.assertRaises(asyncio.TimeoutError): await task
  47. www.ska.ac.za pytest Without asyncio import subprocess def test_it(): proc =

    subprocess.Popen(['/bin/false']) 6 returncode = proc.wait() assert returncode == 1
  48. www.ska.ac.za pytest With asyncio and pytest-asyncio import asyncio, pytest @pytest.mark.asyncio

    async def test_it(): proc = await asyncio.create_subprocess_exec('/bin/false') 6 returncode = await proc.wait() assert returncode == 1
  49. www.ska.ac.za Outline • Concurrent programming • First look at asyncio

    • Futures and Tasks • Cancellation • Blocking code • Testing • Debugging
  50. www.ska.ac.za An easy mistake async def double(x): print('double called with',

    x) return 2 * x async def main(): 6 y = double(2) # Oops! print('Double 2 is', y)
  51. www.ska.ac.za An easy mistake async def double(x): print('double called with',

    x) return 2 * x async def main(): 6 y = double(2) # Oops! print('Double 2 is', y) Double 2 is <coroutine object double at 0x7f5db3696308>
  52. www.ska.ac.za An easy mistake Now with PYTHONASYNCIODEBUG=1: Double 2 is

    <CoroWrapper double() running at code/debugmode.py:1, created at /usr/lib/python3.5/asyncio/coroutines.py:80> <CoroWrapper double() running at code/debugmode.py:1, created at /usr/lib/python3.5/asyncio/coroutines.py:80> was never yielded from Coroutine object created at (most recent call last): File "code/debugmode.py", line 10, in <module> asyncio.get_event_loop().run_until_complete(main()) ... File "/usr/lib/python3.5/asyncio/tasks.py", line 239, in _step result = coro.send(None) File "code/debugmode.py", line 6, in main y = double(2) # Oops! File "/usr/lib/python3.5/asyncio/coroutines.py", line 80, in debug_wrapper return CoroWrapper(gen, None)
  53. www.ska.ac.za Blocking the event loop import time async def sleepy():

    print(time.strftime("%H:%M:%S")) time.sleep(2) 6 print(time.strftime("%H:%M:%S")) 19:50:45 19:50:47
  54. www.ska.ac.za Blocking the event loop With PYTHONASYNCIODEBUG=1: 19:50:45 19:50:47 Executing

    <Task finished coro=<sleepy() done, defined at ./debugmode-block.py:3> result=None created at /usr/lib/python3.5/asyncio/base_events.py:367> took 2.002 seconds
  55. www.ska.ac.za Fire-and-forget tasks import signal, asyncio async def run_server(): raise

    NotImplementedError # oops 6 def shutdown(): task.cancel() 8 loop.stop() loop = asyncio.get_event_loop() task = loop.create_task(run_server()) loop.add_signal_handler(signal.SIGINT, shutdown) loop.run_forever() loop.close()
  56. www.ska.ac.za Uncaught exception Exception seen during shutdown (even without debug

    mode) ^CTask exception was never retrieved future: <Task finished coro=<run_server() done, defined at code/debugmode-nocatch.py:3> exception=NotImplementedError()> Traceback (most recent call last): File "/usr/lib/python3.5/asyncio/tasks.py", line 239, in _step result = coro.send(None) File "code/debugmode-nocatch.py", line 4, in run_server raise NotImplementedError # oops NotImplementedError
  57. www.ska.ac.za Catch and log exceptions import signal, asyncio, logging async

    def run_server(): try: raise NotImplementedError # oops 6 except Exception: logging.exception('Server failed') 8 def shutdown(): task.cancel() loop.stop() loop = asyncio.get_event_loop() task = loop.create_task(run_server()) loop.add_signal_handler(signal.SIGINT, shutdown) 6 loop.run_forever() loop.close()
  58. www.ska.ac.za aiomonitor TCP server that lets you inspect running tasks

    monitor >>> help Commands: ps : Show task table where taskid : Show stack frames for a task cancel taskid : Cancel an indicated task signal signame : Send a Unix signal stacktrace : Print a stack trace from the event loop thread console : Switch to async Python REPL quit : Leave the monitor
  59. www.ska.ac.za aiomonitor TCP server that lets you inspect running tasks

    monitor >>> ps +-----------------+----------+----------------------------------------------------------------------------------+ | Task ID | State | Task | +-----------------+----------+----------------------------------------------------------------------------------+ | 140592037503664 | PENDING | <Task pending coro=<Connection._run() running at | | | | /home/kat/ve3/lib/python3.5/site-packages/aiokatcp/connection.py:183> | | | | wait_for=<Future pending cb=[Task._wakeup()]> cb=[Connection._done_callback()]> | | 140592037504504 | PENDING | <Task pending coro=<DeviceServer.join() running at | | | | /home/kat/ve3/lib/python3.5/site-packages/aiokatcp/server.py:465> | | | | wait_for=<Future pending cb=[Task._wakeup()]> cb=[_run_until_complete_cb() at | | | | /usr/lib/python3.5/asyncio/base_events.py:164]> | | 140592037504784 | FINISHED | <Task finished coro=<start_interactive_server() done, defined at | | | | /home/kat/ve3/lib/python3.5/site-packages/aioconsole/server.py:17> | | | | result=<Server socke....1', 31007)>]>> | | 140592037591416 | PENDING | <Task pending coro=<Receiver._read_stream() running at | | | | /home/kat/ve3/lib/python3.5/site-packages/katsdpingest/receiver.py:327> | | | | wait_for=<Future pending cb=[Task._wakeup()]>> | | 140592037759128 | FINISHED | <Task finished coro=<CBFIngest._frame_job() done, defined at | | | | /home/kat/ve3/lib/python3.5/site-packages/katsdpingest/ingest_session.py:1241> | | | | result=None> | | 140592038199872 | PENDING | <Task pending coro=<CBFIngest.run() running at /home/kat/ve3/lib/python3.5/site- | | | | packages/katsdpingest/ingest_session.py:1367> wait_for=<Future pending | | | | cb=[Task._wakeup()]>> | | 140592038202784 | PENDING | <Task pending coro=<DeviceServer._client_connected_cb.<locals>.cleanup() running | | | | at /home/kat/ve3/lib/python3.5/site-packages/aiokatcp/server.py:494> | | | | wait_for=<Future pending cb=[Task._wakeup()]>> | +-----------------+----------+----------------------------------------------------------------------------------+
  60. www.ska.ac.za aiomonitor TCP server that lets you inspect running tasks

    monitor >>> where 140592038199872 Stack for <Task pending coro=<CBFIngest.run() running at /home/kat/ve3/lib/python3 File "/home/kat/ve3/lib/python3.5/site-packages/katsdpingest/ingest_session.py", await self._run() File "/home/kat/ve3/lib/python3.5/site-packages/katsdpingest/ingest_session.py", await self._get_data() File "/home/kat/ve3/lib/python3.5/site-packages/katsdpingest/ingest_session.py", frame = await self.rx.get() File "/home/kat/ve3/lib/python3.5/site-packages/katsdpingest/receiver.py", line frame = await self._frames_complete.get() File "/usr/lib/python3.5/asyncio/queues.py", line 168, in get yield from getter File "/usr/lib/python3.5/asyncio/futures.py", line 361, in __iter__ yield self # This tells Task to wait for completion.
  61. www.ska.ac.za aioconsole TCP server running an async REPL: Python 3.5.2

    (default, Nov 23 2017, 16:37:01) [GCC 5.4.0 20160609] on linux Type "help", "copyright", "credits" or "license" for more information. --- This console is running in an asyncio event loop. It allows you to wait for coroutines using the 'await' syntax. Try: await asyncio.sleep(1, result=3) --- >>>
  62. www.ska.ac.za SARAO, a business unit of the National Research Foundation.

    The South African Radio Astronomy Observatory (SARAO) spearheads South Africa’s activities in the Square Kilometre Array Radio Telescope, commonly known as the SKA, in engineering, science and construction. SARAO is a National Facility managed by the National Research Foundation and incorporates radio astronomy instruments and programmes such as the MeerKAT and KAT- telescopes in the Karoo, the Hartebeesthoek Radio Astronomy Observatory (HartRAO) in Gauteng, the African Very Long Baseline Interferometry (AVN) programme in nine African countries as well as the associated human capital development and commercialisation endeavours. Contact information Bruce Merry Senior Science Processing Developer Email: [email protected]