Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Juggling GPU tasks with asyncio by Bruce Merry

Pycon ZA
October 07, 2016

Juggling GPU tasks with asyncio by Bruce Merry

Getting peak performance with a GPU requires juggling concurrent tasks: copying data to the GPU, processing data, and copying results back off can all happen in parallel. In a distributed system, data arrives from the network and results are sent back over the network. Python's asyncio module is a great way to manage all these concurrent tasks while avoiding many of the hazards of multiple threads.

This talk will describe how I've used asyncio (actually trollius, the Python 2 backport) to make this all work for GPU-accelerated real-time processing in the MeerKAT radio telescope. I'll cover some helper classes I've written for ensuring that operations happen in the right order, and talk about how changing from a threaded model to trollius has simplified the code.

No experience with GPU programming or asyncio/trollius is required or expected. Some prior exposure to event-driven programming or coroutines in Python would be useful.

Pycon ZA

October 07, 2016
Tweet

More Decks by Pycon ZA

Other Decks in Programming

Transcript

  1. JUGGLING GPU TASKS WITH ASYNCIO LOOK MA, NO THREADS! PyConZA

    2016 Bruce Merry Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 1 / 49
  2. Outline 1 Background Who Am I? GPUs Asynchronous I/O 2

    GPUs with Asyncio Motivation Solution On-GPU Synchronisation 3 Misc Asyncio Advice Python Versions Exceptions Tornado Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 2 / 49
  3. Background Who Am I? Outline 1 Background Who Am I?

    GPUs Asynchronous I/O 2 GPUs with Asyncio Motivation Solution On-GPU Synchronisation 3 Misc Asyncio Advice Python Versions Exceptions Tornado Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 3 / 49
  4. Background Who Am I? Who Am I? • Software developer

    on MeerKAT project Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 4 / 49
  5. Background Who Am I? Who Am I? • Software developer

    on MeerKAT project © SKA South Africa Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 4 / 49
  6. Background Who Am I? Who Am I? • Software developer

    on MeerKAT project © SKA South Africa • I know very little about astronomy Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 4 / 49
  7. Background Who Am I? Who Am I? • Software developer

    on MeerKAT project © SKA South Africa • I know very little about astronomy Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 4 / 49
  8. Background Who Am I? Who Am I? • Software developer

    on MeerKAT project © SKA South Africa • I know very little about astronomy • Background is CS, graphics and GPU programming Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 4 / 49
  9. Background Who Am I? What Do I Do? Radio astronomy

    needs a lot of high-speed processing • 2.2 Tbit/s of digitised data from the dishes Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 5 / 49
  10. Background Who Am I? What Do I Do? Radio astronomy

    needs a lot of high-speed processing • 2.2 Tbit/s of digitised data from the dishes • 57 Gbit/s of correlations from FPGAs Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 5 / 49
  11. Background Who Am I? What Do I Do? Radio astronomy

    needs a lot of high-speed processing • 2.2 Tbit/s of digitised data from the dishes • 57 Gbit/s of correlations from FPGAs GPUs are high-performance parallel processors Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 5 / 49
  12. Background GPUs Outline 1 Background Who Am I? GPUs Asynchronous

    I/O 2 GPUs with Asyncio Motivation Solution On-GPU Synchronisation 3 Misc Asyncio Advice Python Versions Exceptions Tornado Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 6 / 49
  13. Background GPUs The Short, Short Version From https://www.pgroup.com/lit/articles/insider/v1n1a1.htm • Separate

    processor with its own DRAM Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 7 / 49
  14. Background GPUs The Short, Short Version From https://www.pgroup.com/lit/articles/insider/v1n1a1.htm • Separate

    processor with its own DRAM • DMA engines to transfer to/from system DRAM Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 7 / 49
  15. Background GPUs The Short, Short Version From https://www.pgroup.com/lit/articles/insider/v1n1a1.htm • Separate

    processor with its own DRAM • DMA engines to transfer to/from system DRAM • CPU submits kernels to execute Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 7 / 49
  16. Background GPUs The Short, Short Version From https://www.pgroup.com/lit/articles/insider/v1n1a1.htm • Separate

    processor with its own DRAM • DMA engines to transfer to/from system DRAM • CPU submits kernels to execute • CPU, GPU compute and GPU transfers are asynchronous Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 7 / 49
  17. Background Asynchronous I/O Outline 1 Background Who Am I? GPUs

    Asynchronous I/O 2 GPUs with Asyncio Motivation Solution On-GPU Synchronisation 3 Misc Asyncio Advice Python Versions Exceptions Tornado Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 8 / 49
  18. Background Asynchronous I/O To Thread Or Not To Thread Threads

    are cool, but they have issues: • Don’t scale well to 100 000 connections Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 9 / 49
  19. Background Asynchronous I/O To Thread Or Not To Thread Threads

    are cool, but they have issues: • Don’t scale well to 100 000 connections • Concurrent state changes are hard to reason about Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 9 / 49
  20. Background Asynchronous I/O To Thread Or Not To Thread Threads

    are cool, but they have issues: • Don’t scale well to 100 000 connections • Concurrent state changes are hard to reason about • Difficult to react to out-of-band events Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 9 / 49
  21. Background Asynchronous I/O Asynchronous I/O Paradigm Asynchronous I/O avoids threading:

    Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 10 / 49
  22. Background Asynchronous I/O Asynchronous I/O Paradigm Asynchronous I/O avoids threading:

    • In-process non-preemptive scheduler Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 10 / 49
  23. Background Asynchronous I/O Asynchronous I/O Paradigm Asynchronous I/O avoids threading:

    • In-process non-preemptive scheduler • Uses select/epoll/kqueue/etc Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 10 / 49
  24. Background Asynchronous I/O Asynchronous I/O Paradigm Asynchronous I/O avoids threading:

    • In-process non-preemptive scheduler • Uses select/epoll/kqueue/etc • Context can switch only on blocking operations Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 10 / 49
  25. Background Asynchronous I/O Callback Style Example from twisted documentation 1

    d = pod_bay_doors.open() 2 d.addCallback(lambda ignored: pod.launch()) Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 11 / 49
  26. Background Asynchronous I/O Coroutine Style Example from asyncio documentation 1

    @asyncio.coroutine 2 def display_date(loop): 3 end_time = loop.time() + 5.0 4 while True: 5 print(datetime.datetime.now()) 6 if (loop.time() + 1.0) >= end_time: 7 break 8 yield from asyncio.sleep(1) • yield from used to yield control on a blocking operation • Utilises Python’s generator machinery to preserve the stack Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 12 / 49
  27. GPUs with Asyncio Motivation Outline 1 Background Who Am I?

    GPUs Asynchronous I/O 2 GPUs with Asyncio Motivation Solution On-GPU Synchronisation 3 Misc Asyncio Advice Python Versions Exceptions Tornado Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 13 / 49
  28. GPUs with Asyncio Motivation Fully Synchronous Version NIC CPU GPU

    receive upload process download send receive upload process download send Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 14 / 49
  29. GPUs with Asyncio Motivation A Simplified Example h_in d_in d_out

    h_out 101101... 1 fill 2 copy 3 ×3 4 copy CPU GPU Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 15 / 49
  30. GPUs with Asyncio Motivation Sample Code 1 class Processor(object): 2

    def process_one(self): 3 self.h_in[:] = np.random.standard_normal(SIZE) 4 self.d_in.set(self.h_in, self.cq) 5 self.kernel(self.cq, (SIZE,), (256,), 6 self.d_out.data, self.d_in.data) 7 self.d_out.get(self.cq, self.h_out) 8 print(np.dot(self.h_out, self.h_out) / len(self.h_out)) 9 10 def run(self): 11 while True: 12 self.process_one() 13 time.sleep(0.5) Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 16 / 49
  31. GPUs with Asyncio Motivation Asynchronous GPU NIC CPU GPU receive

    upload process download wait send receive upload process download wait send Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 17 / 49
  32. GPUs with Asyncio Motivation Sample Code 1 class Processor(object): 2

    def process_one(self): 3 self.h_in[:] = np.random.standard_normal(SIZE) 4 self.d_in.set(self.h_in, self.cq, async=True) 5 self.kernel(self.cq, (SIZE,), (256,), 6 self.d_out.data, self.d_in.data) 7 self.d_out.get(self.cq, self.h_out, async=True) 8 self.cq.finish() 9 print(np.dot(self.h_out, self.h_out) / len(self.h_out)) 10 11 def run(self): 12 while True: 13 self.process_one() 14 time.sleep(0.5) Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 18 / 49
  33. GPUs with Asyncio Motivation And Now With asyncio 1 class

    Processor(object): 2 @asyncio.coroutine 3 def process_one(self): 4 self.h_in[:] = np.random.standard_normal(SIZE) 5 self.d_in.set(self.h_in, self.cq, async=True) 6 self.kernel(self.cq, (SIZE,), (256,), 7 self.d_out.data, self.d_in.data) 8 self.d_out.get(self.cq, self.h_out, async=True) 9 yield from loop.run_in_executor(None, self.cq.finish) 10 print(np.dot(self.h_out, self.h_out) / len(self.h_out)) 11 12 @asyncio.coroutine 13 def run(self): 14 while True: 15 yield from self.process_one() 16 yield from asyncio.sleep(0.5) Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 19 / 49
  34. GPUs with Asyncio Motivation Fully Asynchronous NIC CPU GPU receive

    upload process download wait send receive upload process download wait send Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 20 / 49
  35. GPUs with Asyncio Motivation Code — With Race Conditions 1

    @asyncio.coroutine 2 def process_one(self): 3 cq = cl.CommandQueue(self.ctx) 4 self.h_in[:] = np.random.standard_normal(SIZE) 5 self.d_in.set(self.h_in, cq, async=True) 6 self.kernel(cq, (SIZE,), (256,), 7 self.d_out.data, self.d_in.data) 8 self.d_out.get(cq, self.h_out, async=True) 9 yield from self.loop.run_in_executor(None, cq.finish) 10 print(np.dot(self.h_out, self.h_out) / len(self.h_out)) 11 12 @asyncio.coroutine 13 def run(self): 14 while True: 15 asyncio.async(self.process_one()) 16 yield from asyncio.sleep(0.5) Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 21 / 49
  36. GPUs with Asyncio Motivation Shared Resources GPU • This is

    fine if each stream uses its own memory Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 22 / 49
  37. GPUs with Asyncio Motivation Shared Resources GPU • This is

    fine if each stream uses its own memory • Not so good if memory is allocated only once Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 22 / 49
  38. GPUs with Asyncio Motivation Shared Resources GPU • This is

    fine if each stream uses its own memory • Not so good if memory is allocated only once • More generally: serialise access to any resource Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 22 / 49
  39. GPUs with Asyncio Solution Outline 1 Background Who Am I?

    GPUs Asynchronous I/O 2 GPUs with Asyncio Motivation Solution On-GPU Synchronisation 3 Misc Asyncio Advice Python Versions Exceptions Tornado Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 23 / 49
  40. GPUs with Asyncio Solution Locks asyncio has a Lock class,

    but • Inconvenient to enforce ordering Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 24 / 49
  41. GPUs with Asyncio Solution Locks asyncio has a Lock class,

    but • Inconvenient to enforce ordering GPU Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 24 / 49
  42. GPUs with Asyncio Solution Locks asyncio has a Lock class,

    but • Inconvenient to enforce ordering GPU • Not easy to extend in the way we want Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 24 / 49
  43. GPUs with Asyncio Solution Resources And Allocations Futures! F0 F1

    F2 A0 A1 Resource Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 25 / 49
  44. GPUs with Asyncio Solution Resources And Allocations Futures! F0 F1

    F2 A0 A1 F3 A2 Resource Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 25 / 49
  45. GPUs with Asyncio Solution Resources and Allocations — Code 1

    class Resource(object): 2 def __init__(self, value): 3 self._future = asyncio.Future() 4 self._future.set_result(None) 5 self.value = value 6 7 def acquire(self): 8 old = self._future 9 self._future = asyncio.Future() 10 return ResourceAllocation(old, self._future, self.value) Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 26 / 49
  46. GPUs with Asyncio Solution Resources and Allocations — Code 1

    class ResourceAllocation(object): 2 def __init__(self, start, end, value): 3 self._start = start 4 self._end = end 5 self.value = value 6 7 def wait(self): 8 return self._start 9 10 def ready(self): 11 self._end.set_result(None) Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 27 / 49
  47. GPUs with Asyncio Solution Resources and Allocations — Usage 1

    def __init__(self, ctx): 2 self.rd_in = Resource(Array(self.cq, SIZE, np.float32)) 3 self.rd_out = Resource(Array(self.cq, SIZE, np.float32)) 4 self.rh_in = Resource(np.empty(SIZE, np.float32)) 5 self.rh_out = Resource(np.empty(SIZE, np.float32)) 6 7 @asyncio.coroutine 8 def run(self): 9 while True: 10 ah_in = self.rh_in.acquire() 11 ah_out = self.rh_out.acquire() 12 ad_in = self.rd_in.acquire() 13 ad_out = self.rd_out.acquire() 14 asyncio.async(self.process_one( 15 ah_in, ah_out, ad_in, ad_out)) 16 yield from asyncio.sleep(0.5) Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 28 / 49
  48. GPUs with Asyncio Solution Resources and Allocations — Usage 1

    @asyncio.coroutine 2 def process_one(self, ah_in, ah_out, ad_in, ad_out): 3 cq = cl.CommandQueue(self.ctx) 4 yield from ah_in.wait() 5 ah_in.value[:] = np.random.standard_normal(SIZE) 6 yield from ad_in.wait() 7 ad_in.value.set(ah_in.value, cq, async=True) 8 yield from async_finish(cq) 9 ah_in.ready(); yield from ad_out.wait() 10 self.kernel(cq, (SIZE,), (256,), 11 ad_out.value.data, ad_in.value.data) 12 yield from async_finish(cq) 13 ad_in.ready(); yield from ah_out.wait() 14 ad_out.value.get(cq, ah_out.value, async=True) 15 yield from async_finish(cq) 16 ad_out.ready() 17 print(np.dot(ah_out.value, ah_out.value) / len(ah_out.value)) 18 ah_out.ready() Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 29 / 49
  49. GPUs with Asyncio Solution Are We There Yet? Good News

    Everyone! • Non-blocking: can service other work Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 30 / 49
  50. GPUs with Asyncio Solution Are We There Yet? Good News

    Everyone! • Non-blocking: can service other work • Can overlap network, transfer, compute Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 30 / 49
  51. GPUs with Asyncio Solution Are We There Yet? Good News

    Everyone! • Non-blocking: can service other work • Can overlap network, transfer, compute • Safe access to shared resources Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 30 / 49
  52. GPUs with Asyncio Solution Are We There Yet? Good News

    Everyone! • Non-blocking: can service other work • Can overlap network, transfer, compute • Safe access to shared resources But still some issues: Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 30 / 49
  53. GPUs with Asyncio Solution Are We There Yet? Good News

    Everyone! • Non-blocking: can service other work • Can overlap network, transfer, compute • Safe access to shared resources But still some issues: • All synchronisation via CPU Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 30 / 49
  54. GPUs with Asyncio Solution Are We There Yet? Good News

    Everyone! • Non-blocking: can service other work • Can overlap network, transfer, compute • Safe access to shared resources But still some issues: • All synchronisation via CPU • Unbounded growth of tasks Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 30 / 49
  55. GPUs with Asyncio On-GPU Synchronisation Outline 1 Background Who Am

    I? GPUs Asynchronous I/O 2 GPUs with Asyncio Motivation Solution On-GPU Synchronisation 3 Misc Asyncio Advice Python Versions Exceptions Tornado Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 31 / 49
  56. GPUs with Asyncio On-GPU Synchronisation Cross-Queue Dependencies OpenCL has a

    mechanism to specify dependencies 1 event = cl.enqueue_marker(cq1) 2 cl.enqueue_barrier(cq2, [event]) Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 32 / 49
  57. GPUs with Asyncio On-GPU Synchronisation Cross-Queue Dependencies OpenCL has a

    mechanism to specify dependencies 1 event = cl.enqueue_marker(cq1) 2 cl.enqueue_barrier(cq2, [event]) Can also wait for events on the host: 1 event.wait() This is blocking, so wrap with run_in_executor. Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 32 / 49
  58. GPUs with Asyncio On-GPU Synchronisation Back To The Future The

    event object is only known once the previous user is done with enqueueing. How can we safely obtain the value? Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 33 / 49
  59. GPUs with Asyncio On-GPU Synchronisation Back To The Future The

    event object is only known once the previous user is done with enqueueing. How can we safely obtain the value? F0 F1 F2 F3 A0 A1 A2 Resource Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 33 / 49
  60. GPUs with Asyncio On-GPU Synchronisation More Code 1 class ResourceAllocation(object):

    2 @asyncio.coroutine 3 def wait_events(self): 4 """Wait until device events are complete""" 5 events = yield from self._start 6 yield from async_wait_for_events(events) 7 8 @asyncio.coroutine 9 def wait_device(self, cq): 10 """Make command-queue wait for device events""" 11 events = yield from self._start 12 cl.enqueue_barrier(cq, events) 13 14 def ready(self, events=None): 15 self._end.set_result(events or []) Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 34 / 49
  61. GPUs with Asyncio On-GPU Synchronisation Usage 1 @asyncio.coroutine 2 def

    process_one(self, ah_in, ah_out, ad_in, ad_out): 3 cq = cl.CommandQueue(self.ctx) 4 yield from ah_in.wait_events() 5 ah_in.value[:] = np.random.standard_normal(SIZE) 6 yield from ad_in.wait_device(cq) 7 ad_in.value.set(ah_in.value, cq, async=True) 8 ah_in.ready([cl.enqueue_marker(cq)]) 9 yield from ad_out.wait_device(cq) 10 self.kernel(cq, (SIZE,), (256,), 11 ad_out.value.data, ad_in.value.data) 12 ad_in.ready([cl.enqueue_marker(cq)]) 13 ... Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 35 / 49
  62. Misc Asyncio Advice Python Versions Outline 1 Background Who Am

    I? GPUs Asynchronous I/O 2 GPUs with Asyncio Motivation Solution On-GPU Synchronisation 3 Misc Asyncio Advice Python Versions Exceptions Tornado Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 36 / 49
  63. Misc Asyncio Advice Python Versions What About Python 2? Python

    2 is supported by trollius • Replace asyncio with trollius Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 37 / 49
  64. Misc Asyncio Advice Python Versions What About Python 2? Python

    2 is supported by trollius • Replace asyncio with trollius • yield from ... becomes yield From(...) Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 37 / 49
  65. Misc Asyncio Advice Python Versions What About Python 2? Python

    2 is supported by trollius • Replace asyncio with trollius • yield from ... becomes yield From(...) • return ... becomes raise Return(...) Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 37 / 49
  66. Misc Asyncio Advice Python Versions Python 3.5 Python 3.5 adds

    new syntactic sugar • async def, async for, await, ... Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 38 / 49
  67. Misc Asyncio Advice Python Versions Python 3.5 Python 3.5 adds

    new syntactic sugar • async def, async for, await, ... • Asynchronous iterables Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 38 / 49
  68. Misc Asyncio Advice Python Versions Python 3.5 Python 3.5 adds

    new syntactic sugar • async def, async for, await, ... • Asynchronous iterables • More robust error checking Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 38 / 49
  69. Misc Asyncio Advice Exceptions Outline 1 Background Who Am I?

    GPUs Asynchronous I/O 2 GPUs with Asyncio Motivation Solution On-GPU Synchronisation 3 Misc Asyncio Advice Python Versions Exceptions Tornado Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 39 / 49
  70. Misc Asyncio Advice Exceptions Exceptions Where does the exception go?

    1 @asyncio.coroutine 2 def func(): 3 raise RuntimeError( I fell over ) 4 5 @asyncio.coroutine 6 def main(): 7 asyncio.async(func()) 8 yield from asyncio.sleep(1) Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 40 / 49
  71. Misc Asyncio Advice Exceptions Unretrieved Exceptions Python 3 — asyncio

    Task exception was never retrieved future: <Task finished coro=<coro() done, defined at /usr/lib/pyth Traceback (most recent call last): File "/usr/lib/python3.4/asyncio/tasks.py", line 238, in _step result = next(coro) File "/usr/lib/python3.4/asyncio/coroutines.py", line 141, in co res = func(*args, **kw) File "./exc_demo.py", line 8, in func raise RuntimeError( I fell over ) RuntimeError: I fell over Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 41 / 49
  72. Misc Asyncio Advice Exceptions Unretrieved Exceptions Python 2 — trollius

    ERROR:trollius:Future/Task exception was never retrieved RuntimeError: I fell over Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 42 / 49
  73. Misc Asyncio Advice Exceptions Solution 1: Join 1 @asyncio.coroutine 2

    def func(): 3 raise RuntimeError( I fell over ) 4 5 @asyncio.coroutine 6 def main(): 7 future = asyncio.async(func()) 8 yield from asyncio.sleep(1) 9 yield from future Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 43 / 49
  74. Misc Asyncio Advice Exceptions Solution 2: Log It 1 @asyncio.coroutine

    2 def func(): 3 try: 4 raise RuntimeError( I fell over ) 5 except Exception: 6 logging.error( func failed , exc_info=True) 7 8 @asyncio.coroutine 9 def main(): 10 asyncio.async(func()) 11 yield from asyncio.sleep(1) Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 44 / 49
  75. Misc Asyncio Advice Exceptions Cancellation Tasks created with asyncio.async can

    be cancelled. • Throws a CancelledError inside the coroutine Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 45 / 49
  76. Misc Asyncio Advice Exceptions Cancellation Tasks created with asyncio.async can

    be cancelled. • Throws a CancelledError inside the coroutine • Makes graceful abort possible, but still needs careful thought Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 45 / 49
  77. Misc Asyncio Advice Tornado Outline 1 Background Who Am I?

    GPUs Asynchronous I/O 2 GPUs with Asyncio Motivation Solution On-GPU Synchronisation 3 Misc Asyncio Advice Python Versions Exceptions Tornado Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 46 / 49
  78. Misc Asyncio Advice Tornado Tornado Integration Can use both on

    the same event loop 1 ioloop = tornado.platform.asyncio.AsyncIOMainLoop() 2 ioloop.install() Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 47 / 49
  79. Misc Asyncio Advice Tornado Mixing Futures Can convert between Tornado

    and asyncio futures: • tornado.platform.asyncio.to_asyncio_future • tornado.platform.asyncio.to_tornado_future Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 48 / 49
  80. Misc Asyncio Advice Tornado Watch Out @asyncio.coroutine def main(): yield

    from asyncio.sleep(1) print("Hello world") ioloop = asyncio.get_event_loop() ioloop.run_until_complete(main()) @tornado.gen.coroutine def main(): yield from tornado.gen.sleep(1) print("Hello world") ioloop = tornado.ioloop.IOLoop.current() ioloop.add_callback(main) ioloop.start() Bruce Merry Juggling GPU tasks with asyncio PyConZA 2016 49 / 49