Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Python async/await unraveled

Python async/await unraveled

Python coroutines in detail -- understand the magical await keyword and how it manages to transform callback hell into coroutine bliss.

If you have ever wondered what await actually does, explained in terms of ordinary code, this talk is for you.

I’ll introduce coroutines in async programming and briefly compare them to callbacks and promises, showing examples of their use in asyncio alongside equivalent blocking code. Then I’ll use these examples to delve further into the seemingly magical await keyword and the programming model it brings about. After the talk you should be able to understand what await X expands to, how a coroutine could be desugared into an ordinary function, and why you’re happy you don’t have to do it yourself.

Hrvoje Nikšić

October 12, 2019
Tweet

Other Decks in Programming

Transcript

  1. Synchronous IO • Synchronous IO: API calls block. – return

    after the operation completes or fails – traditional IO: Python file objects, C stdio • Parallelism only possible with threads or processes. • No cancellation. • Easy to use, scales poorly. 2
  2. Asynchronous IO • Asynchronous IO: API calls don’t block. –

    return immediately, complete or fail with EAGAIN – explicit poll tells when to retry • Parallelism within a single thread. – event loop for central polling and dispatch – application logic in callbacks • Harder to use, scales well. 3
  3. Callbacks • JavaScript: all-async, event loop invisible. • Async calls

    accept continuation callbacks. sync function greet() { console.log("hello"); sleep(1000); // XXX console.log("world"); } 4
  4. Callbacks • JavaScript: all-async, event loop invisible. • Async calls

    accept continuation callbacks. sync function greet() { console.log("hello"); sleep(1000); // XXX console.log("world"); } async function greet() { console.log("hello"); setTimeout(() => { console.log("world"); }, 1000); } 4
  5. Callbacks • JavaScript: all-async, event loop invisible. • Async calls

    accept continuation callbacks. sync function greet() { console.log("hello"); sleep(1000); // XXX console.log("wide"); sleep(1000); // XXX console.log("world"); } async function greet() { console.log("hello"); setTimeout(() => { console.log("world"); }, 1000); } 4
  6. Callbacks • JavaScript: all-async, event loop invisible. • Async calls

    accept continuation callbacks. sync function greet() { console.log("hello"); sleep(1000); // XXX console.log("wide"); sleep(1000); // XXX console.log("world"); } async function greet() { console.log("hello"); setTimeout(() => { console.log("wide"); setTimeout(() => { console.log("world"); }, 1000); }, 1000); } 4
  7. Callback Chaining Sync function verifyUser(username, password) { let userInfo =

    database.checkUser(username, password); let rolesInfo = database.getRoles(userInfo); database.logAccess(rolesInfo); return userInfo; } 5
  8. Callback Chaining Sync function verifyUser(username, password) { let userInfo =

    database.checkUser(username, password); let rolesInfo = database.getRoles(userInfo); database.logAccess(rolesInfo); return userInfo; } Async — callback hell function verifyUser(username, password, callback) { database.checkUser(username, password, (error, userInfo) => { if (error) { callback(error); } else { database.getRoles(username, (error, roles) => { if (error) { callback(error); } else { database.logAccess(username, (error) => { if (error) { callback(error); } else { callback(null, userInfo); } }); } }); } }); } 5
  9. Promises Async — promises function verifyUser(username, password) { let retUserInfo;

    return database.checkUser(username, password) .then(userInfo => { retUserInfo = userInfo; return database.getRoles(userInfo); }) .then(rolesInfo => database.logAccess(rolesInfo)) .then(_ignore => retUserInfo) } 6
  10. Promises Async — promises function verifyUser(username, password) { let retUserInfo;

    return database.checkUser(username, password) .then(userInfo => { retUserInfo = userInfo; return database.getRoles(userInfo); }) .then(rolesInfo => database.logAccess(rolesInfo)) .then(_ignore => retUserInfo) } • Async functions return promises. – no callback parameter • Better than callback hell, but. . . – still based on callbacks – reinvents control flow, feels like different language 6
  11. JavaScript async/await Async — async/await async function verifyUser(username, password) {

    const userInfo = await database.checkUser(username, password); const rolesInfo = await database.getRoles(userInfo); await database.logAccess(userInfo); return userInfo; } 7
  12. JavaScript async/await Async — async/await async function verifyUser(username, password) {

    const userInfo = await database.checkUser(username, password); const rolesInfo = await database.getRoles(userInfo); await database.logAccess(userInfo); return userInfo; } • Automatically returns a promise. • Synchronous look and feel, async execution. • Native control flow. • How? 7
  13. Python Async • asyncore — stdlib, deprecated since Python 3.6.

    • Twisted, Tornado, gevent — separate and incompatible event loops. • asyncio — new async for the standard library. – pluggable event loop – futures, coroutines • uvloop — drop-in replacement for asyncio. • curio, trio — coroutine-first. 8
  14. Coroutine Intro Async function def verify_user(username, password): user_info = [-]

    database.check_user(username, password) roles_info = [-] database.get_roles(user_info) [-] database.log_access(roles_info) return user_info 9
  15. Coroutine Intro Async function def verify_user(username, password): user_info = [-]

    database.check_user(username, password) roles_info = [-] database.get_roles(user_info) [-] database.log_access(roles_info) return user_info • Goal: use normal syntax to write async functions. • Idea: suspend execution at [-], resume when ready. • If we only had a way to magically suspend a function. . . 9
  16. Coroutine Intro Async function def verify_user(username, password): user_info = yield

    database.check_user(username, password) roles_info = yield database.get_roles(user_info) yield database.log_access(roles_info) return user_info • Goal: use normal syntax to write async functions. • Idea: suspend execution at [-], resume when ready. • If we only had a way to magically suspend a function. . . • Let’s just use generators! 9
  17. Coroutine Intro Async function def verify_user(username, password): user_info = yield

    database.check_user(username, password) roles_info = yield database.get_roles(user_info) yield database.log_access(roles_info) return user_info • Goal: use normal syntax to write async functions. • Idea: suspend execution at [-], resume when ready. • If we only had a way to magically suspend a function. . . • Let’s just use generators! • Problem: inner async call may need to suspend more than once — or not at all. • Problem: classic generators don’t return values. 9
  18. Generators Rebooted yield from ITERABLE for _x in ITERABLE: yield

    _x – splitting up plain generators – multiple suspensions in coroutines 10
  19. Generators Rebooted yield from ITERABLE for _x in ITERABLE: yield

    _x – splitting up plain generators – multiple suspensions in coroutines VAR = yield from ITERABLE _it = iter(ITERABLE) while True: try: _x = next(_it) except StopIteration as e: VAR = e.value break else: yield _x 10
  20. Generators Rebooted yield from ITERABLE for _x in ITERABLE: yield

    _x – splitting up plain generators – multiple suspensions in coroutines VAR = yield from ITERABLE _it = iter(ITERABLE) while True: try: _x = next(_it) except StopIteration as e: VAR = e.value break else: yield _x 10
  21. Generators Rebooted yield from ITERABLE for _x in ITERABLE: yield

    _x – splitting up plain generators – multiple suspensions in coroutines VAR = yield from ITERABLE _it = iter(ITERABLE) while True: try: _x = next(_it) except StopIteration as e: VAR = e.value break else: yield _x • return v in a generator stops iteration and stores v in the StopIteration exception. • yield from gen retrieves the value returned by gen • Actual desugaring more complex to support send, throw, and close. 10
  22. Generator Based Coroutines High-level def verify_user(username, password): user_info = yield

    from database.check_user(username, password) roles_info = yield from database.get_roles(user_info) yield from database.log_access(roles_info) return user_info 11
  23. Generator Based Coroutines High-level def verify_user(username, password): user_info = yield

    from database.check_user(username, password) roles_info = yield from database.get_roles(user_info) yield from database.log_access(roles_info) return user_info • Coroutines yield from other coroutines. • When the inner coroutine suspends, so does the calling one, transparently. • yield from can be used anywhere inside the coroutine. 11
  24. Generator Based Coroutines Low-level class Database: def get_roles(self, user_info): request

    = self._make_get_roles_request(user_info) yield from write(self._sock, request) resp = yield from readexactly(self._sock, _RESP_SIZE) roles_info = self._parse_get_roles_response(resp) return roles_info 12
  25. Generator Based Coroutines Low-level class Database: def get_roles(self, user_info): request

    = self._make_get_roles_request(user_info) yield from write(self._sock, request) resp = yield from readexactly(self._sock, _RESP_SIZE) roles_info = self._parse_get_roles_response(resp) return roles_info OS-level def write(fd, data): nwritten = 0 while nwritten < len(data): try: nwritten += os.write(fd, data[nwritten:]) except BlockingIOError: yield WantWrite(fd) 12
  26. Generator Based Coroutines Low-level class Database: def get_roles(self, user_info): request

    = self._make_get_roles_request(user_info) yield from write(self._sock, request) resp = yield from readexactly(self._sock, _RESP_SIZE) roles_info = self._parse_get_roles_response(resp) return roles_info OS-level def write(fd, data): nwritten = 0 while nwritten < len(data): try: nwritten += os.write(fd, data[nwritten:]) except BlockingIOError: yield WantWrite(fd) • OS-level coroutines suspend with yield. • Yielded object visible only to the event loop. • Entire call chain must be async. • Async coroutines do not use extended generators. 12
  27. async/await Coroutine function async def verify_user(username, password): user_info = await

    database.check_user(username, password) roles_info = await database.get_roles(user_info) await database.log_access(roles_info) return user_info 13
  28. async/await Coroutine function async def verify_user(username, password): user_info = await

    database.check_user(username, password) roles_info = await database.get_roles(user_info) await database.log_access(roles_info) return user_info • async def defines a coroutine function. • await synchronizes with another async operation. – only allowed in coroutines • await requires an awaitable, not a generator. – uses yield from under the hood, but generators now invisible – @types.coroutine makes a generator awaitable 13
  29. Example: Parallel Download Threads import concurrent.futures import requests URLS =

    [ http://www.cnn.com/ , http://www.huffpost.com/ , http://europe.wsj.com/ , http://www.bbc.co.uk/ , http://failfailfail.com/ ] def load_url(url): try: with requests.get(url) as resp: content = resp.content print(f {url!r} is {len(content)} bytes ) except IOError: print(f failed to load {url} ) def main(): with concurrent.futures.ThreadPoolExecutor() as executor: futures = [executor.submit(load_url, url) for url in URLS] concurrent.futures.wait(futures) if __name__ == __main__ : main() 14
  30. Example: Parallel Download Threads import concurrent.futures import requests URLS =

    [ http://www.cnn.com/ , http://www.huffpost.com/ , http://europe.wsj.com/ , http://www.bbc.co.uk/ , http://failfailfail.com/ ] def load_url(url): try: with requests.get(url) as resp: content = resp.content print(f {url!r} is {len(content)} bytes ) except IOError: print(f failed to load {url} ) def main(): with concurrent.futures.ThreadPoolExecutor() as executor: futures = [executor.submit(load_url, url) for url in URLS] concurrent.futures.wait(futures) if __name__ == __main__ : main() Async import asyncio import aiohttp URLS = [ http://www.cnn.com/ , http://www.huffpost.com/ , http://europe.wsj.com/ , http://www.bbc.co.uk/ , http://failfailfail.com/ ] async def load_url(url, session): try: async with session.get(url) as resp: content = await resp.read() print(f {url!r} is {len(content)} bytes ) except IOError: print(f failed to load {url} ) async def main(): async with aiohttp.ClientSession() as session: tasks = [load_url(url, session) for url in URLS] await asyncio.wait(tasks) if __name__ == __main__ : asyncio.run(main()) 14
  31. Caveats • Calling a coroutine function just constructs an awaitable

    coroutine object. – await or nothing happens! • await doesn’t introduce parallelism. • No blocking or long-running code allowed. – but off-thread is ok • All coroutines run in a single thread. – caution with classic “async” APIs 15
  32. Awaitable From the Ground Up Coroutine awaitable async def print_len(url,

    resp): content = await resp.read() print(f {url!r} is {len(content)} bytes ) 16
  33. Awaitable From the Ground Up Coroutine awaitable async def print_len(url,

    resp): # CHALLENGE content = await resp.read() # write print_len without using print(f {url!r} is {len(content)} bytes ) # await or yield from 16
  34. Awaitable From the Ground Up Coroutine awaitable async def print_len(url,

    resp): # CHALLENGE content = await resp.read() # write print_len without using print(f {url!r} is {len(content)} bytes ) # await or yield from • Learning exercise: write an awaitable! • print_len(...) must return object that defines __await__. • __await__ must return a running generator. 16
  35. Awaitable With Generator Generator using yield from class print_len: def

    __init__(self, url, resp): self._url, self._resp = url, resp ... 17
  36. Awaitable With Generator Generator using yield from class print_len: def

    __init__(self, url, resp): self._url, self._resp = url, resp def __await__(self): content = yield from self._resp.read().__await__() print(f {self._url!r} is {len(content)} bytes ) 17
  37. Awaitable With Generator Generator using yield from class print_len: def

    __init__(self, url, resp): self._url, self._resp = url, resp def __await__(self): content = yield from self._resp.read().__await__() print(f {self._url!r} is {len(content)} bytes ) • Constructor only stores data. • awaitable.__await__() requests generator. 17
  38. Awaitable With Plain Generator Plain generator class print_len: def __init__(self,

    url, resp): self._url, self._resp = url, resp def __await__(self): # content = yield from self._resp.read().__await__() ... 18
  39. Awaitable With Plain Generator Plain generator class print_len: def __init__(self,

    url, resp): self._url, self._resp = url, resp def __await__(self): # content = yield from self._resp.read().__await__() it = iter(self._resp.read().__await__()) while True: try: _token = next(it) except StopIteration as e: content = e.value break else: yield _token print(f {self._url!r} is {len(content)} bytes ) 18
  40. Awaitable With Plain Generator Plain generator class print_len: def __init__(self,

    url, resp): self._url, self._resp = url, resp def __await__(self): # content = yield from self._resp.read().__await__() it = iter(self._resp.read().__await__()) while True: try: _token = next(it) except StopIteration as e: content = e.value break else: yield _token print(f {self._url!r} is {len(content)} bytes ) • for not allowed — would swallow StopIteration. 18
  41. Awaitable With Iterator Awaitable class class print_len: def __init__(self, url,

    resp): self._url, self._resp = url, resp def __await__(self): return _print_len_iter(self._url, self._resp) 19
  42. Awaitable With Iterator Awaitable class class print_len: def __init__(self, url,

    resp): self._url, self._resp = url, resp def __await__(self): return _print_len_iter(self._url, self._resp) Iterator class class _print_len_iter: def __init__(self, url, resp): self._url, self._resp = url, resp self._state = 0 def __iter__(self): return self 19
  43. Awaitable With Iterator Awaitable class class print_len: def __init__(self, url,

    resp): self._url, self._resp = url, resp def __await__(self): return _print_len_iter(self._url, self._resp) Iterator class class _print_len_iter: def __init__(self, url, resp): self._url, self._resp = url, resp self._state = 0 def __iter__(self): return self Iterator class, cont. def __next__(self): if self._state == 0: self._it = iter(self._resp.read().__await__()) self._state = 1 ... 19
  44. Awaitable With Iterator Awaitable class class print_len: def __init__(self, url,

    resp): self._url, self._resp = url, resp def __await__(self): return _print_len_iter(self._url, self._resp) Iterator class class _print_len_iter: def __init__(self, url, resp): self._url, self._resp = url, resp self._state = 0 def __iter__(self): return self Iterator class, cont. def __next__(self): if self._state == 0: self._it = iter(self._resp.read().__await__()) self._state = 1 if self._state == 1: try: _token = next(self._it) except StopIteration as e: content = e.value else: return _token ... 19
  45. Awaitable With Iterator Awaitable class class print_len: def __init__(self, url,

    resp): self._url, self._resp = url, resp def __await__(self): return _print_len_iter(self._url, self._resp) Iterator class class _print_len_iter: def __init__(self, url, resp): self._url, self._resp = url, resp self._state = 0 def __iter__(self): return self Iterator class, cont. def __next__(self): if self._state == 0: self._it = iter(self._resp.read().__await__()) self._state = 1 if self._state == 1: try: _token = next(self._it) except StopIteration as e: content = e.value else: return _token print(f {self._url!r} is {len(content)} bytes ) self._state = 2 if self._state == 2: raise StopIteration 19
  46. Awaitable With Iterator Awaitable class class print_len: def __init__(self, url,

    resp): self._url, self._resp = url, resp def __await__(self): return _print_len_iter(self._url, self._resp) Iterator class class _print_len_iter: def __init__(self, url, resp): self._url, self._resp = url, resp self._state = 0 def __iter__(self): return self Iterator class, cont. def __next__(self): if self._state == 0: self._it = iter(self._resp.read().__await__()) self._state = 1 if self._state == 1: try: _token = next(self._it) except StopIteration as e: content = e.value else: return _token print(f {self._url!r} is {len(content)} bytes ) self._state = 2 if self._state == 2: raise StopIteration 19
  47. Futher reading • Python Concurrency From the Ground Up https://www.youtube.com/watch?v=MCs5OvhV9S4

    • Notes on structured concurrency, or: Go statement considered harmful https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/ 20