Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Python async/await unraveled

Python async/await unraveled

Python coroutines in detail -- understand the magical await keyword and how it manages to transform callback hell into coroutine bliss.

If you have ever wondered what await actually does, explained in terms of ordinary code, this talk is for you.

I’ll introduce coroutines in async programming and briefly compare them to callbacks and promises, showing examples of their use in asyncio alongside equivalent blocking code. Then I’ll use these examples to delve further into the seemingly magical await keyword and the programming model it brings about. After the talk you should be able to understand what await X expands to, how a coroutine could be desugared into an ordinary function, and why you’re happy you don’t have to do it yourself.

Hrvoje Nikšić

October 12, 2019
Tweet

Other Decks in Programming

Transcript

  1. python async/await unraveled
    Hrvoje Nikšić
    WebCamp Zagreb 2019
    AVL-AST Croatia
    1

    View Slide

  2. Synchronous IO
    • Synchronous IO: API calls block.
    – return after the operation completes or fails
    – traditional IO: Python file objects, C stdio
    • Parallelism only possible with threads or processes.
    • No cancellation.
    • Easy to use, scales poorly.
    2

    View Slide

  3. Asynchronous IO
    • Asynchronous IO: API calls don’t block.
    – return immediately, complete or fail with EAGAIN
    – explicit poll tells when to retry
    • Parallelism within a single thread.
    – event loop for central polling and dispatch
    – application logic in callbacks
    • Harder to use, scales well.
    3

    View Slide

  4. Callbacks
    • JavaScript: all-async, event loop invisible.
    • Async calls accept continuation callbacks.
    4

    View Slide

  5. Callbacks
    • JavaScript: all-async, event loop invisible.
    • Async calls accept continuation callbacks.
    sync
    function greet() {
    console.log("hello");
    sleep(1000); // XXX
    console.log("world");
    }
    4

    View Slide

  6. Callbacks
    • JavaScript: all-async, event loop invisible.
    • Async calls accept continuation callbacks.
    sync
    function greet() {
    console.log("hello");
    sleep(1000); // XXX
    console.log("world");
    }
    async
    function greet() {
    console.log("hello");
    setTimeout(() => {
    console.log("world");
    }, 1000);
    }
    4

    View Slide

  7. Callbacks
    • JavaScript: all-async, event loop invisible.
    • Async calls accept continuation callbacks.
    sync
    function greet() {
    console.log("hello");
    sleep(1000); // XXX
    console.log("wide");
    sleep(1000); // XXX
    console.log("world");
    }
    async
    function greet() {
    console.log("hello");
    setTimeout(() => {
    console.log("world");
    }, 1000);
    }
    4

    View Slide

  8. Callbacks
    • JavaScript: all-async, event loop invisible.
    • Async calls accept continuation callbacks.
    sync
    function greet() {
    console.log("hello");
    sleep(1000); // XXX
    console.log("wide");
    sleep(1000); // XXX
    console.log("world");
    }
    async
    function greet() {
    console.log("hello");
    setTimeout(() => {
    console.log("wide");
    setTimeout(() => {
    console.log("world");
    }, 1000);
    }, 1000);
    }
    4

    View Slide

  9. Callback Chaining
    Sync
    function verifyUser(username, password) {
    let userInfo = database.checkUser(username, password);
    let rolesInfo = database.getRoles(userInfo);
    database.logAccess(rolesInfo);
    return userInfo;
    }
    5

    View Slide

  10. Callback Chaining
    Sync
    function verifyUser(username, password) {
    let userInfo = database.checkUser(username, password);
    let rolesInfo = database.getRoles(userInfo);
    database.logAccess(rolesInfo);
    return userInfo;
    }
    Async — callback hell
    function verifyUser(username, password, callback) {
    database.checkUser(username, password, (error, userInfo) => {
    if (error) {
    callback(error);
    } else {
    database.getRoles(username, (error, roles) => {
    if (error) {
    callback(error);
    } else {
    database.logAccess(username, (error) => {
    if (error) {
    callback(error);
    } else {
    callback(null, userInfo);
    }
    });
    }
    });
    }
    });
    }
    5

    View Slide

  11. Promises
    Async — promises
    function verifyUser(username, password) {
    let retUserInfo;
    return database.checkUser(username, password)
    .then(userInfo => {
    retUserInfo = userInfo;
    return database.getRoles(userInfo);
    })
    .then(rolesInfo => database.logAccess(rolesInfo))
    .then(_ignore => retUserInfo)
    }
    6

    View Slide

  12. Promises
    Async — promises
    function verifyUser(username, password) {
    let retUserInfo;
    return database.checkUser(username, password)
    .then(userInfo => {
    retUserInfo = userInfo;
    return database.getRoles(userInfo);
    })
    .then(rolesInfo => database.logAccess(rolesInfo))
    .then(_ignore => retUserInfo)
    }
    • Async functions return promises.
    – no callback parameter
    • Better than callback hell, but. . .
    – still based on callbacks
    – reinvents control flow, feels like different language
    6

    View Slide

  13. JavaScript async/await
    Async — async/await
    async function verifyUser(username, password) {
    const userInfo = await database.checkUser(username, password);
    const rolesInfo = await database.getRoles(userInfo);
    await database.logAccess(userInfo);
    return userInfo;
    }
    7

    View Slide

  14. JavaScript async/await
    Async — async/await
    async function verifyUser(username, password) {
    const userInfo = await database.checkUser(username, password);
    const rolesInfo = await database.getRoles(userInfo);
    await database.logAccess(userInfo);
    return userInfo;
    }
    • Automatically returns a promise.
    • Synchronous look and feel, async execution.
    • Native control flow.
    • How?
    7

    View Slide

  15. Python Async
    • asyncore — stdlib, deprecated since Python 3.6.
    • Twisted, Tornado, gevent — separate and incompatible event loops.
    • asyncio — new async for the standard library.
    – pluggable event loop
    – futures, coroutines
    • uvloop — drop-in replacement for asyncio.
    • curio, trio — coroutine-first.
    8

    View Slide

  16. Coroutine Intro
    Async function
    def verify_user(username, password):
    user_info = [-] database.check_user(username, password)
    roles_info = [-] database.get_roles(user_info)
    [-] database.log_access(roles_info)
    return user_info
    9

    View Slide

  17. Coroutine Intro
    Async function
    def verify_user(username, password):
    user_info = [-] database.check_user(username, password)
    roles_info = [-] database.get_roles(user_info)
    [-] database.log_access(roles_info)
    return user_info
    • Goal: use normal syntax to write async functions.
    • Idea: suspend execution at [-], resume when ready.
    • If we only had a way to magically suspend a function. . .
    9

    View Slide

  18. Coroutine Intro
    Async function
    def verify_user(username, password):
    user_info = yield database.check_user(username, password)
    roles_info = yield database.get_roles(user_info)
    yield database.log_access(roles_info)
    return user_info
    • Goal: use normal syntax to write async functions.
    • Idea: suspend execution at [-], resume when ready.
    • If we only had a way to magically suspend a function. . .
    • Let’s just use generators!
    9

    View Slide

  19. Coroutine Intro
    Async function
    def verify_user(username, password):
    user_info = yield database.check_user(username, password)
    roles_info = yield database.get_roles(user_info)
    yield database.log_access(roles_info)
    return user_info
    • Goal: use normal syntax to write async functions.
    • Idea: suspend execution at [-], resume when ready.
    • If we only had a way to magically suspend a function. . .
    • Let’s just use generators!
    • Problem: inner async call may need to suspend more than once — or not at all.
    • Problem: classic generators don’t return values.
    9

    View Slide

  20. Generators Rebooted
    yield from ITERABLE
    for _x in ITERABLE:
    yield _x
    10

    View Slide

  21. Generators Rebooted
    yield from ITERABLE
    for _x in ITERABLE:
    yield _x
    – splitting up plain generators
    – multiple suspensions in coroutines
    10

    View Slide

  22. Generators Rebooted
    yield from ITERABLE
    for _x in ITERABLE:
    yield _x
    – splitting up plain generators
    – multiple suspensions in coroutines
    VAR = yield from ITERABLE
    _it = iter(ITERABLE)
    while True:
    try:
    _x = next(_it)
    except StopIteration as e:
    VAR = e.value
    break
    else:
    yield _x
    10

    View Slide

  23. Generators Rebooted
    yield from ITERABLE
    for _x in ITERABLE:
    yield _x
    – splitting up plain generators
    – multiple suspensions in coroutines
    VAR = yield from ITERABLE
    _it = iter(ITERABLE)
    while True:
    try:
    _x = next(_it)
    except StopIteration as e:
    VAR = e.value
    break
    else:
    yield _x
    10

    View Slide

  24. Generators Rebooted
    yield from ITERABLE
    for _x in ITERABLE:
    yield _x
    – splitting up plain generators
    – multiple suspensions in coroutines
    VAR = yield from ITERABLE
    _it = iter(ITERABLE)
    while True:
    try:
    _x = next(_it)
    except StopIteration as e:
    VAR = e.value
    break
    else:
    yield _x
    • return v in a generator stops iteration and stores v in the StopIteration exception.
    • yield from gen retrieves the value returned by gen
    • Actual desugaring more complex to support send, throw, and close.
    10

    View Slide

  25. Generator Based Coroutines
    High-level
    def verify_user(username, password):
    user_info = yield from database.check_user(username, password)
    roles_info = yield from database.get_roles(user_info)
    yield from database.log_access(roles_info)
    return user_info
    11

    View Slide

  26. Generator Based Coroutines
    High-level
    def verify_user(username, password):
    user_info = yield from database.check_user(username, password)
    roles_info = yield from database.get_roles(user_info)
    yield from database.log_access(roles_info)
    return user_info
    • Coroutines yield from other coroutines.
    • When the inner coroutine suspends, so does the calling one, transparently.
    • yield from can be used anywhere inside the coroutine.
    11

    View Slide

  27. Generator Based Coroutines
    Low-level
    class Database:
    def get_roles(self, user_info):
    request = self._make_get_roles_request(user_info)
    yield from write(self._sock, request)
    resp = yield from readexactly(self._sock, _RESP_SIZE)
    roles_info = self._parse_get_roles_response(resp)
    return roles_info
    12

    View Slide

  28. Generator Based Coroutines
    Low-level
    class Database:
    def get_roles(self, user_info):
    request = self._make_get_roles_request(user_info)
    yield from write(self._sock, request)
    resp = yield from readexactly(self._sock, _RESP_SIZE)
    roles_info = self._parse_get_roles_response(resp)
    return roles_info
    OS-level
    def write(fd, data):
    nwritten = 0
    while nwritten < len(data):
    try:
    nwritten += os.write(fd, data[nwritten:])
    except BlockingIOError:
    yield WantWrite(fd)
    12

    View Slide

  29. Generator Based Coroutines
    Low-level
    class Database:
    def get_roles(self, user_info):
    request = self._make_get_roles_request(user_info)
    yield from write(self._sock, request)
    resp = yield from readexactly(self._sock, _RESP_SIZE)
    roles_info = self._parse_get_roles_response(resp)
    return roles_info
    OS-level
    def write(fd, data):
    nwritten = 0
    while nwritten < len(data):
    try:
    nwritten += os.write(fd, data[nwritten:])
    except BlockingIOError:
    yield WantWrite(fd)
    • OS-level coroutines suspend with yield.
    • Yielded object visible only to the event loop.
    • Entire call chain must be async.
    • Async coroutines do not use extended generators.
    12

    View Slide

  30. async/await
    Coroutine function
    async def verify_user(username, password):
    user_info = await database.check_user(username, password)
    roles_info = await database.get_roles(user_info)
    await database.log_access(roles_info)
    return user_info
    13

    View Slide

  31. async/await
    Coroutine function
    async def verify_user(username, password):
    user_info = await database.check_user(username, password)
    roles_info = await database.get_roles(user_info)
    await database.log_access(roles_info)
    return user_info
    • async def defines a coroutine function.
    • await synchronizes with another async operation.
    – only allowed in coroutines
    • await requires an awaitable, not a generator.
    – uses yield from under the hood, but generators now invisible
    – @types.coroutine makes a generator awaitable
    13

    View Slide

  32. Example: Parallel Download
    Threads
    import concurrent.futures
    import requests
    URLS = [ http://www.cnn.com/ ,
    http://www.huffpost.com/ ,
    http://europe.wsj.com/ ,
    http://www.bbc.co.uk/ ,
    http://failfailfail.com/ ]
    def load_url(url):
    try:
    with requests.get(url) as resp:
    content = resp.content
    print(f {url!r} is {len(content)} bytes )
    except IOError:
    print(f failed to load {url} )
    def main():
    with concurrent.futures.ThreadPoolExecutor() as executor:
    futures = [executor.submit(load_url, url) for url in URLS]
    concurrent.futures.wait(futures)
    if __name__ == __main__ :
    main()
    14

    View Slide

  33. Example: Parallel Download
    Threads
    import concurrent.futures
    import requests
    URLS = [ http://www.cnn.com/ ,
    http://www.huffpost.com/ ,
    http://europe.wsj.com/ ,
    http://www.bbc.co.uk/ ,
    http://failfailfail.com/ ]
    def load_url(url):
    try:
    with requests.get(url) as resp:
    content = resp.content
    print(f {url!r} is {len(content)} bytes )
    except IOError:
    print(f failed to load {url} )
    def main():
    with concurrent.futures.ThreadPoolExecutor() as executor:
    futures = [executor.submit(load_url, url) for url in URLS]
    concurrent.futures.wait(futures)
    if __name__ == __main__ :
    main()
    Async
    import asyncio
    import aiohttp
    URLS = [ http://www.cnn.com/ ,
    http://www.huffpost.com/ ,
    http://europe.wsj.com/ ,
    http://www.bbc.co.uk/ ,
    http://failfailfail.com/ ]
    async def load_url(url, session):
    try:
    async with session.get(url) as resp:
    content = await resp.read()
    print(f {url!r} is {len(content)} bytes )
    except IOError:
    print(f failed to load {url} )
    async def main():
    async with aiohttp.ClientSession() as session:
    tasks = [load_url(url, session) for url in URLS]
    await asyncio.wait(tasks)
    if __name__ == __main__ :
    asyncio.run(main())
    14

    View Slide

  34. Caveats
    • Calling a coroutine function just constructs an awaitable coroutine object.
    – await or nothing happens!
    • await doesn’t introduce parallelism.
    • No blocking or long-running code allowed.
    – but off-thread is ok
    • All coroutines run in a single thread.
    – caution with classic “async” APIs
    15

    View Slide

  35. Awaitable From the Ground Up
    Coroutine awaitable
    async def print_len(url, resp):
    content = await resp.read()
    print(f {url!r} is {len(content)} bytes )
    16

    View Slide

  36. Awaitable From the Ground Up
    Coroutine awaitable
    async def print_len(url, resp): # CHALLENGE
    content = await resp.read() # write print_len without using
    print(f {url!r} is {len(content)} bytes ) # await or yield from
    16

    View Slide

  37. Awaitable From the Ground Up
    Coroutine awaitable
    async def print_len(url, resp): # CHALLENGE
    content = await resp.read() # write print_len without using
    print(f {url!r} is {len(content)} bytes ) # await or yield from
    • Learning exercise: write an awaitable!
    • print_len(...) must return object that defines __await__.
    • __await__ must return a running generator.
    16

    View Slide

  38. Awaitable With Generator
    Generator using yield from
    class print_len:
    def __init__(self, url, resp):
    self._url, self._resp = url, resp
    ...
    17

    View Slide

  39. Awaitable With Generator
    Generator using yield from
    class print_len:
    def __init__(self, url, resp):
    self._url, self._resp = url, resp
    def __await__(self):
    content = yield from self._resp.read().__await__()
    print(f {self._url!r} is {len(content)} bytes )
    17

    View Slide

  40. Awaitable With Generator
    Generator using yield from
    class print_len:
    def __init__(self, url, resp):
    self._url, self._resp = url, resp
    def __await__(self):
    content = yield from self._resp.read().__await__()
    print(f {self._url!r} is {len(content)} bytes )
    • Constructor only stores data.
    • awaitable.__await__() requests generator.
    17

    View Slide

  41. Awaitable With Plain Generator
    Plain generator
    class print_len:
    def __init__(self, url, resp):
    self._url, self._resp = url, resp
    def __await__(self):
    # content = yield from self._resp.read().__await__()
    ...
    18

    View Slide

  42. Awaitable With Plain Generator
    Plain generator
    class print_len:
    def __init__(self, url, resp):
    self._url, self._resp = url, resp
    def __await__(self):
    # content = yield from self._resp.read().__await__()
    it = iter(self._resp.read().__await__())
    while True:
    try:
    _token = next(it)
    except StopIteration as e:
    content = e.value
    break
    else:
    yield _token
    print(f {self._url!r} is {len(content)} bytes )
    18

    View Slide

  43. Awaitable With Plain Generator
    Plain generator
    class print_len:
    def __init__(self, url, resp):
    self._url, self._resp = url, resp
    def __await__(self):
    # content = yield from self._resp.read().__await__()
    it = iter(self._resp.read().__await__())
    while True:
    try:
    _token = next(it)
    except StopIteration as e:
    content = e.value
    break
    else:
    yield _token
    print(f {self._url!r} is {len(content)} bytes )
    • for not allowed — would swallow StopIteration.
    18

    View Slide

  44. Awaitable With Iterator
    Awaitable class
    class print_len:
    def __init__(self, url, resp):
    self._url, self._resp = url, resp
    def __await__(self):
    return _print_len_iter(self._url, self._resp)
    19

    View Slide

  45. Awaitable With Iterator
    Awaitable class
    class print_len:
    def __init__(self, url, resp):
    self._url, self._resp = url, resp
    def __await__(self):
    return _print_len_iter(self._url, self._resp)
    Iterator class
    class _print_len_iter:
    def __init__(self, url, resp):
    self._url, self._resp = url, resp
    self._state = 0
    def __iter__(self):
    return self
    19

    View Slide

  46. Awaitable With Iterator
    Awaitable class
    class print_len:
    def __init__(self, url, resp):
    self._url, self._resp = url, resp
    def __await__(self):
    return _print_len_iter(self._url, self._resp)
    Iterator class
    class _print_len_iter:
    def __init__(self, url, resp):
    self._url, self._resp = url, resp
    self._state = 0
    def __iter__(self):
    return self
    Iterator class, cont.
    def __next__(self):
    if self._state == 0:
    self._it = iter(self._resp.read().__await__())
    self._state = 1
    ...
    19

    View Slide

  47. Awaitable With Iterator
    Awaitable class
    class print_len:
    def __init__(self, url, resp):
    self._url, self._resp = url, resp
    def __await__(self):
    return _print_len_iter(self._url, self._resp)
    Iterator class
    class _print_len_iter:
    def __init__(self, url, resp):
    self._url, self._resp = url, resp
    self._state = 0
    def __iter__(self):
    return self
    Iterator class, cont.
    def __next__(self):
    if self._state == 0:
    self._it = iter(self._resp.read().__await__())
    self._state = 1
    if self._state == 1:
    try:
    _token = next(self._it)
    except StopIteration as e:
    content = e.value
    else:
    return _token
    ...
    19

    View Slide

  48. Awaitable With Iterator
    Awaitable class
    class print_len:
    def __init__(self, url, resp):
    self._url, self._resp = url, resp
    def __await__(self):
    return _print_len_iter(self._url, self._resp)
    Iterator class
    class _print_len_iter:
    def __init__(self, url, resp):
    self._url, self._resp = url, resp
    self._state = 0
    def __iter__(self):
    return self
    Iterator class, cont.
    def __next__(self):
    if self._state == 0:
    self._it = iter(self._resp.read().__await__())
    self._state = 1
    if self._state == 1:
    try:
    _token = next(self._it)
    except StopIteration as e:
    content = e.value
    else:
    return _token
    print(f {self._url!r} is {len(content)} bytes )
    self._state = 2
    if self._state == 2:
    raise StopIteration
    19

    View Slide

  49. Awaitable With Iterator
    Awaitable class
    class print_len:
    def __init__(self, url, resp):
    self._url, self._resp = url, resp
    def __await__(self):
    return _print_len_iter(self._url, self._resp)
    Iterator class
    class _print_len_iter:
    def __init__(self, url, resp):
    self._url, self._resp = url, resp
    self._state = 0
    def __iter__(self):
    return self
    Iterator class, cont.
    def __next__(self):
    if self._state == 0:
    self._it = iter(self._resp.read().__await__())
    self._state = 1
    if self._state == 1:
    try:
    _token = next(self._it)
    except StopIteration as e:
    content = e.value
    else:
    return _token
    print(f {self._url!r} is {len(content)} bytes )
    self._state = 2
    if self._state == 2:
    raise StopIteration
    19

    View Slide

  50. Futher reading
    • Python Concurrency From the Ground Up
    https://www.youtube.com/watch?v=MCs5OvhV9S4
    • Notes on structured concurrency, or: Go statement considered harmful
    https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/
    20

    View Slide