$30 off During Our Annual Pro Sale. View Details »

Just Add Await: Retrofitting Async Into Django

Just Add Await: Retrofitting Async Into Django

A talk I gave at PyCon Australia 2019 (and then later gave a modified version of at DjangoCon US 2019)

Andrew Godwin

August 02, 2019
Tweet

More Decks by Andrew Godwin

Other Decks in Programming

Transcript

  1. JUST ADD AWAIT
    ANDREW GODWIN // @andrewgodwin
    RETROFITTING ASYNC INTO DJANGO

    View Slide

  2. Hi, I’m
    Andrew Godwin
    • Django contributor (Migrations/Channels)
    • Principal Engineer at
    • I see Python threads in my sleep now

    View Slide

  3. View Slide

  4. async def view(request):
    return TemplateResponse(
    request,
    "template.html",
    {"article": await api_call(pk=5)}
    )

    View Slide

  5. Django In Depth
    1.
    2.
    3.
    Threading, cooperation, and intrigue
    Spanning two worlds with one vision
    Handlers, Requests, Middleware & Views
    Big Framework Problems
    Async In Brief

    View Slide

  6. ASYNC IN BRIEF
    1.

    View Slide

  7. You don't know when it'll switch!
    Threads are preemptive

    View Slide

  8. You don't know when it'll switch!
    Threads are preemptive

    View Slide

  9. You don't know when it'll switch!
    Threads are preemptive
    Coroutines are cooperative
    They only yield at an await.

    View Slide

  10. Coroutines are cooperative
    They only yield at an await.

    View Slide

  11. You don't know when it'll switch!
    Threads are preemptive
    Coroutines are cooperative
    They only yield at an await.

    View Slide

  12. Coroutines need an event loop
    It's where the program idles between tasks

    View Slide

  13. An event loop runs in a single thread
    Yes, you can have threads and coroutines!

    View Slide

  14. Sync Thread
    Sync Thread
    Async Thread

    View Slide

  15. Threads are slow!
    The more you add, the worse it gets.

    View Slide

  16. Async is fast...
    As long as you are I/O bound!

    View Slide

  17. Async functions are different to sync
    They are not cross-compatible!

    View Slide

  18. # Call sync from sync
    result = function()
    # Call async from async
    result = await function()

    View Slide

  19. # Call async from sync in Python 3.7
    result = asyncio.run(function())
    # Python 3.6
    loop = asyncio.new_event_loop()
    asyncio.set_event_loop(loop)
    result = loop.run_until_complete(function())

    View Slide

  20. # Call sync from async
    executor = ThreadPoolExecutor(max_workers=3)
    loop = asyncio.get_event_loop()
    result = await loop.run_in_executor(
    executor,
    function,
    *args,
    )

    View Slide

  21. Async calling Sync is dangerous
    It has to be in a separate thread or it'll block the event loop

    View Slide

  22. Sync Thread
    Async Thread

    View Slide

  23. Sync Thread
    Async Thread

    View Slide

  24. It's complicated
    This is why I encourage writing sync code at first!

    View Slide

  25. BIG FRAMEWORK PROBLEMS
    2.

    View Slide

  26. Backwards compatibility is crucial
    Throw it away, and nobody will adopt your new thing

    View Slide

  27. A function cannot be both sync and async
    You have to pick one. I've tried.

    View Slide

  28. # You can have this...
    result = cache.get("my-key")
    # Or this. Not both.
    result = await cache.get("my-key")

    View Slide

  29. Totally different libraries
    And there's not even standards like DBAPI2

    View Slide

  30. # Even sleep is different!
    time.sleep(0.01)
    await asyncio.sleep(0.1)

    View Slide

  31. Lack of standards
    It's just too early on for things to coalesce.

    View Slide

  32. async def application(scope, receive, send):
    await receive()
    ...
    send({"type": "http.response", ...})

    View Slide

  33. Different language features
    No more attribute access

    View Slide

  34. # While this can have an async version...
    instance = await Model.objects.get()
    # There's no async way of doing this
    print(instance.foreign_key.name)

    View Slide

  35. Threads matter!
    Sync code all wants to run in the same thread still.

    View Slide

  36. Async has to add, not replace
    Sync Django is still important
    Things need to look familiar
    We don't want a wildly different API feel
    Things need to be safe
    Deadlocking or blocking is easier than ever

    View Slide

  37. DJANGO IN DEPTH
    3.

    View Slide

  38. Handler
    ASGI / WSGI
    Server
    Middleware
    View
    ORM
    Template
    URL Router Forms

    View Slide

  39. Outside-in approach
    Async outside, sync inside

    View Slide

  40. Handler
    ASGI / WSGI
    Server
    Middleware
    View
    ORM
    Template
    URL Router Forms
    Phase One
    Phase Two Phase Three

    View Slide

  41. Phase One: ASGI Support
    Allowing Django to be async at all
    Phase Two: Async Views
    Unlocking async use in normal apps
    Phase Three: The ORM
    High-level async use for the most common case

    View Slide

  42. Phase One: Django 3.0
    It's already committed!
    Phase Two: Django 3.1
    Unless things really do turn out nicely...
    Phase Three: Django 3.2/4.0
    There's a lot of work here.

    View Slide

  43. Each phase brings concrete benefits
    Even if we stop!

    View Slide

  44. Phase One: ASGI

    View Slide

  45. Django predates WSGI
    Which turns out to actually help, in the end

    View Slide


  46. James Bennett, "Django and NIH", 2006
    Just so you know, Django is a
    smug, arrogant framework that
    doesn’t play nice with others.

    View Slide


  47. James Bennett, "Django and NIH", 2006
    Just so you know, Django is a
    smug, arrogant framework that
    doesn’t play nice with others. [...]
    Or at least, that’s the impression
    you’d get from reading the rants...

    View Slide

  48. Custom request/response objects
    Most other frameworks did this too
    Custom "handler" classes
    Abstracts away WSGI
    Custom middleware
    Wow, was this contentious at the time!

    View Slide

  49. WSGIHandler
    __call__
    WSGI Server
    WSGIRequest
    BaseHandler
    get_response
    URLs Middleware
    View
    __call__
    HTTP protocol
    Socket handling
    Transfer encodings
    Headers-to-META
    Upload file wrapping
    GET/POST parsing
    Exception catching
    Atomic view wrapper

    View Slide

  50. WSGIHandler
    __call__
    WSGI Server
    WSGIRequest
    BaseHandler
    get_response
    URLs Middleware
    View
    __call__
    ASGIHandler
    __call__
    ASGI Server
    ASGIRequest

    View Slide

  51. WSGIHandler
    __call__
    WSGI Server
    WSGIRequest
    BaseHandler
    get_response
    URLs Middleware
    View
    __call__
    ASGIHandler
    __call__
    ASGI Server
    ASGIRequest
    Asynchronous

    View Slide

  52. ASGI is mostly WSGI-compatible
    With better definitions of bytes versus unicode

    View Slide

  53. if self.scope.get('client'):
    self.META['REMOTE_ADDR'] = self.scope['client'][0]
    self.META['REMOTE_HOST'] = self.META['REMOTE_ADDR']
    self.META['REMOTE_PORT'] = self.scope['client'][1]

    View Slide

  54. body_file = tempfile.SpooledTemporaryFile(max_size=..., mode='w+b')
    while True:
    message = await receive()
    if message['type'] == 'http.disconnect':
    # Early client disconnect.
    raise RequestAborted()
    # Add a body chunk from the message, if provided.
    if 'body' in message:
    body_file.write(message['body'])
    # Quit out if that's the end.
    if not message.get('more_body', False):
    break
    body_file.seek(0)
    return body_file

    View Slide

  55. if response.streaming:
    # Access `__iter__` and not `streaming_content` directly in case
    # it has been overridden in a subclass.
    for part in response:
    for chunk, _ in self.chunk_bytes(part):
    await send({
    'type': 'http.response.body',
    'body': chunk,
    # Ignore "more" as there may be more parts; instead,
    # use an empty final closing message with False.
    'more_body': True,
    })
    # Final closing message.
    await send({'type': 'http.response.body'})

    View Slide

  56. WSGIHandler
    __call__
    WSGI Server
    WSGIRequest
    BaseHandler
    get_response
    URLs Middleware
    View
    __call__
    ASGIHandler
    __call__
    ASGI Server
    ASGIRequest
    Asynchronous

    View Slide


  57. Me, earlier in this talk
    Async calling sync
    is dangerous!

    View Slide

  58. from asgiref.sync import sync_to_async
    result = await sync_to_async(callable)(arg1, name=arg2)

    View Slide

  59. Propagates exceptions nicely
    Really helps with debugging!
    Proxies threadlocals down correctly
    Because people really love threadlocals.
    Stickies sync code into one thread
    We'll get back to this. It's nasty.

    View Slide

  60. Result: Django 3.0 can speak ASGI
    But it can't do much else async... yet.

    View Slide

  61. Phase Two: Views

    View Slide

  62. WSGIHandler
    __call__
    WSGI Server
    WSGIRequest
    BaseHandler
    get_response
    URLs Middleware
    View
    __call__
    ASGIHandler
    __call__
    ASGI Server
    ASGIRequest
    Asynchronous

    View Slide

  63. WSGIHandler
    __call__
    WSGI Server
    WSGIRequest
    BaseHandler
    get_response
    URLs Middleware
    Async View
    __call__
    ASGIHandler
    __call__
    ASGI Server
    ASGIRequest
    Asynchronous
    Sync View
    __call__

    View Slide

  64. WSGIHandler
    __call__
    BaseHandler
    get_response
    URLs Middleware
    Async View
    __call__
    ASGIHandler
    __call__
    Sync View
    __call__
    TestClient
    get/post

    View Slide

  65. BaseHandler
    get_response
    Sync View
    __call__
    TestClient
    get/post
    Main Thread Event Loop Sub Thread
    asyncio.run ThreadPool

    View Slide

  66. SQLite hates this
    Try This One Weird Trick To Help Thread-Sensitive Libraries

    View Slide

  67. result = async_to_sync(awaitable)(arg1, name=arg2)
    result = await sync_to_async(callable)(arg1, name=arg2)

    View Slide

  68. BaseHandler
    get_response
    Sync View
    __call__
    TestClient
    get/post
    Main Thread Event Loop Main Thread
    async_to_sync sync_to_async

    View Slide

  69. There's a whole talk in how this works!
    Also, it's not pretty or nice and it really shouldn't be necessary.

    View Slide

  70. M I D D L E W A R E

    View Slide

  71. get_response
    Middleware 1
    Middleware 2
    View

    View Slide

  72. get_response
    Sync
    Middleware
    Async
    Middleware
    Async View
    async_to_sync
    sync_to_async

    View Slide

  73. Transactions
    Views are auto-wrapped in them with ATOMIC_REQUESTS
    Templates
    Direct calls from error handlers
    Tracebacks
    They're really long with all the switch functions

    View Slide

  74. Goal: Django 3.1 has async def views
    They already work on the branch right now!

    View Slide

  75. Phase Three: ORM

    View Slide

  76. API Design is crucial
    It must be familiar, yet safe.

    View Slide

  77. # Iteration is the one transparent thing
    for result in Model.objects.filter(name="Andrew"):
    >>> QuerySet.__iter__
    # This can work in the same codebase!
    async for result in Model.objects.filter(name="Andrew"):
    >>> QuerySet.__aiter__

    View Slide

  78. # But some things will never work -
    # we'll need to force select_related
    result = instance.foreign_key.name

    View Slide

  79. QuerySet Query Compiler Connection

    View Slide

  80. QuerySet Query Compiler Connection

    View Slide

  81. QuerySet Query Compiler Connection

    View Slide

  82. In the meantime, async-safety
    You just try calling the ORM from async code in 3.0!

    View Slide

  83. async def random_code():
    result = Model.objects.get(pk=5)
    >>> SynchronousOnlyOperation("You cannot call this from
    an async context - use a thread or sync_to_async.")

    View Slide

  84. This needs a lot more research
    It's also not going to happen straight away.

    View Slide

  85. LOOKING AHEAD
    4.

    View Slide

  86. Cache? Templates? Forms?
    Some will benefit from async, some will not

    View Slide

  87. Some things don't need to be async
    URL routing is just fine as it is.

    View Slide

  88. Async views are the cornerstone
    Once we get those working, all other paths open up

    View Slide

  89. Being careful about performance
    Things could easily slow down for synchronous applications

    View Slide

  90. Being careful about people
    We need to bring on new faces, and not burn out others

    View Slide

  91. Documentation
    Async needs to be clear, safe, and clearly optional

    View Slide

  92. Funding
    Async expertise is rare. We need to pay people for their knowledge.

    View Slide

  93. Organisation
    One of the largest changes in Django's history.

    View Slide

  94. aeracode.org/2018/02/19/python-async-simplified/
    A deeper dive into async vs. sync functions
    github.com/django/deps/blob/master/accepted/0009-async.rst
    DEP 0009, the proposal for async in Django
    code.djangoproject.com/wiki/AsyncProject
    Where to go to help

    View Slide

  95. Thanks.
    Andrew Godwin
    @andrewgodwin // aeracode.org

    View Slide