Just Add Await: Retrofitting Async Into Django

Just Add Await: Retrofitting Async Into Django

A talk I gave at PyCon Australia 2019 (and then later gave a modified version of at DjangoCon US 2019)

077e9a0cb34fa3eba2699240c9509717?s=128

Andrew Godwin

August 02, 2019
Tweet

Transcript

  1. JUST ADD AWAIT ANDREW GODWIN // @andrewgodwin RETROFITTING ASYNC INTO

    DJANGO
  2. Hi, I’m Andrew Godwin • Django contributor (Migrations/Channels) • Principal

    Engineer at • I see Python threads in my sleep now
  3. None
  4. async def view(request): return TemplateResponse( request, "template.html", {"article": await api_call(pk=5)}

    )
  5. Django In Depth 1. 2. 3. Threading, cooperation, and intrigue

    Spanning two worlds with one vision Handlers, Requests, Middleware & Views Big Framework Problems Async In Brief
  6. ASYNC IN BRIEF 1.

  7. You don't know when it'll switch! Threads are preemptive

  8. You don't know when it'll switch! Threads are preemptive

  9. You don't know when it'll switch! Threads are preemptive Coroutines

    are cooperative They only yield at an await.
  10. Coroutines are cooperative They only yield at an await.

  11. You don't know when it'll switch! Threads are preemptive Coroutines

    are cooperative They only yield at an await.
  12. Coroutines need an event loop It's where the program idles

    between tasks
  13. An event loop runs in a single thread Yes, you

    can have threads and coroutines!
  14. Sync Thread Sync Thread Async Thread

  15. Threads are slow! The more you add, the worse it

    gets.
  16. Async is fast... As long as you are I/O bound!

  17. Async functions are different to sync They are not cross-compatible!

  18. # Call sync from sync result = function() # Call

    async from async result = await function()
  19. # Call async from sync in Python 3.7 result =

    asyncio.run(function()) # Python 3.6 loop = asyncio.new_event_loop() asyncio.set_event_loop(loop) result = loop.run_until_complete(function())
  20. # Call sync from async executor = ThreadPoolExecutor(max_workers=3) loop =

    asyncio.get_event_loop() result = await loop.run_in_executor( executor, function, *args, )
  21. Async calling Sync is dangerous It has to be in

    a separate thread or it'll block the event loop
  22. Sync Thread Async Thread

  23. Sync Thread Async Thread

  24. It's complicated This is why I encourage writing sync code

    at first!
  25. BIG FRAMEWORK PROBLEMS 2.

  26. Backwards compatibility is crucial Throw it away, and nobody will

    adopt your new thing
  27. A function cannot be both sync and async You have

    to pick one. I've tried.
  28. # You can have this... result = cache.get("my-key") # Or

    this. Not both. result = await cache.get("my-key")
  29. Totally different libraries And there's not even standards like DBAPI2

  30. # Even sleep is different! time.sleep(0.01) await asyncio.sleep(0.1)

  31. Lack of standards It's just too early on for things

    to coalesce.
  32. async def application(scope, receive, send): await receive() ... send({"type": "http.response",

    ...})
  33. Different language features No more attribute access

  34. # While this can have an async version... instance =

    await Model.objects.get() # There's no async way of doing this print(instance.foreign_key.name)
  35. Threads matter! Sync code all wants to run in the

    same thread still.
  36. Async has to add, not replace Sync Django is still

    important Things need to look familiar We don't want a wildly different API feel Things need to be safe Deadlocking or blocking is easier than ever
  37. DJANGO IN DEPTH 3.

  38. Handler ASGI / WSGI Server Middleware View ORM Template URL

    Router Forms
  39. Outside-in approach Async outside, sync inside

  40. Handler ASGI / WSGI Server Middleware View ORM Template URL

    Router Forms Phase One Phase Two Phase Three
  41. Phase One: ASGI Support Allowing Django to be async at

    all Phase Two: Async Views Unlocking async use in normal apps Phase Three: The ORM High-level async use for the most common case
  42. Phase One: Django 3.0 It's already committed! Phase Two: Django

    3.1 Unless things really do turn out nicely... Phase Three: Django 3.2/4.0 There's a lot of work here.
  43. Each phase brings concrete benefits Even if we stop!

  44. Phase One: ASGI

  45. Django predates WSGI Which turns out to actually help, in

    the end
  46. “ James Bennett, "Django and NIH", 2006 Just so you

    know, Django is a smug, arrogant framework that doesn’t play nice with others.
  47. “ James Bennett, "Django and NIH", 2006 Just so you

    know, Django is a smug, arrogant framework that doesn’t play nice with others. [...] Or at least, that’s the impression you’d get from reading the rants...
  48. Custom request/response objects Most other frameworks did this too Custom

    "handler" classes Abstracts away WSGI Custom middleware Wow, was this contentious at the time!
  49. WSGIHandler __call__ WSGI Server WSGIRequest BaseHandler get_response URLs Middleware View

    __call__ HTTP protocol Socket handling Transfer encodings Headers-to-META Upload file wrapping GET/POST parsing Exception catching Atomic view wrapper
  50. WSGIHandler __call__ WSGI Server WSGIRequest BaseHandler get_response URLs Middleware View

    __call__ ASGIHandler __call__ ASGI Server ASGIRequest
  51. WSGIHandler __call__ WSGI Server WSGIRequest BaseHandler get_response URLs Middleware View

    __call__ ASGIHandler __call__ ASGI Server ASGIRequest Asynchronous
  52. ASGI is mostly WSGI-compatible With better definitions of bytes versus

    unicode
  53. if self.scope.get('client'): self.META['REMOTE_ADDR'] = self.scope['client'][0] self.META['REMOTE_HOST'] = self.META['REMOTE_ADDR'] self.META['REMOTE_PORT'] =

    self.scope['client'][1]
  54. body_file = tempfile.SpooledTemporaryFile(max_size=..., mode='w+b') while True: message = await receive()

    if message['type'] == 'http.disconnect': # Early client disconnect. raise RequestAborted() # Add a body chunk from the message, if provided. if 'body' in message: body_file.write(message['body']) # Quit out if that's the end. if not message.get('more_body', False): break body_file.seek(0) return body_file
  55. if response.streaming: # Access `__iter__` and not `streaming_content` directly in

    case # it has been overridden in a subclass. for part in response: for chunk, _ in self.chunk_bytes(part): await send({ 'type': 'http.response.body', 'body': chunk, # Ignore "more" as there may be more parts; instead, # use an empty final closing message with False. 'more_body': True, }) # Final closing message. await send({'type': 'http.response.body'})
  56. WSGIHandler __call__ WSGI Server WSGIRequest BaseHandler get_response URLs Middleware View

    __call__ ASGIHandler __call__ ASGI Server ASGIRequest Asynchronous
  57. “ Me, earlier in this talk Async calling sync is

    dangerous!
  58. from asgiref.sync import sync_to_async result = await sync_to_async(callable)(arg1, name=arg2)

  59. Propagates exceptions nicely Really helps with debugging! Proxies threadlocals down

    correctly Because people really love threadlocals. Stickies sync code into one thread We'll get back to this. It's nasty.
  60. Result: Django 3.0 can speak ASGI But it can't do

    much else async... yet.
  61. Phase Two: Views

  62. WSGIHandler __call__ WSGI Server WSGIRequest BaseHandler get_response URLs Middleware View

    __call__ ASGIHandler __call__ ASGI Server ASGIRequest Asynchronous
  63. WSGIHandler __call__ WSGI Server WSGIRequest BaseHandler get_response URLs Middleware Async

    View __call__ ASGIHandler __call__ ASGI Server ASGIRequest Asynchronous Sync View __call__
  64. WSGIHandler __call__ BaseHandler get_response URLs Middleware Async View __call__ ASGIHandler

    __call__ Sync View __call__ TestClient get/post
  65. BaseHandler get_response Sync View __call__ TestClient get/post Main Thread Event

    Loop Sub Thread asyncio.run ThreadPool
  66. SQLite hates this Try This One Weird Trick To Help

    Thread-Sensitive Libraries
  67. result = async_to_sync(awaitable)(arg1, name=arg2) result = await sync_to_async(callable)(arg1, name=arg2)

  68. BaseHandler get_response Sync View __call__ TestClient get/post Main Thread Event

    Loop Main Thread async_to_sync sync_to_async
  69. There's a whole talk in how this works! Also, it's

    not pretty or nice and it really shouldn't be necessary.
  70. M I D D L E W A R E

  71. get_response Middleware 1 Middleware 2 View

  72. get_response Sync Middleware Async Middleware Async View async_to_sync sync_to_async

  73. Transactions Views are auto-wrapped in them with ATOMIC_REQUESTS Templates Direct

    calls from error handlers Tracebacks They're really long with all the switch functions
  74. Goal: Django 3.1 has async def views They already work

    on the branch right now!
  75. Phase Three: ORM

  76. API Design is crucial It must be familiar, yet safe.

  77. # Iteration is the one transparent thing for result in

    Model.objects.filter(name="Andrew"): >>> QuerySet.__iter__ # This can work in the same codebase! async for result in Model.objects.filter(name="Andrew"): >>> QuerySet.__aiter__
  78. # But some things will never work - # we'll

    need to force select_related result = instance.foreign_key.name
  79. QuerySet Query Compiler Connection

  80. QuerySet Query Compiler Connection

  81. QuerySet Query Compiler Connection

  82. In the meantime, async-safety You just try calling the ORM

    from async code in 3.0!
  83. async def random_code(): result = Model.objects.get(pk=5) >>> SynchronousOnlyOperation("You cannot call

    this from an async context - use a thread or sync_to_async.")
  84. This needs a lot more research It's also not going

    to happen straight away.
  85. LOOKING AHEAD 4.

  86. Cache? Templates? Forms? Some will benefit from async, some will

    not
  87. Some things don't need to be async URL routing is

    just fine as it is.
  88. Async views are the cornerstone Once we get those working,

    all other paths open up
  89. Being careful about performance Things could easily slow down for

    synchronous applications
  90. Being careful about people We need to bring on new

    faces, and not burn out others
  91. Documentation Async needs to be clear, safe, and clearly optional

  92. Funding Async expertise is rare. We need to pay people

    for their knowledge.
  93. Organisation One of the largest changes in Django's history.

  94. aeracode.org/2018/02/19/python-async-simplified/ A deeper dive into async vs. sync functions github.com/django/deps/blob/master/accepted/0009-async.rst

    DEP 0009, the proposal for async in Django code.djangoproject.com/wiki/AsyncProject Where to go to help
  95. Thanks. Andrew Godwin @andrewgodwin // aeracode.org