Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Django & Twisted (Django Under The Hood 2015)

Django & Twisted (Django Under The Hood 2015)


Amber Brown (HawkOwl)

November 06, 2015


  1. Django & Twisted Django Under The Hood, 2015

  2. Hello, I’m Amber Brown (HawkOwl)

  3. I live in Perth, Western Australia

  4. I organise Django Girls events!

  5. omg it’s russ I organise Django Girls events!

  6. I serve on the Django Code of Conduct Committee.

  7. I’m a Twisted core developer …and release manager (get hype

    for 15.5!)
  8. (image by isometri.cc)

  9. None
  10. Paraphrasing the DjangoCon AU 2014 Keynote

  11. I’m an invited speaker

  12. It’s expected that I have something of use to tell

  13. Talks are only worthwhile if they educate or entertain

  14. So I’m going to say this upfront, with no ambiguity:

  15. This talk’s conclusion is NOT “Django sucks”.

  16. This talk’s conclusion is NOT “using Twisted makes you a

    better programmer”.
  17. This talk’s conclusion is that the future of Python web

    development is working together.
  18. Any interpretation drawing a different conclusion is incorrect

  19. >>> Django == good True >>> Twisted == good True

  20. WARNING This talk is full of spiders

  21. No question is stupid

  22. Concepts

  23. For the purposes of this talk, synchronous code returns in-line

  24. def sync(): return 1

  25. …and asynchronous code calls another function with a result at

    some later time
  26. def async(func): func(1)

  27. However, this is also asynchronous

  28. def asyncyieldfrom(): a = yield from somefunc() return a

  29. Contrary to how it looks, yield-from—using functions do not “return

  30. Python suspends at the yield point, and can run other

    things — purely syntactic sugar
  31. Blocking, for the purposes of this talk, means that Python

    cannot run absolutely anything else during that period due to I/O operations
  32. Short, CPU-bound tasks are not considered “blocking”

  33. Long CPU-bound, or “short”/long I/O bound operations are “blocking”

  34. “Short” I/O still takes a long time

  35. PING google.com ( 56 data bytes 64 bytes: icmp_seq=0 ttl=60

    time=13.217 ms 64 bytes: icmp_seq=1 ttl=60 time=18.227 ms 64 bytes: icmp_seq=2 ttl=60 time=13.117 ms
  36. 13ms in computer time is an eternity

  37. What is Twisted?

  38. Asynchronous networking framework

  39. At least a decade old

  40. Stable & Mature (thanks to a robust Compatibility Policy)

  41. Many protocol implementations (HTTP/1.0+1.1, SMTP, IMAP, DNS, SSH, many many

  42. Python 2.7/3.3+ (Python 3.3+ port is incomplete, 50%+ there)

  43. Time-based versioning 15.0 == 1st release in ’15 15.5 ==

    6th release in ‘15
  44. How Twisted’s Reactor Works

  45. Sockets usually block until the data is sent

  46. Twisted configures the sockets to be non- blocking

  47. Given the non-blocking socket socket, socket.write() will write to the

    send buffer and return immediately
  48. If the send buffer is full, it raises EWOULDBLOCK

  49. Further socket.write() calls are put in a secondary send buffer

    by Twisted
  50. This secondary send buffering is taken care of by the

    Twisted Protocol class (socket.write() is never called directly by user code)
  51. socket.read() is also called automatically by Protocol

  52. Twisted’s reactor then alerts Protocol when there is more data

    to be read, or more data can be written
  53. select, poll, epoll, kqueue

  54. Takes a list of file descriptors (eg. sockets) and returns

    the ones that can have further data written/read
  55. If more data can be written, Protocol tries to empty

    its secondary send buffer
  56. If more data can be read, Protocol reads it and

    gives it to user code with the overridden dataReceived method
  57. That handles sending/ receiving data, but we operate on a

    higher level
  58. Each Protocol implements something — WebSockets, SMTP, et al

  59. The Protocol is asynchronous, so the consumption of its data

    must also be asynchronous
  60. Deferreds

  61. <anger>

  62. “If you don’t understand Deferreds, you’re too stupid for Twisted”

  63. That belief has no place in any Twisted I’m a

    part of
  64. If you don’t “get” Deferreds, that is OUR failure.

  65. We need better documentation

  66. We need better examples

  67. We need to adopt syntactic changes that make it easier

  68. </anger>

  69. Deferreds are an object which holds a result at some

    point in time
  70. Callbacks mean ‘when you have a result, call this function

    with the result’
  71. Deferreds have a “callback chain”, where the result is passed

  72. d = Deferred() d.addCallback(lambda t: t + 1) d.addCallback(lambda t:

    print(t)) d.callback(12)
  73. >>> d = Deferred() >>> d.addCallback(lambda t: t + 1)

    <Deferred at 0x100a03c50> >>> d.addCallback(lambda t: print(t)) <Deferred at 0x100a03c50> >>> d.callback(12) 13
  74. addCallback returns a Deferred, so you can chain it

  75. Deferred() \ .addCallback(lambda t: t + 1) \ .addCallback(lambda t:

    print(t)) \ .callback(12)
  76. Callbacks can be synchronous (although they should not block) or

    return more Deferreds
  77. Many things return Deferreds

  78. >> import treq >> treq.get("https://google.com") <Deferred at 0x10d6db5c0>

  79. import treq from twisted.internet.task import react def get(reactor): d =

    treq.get("http://atleastfornow.net") d.addCallback(treq.content) d.addCallback(lambda _: print(_)) return d react(get)
  80. @inlineCallbacks

  81. inlineCallbacks makes Deferreds act like Futures/coroutines

  82. import treq from twisted.internet.task import react from twisted.internet.defer import inlineCallbacks

    @inlineCallbacks def get(reactor): request = yield treq.get( "http://atleastfornow.net") content = yield treq.content(request) print(content) react(get)
  83. Supported in Twisted since generators were introduced

  84. Return a value with defer.returnValue()

  85. Works with regular Deferreds — a function wrapped with inlineCallbacks

    returns a Deferred automatically
  86. To wait for a Deferred to fire, use yield in

    the function
  87. Making Django Asynchronous

  88. Django is synchronous at its core

  89. WSGI relies on what it calls being synchronous

  90. Django’s ORM does blocking I/O

  91. Making either of these asynchronous is complex

  92. asynchronousness can’t be bolted on

  93. Everything has to cooperate or everything falls apart

  94. “Common Sense” async == hard sync == easy

  95. In reality, each approach has tradeoffs

  96. Synchronous Upsides • Code flow is easier to understand —

    do x, then y • Only one “thread” of execution, for simplicity • Many libraries are synchronous
  97. Synchronous Downsides • You can only do one thing at

    once • Although suited to the request/response cycle, it can only really do that • Persistent connections are not simple to implement
  98. Asynchronous Upsides • Massively scalable network concurrency • Multiple “threads”

    of execution — the code handling the request doesn’t have to finish after the request is written • Handling persistent/evented connections is super easy • Reactor model async is threadless • Python 3 adds some syntactic sugar that makes it easier to write
  99. Asynchronous Downsides • “Callback hell” when using raw futures/deferreds •

    You have to be a good citizen — blocking in the reactor loop is disastrous for performance • Doing I/O is “harder” because you have to be explicit about it • Python 2 lacks a bunch of async syntactic sugar
  100. You can’t get the upsides of both

  101. But you can try!

  102. Threaded WSGI Runner • The standard Django deployment method —

    run lots of threads, so it doesn’t matter if it blocks • Each thread is blocking, so it can’t run multiple I/O operations at once • To handle many concurrent requests, you need many threads
  103. Hendrix • Hendrix is a “Twisted Django” • WSGI server

    using Twisted, plus WebSockets • Multiprocessing, multithreaded • https://github.com/hangarunderground/hendrix
  104. Crochet • Run Twisted code side-by-side with blocking code •

    Runs a Twisted reactor in another thread, rather than Twisted calling Django • https://github.com/itamarst/crochet
  105. The Future of Django (Django Channels)

  106. Brainchild of Andrew Godwin

  107. Django Channels makes Django event-driven

  108. Asynchronous server (Twisted) + Synchronous “workers”

  109. Requests and WebSocket events are now “events” sent through “channels”

  110. You write synchronous code which handles these events

  111. Channel events go on a queue, and are picked up

    by workers
  112. Workers can also put things on the queue (but can’t

    get the result)
  113. Channels Upsides • It allows you to use WebSockets! •

    If you don’t care about the response (eg. a page view counter), it can be sent by a channel and run by a worker without blocking the current event • The workers don’t have to be on the same machine, allowing distribution
  114. Channels Downsides • You can’t get the results of events

    you create in your code • Your code can still only “do” one thing at a time • Your code is a few steps removed from the real WebSocket or HTTP connections, which makes it less flexible
  115. So, what does Channels look like?

  116. None
  117. When a HTTP/WebSocket event comes in from a client, it

    sends a message to a channel
  118. You implement consumers for these channels

  119. You are given a channel to send the result of

    your consumer when it is called
  120. In the case of a HTTP request, you send back

    a “channel encoded” response object
  121. In the case of Websockets, you send back content

  122. This content is then returned to the client

  123. WebSocket clients can be put into “Groups”

  124. You can then broadcast a message out to a Group

  125. What makes it different?

  126. Channels doesn’t actually make your code asynchronous, it just adds

    async runners for your sync code
  127. It doesn’t tackle the “hard” problem of running Django asynchronously

  128. So it doesn’t get all the benefits as if it

  129. Maybe that’s enough?

  130. It’s a positive development for Django

  131. It supports Python 2.7 and Python 3.3+

  132. Check it out: http://git.io/vYEbp

  133. So, why not just use Twisted?

  134. Well…

  135. The Future of Django (alternate)

  136. WSGI II Electric Boogaloo

  137. WSGI is currently inherently request/ response

  138. WebSockets is useful, and WSGI II would need to support

  139. WebSockets 2?

  140. HTTP falls out of use?

  141. Metal WSGear Solid 3 Snake Eater

  142. Async is undergoing another renaissance

  143. Django has to decide where it is going to sit

  144. Adopting an asynchronous framework is a long-term way forward

  145. It will require a lot of broken eggs, but Django

    can make the transition
  146. None
  147. This is Django…

  148. This is Django… …with async views…

  149. This is Django… …with async views… …with an async ORM…

  150. This is Django… …with async views… …with an async ORM…

    …running on Twisted Web…
  151. This is Django… …with async views… …with an async ORM…

    …running on Twisted Web… …with no WSGI.
  152. Live Demo

  153. None
  154. Caveats: I wrote this on a plane, the ORM runs

    in a threadpool, the tests fail hilariously
  155. But it’s serving concurrent web requests in pure Python

  156. async_create() which returns a Deferred, etc

  157. ORM needs more work

  158. The ORM does a lot of things that cause cursor.execute()

    where you wouldn’t expect
  159. The backends need to be truly asynchronous

  160. More separation between SQL generation, and executing that SQL

  161. Then we have all the requirements for an asynchronous Django!

  162. Django users have to be good async citizens

  163. Like I said, everything has to cooperate or it all

    falls apart
  164. yield from Python 3.4

  165. await, async iterators, async context managers PEP 492 in Python

  166. Django might be able to support async & sync views

  167. WSGI would work as it does now

  168. If using Twisted as your web server, you can use

    async views
  169. Django’s ORM and other features would then be usable by

    Twisted libraries
  170. Then Django doesn’t need to care about WebSockets, or whatever

    comes next
  171. – someone, unless I imagined that “Django should have been

    a Twisted plugin.”
  172. The Future of Twisted

  173. Twisted isn’t perfect

  174. Contributor onboarding improvements

  175. Contributor tooling improvements

  176. Git migration

  177. Twisted’s future is new blood, and we need to work

    for that
  178. Adopting a Django-style Deprecation Policy (removing deprecated junk)

  179. Shedding the past (Python 2.6 support)

  180. Adopting Python 3 features
 (def async, yield from)

  181. Twisted + Django

  182. I would like to see this happen

  183. Like I said earlier, you cannot get the upsides of

    async and sync code at the same time
  184. But with asyncio, writing asynchronous code in Python is becoming

  185. Features like yield from and async def can be adopted

    by Twisted, even though they’re targeted at asyncio
  186. This removes some of the difficulty of writing async code

    (“callback hell”)
  187. Makes async code look sequential

  188. Ugly hax: github.com/hawkowl/django

  189. Questions answered before you ask

  190. What about gevent?

  191. Glyph’s “Unyielding” https://goo.gl/lYDtct

  192. — Glyph “Despite the fact that implicit coroutines masquerade under

    many different names, many of which don’t include the word “thread” – for example, “greenlets”, “coroutines”, “fibers”, “tasks” – green or lightweight threads are indeed threads … In the long run, when you build a system that relies upon them, you eventually have all the pitfalls and dangers of full-blown preemptive threads.”
  193. What would an async Django get me?

  194. Websockets More I/O efficiency You don’t need a task manager

    to run things after a response
  195. Why do you wear a red trenchcoat?

  196. None
  197. Questions!