$30 off During Our Annual Pro Sale. View Details »

Django & Twisted (Django Under The Hood 2015)

Django & Twisted (Django Under The Hood 2015)

Amber Brown (HawkOwl)

November 06, 2015
Tweet

More Decks by Amber Brown (HawkOwl)

Other Decks in Programming

Transcript

  1. Django & Twisted
    Django Under The Hood, 2015

    View Slide

  2. Hello, I’m
    Amber Brown
    (HawkOwl)

    View Slide

  3. I live in Perth, Western Australia

    View Slide

  4. I organise Django Girls events!

    View Slide

  5. omg it’s russ
    I organise Django Girls events!

    View Slide

  6. I serve on the Django Code of Conduct
    Committee.

    View Slide

  7. I’m a Twisted core
    developer
    …and release manager
    (get hype for 15.5!)

    View Slide

  8. (image by isometri.cc)

    View Slide

  9. View Slide

  10. Paraphrasing the
    DjangoCon AU 2014 Keynote

    View Slide

  11. I’m an invited speaker

    View Slide

  12. It’s expected that I
    have something of use
    to tell you

    View Slide

  13. Talks are only worthwhile
    if they educate or entertain

    View Slide

  14. So I’m going to say this
    upfront, with no ambiguity:

    View Slide

  15. This talk’s conclusion is NOT
    “Django sucks”.

    View Slide

  16. This talk’s conclusion is
    NOT “using Twisted makes
    you a better programmer”.

    View Slide

  17. This talk’s conclusion is
    that the future of Python
    web development is
    working together.

    View Slide

  18. Any interpretation
    drawing a different
    conclusion is incorrect

    View Slide

  19. >>> Django == good
    True
    >>> Twisted == good
    True

    View Slide

  20. WARNING
    This talk is full of spiders

    View Slide

  21. No question is
    stupid

    View Slide

  22. Concepts

    View Slide

  23. For the purposes of this
    talk, synchronous code
    returns in-line

    View Slide

  24. def sync():
    return 1

    View Slide

  25. …and asynchronous code
    calls another function with
    a result at some later time

    View Slide

  26. def async(func):
    func(1)

    View Slide

  27. However, this is also
    asynchronous

    View Slide

  28. def asyncyieldfrom():
    a = yield from somefunc()
    return a

    View Slide

  29. Contrary to how it looks,
    yield-from—using functions
    do not “return immediately”

    View Slide

  30. Python suspends at the
    yield point, and can run
    other things — purely
    syntactic sugar

    View Slide

  31. Blocking, for the purposes of
    this talk, means that Python
    cannot run absolutely
    anything else during that
    period due to I/O operations

    View Slide

  32. Short, CPU-bound
    tasks are not
    considered “blocking”

    View Slide

  33. Long CPU-bound, or
    “short”/long I/O bound
    operations are
    “blocking”

    View Slide

  34. “Short” I/O still takes a
    long time

    View Slide

  35. PING google.com (150.101.170.180): 56 data bytes
    64 bytes: icmp_seq=0 ttl=60 time=13.217 ms
    64 bytes: icmp_seq=1 ttl=60 time=18.227 ms
    64 bytes: icmp_seq=2 ttl=60 time=13.117 ms

    View Slide

  36. 13ms in computer time
    is an eternity

    View Slide

  37. What is Twisted?

    View Slide

  38. Asynchronous
    networking framework

    View Slide

  39. At least a decade old

    View Slide

  40. Stable & Mature
    (thanks to a robust
    Compatibility Policy)

    View Slide

  41. Many protocol
    implementations
    (HTTP/1.0+1.1, SMTP, IMAP,
    DNS, SSH, many many more)

    View Slide

  42. Python 2.7/3.3+
    (Python 3.3+ port is
    incomplete, 50%+ there)

    View Slide

  43. Time-based versioning
    15.0 == 1st release in ’15
    15.5 == 6th release in ‘15

    View Slide

  44. How Twisted’s Reactor
    Works

    View Slide

  45. Sockets usually block
    until the data is sent

    View Slide

  46. Twisted configures the
    sockets to be non-
    blocking

    View Slide

  47. Given the non-blocking socket
    socket, socket.write()
    will write to the send buffer
    and return immediately

    View Slide

  48. If the send buffer is full,
    it raises EWOULDBLOCK

    View Slide

  49. Further socket.write()
    calls are put in a secondary
    send buffer by Twisted

    View Slide

  50. This secondary send
    buffering is taken care of by
    the Twisted Protocol class
    (socket.write() is never
    called directly by user code)

    View Slide

  51. socket.read() is also
    called automatically by
    Protocol

    View Slide

  52. Twisted’s reactor then
    alerts Protocol when there
    is more data to be read, or
    more data can be written

    View Slide

  53. select, poll, epoll,
    kqueue

    View Slide

  54. Takes a list of file descriptors
    (eg. sockets) and returns the
    ones that can have further
    data written/read

    View Slide

  55. If more data can be written,
    Protocol tries to empty its
    secondary send buffer

    View Slide

  56. If more data can be read,
    Protocol reads it and
    gives it to user code with
    the overridden
    dataReceived method

    View Slide

  57. That handles sending/
    receiving data, but we
    operate on a higher
    level

    View Slide

  58. Each Protocol
    implements something
    — WebSockets, SMTP, et
    al

    View Slide

  59. The Protocol is
    asynchronous, so the
    consumption of its data
    must also be asynchronous

    View Slide

  60. Deferreds

    View Slide


  61. View Slide

  62. “If you don’t understand
    Deferreds, you’re too
    stupid for Twisted”

    View Slide

  63. That belief has no place
    in any Twisted I’m a
    part of

    View Slide

  64. If you don’t “get”
    Deferreds, that is OUR
    failure.

    View Slide

  65. We need better
    documentation

    View Slide

  66. We need better
    examples

    View Slide

  67. We need to adopt
    syntactic changes that
    make it easier

    View Slide


  68. View Slide

  69. Deferreds are an object
    which holds a result at
    some point in time

    View Slide

  70. Callbacks mean ‘when you
    have a result, call this
    function with the result’

    View Slide

  71. Deferreds have a “callback
    chain”, where the result is
    passed through

    View Slide

  72. d = Deferred()
    d.addCallback(lambda t: t + 1)
    d.addCallback(lambda t: print(t))
    d.callback(12)

    View Slide

  73. >>> d = Deferred()
    >>> d.addCallback(lambda t: t + 1)

    >>> d.addCallback(lambda t: print(t))

    >>> d.callback(12)
    13

    View Slide

  74. addCallback returns a
    Deferred, so you can
    chain it

    View Slide

  75. Deferred() \
    .addCallback(lambda t: t + 1) \
    .addCallback(lambda t: print(t)) \
    .callback(12)

    View Slide

  76. Callbacks can be synchronous
    (although they should not block)
    or return more Deferreds

    View Slide

  77. Many things return
    Deferreds

    View Slide

  78. >> import treq
    >> treq.get("https://google.com")

    View Slide

  79. import treq
    from twisted.internet.task import react
    def get(reactor):
    d = treq.get("http://atleastfornow.net")
    d.addCallback(treq.content)
    d.addCallback(lambda _: print(_))
    return d
    react(get)

    View Slide

  80. @inlineCallbacks

    View Slide

  81. inlineCallbacks
    makes Deferreds act
    like Futures/coroutines

    View Slide

  82. import treq
    from twisted.internet.task import react
    from twisted.internet.defer import inlineCallbacks
    @inlineCallbacks
    def get(reactor):
    request = yield treq.get(
    "http://atleastfornow.net")
    content = yield treq.content(request)
    print(content)
    react(get)

    View Slide

  83. Supported in Twisted
    since generators were
    introduced

    View Slide

  84. Return a value with
    defer.returnValue()

    View Slide

  85. Works with regular Deferreds
    — a function wrapped with
    inlineCallbacks returns a
    Deferred automatically

    View Slide

  86. To wait for a Deferred
    to fire, use yield in
    the function

    View Slide

  87. Making Django
    Asynchronous

    View Slide

  88. Django is synchronous
    at its core

    View Slide

  89. WSGI relies on what it
    calls being synchronous

    View Slide

  90. Django’s ORM does
    blocking I/O

    View Slide

  91. Making either of these
    asynchronous is complex

    View Slide

  92. asynchronousness
    can’t be bolted on

    View Slide

  93. Everything has to
    cooperate or
    everything falls apart

    View Slide

  94. “Common Sense”
    async == hard
    sync == easy

    View Slide

  95. In reality, each
    approach has tradeoffs

    View Slide

  96. Synchronous Upsides
    • Code flow is easier to understand — do x, then y
    • Only one “thread” of execution, for simplicity
    • Many libraries are synchronous

    View Slide

  97. Synchronous Downsides
    • You can only do one thing at once
    • Although suited to the request/response cycle, it
    can only really do that
    • Persistent connections are not simple to
    implement

    View Slide

  98. Asynchronous Upsides
    • Massively scalable network concurrency
    • Multiple “threads” of execution — the code handling
    the request doesn’t have to finish after the request is
    written
    • Handling persistent/evented connections is super easy
    • Reactor model async is threadless
    • Python 3 adds some syntactic sugar that makes it
    easier to write

    View Slide

  99. Asynchronous Downsides
    • “Callback hell” when using raw futures/deferreds
    • You have to be a good citizen — blocking in the
    reactor loop is disastrous for performance
    • Doing I/O is “harder” because you have to be
    explicit about it
    • Python 2 lacks a bunch of async syntactic sugar

    View Slide

  100. You can’t get the
    upsides of both

    View Slide

  101. But you can try!

    View Slide

  102. Threaded WSGI Runner
    • The standard Django deployment method — run
    lots of threads, so it doesn’t matter if it blocks
    • Each thread is blocking, so it can’t run multiple
    I/O operations at once
    • To handle many concurrent requests, you need
    many threads

    View Slide

  103. Hendrix
    • Hendrix is a “Twisted Django”
    • WSGI server using Twisted, plus WebSockets
    • Multiprocessing, multithreaded
    • https://github.com/hangarunderground/hendrix

    View Slide

  104. Crochet
    • Run Twisted code side-by-side with blocking
    code
    • Runs a Twisted reactor in another thread, rather
    than Twisted calling Django
    • https://github.com/itamarst/crochet

    View Slide

  105. The Future of Django
    (Django Channels)

    View Slide

  106. Brainchild of
    Andrew Godwin

    View Slide

  107. Django Channels makes
    Django event-driven

    View Slide

  108. Asynchronous server
    (Twisted)
    +
    Synchronous “workers”

    View Slide

  109. Requests and WebSocket
    events are now “events”
    sent through “channels”

    View Slide

  110. You write synchronous
    code which handles
    these events

    View Slide

  111. Channel events go on a
    queue, and are picked
    up by workers

    View Slide

  112. Workers can also put
    things on the queue
    (but can’t get the result)

    View Slide

  113. Channels Upsides
    • It allows you to use WebSockets!
    • If you don’t care about the response (eg. a page
    view counter), it can be sent by a channel and
    run by a worker without blocking the current
    event
    • The workers don’t have to be on the same
    machine, allowing distribution

    View Slide

  114. Channels Downsides
    • You can’t get the results of events you create in
    your code
    • Your code can still only “do” one thing at a time
    • Your code is a few steps removed from the real
    WebSocket or HTTP connections, which makes
    it less flexible

    View Slide

  115. So, what does
    Channels look like?

    View Slide

  116. View Slide

  117. When a HTTP/WebSocket
    event comes in from a client, it
    sends a message to a channel

    View Slide

  118. You implement consumers
    for these channels

    View Slide

  119. You are given a channel to
    send the result of your
    consumer when it is called

    View Slide

  120. In the case of a HTTP request,
    you send back a “channel
    encoded” response object

    View Slide

  121. In the case of Websockets,
    you send back content

    View Slide

  122. This content is then
    returned to the client

    View Slide

  123. WebSocket clients can
    be put into “Groups”

    View Slide

  124. You can then broadcast a
    message out to a Group

    View Slide

  125. What makes it
    different?

    View Slide

  126. Channels doesn’t actually
    make your code asynchronous,
    it just adds async runners for
    your sync code

    View Slide

  127. It doesn’t tackle the
    “hard” problem of running
    Django asynchronously

    View Slide

  128. So it doesn’t get all the
    benefits as if it did

    View Slide

  129. Maybe that’s enough?

    View Slide

  130. It’s a positive
    development for Django

    View Slide

  131. It supports Python 2.7
    and Python 3.3+

    View Slide

  132. Check it out:
    http://git.io/vYEbp

    View Slide

  133. So, why not just use
    Twisted?

    View Slide

  134. Well…

    View Slide

  135. The Future of Django
    (alternate)

    View Slide

  136. WSGI II
    Electric Boogaloo

    View Slide

  137. WSGI is currently
    inherently request/
    response

    View Slide

  138. WebSockets is useful,
    and WSGI II would need
    to support it

    View Slide

  139. WebSockets 2?

    View Slide

  140. HTTP falls out of use?

    View Slide

  141. Metal WSGear Solid 3
    Snake Eater

    View Slide

  142. Async is undergoing
    another renaissance

    View Slide

  143. Django has to decide
    where it is going to sit

    View Slide

  144. Adopting an asynchronous
    framework is a long-term
    way forward

    View Slide

  145. It will require a lot of
    broken eggs, but Django
    can make the transition

    View Slide

  146. View Slide

  147. This is Django…

    View Slide

  148. This is Django…
    …with async views…

    View Slide

  149. This is Django…
    …with async views…
    …with an async ORM…

    View Slide

  150. This is Django…
    …with async views…
    …with an async ORM…
    …running on Twisted Web…

    View Slide

  151. This is Django…
    …with async views…
    …with an async ORM…
    …running on Twisted Web…
    …with no WSGI.

    View Slide

  152. Live Demo

    View Slide

  153. View Slide

  154. Caveats: I wrote this on
    a plane, the ORM runs
    in a threadpool, the
    tests fail hilariously

    View Slide

  155. But it’s serving
    concurrent web
    requests in pure
    Python

    View Slide

  156. async_create()
    which returns a
    Deferred, etc

    View Slide

  157. ORM needs more work

    View Slide

  158. The ORM does a lot of
    things that cause
    cursor.execute() where
    you wouldn’t expect

    View Slide

  159. The backends need to
    be truly asynchronous

    View Slide

  160. More separation
    between SQL generation,
    and executing that SQL

    View Slide

  161. Then we have all the
    requirements for an
    asynchronous Django!

    View Slide

  162. Django users have to be
    good async citizens

    View Slide

  163. Like I said, everything
    has to cooperate or it
    all falls apart

    View Slide

  164. yield from
    Python 3.4

    View Slide

  165. await, async iterators,
    async context managers
    PEP 492 in Python 3.5

    View Slide

  166. Django might be able
    to support async & sync
    views

    View Slide

  167. WSGI would work as it
    does now

    View Slide

  168. If using Twisted as your
    web server, you can use
    async views

    View Slide

  169. Django’s ORM and
    other features would
    then be usable by
    Twisted libraries

    View Slide

  170. Then Django doesn’t
    need to care about
    WebSockets, or
    whatever comes next

    View Slide

  171. – someone, unless I imagined that
    “Django should have been a
    Twisted plugin.”

    View Slide

  172. The Future of Twisted

    View Slide

  173. Twisted isn’t perfect

    View Slide

  174. Contributor onboarding
    improvements

    View Slide

  175. Contributor tooling
    improvements

    View Slide

  176. Git migration

    View Slide

  177. Twisted’s future is new
    blood, and we need to
    work for that

    View Slide

  178. Adopting a Django-style
    Deprecation Policy
    (removing deprecated junk)

    View Slide

  179. Shedding the past
    (Python 2.6 support)

    View Slide

  180. Adopting Python 3
    features

    (def async, yield from)

    View Slide

  181. Twisted + Django

    View Slide

  182. I would like to see this
    happen

    View Slide

  183. Like I said earlier, you
    cannot get the upsides
    of async and sync code
    at the same time

    View Slide

  184. But with asyncio,
    writing asynchronous
    code in Python is
    becoming “normal”

    View Slide

  185. Features like yield
    from and async def can
    be adopted by Twisted,
    even though they’re
    targeted at asyncio

    View Slide

  186. This removes some of
    the difficulty of writing
    async code
    (“callback hell”)

    View Slide

  187. Makes async code look
    sequential

    View Slide

  188. Ugly hax:
    github.com/hawkowl/django

    View Slide

  189. Questions answered
    before you ask

    View Slide

  190. What about gevent?

    View Slide

  191. Glyph’s “Unyielding”
    https://goo.gl/lYDtct

    View Slide

  192. — Glyph
    “Despite the fact that implicit coroutines
    masquerade under many different names, many of
    which don’t include the word “thread” – for
    example, “greenlets”, “coroutines”, “fibers”, “tasks”
    – green or lightweight threads are indeed threads
    … In the long run, when you build a system that
    relies upon them, you eventually have all the
    pitfalls and dangers of full-blown preemptive
    threads.”

    View Slide

  193. What would an async
    Django get me?

    View Slide

  194. Websockets
    More I/O efficiency
    You don’t need a task manager
    to run things after a response

    View Slide

  195. Why do you wear a red
    trenchcoat?

    View Slide

  196. View Slide

  197. Questions!

    View Slide