Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Future of Twisted, and Pretty Much Everything Else (PyCon CZ Keynote, 2015)

The Future of Twisted, and Pretty Much Everything Else (PyCon CZ Keynote, 2015)

Keynote #1 at PyCon Czech Republic.


Amber Brown (HawkOwl)

November 14, 2015


  1. The Future of Twisted (and pretty much everything else) PyCon

    CZ, 2015
  2. Hello, I’m Amber Brown (HawkOwl)

  3. I live in Perth, Western Australia

  4. I organise Django Girls events!

  5. omg it’s russ I organise Django Girls events!

  6. I’m a Twisted core developer …and release manager (get hyped

    for 15.5!)
  7. (image by isometri.cc)

  8. None
  9. WARNING This talk is full of spiders

  10. No question is stupid

  11. Concepts

  12. For the purposes of this talk, synchronous code returns inline

  13. def sync(): return 1

  14. …and asynchronous code calls another function with a result at

    some later time
  15. def async(func): func(1)

  16. However, this is also asynchronous

  17. def asyncyieldfrom(): a = yield from somefunc() return a

  18. Contrary to how it looks, yield- from—using functions don’t return

    an immediate result
  19. It’s a generator, and Python suspends it at the yield

    point, but can run other code while it is suspended
  20. Blocking means that Python cannot run anything else at all

    during that period due to I/O operations
  21. Short, CPU-bound tasks are not considered “blocking”

  22. Long CPU-bound, or “short”/ long I/O bound operations are “blocking”

  23. “Short” I/O still takes a long time

  24. PING google.com ( 56 data bytes 64 bytes: icmp_seq=0 ttl=60

    time=13.217 ms 64 bytes: icmp_seq=1 ttl=60 time=18.227 ms 64 bytes: icmp_seq=2 ttl=60 time=13.117 ms
  25. 13ms in computer time is an eternity

  26. Courtesy of Cory Benfield (@Lukasaoz)

  27. What is Twisted?

  28. Asynchronous networking framework (helps you write network servers & clients)

  29. At least a decade old

  30. Stable & Mature (thanks to a robust Compatibility Policy)

  31. Many protocol implementations (HTTP/1.0+1.1, SMTP, IMAP, DNS, SSH, many many

  32. Python 2.7/3.3+ (Python 3.3+ port is incomplete, 50%+ there)

  33. Time-based versioning 15.0 == 1st release in ’15 15.5 ==

    6th release in ‘15
  34. To explain the future, we need to explain the present

  35. How Twisted Works

  36. Cooperative Multitasking System

  37. Ideally, nothing blocks

  38. Everything revolves around the Reactor, and all the code is

    event-driven — things happen and the Reactor tells you
  39. Everything starts with a Protocol

  40. A Protocol implements a network protocol (eg. HTTP)

  41. Protocols are given a Transport to write to when a

    client/server connection is made
  42. Twisted’s Transports are capable of doing non- blocking socket I/O

  43. Twisted manages the Transport, and tries to write as much

    as it can
  44. If the socket send buffer is full, it holds a

    secondary buffer and schedules it for later
  45. It then notifies the Reactor that it wants to send

    more data when it can
  46. How does Twisted know when more can be done?

  47. select, poll, epoll, kqueue

  48. Takes a list of file descriptors (eg. sockets) and returns

    the ones that can have further data written/read
  49. If more data can be written, the Reactor tells the

    Transport to flush what it can of its send buffer
  50. If more data can be read, the Reactor notifies Transport,

    which reads it and gives it to the Protocol
  51. At no point does the network I/O block

  52. Each Protocol works together, yielding control

  53. Protocols implement useful things, with higher level abstractions

  54. For example, the SMTP Protocol sends an email and gets

    its delivery status
  55. How do we handle when function calls return before the

    data is sent?
  56. Deferreds

  57. <anger>

  58. “If you don’t understand Deferreds, you’re too stupid for Twisted”

  59. That belief has no place in any Twisted I’m a

    part of
  60. If you don’t “get” Deferreds, that is OUR failure.

  61. We need better documentation

  62. We need better examples

  63. We need to adopt syntactic changes that make it easier

  64. </anger>

  65. Deferreds are an object which holds a result at some

    point in time
  66. Callbacks mean ‘when you have a result, call this function

    with the result’
  67. Deferreds have a “callback chain”, where the result is passed

  68. d = Deferred() d.addCallback(lambda t: t + 1) d.addCallback(print) d.callback(12)

  69. In [11]: d = Deferred() In [12]: d.addCallback(lambda t: t

    + 1) Out[12]: <Deferred at 0x10a52f440> In [13]: d.addCallback(print) Out[13]: <Deferred at 0x10a52f440> In [14]: d.callback(12) 13
  70. addCallback returns the Deferred you are adding callbacks to, so

    you can chain it
  71. Deferred() \ .addCallback(lambda t: t + 1) \ .addCallback(print) \

  72. Callbacks can be synchronous (although they should not block) or

    return more Deferreds
  73. Many things return Deferreds

  74. >> import treq >> treq.get("https://google.com") <Deferred at 0x10d6db5c0>

  75. import treq from twisted.internet.task import react def get(reactor): d =

    treq.get("http://atleastfornow.net") d.addCallback(treq.content) d.addCallback(print) return d react(get)
  76. Deferreds are generally seen as “boilerplate-y”

  77. Which is why we have…

  78. @inlineCallbacks

  79. @inlineCallbacks is a decorator that makes Deferreds act like Futures/

    coroutines inside a function
  80. import treq from twisted.internet.task import react from twisted.internet.defer import inlineCallbacks

    @inlineCallbacks def get(reactor): request = yield treq.get("http://atleastfornow.net") content = yield treq.content(request) print(content) react(get)
  81. Simpler to understand and use

  82. You don’t manage a callback chain

  83. To wait for a Deferred to fire, use yield in

    the function
  84. Supported in Twisted since generators were introduced

  85. Works with regular Deferreds — a function wrapped with @inlineCallbacks

    returns a Deferred automatically
  86. Choose between the power of Deferreds and the ease of

    use of coroutines
  87. Return a value with defer.returnValue()

  88. The Future of Everything That Isn’t Twisted (in the web

  89. WSGI II Electric Boogaloo

  90. WSGI is currently inherently request/response

  91. WebSockets is useful, and WSGI II would need to support

  92. WebSockets 2?

  93. HTTP falls out of use?

  94. Metal WSGear Solid 3 Snake Eater

  95. Async is undergoing another renaissance

  96. Twisted, Asyncio, Tornado All good options

  97. Interoperability: Tornado works with Twisted txaio allows supporting asyncio and

    Twisted with one code base
  98. Less about the concurrency framework

  99. Bytes In Abstractions Out

  100. Lukasa’s HyperH2

  101. Implementation of HTTP/2 protocol with no ties to concurrency frameworks

  102. Crossbar.io’s Autobahn

  103. Underlying protocol is non- specific

  104. Higher level abstractions use txaio to provide instantiation- time concurrency

    framework selection
  105. The hard work is separating the things that do logic

    and things that do I/O
  106. A Case Study: Django

  107. It’s not impossible to go asynchronous

  108. I have a proof of concept that proves Django can

    make the transition
  109. None
  110. This is Django…

  111. This is Django… …with async views…

  112. This is Django… …with async views… …with an async ORM…

  113. This is Django… …with async views… …with an async ORM…

    …running on Twisted Web…
  114. This is Django… …with async views… …with an async ORM…

    …running on Twisted Web… …with no WSGI.
  115. Live Demo

  116. Caveats: I wrote this on a plane, the ORM runs

    in a threadpool, the tests fail hilariously
  117. async_create() which returns a Deferred, etc

  118. This is a proof of concept! The ORM needs more

  119. The ORM does a lot of things that cause cursor.execute()

    where you wouldn’t expect
  120. The backends need to be truly asynchronous

  121. More separation between SQL generation, and executing that SQL

  122. Then we have all the requirements for an asynchronous Django!

  123. The Future of Python

  124. Python 3 is (finally!) hitting critical mass

  125. With Python 3.5, porting is easier

  126. …but Python 3.3 and 3.4 are still huge deployment targets

  127. Next PyPy3 will target the Python 3.3 standard

  128. Python 3.4 is still the default Python on every Linux

    distribution (Okay, except for Arch)
  129. Python 3.5 is shipped on Ubuntu 15.10

  130. So we still need to care about older Python 3

  131. Porting Real Code to Python 3

  132. Porting to Python 3 is easy!

  133. …assuming you haven’t got any legacy

  134. …and assuming you have unittests for every part of your

    codebase, and every way it could be used
  135. …and assuming you’re not relying on Python 2 misfeatures

  136. So what if you have, you don’t, and you are?

  137. Twisted has code that’s unchanged from Py2.3 (Don’t fix what

    ain’t broke)
  138. Twisted has thousands of unittests, but it takes one oversight

    to have a hidden bug
  139. Lots of Twisted relies on string formatting on bytes objects

  140. …plus we have a Pyrex extension for some reactor support

    on Windows
  141. How do we get ourselves out of this mess, and

    why has it taken so long?
  142. Syntax & Semantics

  143. Python 2 raise Exception, “oh no” Python 2/3 raise Exception(“oh

    no”) You must raise instances of Exceptions, not a tuple of the Exception and its arguments
  144. Python 2 print “Hello!” Python 2/3 from __future__ import print_function

    print(“Hello!”) print is a function, not a keyword.
  145. Python 2 import innerpackage Python 2/3 from __future__ import absolute_import

    import .innerpackage Python 3 imports are absolute.
  146. Python 2 2 / 5.0 == 0.4 2 / 5

    == 0 Python 2/3 from __future__ import division 2 / 5 == 0.4 2 // 5 == 0 / in Python 3 returns a float, // returns an int.
  147. Strings

  148. Python 2 True Python 3 False In Python 2, bytes

    is an alias to str. In Python 3, it is its own type. DO NOT USE unicode_literals bytes == str
  149. Python 2 False Python 3 NameError: name 'unicode' is not

    defined The unicode type has been moved to str and is not aliased in Python 3. Use from six import text_type instead. unicode == str
  150. Python has three string types

  151. explicit bytes b“I’m some bytes!” Used for writing to the

    network, to binary files, etc
  152. explicit Unicode u“I’m a Unicode string!” Used whenever text is

    required Encoded to/decoded from bytes
  153. str “I’m a str!” Used for Python identifiers

  154. What should be a str? A: More or less nothing

  155. What should be bytes? A: Data that goes over the

    network or in files
  156. What should be Unicode? A: Everything else.

  157. Encodings are tricky

  158. Pop Quiz! What encoding is a HTTP header?

  159. A: Who knows! But ASCII probably works.

  160. – RFC 7230, “3.2.4. Field Parsing" Historically, HTTP has allowed

    field content with text in the ISO-8859-1 charset, supporting other charsets only through use of RFC2047 encoding. In practice, most HTTP header field values use only a subset of the US-ASCII charset. Newly defined header fields SHOULD limit their field values to US-ASCII octets. A recipient SHOULD treat other octets in field content as opaque data.
  161. None
  162. None
  163. Python 2 'Hello PyCon CZ' Python 3.4 <explodes> Python 3.5

    b'Hello PyCon CZ' Python 3 removed formatting on bytes, and reinstated it in Python 3.5. b"Hello %s" % (b"PyCon CZ",)
  164. Python 2 b"Content-Length: %d" % (110,) Python 2/3 b"Content-Length: "

    + str(110).encode('ascii') Being unable to use percent- formatting means that some easy tasks become much more verbose
  165. Python 2 'Hello PyCon CZ' Python 3.4 AttributeError: 'bytes' object

    has no attribute 'format' Python 3 removed .format() on bytes, and did not return in 3.5. b"Hello {name}".format(name=b"PyCon CZ")
  166. So how do you format bytes on Python 2 and

  167. A: Decode the format string and all arguments, then format,

    and encode back to bytes (Very slow, doesn’t work if you have random binary trash in it)
  168. B: Use percent formatting, drop 3.3 & 3.4 support (Not

    desirable, 3.4 is still a huge platform)
  169. C: Eschew formatting altogether and concatenate some bytes (Not as

    easy to read or understand, but works everywhere and is fast)
  170. Twisted picked C.

  171. response = b64encode('%s:%s' % (self.username, self.password)) response = response.strip('=')

    = b64encode(b''.join([self.username, b':', self.password])) response = response.strip(b'=')
  172. So we need to rewrite every instance of percent formatting

    or use of .format() for on-wire protocols
  173. 13.2: ~1700 instances 14.0: ~1750 instances 15.5: ~1800 instances

  174. Not all instances are for wire protocols, but it’s safe

    to assume that many are
  175. Basic Types

  176. type({}.keys()) Python 2 <type ‘list'> Python 3 <class ‘dict_keys'> The

    keys, items, and values methods on dicts no longer return lists, but an iterator.
  177. Python 2 Exception(“oh no!”).message Python 2/3 Exception(“oh no!”).args[0] Exception.message was

    introduced in Python 2.5, deprecated in Python 2.6, and removed in Python 3.0.
  178. Classes & Inheritance

  179. Python 2 class MyExcept: Python 2/3 class MyExcept(BaseException): Custom Exceptions

    must inherit from BaseException.
  180. Python 2 AttributeError: class Foo has no attribute '__mro__' Python

    3 (<class '__main__.Foo'>, <class 'object'>) All classes are new-style classes in Python 3. class Foo: pass print(Foo.__mro__)
  181. Migrating to all-new-style classes is hard

  182. class OldStyle: pass class Foo(OldStyle): pass

  183. class OldStyle: pass class Foo(object, OldStyle): pass

  184. >>> Foo.__mro__ (<class '__main__.Foo'>, <type 'object'>, <class __main__.OldStyle>)

  185. TypeError: Cannot create a consistent method resolution order (MRO) for

    bases OldStyle, object
  186. class Bad(object, OldStyle): pass class Good(OldStyle, object): pass

  187. Python 2 <unbound method Foo.__init__> Python 3 <function Foo.__init__ at

    0x10908cbf8> Python 3 removes the concept of “unbound methods”. class Foo(object): def __init__(self): pass print(Foo.__init__)
  188. Python 2 class MyProtocol(object): implements(ISomeInterface) pass Python 3 @implementer(ISomeInterface) class

    MyProtocol(object): pass zope.interface on Python 3 is required to be a decorator.
  189. New Features Things on the horizon

  190. yield from Python 3.4 Not yet implemented in Twisted

  191. def some_function(): page = yield from treq.get(...) content = yield

    from treq.content(page) return content
  192. await, async iterators, async context managers, async def PEP 492

    in Python 3.5 Not yet implemented in Twisted
  193. async def some_function(): page = await treq.get(...) content = await

    treq.content(page) return content
  194. async def some_db_function(db): async with db.transaction(): await db.execute('DROP * FROM

    *;') return 'bye bye data!'
  195. The Future of Twisted

  196. Twisted isn’t perfect

  197. Contributor onboarding improvements

  198. Contributor tooling improvements

  199. Git migration

  200. Twisted’s future is new blood, and we need to work

    for that
  201. Adopting a Django-style Deprecation Policy (removing deprecated junk)

  202. Shedding the past (Python 2.6 support)

  203. Adopting Python 3 features (async def, yield from)

  204. Questions answered before you ask

  205. What about gevent?

  206. Glyph’s “Unyielding” https://goo.gl/lYDtct

  207. — Glyph “Despite the fact that implicit coroutines masquerade under

    many different names, many of which don’t include the word “thread” – for example, “greenlets”, “coroutines”, “fibers”, “tasks” – green or lightweight threads are indeed threads … In the long run, when you build a system that relies upon them, you eventually have all the pitfalls and dangers of full-blown preemptive threads.”
  208. Why do you wear a red trenchcoat?

  209. None
  210. Questions! Statements will be ignored! :)