Slide 1

Slide 1 text

Django & Twisted Django Under The Hood, 2015

Slide 2

Slide 2 text

Hello, I’m Amber Brown (HawkOwl)

Slide 3

Slide 3 text

I live in Perth, Western Australia

Slide 4

Slide 4 text

I organise Django Girls events!

Slide 5

Slide 5 text

omg it’s russ I organise Django Girls events!

Slide 6

Slide 6 text

I serve on the Django Code of Conduct Committee.

Slide 7

Slide 7 text

I’m a Twisted core developer …and release manager (get hype for 15.5!)

Slide 8

Slide 8 text

(image by isometri.cc)

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

Paraphrasing the DjangoCon AU 2014 Keynote

Slide 11

Slide 11 text

I’m an invited speaker

Slide 12

Slide 12 text

It’s expected that I have something of use to tell you

Slide 13

Slide 13 text

Talks are only worthwhile if they educate or entertain

Slide 14

Slide 14 text

So I’m going to say this upfront, with no ambiguity:

Slide 15

Slide 15 text

This talk’s conclusion is NOT “Django sucks”.

Slide 16

Slide 16 text

This talk’s conclusion is NOT “using Twisted makes you a better programmer”.

Slide 17

Slide 17 text

This talk’s conclusion is that the future of Python web development is working together.

Slide 18

Slide 18 text

Any interpretation drawing a different conclusion is incorrect

Slide 19

Slide 19 text

>>> Django == good True >>> Twisted == good True

Slide 20

Slide 20 text

WARNING This talk is full of spiders

Slide 21

Slide 21 text

No question is stupid

Slide 22

Slide 22 text

Concepts

Slide 23

Slide 23 text

For the purposes of this talk, synchronous code returns in-line

Slide 24

Slide 24 text

def sync(): return 1

Slide 25

Slide 25 text

…and asynchronous code calls another function with a result at some later time

Slide 26

Slide 26 text

def async(func): func(1)

Slide 27

Slide 27 text

However, this is also asynchronous

Slide 28

Slide 28 text

def asyncyieldfrom(): a = yield from somefunc() return a

Slide 29

Slide 29 text

Contrary to how it looks, yield-from—using functions do not “return immediately”

Slide 30

Slide 30 text

Python suspends at the yield point, and can run other things — purely syntactic sugar

Slide 31

Slide 31 text

Blocking, for the purposes of this talk, means that Python cannot run absolutely anything else during that period due to I/O operations

Slide 32

Slide 32 text

Short, CPU-bound tasks are not considered “blocking”

Slide 33

Slide 33 text

Long CPU-bound, or “short”/long I/O bound operations are “blocking”

Slide 34

Slide 34 text

“Short” I/O still takes a long time

Slide 35

Slide 35 text

PING google.com (150.101.170.180): 56 data bytes 64 bytes: icmp_seq=0 ttl=60 time=13.217 ms 64 bytes: icmp_seq=1 ttl=60 time=18.227 ms 64 bytes: icmp_seq=2 ttl=60 time=13.117 ms

Slide 36

Slide 36 text

13ms in computer time is an eternity

Slide 37

Slide 37 text

What is Twisted?

Slide 38

Slide 38 text

Asynchronous networking framework

Slide 39

Slide 39 text

At least a decade old

Slide 40

Slide 40 text

Stable & Mature (thanks to a robust Compatibility Policy)

Slide 41

Slide 41 text

Many protocol implementations (HTTP/1.0+1.1, SMTP, IMAP, DNS, SSH, many many more)

Slide 42

Slide 42 text

Python 2.7/3.3+ (Python 3.3+ port is incomplete, 50%+ there)

Slide 43

Slide 43 text

Time-based versioning 15.0 == 1st release in ’15 15.5 == 6th release in ‘15

Slide 44

Slide 44 text

How Twisted’s Reactor Works

Slide 45

Slide 45 text

Sockets usually block until the data is sent

Slide 46

Slide 46 text

Twisted configures the sockets to be non- blocking

Slide 47

Slide 47 text

Given the non-blocking socket socket, socket.write() will write to the send buffer and return immediately

Slide 48

Slide 48 text

If the send buffer is full, it raises EWOULDBLOCK

Slide 49

Slide 49 text

Further socket.write() calls are put in a secondary send buffer by Twisted

Slide 50

Slide 50 text

This secondary send buffering is taken care of by the Twisted Protocol class (socket.write() is never called directly by user code)

Slide 51

Slide 51 text

socket.read() is also called automatically by Protocol

Slide 52

Slide 52 text

Twisted’s reactor then alerts Protocol when there is more data to be read, or more data can be written

Slide 53

Slide 53 text

select, poll, epoll, kqueue

Slide 54

Slide 54 text

Takes a list of file descriptors (eg. sockets) and returns the ones that can have further data written/read

Slide 55

Slide 55 text

If more data can be written, Protocol tries to empty its secondary send buffer

Slide 56

Slide 56 text

If more data can be read, Protocol reads it and gives it to user code with the overridden dataReceived method

Slide 57

Slide 57 text

That handles sending/ receiving data, but we operate on a higher level

Slide 58

Slide 58 text

Each Protocol implements something — WebSockets, SMTP, et al

Slide 59

Slide 59 text

The Protocol is asynchronous, so the consumption of its data must also be asynchronous

Slide 60

Slide 60 text

Deferreds

Slide 61

Slide 61 text

Slide 62

Slide 62 text

“If you don’t understand Deferreds, you’re too stupid for Twisted”

Slide 63

Slide 63 text

That belief has no place in any Twisted I’m a part of

Slide 64

Slide 64 text

If you don’t “get” Deferreds, that is OUR failure.

Slide 65

Slide 65 text

We need better documentation

Slide 66

Slide 66 text

We need better examples

Slide 67

Slide 67 text

We need to adopt syntactic changes that make it easier

Slide 68

Slide 68 text

Slide 69

Slide 69 text

Deferreds are an object which holds a result at some point in time

Slide 70

Slide 70 text

Callbacks mean ‘when you have a result, call this function with the result’

Slide 71

Slide 71 text

Deferreds have a “callback chain”, where the result is passed through

Slide 72

Slide 72 text

d = Deferred() d.addCallback(lambda t: t + 1) d.addCallback(lambda t: print(t)) d.callback(12)

Slide 73

Slide 73 text

>>> d = Deferred() >>> d.addCallback(lambda t: t + 1) >>> d.addCallback(lambda t: print(t)) >>> d.callback(12) 13

Slide 74

Slide 74 text

addCallback returns a Deferred, so you can chain it

Slide 75

Slide 75 text

Deferred() \ .addCallback(lambda t: t + 1) \ .addCallback(lambda t: print(t)) \ .callback(12)

Slide 76

Slide 76 text

Callbacks can be synchronous (although they should not block) or return more Deferreds

Slide 77

Slide 77 text

Many things return Deferreds

Slide 78

Slide 78 text

>> import treq >> treq.get("https://google.com")

Slide 79

Slide 79 text

import treq from twisted.internet.task import react def get(reactor): d = treq.get("http://atleastfornow.net") d.addCallback(treq.content) d.addCallback(lambda _: print(_)) return d react(get)

Slide 80

Slide 80 text

@inlineCallbacks

Slide 81

Slide 81 text

inlineCallbacks makes Deferreds act like Futures/coroutines

Slide 82

Slide 82 text

import treq from twisted.internet.task import react from twisted.internet.defer import inlineCallbacks @inlineCallbacks def get(reactor): request = yield treq.get( "http://atleastfornow.net") content = yield treq.content(request) print(content) react(get)

Slide 83

Slide 83 text

Supported in Twisted since generators were introduced

Slide 84

Slide 84 text

Return a value with defer.returnValue()

Slide 85

Slide 85 text

Works with regular Deferreds — a function wrapped with inlineCallbacks returns a Deferred automatically

Slide 86

Slide 86 text

To wait for a Deferred to fire, use yield in the function

Slide 87

Slide 87 text

Making Django Asynchronous

Slide 88

Slide 88 text

Django is synchronous at its core

Slide 89

Slide 89 text

WSGI relies on what it calls being synchronous

Slide 90

Slide 90 text

Django’s ORM does blocking I/O

Slide 91

Slide 91 text

Making either of these asynchronous is complex

Slide 92

Slide 92 text

asynchronousness can’t be bolted on

Slide 93

Slide 93 text

Everything has to cooperate or everything falls apart

Slide 94

Slide 94 text

“Common Sense” async == hard sync == easy

Slide 95

Slide 95 text

In reality, each approach has tradeoffs

Slide 96

Slide 96 text

Synchronous Upsides • Code flow is easier to understand — do x, then y • Only one “thread” of execution, for simplicity • Many libraries are synchronous

Slide 97

Slide 97 text

Synchronous Downsides • You can only do one thing at once • Although suited to the request/response cycle, it can only really do that • Persistent connections are not simple to implement

Slide 98

Slide 98 text

Asynchronous Upsides • Massively scalable network concurrency • Multiple “threads” of execution — the code handling the request doesn’t have to finish after the request is written • Handling persistent/evented connections is super easy • Reactor model async is threadless • Python 3 adds some syntactic sugar that makes it easier to write

Slide 99

Slide 99 text

Asynchronous Downsides • “Callback hell” when using raw futures/deferreds • You have to be a good citizen — blocking in the reactor loop is disastrous for performance • Doing I/O is “harder” because you have to be explicit about it • Python 2 lacks a bunch of async syntactic sugar

Slide 100

Slide 100 text

You can’t get the upsides of both

Slide 101

Slide 101 text

But you can try!

Slide 102

Slide 102 text

Threaded WSGI Runner • The standard Django deployment method — run lots of threads, so it doesn’t matter if it blocks • Each thread is blocking, so it can’t run multiple I/O operations at once • To handle many concurrent requests, you need many threads

Slide 103

Slide 103 text

Hendrix • Hendrix is a “Twisted Django” • WSGI server using Twisted, plus WebSockets • Multiprocessing, multithreaded • https://github.com/hangarunderground/hendrix

Slide 104

Slide 104 text

Crochet • Run Twisted code side-by-side with blocking code • Runs a Twisted reactor in another thread, rather than Twisted calling Django • https://github.com/itamarst/crochet

Slide 105

Slide 105 text

The Future of Django (Django Channels)

Slide 106

Slide 106 text

Brainchild of Andrew Godwin

Slide 107

Slide 107 text

Django Channels makes Django event-driven

Slide 108

Slide 108 text

Asynchronous server (Twisted) + Synchronous “workers”

Slide 109

Slide 109 text

Requests and WebSocket events are now “events” sent through “channels”

Slide 110

Slide 110 text

You write synchronous code which handles these events

Slide 111

Slide 111 text

Channel events go on a queue, and are picked up by workers

Slide 112

Slide 112 text

Workers can also put things on the queue (but can’t get the result)

Slide 113

Slide 113 text

Channels Upsides • It allows you to use WebSockets! • If you don’t care about the response (eg. a page view counter), it can be sent by a channel and run by a worker without blocking the current event • The workers don’t have to be on the same machine, allowing distribution

Slide 114

Slide 114 text

Channels Downsides • You can’t get the results of events you create in your code • Your code can still only “do” one thing at a time • Your code is a few steps removed from the real WebSocket or HTTP connections, which makes it less flexible

Slide 115

Slide 115 text

So, what does Channels look like?

Slide 116

Slide 116 text

No content

Slide 117

Slide 117 text

When a HTTP/WebSocket event comes in from a client, it sends a message to a channel

Slide 118

Slide 118 text

You implement consumers for these channels

Slide 119

Slide 119 text

You are given a channel to send the result of your consumer when it is called

Slide 120

Slide 120 text

In the case of a HTTP request, you send back a “channel encoded” response object

Slide 121

Slide 121 text

In the case of Websockets, you send back content

Slide 122

Slide 122 text

This content is then returned to the client

Slide 123

Slide 123 text

WebSocket clients can be put into “Groups”

Slide 124

Slide 124 text

You can then broadcast a message out to a Group

Slide 125

Slide 125 text

What makes it different?

Slide 126

Slide 126 text

Channels doesn’t actually make your code asynchronous, it just adds async runners for your sync code

Slide 127

Slide 127 text

It doesn’t tackle the “hard” problem of running Django asynchronously

Slide 128

Slide 128 text

So it doesn’t get all the benefits as if it did

Slide 129

Slide 129 text

Maybe that’s enough?

Slide 130

Slide 130 text

It’s a positive development for Django

Slide 131

Slide 131 text

It supports Python 2.7 and Python 3.3+

Slide 132

Slide 132 text

Check it out: http://git.io/vYEbp

Slide 133

Slide 133 text

So, why not just use Twisted?

Slide 134

Slide 134 text

Well…

Slide 135

Slide 135 text

The Future of Django (alternate)

Slide 136

Slide 136 text

WSGI II Electric Boogaloo

Slide 137

Slide 137 text

WSGI is currently inherently request/ response

Slide 138

Slide 138 text

WebSockets is useful, and WSGI II would need to support it

Slide 139

Slide 139 text

WebSockets 2?

Slide 140

Slide 140 text

HTTP falls out of use?

Slide 141

Slide 141 text

Metal WSGear Solid 3 Snake Eater

Slide 142

Slide 142 text

Async is undergoing another renaissance

Slide 143

Slide 143 text

Django has to decide where it is going to sit

Slide 144

Slide 144 text

Adopting an asynchronous framework is a long-term way forward

Slide 145

Slide 145 text

It will require a lot of broken eggs, but Django can make the transition

Slide 146

Slide 146 text

No content

Slide 147

Slide 147 text

This is Django…

Slide 148

Slide 148 text

This is Django… …with async views…

Slide 149

Slide 149 text

This is Django… …with async views… …with an async ORM…

Slide 150

Slide 150 text

This is Django… …with async views… …with an async ORM… …running on Twisted Web…

Slide 151

Slide 151 text

This is Django… …with async views… …with an async ORM… …running on Twisted Web… …with no WSGI.

Slide 152

Slide 152 text

Live Demo

Slide 153

Slide 153 text

No content

Slide 154

Slide 154 text

Caveats: I wrote this on a plane, the ORM runs in a threadpool, the tests fail hilariously

Slide 155

Slide 155 text

But it’s serving concurrent web requests in pure Python

Slide 156

Slide 156 text

async_create() which returns a Deferred, etc

Slide 157

Slide 157 text

ORM needs more work

Slide 158

Slide 158 text

The ORM does a lot of things that cause cursor.execute() where you wouldn’t expect

Slide 159

Slide 159 text

The backends need to be truly asynchronous

Slide 160

Slide 160 text

More separation between SQL generation, and executing that SQL

Slide 161

Slide 161 text

Then we have all the requirements for an asynchronous Django!

Slide 162

Slide 162 text

Django users have to be good async citizens

Slide 163

Slide 163 text

Like I said, everything has to cooperate or it all falls apart

Slide 164

Slide 164 text

yield from Python 3.4

Slide 165

Slide 165 text

await, async iterators, async context managers PEP 492 in Python 3.5

Slide 166

Slide 166 text

Django might be able to support async & sync views

Slide 167

Slide 167 text

WSGI would work as it does now

Slide 168

Slide 168 text

If using Twisted as your web server, you can use async views

Slide 169

Slide 169 text

Django’s ORM and other features would then be usable by Twisted libraries

Slide 170

Slide 170 text

Then Django doesn’t need to care about WebSockets, or whatever comes next

Slide 171

Slide 171 text

– someone, unless I imagined that “Django should have been a Twisted plugin.”

Slide 172

Slide 172 text

The Future of Twisted

Slide 173

Slide 173 text

Twisted isn’t perfect

Slide 174

Slide 174 text

Contributor onboarding improvements

Slide 175

Slide 175 text

Contributor tooling improvements

Slide 176

Slide 176 text

Git migration

Slide 177

Slide 177 text

Twisted’s future is new blood, and we need to work for that

Slide 178

Slide 178 text

Adopting a Django-style Deprecation Policy (removing deprecated junk)

Slide 179

Slide 179 text

Shedding the past (Python 2.6 support)

Slide 180

Slide 180 text

Adopting Python 3 features
 (def async, yield from)

Slide 181

Slide 181 text

Twisted + Django

Slide 182

Slide 182 text

I would like to see this happen

Slide 183

Slide 183 text

Like I said earlier, you cannot get the upsides of async and sync code at the same time

Slide 184

Slide 184 text

But with asyncio, writing asynchronous code in Python is becoming “normal”

Slide 185

Slide 185 text

Features like yield from and async def can be adopted by Twisted, even though they’re targeted at asyncio

Slide 186

Slide 186 text

This removes some of the difficulty of writing async code (“callback hell”)

Slide 187

Slide 187 text

Makes async code look sequential

Slide 188

Slide 188 text

Ugly hax: github.com/hawkowl/django

Slide 189

Slide 189 text

Questions answered before you ask

Slide 190

Slide 190 text

What about gevent?

Slide 191

Slide 191 text

Glyph’s “Unyielding” https://goo.gl/lYDtct

Slide 192

Slide 192 text

— Glyph “Despite the fact that implicit coroutines masquerade under many different names, many of which don’t include the word “thread” – for example, “greenlets”, “coroutines”, “fibers”, “tasks” – green or lightweight threads are indeed threads … In the long run, when you build a system that relies upon them, you eventually have all the pitfalls and dangers of full-blown preemptive threads.”

Slide 193

Slide 193 text

What would an async Django get me?

Slide 194

Slide 194 text

Websockets More I/O efficiency You don’t need a task manager to run things after a response

Slide 195

Slide 195 text

Why do you wear a red trenchcoat?

Slide 196

Slide 196 text

No content

Slide 197

Slide 197 text

Questions!