Slide 1

Slide 1 text

HTTP for Great Good Scaling Django with HTTP DjangoCon US September 5th 2013 Matt Robenolt

Slide 2

Slide 2 text

Hello < me irl

Slide 3

Slide 3 text

Site Reliability Engineer

Slide 4

Slide 4 text

“DJANGO ALL THE THINGS!”

Slide 5

Slide 5 text

“...but

Slide 6

Slide 6 text

“...but

Slide 7

Slide 7 text

The slowest part of a web application is typically not your code.

Slide 8

Slide 8 text

Between databases and memcaches and Redises and Cassandras and MongoDBs and networks, Django is not the problem.

Slide 9

Slide 9 text

“...everything

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

A few vanity metrics.

Slide 12

Slide 12 text

Monthly Unique Visitors 1,115,080,411

Slide 13

Slide 13 text

Monthly Page Views 7,516,761,301

Slide 14

Slide 14 text

Inbound Traffic 42k total req/s 15k app req/s not my fault

Slide 15

Slide 15 text

36% of all requests actually hit a Django server 15k/42k = 36%

Slide 16

Slide 16 text

...what happened to the other 64%?

Slide 17

Slide 17 text

Let’s talk about HTTP. Hypertext Transport Protocol

Slide 18

Slide 18 text

$ curl -v disqus.com

Slide 19

Slide 19 text

> GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com > Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 30 Aug 2013 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 30 Aug 2013 06:38:36 GMT < Cache-Control: no-cache, must-revalidate

Slide 20

Slide 20 text

> GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com > Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 30 Aug 2013 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 30 Aug 2013 06:38:36 GMT < Cache-Control: no-cache, must-revalidate Request

Slide 21

Slide 21 text

> GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com > Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 30 Aug 2013 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 30 Aug 2013 06:38:36 GMT < Cache-Control: no-cache, must-revalidate Method

Slide 22

Slide 22 text

> GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com > Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 30 Aug 2013 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 30 Aug 2013 06:38:36 GMT < Cache-Control: no-cache, must-revalidate Path

Slide 23

Slide 23 text

> GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com > Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 30 Aug 2013 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 30 Aug 2013 06:38:36 GMT < Cache-Control: no-cache, must-revalidate Version

Slide 24

Slide 24 text

> GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com > Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 30 Aug 2013 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 30 Aug 2013 06:38:36 GMT < Cache-Control: no-cache, must-revalidate Headers

Slide 25

Slide 25 text

> GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com > Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 30 Aug 2013 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 30 Aug 2013 06:38:36 GMT < Cache-Control: no-cache, must-revalidate Response

Slide 26

Slide 26 text

> GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com > Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 30 Aug 2013 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 30 Aug 2013 06:38:36 GMT < Cache-Control: no-cache, must-revalidate Status

Slide 27

Slide 27 text

> GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com > Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 30 Aug 2013 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 30 Aug 2013 06:38:36 GMT < Cache-Control: no-cache, must-revalidate Headers

Slide 28

Slide 28 text

Request in Django request.method # method request.get_full_path() # path request.META['HTTP_USER_AGENT'] request.META['HTTP_ACCEPT']

Slide 29

Slide 29 text

Response in Django response = HttpResponse(body) response.status_code = 200 response['X-Foo'] = 'bar'

Slide 30

Slide 30 text

“Cool,

Slide 31

Slide 31 text

> GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com > Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 30 Aug 2013 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 30 Aug 2013 06:38:36 GMT < Cache-Control: no-cache, must-revalidate Hmm. What can we do with this information?

Slide 32

Slide 32 text

> GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com > Accept: */* > If-Modified-Since: Fri, 30 Aug 2013 00:32:14 GMT > < HTTP/1.1 304 Not Modified < Server: nginx < Date: Fri, 30 Aug 2013 18:05:38 GMT < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 30 Aug 2013 18:05:37 GMT < Cache-Control: no-cache, must-revalidate

Slide 33

Slide 33 text

304 Not Modified No body is sent with the response

Slide 34

Slide 34 text

304 Not Modified Client reuses its cached version

Slide 35

Slide 35 text

304 Not Modified Usually more efficient to calculate

Slide 36

Slide 36 text

notbad.gif

Slide 37

Slide 37 text

But we can do better!

Slide 38

Slide 38 text

Static files have been promoting good practices for a long time.

Slide 39

Slide 39 text

“Far future headers”

Slide 40

Slide 40 text

$ curl -v a.disquscdn.com/dotcom/d-6203c8f/css/ disqus-web/pages/home.css < HTTP/1.1 200 OK < Server: nginx < Content-Type: text/css; charset=utf-8 < Last-Modified: Fri, 16 Aug 2013 20:31:05 GMT < Expires: Sun, 15 Sep 2013 20:34:21 GMT < Cache-Control: max-age=2592000 < Content-Length: 30749 < Date: Sun, 18 Aug 2013 03:23:37 GMT < Via: 1.1 varnish < Age: 110956 < Connection: keep-alive < Vary: Accept-Encoding

Slide 41

Slide 41 text

$ curl -v a.disquscdn.com/dotcom/d-6203c8f/css/ disqus-web/pages/home.css < HTTP/1.1 200 OK < Server: nginx < Content-Type: text/css; charset=utf-8 < Last-Modified: Fri, 16 Aug 2013 20:31:05 GMT < Expires: Sun, 15 Sep 2013 20:34:21 GMT < Cache-Control: max-age=2592000 < Content-Length: 30749 < Date: Sun, 18 Aug 2013 03:23:37 GMT < Via: 1.1 varnish < Age: 110956 < Connection: keep-alive < Vary: Accept-Encoding 30 days in the future.

Slide 42

Slide 42 text

Chrome Web Inspector on second visit

Slide 43

Slide 43 text

No HTTP request 0ms

Slide 44

Slide 44 text

No HTTP request The client knew to just use its local cache

Slide 45

Slide 45 text

No HTTP request Computer actually did something right for once

Slide 46

Slide 46 text

“I SEE WHAT YOU DID THERE” - Hopefully you

Slide 47

Slide 47 text

Takeaways Clients behave differently depending on the response headers

Slide 48

Slide 48 text

Takeaways These usually come with minimal effort with static files

Slide 49

Slide 49 text

Takeaways We can and should utilize these to our advantage to improve UX

Slide 50

Slide 50 text

Same logic can be applied to dynamic content.

Slide 51

Slide 51 text

What’s this look like in Django?

Slide 52

Slide 52 text

Last-Modified def lol(request): response = render(request, 'lol.html') response['Last-Modified'] = \ 'Fri, 16 Aug 2013 20:31:05 GMT' return response * don’t do this.

Slide 53

Slide 53 text

Last-Modified from django.views.decorators.http import \ last_modified def post_last_modified(request, slug): return Post.objects.get(slug=slug).modified @last_modified(post_last_modified) def blog_post_detail(request, slug): # Your view

Slide 54

Slide 54 text

Cache-Control def lol(request): response = render(request, 'lol.html') response['Cache-Control'] = 'max-age=600' return response

Slide 55

Slide 55 text

Cache-Control from django.views.decorators.cache import \ cache_control @cache_control(max_age=600) # Cache for 10m def home(request): return HttpResponse('lol')

Slide 56

Slide 56 text

“OMG!

Slide 57

Slide 57 text

No content

Slide 58

Slide 58 text

Well... not really.

Slide 59

Slide 59 text

How many requests will {{user}} make to the same page within 10 minutes?

Slide 60

Slide 60 text

1? 2? 3?

Slide 61

Slide 61 text

...out of 42k requests per second.

Slide 62

Slide 62 text

Even with caching, your app is doing a lot of work.

Slide 63

Slide 63 text

Parsing HTTP.

Slide 64

Slide 64 text

WSGI.

Slide 65

Slide 65 text

Django middleware stack.

Slide 66

Slide 66 text

Do some stuff.

Slide 67

Slide 67 text

Render a template?

Slide 68

Slide 68 text

Back out through the Django middleware.

Slide 69

Slide 69 text

Transform an HttpResponse into a real HTTP response.

Slide 70

Slide 70 text

...at 42k requests per second.

Slide 71

Slide 71 text

You’re gonna have a bad time. me

Slide 72

Slide 72 text

Until now, “client” has been a user’s browser.

Slide 73

Slide 73 text

“If only we could utilize this Cache-Control stuff better...”

Slide 74

Slide 74 text

Introducing

Slide 75

Slide 75 text

$ apt-get install varnish

Slide 76

Slide 76 text

$ brew install varnish

Slide 77

Slide 77 text

tl;dr Varnish sits between Django and your users Internet

Slide 78

Slide 78 text

tl;dr Caches HTTP responses and respects proper HTTP headers Internet

Slide 79

Slide 79 text

tl;dr Its sole purpose in life is to be a cache, so it’s really fast. Internet

Slide 80

Slide 80 text

Stand back, science is happening.

Slide 81

Slide 81 text

Stand back, science is happening. benchmarking

Slide 82

Slide 82 text

Simple, non-scientific “Hello World”

Slide 83

Slide 83 text

from django.views.decorators.cache import \ cache_control from django.http import HttpResponse @cache_control(max_age=5) def hello(request): return HttpResponse('Hello world') “Hello World”

Slide 84

Slide 84 text

$ httperf --server 127.0.0.1 --port 8000 -- uri /hello/ --rate 150 --num-conn 10 --num-call 500 --hog Request rate: 369.6 req/s (2.7 ms/req) Django + gunicorn * on my MacBook Air

Slide 85

Slide 85 text

$ httperf --server 127.0.0.1 --port 8888 -- uri /hello/ --rate 150 --num-conn 10 --num-call 10000 --hog Request rate: 15633.4 req/s (0.1 ms/req) Varnish * on my MacBook Air

Slide 86

Slide 86 text

Varnish: How does it work?

Slide 87

Slide 87 text

First request

Slide 88

Slide 88 text

First response “Lemme

Slide 89

Slide 89 text

“Yo,

Slide 90

Slide 90 text

Next response “wut

Slide 91

Slide 91 text

Caching: ProMoves™

Slide 92

Slide 92 text

Augment with JavaScript Update your UI optimistically

Slide 93

Slide 93 text

Augment with JavaScript Leverage cookies to store non-critical data

Slide 94

Slide 94 text

Augment with JavaScript Defer fetching user-specific data until needed

Slide 95

Slide 95 text

Short TTLs are good Most things can be cached for at least 5s

Slide 96

Slide 96 text

Short TTLs are good At 10k requests/s, a 5s TTL absorbs 49,999 requests

Slide 97

Slide 97 text

Let’s meet: John and Jane Doe.

Slide 98

Slide 98 text

John and Jane are different users.

Slide 99

Slide 99 text

John logs into Disqus.

Slide 100

Slide 100 text

Jane logs into Disqus.

Slide 101

Slide 101 text

Jane sees John’s stuff.

Slide 102

Slide 102 text

Jane sees John’s stuff. ^ not

Slide 103

Slide 103 text

We really want to avoid this from ever happening.

Slide 104

Slide 104 text

Cookies

Slide 105

Slide 105 text

How do users even work?

Slide 106

Slide 106 text

$ curl -vd "username=foo&password=bar" https:// disqus.com/profile/login/ > POST /profile/login/ HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com > < HTTP/1.1 302 FOUND < Server: nginx < Date: Fri, 30 Aug 2013 21:34:36 GMT < Vary: Cookie < Set-Cookie: sessionid=f7aa9598-11bb-11e3-9eb1-003048d9a288; Domain=.disqus.com; expires=Sun, 29-Sep-2013 21:34:36 GMT; httponly; Max-Age=2592000; Path=/

Slide 107

Slide 107 text

$ curl -vd "username=foo&password=bar" https:// disqus.com/profile/login/ > POST /profile/login/ HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com > < HTTP/1.1 302 FOUND < Server: nginx < Date: Fri, 30 Aug 2013 21:34:36 GMT < Vary: Cookie < Set-Cookie: sessionid=f7aa9598-11bb-11e3-9eb1-003048d9a288; Domain=.disqus.com; expires=Sun, 29-Sep-2013 21:34:36 GMT; httponly; Max-Age=2592000; Path=/ Set-Cookie

Slide 108

Slide 108 text

$ curl -vd "username=foo&password=bar" https:// disqus.com/profile/login/ > POST /profile/login/ HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com > < HTTP/1.1 302 FOUND < Server: nginx < Date: Fri, 30 Aug 2013 21:34:36 GMT < Vary: Cookie < Set-Cookie: sessionid=f7aa9598-11bb-11e3-9eb1-003048d9a288; Domain=.disqus.com; expires=Sun, 29-Sep-2013 21:34:36 GMT; httponly; Max-Age=2592000; Path=/ Session Id

Slide 109

Slide 109 text

Unique id that represents a logged in user Session Id

Slide 110

Slide 110 text

django.contrib.sessions / django.contrib.auth Session Id

Slide 111

Slide 111 text

Could potentially cache per session id Session Id

Slide 112

Slide 112 text

By default, Varnish will not cache any request with a Cookie header at all.

Slide 113

Slide 113 text

Think about if your endpoint changes based on a user’s authentication.

Slide 114

Slide 114 text

If it doesn’t, Varnish can normalize it.

Slide 115

Slide 115 text

No content

Slide 116

Slide 116 text

Learn: Varnish Configuration Language (VCL) in 30 seconds

Slide 117

Slide 117 text

sub vcl_recv { // These urls can be stripped of all // cookies since they serve the same // data for anon and auth'd user if ( req.url == "/" || req.url ~ "^/embed/comments/" ) { unset req.http.Cookie; } }

Slide 118

Slide 118 text

Basically, caching is hard.

Slide 119

Slide 119 text

Go make some stuff faster.

Slide 120

Slide 120 text

We’re hiring people who hate computers. disqus.com/jobs

Slide 121

Slide 121 text

Questions? I have answers. ^ github.com/mattrobenolt @mattrobenolt some