Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
HTTP for Great Good
Search
Matt Robenolt
September 05, 2013
Programming
85
200k
HTTP for Great Good
Scaling Django with HTTP
DjangoCon US 2013
http://www.youtube.com/watch?v=HAjOQ09I1UY
Matt Robenolt
September 05, 2013
Tweet
Share
More Decks by Matt Robenolt
See All by Matt Robenolt
Everything is broken and I don't know why.
mattrobenolt
0
44
I am bad at my job.
mattrobenolt
0
190
Everything is broken, and I don't know why. Python edition.
mattrobenolt
1
190
Everything is broken, and I don't know why. Python edition.
mattrobenolt
2
570
Varnish: How We Do It
mattrobenolt
1
220
Everything is broken, and I don't know why.
mattrobenolt
7
1.5k
Cheating Your Way to Webscale
mattrobenolt
13
1.4k
Caching is Hard: Varnish @ Disqus
mattrobenolt
52
2.1M
Developing & Deploying "Large" Scale Web Applications
mattrobenolt
25
1.3k
Other Decks in Programming
See All in Programming
Android 16KBページサイズ対応をはじめからていねいに
mine2424
0
570
RailsGirls IZUMO スポンサーLT
16bitidol
0
210
テスターからテストエンジニアへ ~新米テストエンジニアが歩んだ9ヶ月振り返り~
non0113
2
230
Streamlitで実現できるようになったこと、実現してくれたこと
ayumu_yamaguchi
2
170
Jakarta EE Meets AI
ivargrimstad
0
180
状態遷移図を書こう / Sequence Chart vs State Diagram
orgachem
PRO
2
240
構造化・自動化・ガードレール - Vibe Coding実践記 -
tonegawa07
0
120
코딩 에이전트 체크리스트: Claude Code ver.
nacyot
0
980
顧客の画像データをテラバイト単位で配信する 画像サーバを WebP にした際に起こった課題と その対応策 ~継続的な取り組みを添えて~
takutakahashi
4
1.4k
ISUCON研修おかわり会 講義スライド
arfes0e2b3c
1
470
AI コーディングエージェントの時代へ:JetBrains が描く開発の未来
masaruhr
1
210
The Modern View Layer Rails Deserves: A Vision For 2025 And Beyond @ RailsConf 2025, Philadelphia, PA
marcoroth
2
760
Featured
See All Featured
Practical Orchestrator
shlominoach
189
11k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
45
7.5k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
357
30k
Embracing the Ebb and Flow
colly
86
4.8k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
229
22k
We Have a Design System, Now What?
morganepeng
53
7.7k
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
234
17k
Done Done
chrislema
184
16k
KATA
mclloyd
30
14k
A better future with KSS
kneath
238
17k
Designing for humans not robots
tammielis
253
25k
Mobile First: as difficult as doing things right
swwweet
223
9.7k
Transcript
HTTP for Great Good Scaling Django with HTTP DjangoCon US
September 5th 2013 Matt Robenolt
Hello < me irl
Site Reliability Engineer
“DJANGO ALL THE THINGS!”
“...but
“...but
The slowest part of a web application is typically not
your code.
Between databases and memcaches and Redises and Cassandras and MongoDBs
and networks, Django is not the problem.
“...everything
None
A few vanity metrics.
Monthly Unique Visitors 1,115,080,411
Monthly Page Views 7,516,761,301
Inbound Traffic 42k total req/s 15k app req/s not my
fault
36% of all requests actually hit a Django server 15k/42k
= 36%
...what happened to the other 64%?
Let’s talk about HTTP. Hypertext Transport Protocol
$ curl -v disqus.com
> GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com
> Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 30 Aug 2013 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 30 Aug 2013 06:38:36 GMT < Cache-Control: no-cache, must-revalidate
> GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com
> Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 30 Aug 2013 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 30 Aug 2013 06:38:36 GMT < Cache-Control: no-cache, must-revalidate Request
> GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com
> Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 30 Aug 2013 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 30 Aug 2013 06:38:36 GMT < Cache-Control: no-cache, must-revalidate Method
> GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com
> Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 30 Aug 2013 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 30 Aug 2013 06:38:36 GMT < Cache-Control: no-cache, must-revalidate Path
> GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com
> Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 30 Aug 2013 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 30 Aug 2013 06:38:36 GMT < Cache-Control: no-cache, must-revalidate Version
> GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com
> Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 30 Aug 2013 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 30 Aug 2013 06:38:36 GMT < Cache-Control: no-cache, must-revalidate Headers
> GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com
> Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 30 Aug 2013 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 30 Aug 2013 06:38:36 GMT < Cache-Control: no-cache, must-revalidate Response
> GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com
> Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 30 Aug 2013 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 30 Aug 2013 06:38:36 GMT < Cache-Control: no-cache, must-revalidate Status
> GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com
> Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 30 Aug 2013 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 30 Aug 2013 06:38:36 GMT < Cache-Control: no-cache, must-revalidate Headers
Request in Django request.method # method request.get_full_path() # path request.META['HTTP_USER_AGENT']
request.META['HTTP_ACCEPT']
Response in Django response = HttpResponse(body) response.status_code = 200 response['X-Foo']
= 'bar'
“Cool,
> GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com
> Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 30 Aug 2013 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 30 Aug 2013 06:38:36 GMT < Cache-Control: no-cache, must-revalidate Hmm. What can we do with this information?
> GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com
> Accept: */* > If-Modified-Since: Fri, 30 Aug 2013 00:32:14 GMT > < HTTP/1.1 304 Not Modified < Server: nginx < Date: Fri, 30 Aug 2013 18:05:38 GMT < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 30 Aug 2013 18:05:37 GMT < Cache-Control: no-cache, must-revalidate
304 Not Modified No body is sent with the response
304 Not Modified Client reuses its cached version
304 Not Modified Usually more efficient to calculate
notbad.gif
But we can do better!
Static files have been promoting good practices for a long
time.
“Far future headers”
$ curl -v a.disquscdn.com/dotcom/d-6203c8f/css/ disqus-web/pages/home.css < HTTP/1.1 200 OK <
Server: nginx < Content-Type: text/css; charset=utf-8 < Last-Modified: Fri, 16 Aug 2013 20:31:05 GMT < Expires: Sun, 15 Sep 2013 20:34:21 GMT < Cache-Control: max-age=2592000 < Content-Length: 30749 < Date: Sun, 18 Aug 2013 03:23:37 GMT < Via: 1.1 varnish < Age: 110956 < Connection: keep-alive < Vary: Accept-Encoding
$ curl -v a.disquscdn.com/dotcom/d-6203c8f/css/ disqus-web/pages/home.css < HTTP/1.1 200 OK <
Server: nginx < Content-Type: text/css; charset=utf-8 < Last-Modified: Fri, 16 Aug 2013 20:31:05 GMT < Expires: Sun, 15 Sep 2013 20:34:21 GMT < Cache-Control: max-age=2592000 < Content-Length: 30749 < Date: Sun, 18 Aug 2013 03:23:37 GMT < Via: 1.1 varnish < Age: 110956 < Connection: keep-alive < Vary: Accept-Encoding 30 days in the future.
Chrome Web Inspector on second visit
No HTTP request 0ms
No HTTP request The client knew to just use its
local cache
No HTTP request Computer actually did something right for once
“I SEE WHAT YOU DID THERE” - Hopefully you
Takeaways Clients behave differently depending on the response headers
Takeaways These usually come with minimal effort with static files
Takeaways We can and should utilize these to our advantage
to improve UX
Same logic can be applied to dynamic content.
What’s this look like in Django?
Last-Modified def lol(request): response = render(request, 'lol.html') response['Last-Modified'] = \
'Fri, 16 Aug 2013 20:31:05 GMT' return response * don’t do this.
Last-Modified from django.views.decorators.http import \ last_modified def post_last_modified(request, slug): return
Post.objects.get(slug=slug).modified @last_modified(post_last_modified) def blog_post_detail(request, slug): # Your view
Cache-Control def lol(request): response = render(request, 'lol.html') response['Cache-Control'] = 'max-age=600'
return response
Cache-Control from django.views.decorators.cache import \ cache_control @cache_control(max_age=600) # Cache for
10m def home(request): return HttpResponse('lol')
“OMG!
None
Well... not really.
How many requests will {{user}} make to the same page
within 10 minutes?
1? 2? 3?
...out of 42k requests per second.
Even with caching, your app is doing a lot of
work.
Parsing HTTP.
WSGI.
Django middleware stack.
Do some stuff.
Render a template?
Back out through the Django middleware.
Transform an HttpResponse into a real HTTP response.
...at 42k requests per second.
You’re gonna have a bad time. me
Until now, “client” has been a user’s browser.
“If only we could utilize this Cache-Control stuff better...”
Introducing
$ apt-get install varnish
$ brew install varnish
tl;dr Varnish sits between Django and your users Internet
tl;dr Caches HTTP responses and respects proper HTTP headers Internet
tl;dr Its sole purpose in life is to be a
cache, so it’s really fast. Internet
Stand back, science is happening.
Stand back, science is happening. benchmarking
Simple, non-scientific “Hello World”
from django.views.decorators.cache import \ cache_control from django.http import HttpResponse @cache_control(max_age=5)
def hello(request): return HttpResponse('Hello world') “Hello World”
$ httperf --server 127.0.0.1 --port 8000 -- uri /hello/ --rate
150 --num-conn 10 --num-call 500 --hog Request rate: 369.6 req/s (2.7 ms/req) Django + gunicorn * on my MacBook Air
$ httperf --server 127.0.0.1 --port 8888 -- uri /hello/ --rate
150 --num-conn 10 --num-call 10000 --hog Request rate: 15633.4 req/s (0.1 ms/req) Varnish * on my MacBook Air
Varnish: How does it work?
First request
First response “Lemme
“Yo,
Next response “wut
Caching: ProMoves™
Augment with JavaScript Update your UI optimistically
Augment with JavaScript Leverage cookies to store non-critical data
Augment with JavaScript Defer fetching user-specific data until needed
Short TTLs are good Most things can be cached for
at least 5s
Short TTLs are good At 10k requests/s, a 5s TTL
absorbs 49,999 requests
Let’s meet: John and Jane Doe.
John and Jane are different users.
John logs into Disqus.
Jane logs into Disqus.
Jane sees John’s stuff.
Jane sees John’s stuff. ^ not
We really want to avoid this from ever happening.
Cookies
How do users even work?
$ curl -vd "username=foo&password=bar" https:// disqus.com/profile/login/ > POST /profile/login/ HTTP/1.1
> User-Agent: curl/7.24.0 > Host: disqus.com > < HTTP/1.1 302 FOUND < Server: nginx < Date: Fri, 30 Aug 2013 21:34:36 GMT < Vary: Cookie < Set-Cookie: sessionid=f7aa9598-11bb-11e3-9eb1-003048d9a288; Domain=.disqus.com; expires=Sun, 29-Sep-2013 21:34:36 GMT; httponly; Max-Age=2592000; Path=/
$ curl -vd "username=foo&password=bar" https:// disqus.com/profile/login/ > POST /profile/login/ HTTP/1.1
> User-Agent: curl/7.24.0 > Host: disqus.com > < HTTP/1.1 302 FOUND < Server: nginx < Date: Fri, 30 Aug 2013 21:34:36 GMT < Vary: Cookie < Set-Cookie: sessionid=f7aa9598-11bb-11e3-9eb1-003048d9a288; Domain=.disqus.com; expires=Sun, 29-Sep-2013 21:34:36 GMT; httponly; Max-Age=2592000; Path=/ Set-Cookie
$ curl -vd "username=foo&password=bar" https:// disqus.com/profile/login/ > POST /profile/login/ HTTP/1.1
> User-Agent: curl/7.24.0 > Host: disqus.com > < HTTP/1.1 302 FOUND < Server: nginx < Date: Fri, 30 Aug 2013 21:34:36 GMT < Vary: Cookie < Set-Cookie: sessionid=f7aa9598-11bb-11e3-9eb1-003048d9a288; Domain=.disqus.com; expires=Sun, 29-Sep-2013 21:34:36 GMT; httponly; Max-Age=2592000; Path=/ Session Id
Unique id that represents a logged in user Session Id
django.contrib.sessions / django.contrib.auth Session Id
Could potentially cache per session id Session Id
By default, Varnish will not cache any request with a
Cookie header at all.
Think about if your endpoint changes based on a user’s
authentication.
If it doesn’t, Varnish can normalize it.
None
Learn: Varnish Configuration Language (VCL) in 30 seconds
sub vcl_recv { // These urls can be stripped of
all // cookies since they serve the same // data for anon and auth'd user if ( req.url == "/" || req.url ~ "^/embed/comments/" ) { unset req.http.Cookie; } }
Basically, caching is hard.
Go make some stuff faster.
We’re hiring people who hate computers. disqus.com/jobs
Questions? I have answers. ^ github.com/mattrobenolt @mattrobenolt some