Python Nordeste, May 2nd 2014
Python NordesteMay 2nd 2014Matt RobenoltCheating Your Way to#webscale
View Slide
Hello< me irl
Lead Operations Engineer
Core Contributor
So what is #webscale?
10 million requests per second4ms mean response timeasynchronous iomongodb
10 million requests per second4ms mean response timeasynchronous iomongodbNOPE
Disqus only does150 req/sper web server.* we also write some bad code
15012,960,000388,800,000per secondper dayper monthreal world #webscale
Scale is about hiding thefact that your application isactually really slow.
If your application feelsfast, then it’s probablygood enough.
Users hate waiting for shit.
So how do we do it?
Cheating 101
When a user asks for newdata, let’s give them olddata instead.
When a user asks for newdata, let’s give them olddata instead.Caching
When telling us to dosomething, let’s say we didand maybe do it later.
When telling us to dosomething, let’s say we didand maybe do it later.Queueing
Rule #1Don’t get caught.
Rule #2Don’t get caught.
Rule #3Don’t get caught.
HTTP Caching
Introducing
tl;drVarnish sits between your application and your usersInternet
Let’s talk about HTTP.Hypertext Transport Protocol
$ curl -v disqus.com
> GET / HTTP/1.1> User-Agent: curl/7.24.0> Host: disqus.com> Accept: */*>< HTTP/1.1 200 OK< Server: nginx< Date: Fri, 02 May 2014 06:38:37 GMT< Content-Type: text/html; charset=utf-8< Content-Length: 10453< Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT< Vary: Accept-Encoding< Expires: Fri, 02 May 2014 06:43:36 GMT< Cache-Control: public, max-age=300
> GET / HTTP/1.1> User-Agent: curl/7.24.0> Host: disqus.com> Accept: */*>< HTTP/1.1 200 OK< Server: nginx< Date: Fri, 02 May 2014 06:38:37 GMT< Content-Type: text/html; charset=utf-8< Content-Length: 10453< Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT< Vary: Accept-Encoding< Expires: Fri, 02 May 2014 06:43:36 GMT< Cache-Control: public, max-age=300Request
> GET / HTTP/1.1> User-Agent: curl/7.24.0> Host: disqus.com> Accept: */*>< HTTP/1.1 200 OK< Server: nginx< Date: Fri, 02 May 2014 06:38:37 GMT< Content-Type: text/html; charset=utf-8< Content-Length: 10453< Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT< Vary: Accept-Encoding< Expires: Fri, 02 May 2014 06:43:36 GMT< Cache-Control: public, max-age=300Method
> GET / HTTP/1.1> User-Agent: curl/7.24.0> Host: disqus.com> Accept: */*>< HTTP/1.1 200 OK< Server: nginx< Date: Fri, 02 May 2014 06:38:37 GMT< Content-Type: text/html; charset=utf-8< Content-Length: 10453< Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT< Vary: Accept-Encoding< Expires: Fri, 02 May 2014 06:43:36 GMT< Cache-Control: public, max-age=300Path
> GET / HTTP/1.1> User-Agent: curl/7.24.0> Host: disqus.com> Accept: */*>< HTTP/1.1 200 OK< Server: nginx< Date: Fri, 02 May 2014 06:38:37 GMT< Content-Type: text/html; charset=utf-8< Content-Length: 10453< Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT< Vary: Accept-Encoding< Expires: Fri, 02 May 2014 06:43:36 GMT< Cache-Control: public, max-age=300Version
> GET / HTTP/1.1> User-Agent: curl/7.24.0> Host: disqus.com> Accept: */*>< HTTP/1.1 200 OK< Server: nginx< Date: Fri, 02 May 2014 06:38:37 GMT< Content-Type: text/html; charset=utf-8< Content-Length: 10453< Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT< Vary: Accept-Encoding< Expires: Fri, 02 May 2014 06:43:36 GMT< Cache-Control: public, max-age=300Headers
> GET / HTTP/1.1> User-Agent: curl/7.24.0> Host: disqus.com> Accept: */*>< HTTP/1.1 200 OK< Server: nginx< Date: Fri, 02 May 2014 06:38:37 GMT< Content-Type: text/html; charset=utf-8< Content-Length: 10453< Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT< Vary: Accept-Encoding< Expires: Fri, 02 May 2014 06:43:36 GMT< Cache-Control: public, max-age=300Response
> GET / HTTP/1.1> User-Agent: curl/7.24.0> Host: disqus.com> Accept: */*>< HTTP/1.1 200 OK< Server: nginx< Date: Fri, 02 May 2014 06:38:37 GMT< Content-Type: text/html; charset=utf-8< Content-Length: 10453< Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT< Vary: Accept-Encoding< Expires: Fri, 02 May 2014 06:43:36 GMT< Cache-Control: public, max-age=300Status
> GET / HTTP/1.1> User-Agent: curl/7.24.0> Host: disqus.com> Accept: */*>< HTTP/1.1 200 OK< Server: nginx< Date: Fri, 02 May 2014 06:38:37 GMT< Content-Type: text/html; charset=utf-8< Content-Length: 10453< Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT< Vary: Accept-Encoding< Expires: Fri, 02 May 2014 06:43:36 GMT< Cache-Control: public, max-age=300For 300 seconds, all userswill get the same responsewithout talking to ourapplication.
> GET / HTTP/1.1> User-Agent: curl/7.24.0> Host: disqus.com> Accept: */*>< HTTP/1.1 200 OK< Server: nginx< Date: Fri, 02 May 2014 06:38:37 GMT< Content-Type: text/html; charset=utf-8< Content-Length: 10453< Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT< Vary: Accept-Encoding< Expires: Fri, 02 May 2014 06:43:36 GMT< Cache-Control: public, max-age=300With power comes greatresponsibility.
GET /INTERNET Varnish Web servers
GET /INTERNET Varnish Web serversCACHED!“Cache-Control: max-age=300”
GET /INTERNET Varnish Web serversCACHED!
BUT WAIT…THERE’S MORE
COLLAPSINGREQUEST
INTERNET Varnish Web serversGET /
INTERNET Varnish Web serversGET /If multiple users requestthe same object, Varnishmakes one fetch andreturns to all users.
Queueing
Do as little work as possible,and return a promise thatthis work will be done.
INTERNET Web servers Task workersSlow/Fast Data storeQueuePOST /foo
INTERNET Web servers Task workersSlow/Fast Data storeQueuePOST /fooWorkers can rate limit,debounce, incrementcounters, generate a fastmaterialized view, etc.
INTERNET Web servers Task workersSlow/Fast Data storeQueuePOST /fooMake sure your tasks finishbefore a user tries to readthe data back.
Final Thoughts
Understand your application.Where can you cheat without disrupting user experience?Is seeing a few seconds old data going to damage a product?
Cheating should onlyenhance user experience.
Cheat any way you can, justdon’t get caught.
Django &Varnish &RabbitMQ &Celery &PostgreSQL &Redis & Cassandra &Riak Thanks
Questions? I have answers.^github.com/mattrobenolt@mattrobenoltsome