Cheating Your Way to Webscale

Cheating Your Way to Webscale

Python Nordeste, May 2nd 2014

Ce86d68173d477a17396b5e611468f52?s=128

Matt Robenolt

May 02, 2014
Tweet

Transcript

  1. Python Nordeste May 2nd 2014 Matt Robenolt Cheating Your Way

    to #webscale
  2. Hello < me irl

  3. Lead Operations Engineer

  4. Core Contributor

  5. So what is #webscale?

  6. 10 million requests per second 4ms mean response time asynchronous

    io mongodb
  7. 10 million requests per second 4ms mean response time asynchronous

    io mongodb NOPE
  8. Disqus only does 150 req/s per web server. * we

    also write some bad code
  9. 150 12,960,000 388,800,000 per second per day per month real

    world #webscale
  10. Scale is about hiding the fact that your application is

    actually really slow.
  11. If your application feels fast, then it’s probably good enough.

  12. Users hate waiting for shit.

  13. So how do we do it?

  14. Cheating 101

  15. When a user asks for new data, let’s give them

    old data instead.
  16. When a user asks for new data, let’s give them

    old data instead. Caching
  17. When telling us to do something, let’s say we did

    and maybe do it later.
  18. When telling us to do something, let’s say we did

    and maybe do it later. Queueing
  19. Rule #1 Don’t get caught.

  20. Rule #2 Don’t get caught.

  21. Rule #3 Don’t get caught.

  22. HTTP Caching

  23. Introducing

  24. tl;dr Varnish sits between your application and your users Internet

  25. Let’s talk about HTTP. Hypertext Transport Protocol

  26. $ curl -v disqus.com

  27. > GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com

    > Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 02 May 2014 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 02 May 2014 06:43:36 GMT < Cache-Control: public, max-age=300
  28. > GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com

    > Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 02 May 2014 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 02 May 2014 06:43:36 GMT < Cache-Control: public, max-age=300 Request
  29. > GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com

    > Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 02 May 2014 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 02 May 2014 06:43:36 GMT < Cache-Control: public, max-age=300 Method
  30. > GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com

    > Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 02 May 2014 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 02 May 2014 06:43:36 GMT < Cache-Control: public, max-age=300 Path
  31. > GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com

    > Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 02 May 2014 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 02 May 2014 06:43:36 GMT < Cache-Control: public, max-age=300 Version
  32. > GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com

    > Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 02 May 2014 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 02 May 2014 06:43:36 GMT < Cache-Control: public, max-age=300 Headers
  33. > GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com

    > Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 02 May 2014 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 02 May 2014 06:43:36 GMT < Cache-Control: public, max-age=300 Response
  34. > GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com

    > Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 02 May 2014 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 02 May 2014 06:43:36 GMT < Cache-Control: public, max-age=300 Status
  35. > GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com

    > Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 02 May 2014 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 02 May 2014 06:43:36 GMT < Cache-Control: public, max-age=300 Headers
  36. > GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com

    > Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 02 May 2014 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 02 May 2014 06:43:36 GMT < Cache-Control: public, max-age=300
  37. > GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com

    > Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 02 May 2014 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 02 May 2014 06:43:36 GMT < Cache-Control: public, max-age=300 For 300 seconds, all users will get the same response without talking to our application.
  38. > GET / HTTP/1.1 > User-Agent: curl/7.24.0 > Host: disqus.com

    > Accept: */* > < HTTP/1.1 200 OK < Server: nginx < Date: Fri, 02 May 2014 06:38:37 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 10453 < Last-Modified: Fri, 30 Aug 2013 00:32:14 GMT < Vary: Accept-Encoding < Expires: Fri, 02 May 2014 06:43:36 GMT < Cache-Control: public, max-age=300 With power comes great responsibility.
  39. GET / INTERNET Varnish Web servers

  40. GET / INTERNET Varnish Web servers CACHED! “Cache-Control: max-age=300”

  41. GET / INTERNET Varnish Web servers

  42. GET / INTERNET Varnish Web servers CACHED!

  43. BUT WAIT… THERE’S MORE

  44. COLLAPSING REQUEST

  45. GET / INTERNET Varnish Web servers

  46. INTERNET Varnish Web servers GET /

  47. INTERNET Varnish Web servers GET / If multiple users request

    the same object, Varnish makes one fetch and returns to all users.
  48. Queueing

  49. Do as little work as possible, and return a promise

    that this work will be done.
  50. INTERNET Web servers Task workers Slow/Fast Data store Queue POST

    /foo
  51. INTERNET Web servers Task workers Slow/Fast Data store Queue POST

    /foo
  52. INTERNET Web servers Task workers Slow/Fast Data store Queue POST

    /foo Workers can rate limit, debounce, increment counters, generate a fast materialized view, etc.
  53. INTERNET Web servers Task workers Slow/Fast Data store Queue POST

    /foo Make sure your tasks finish before a user tries to read the data back.
  54. Final Thoughts

  55. Understand your application. Where can you cheat without disrupting user

    experience? Is seeing a few seconds old data going to damage a product?
  56. Cheating should only enhance user experience.

  57. Cheat any way you can, just don’t get caught.

  58. Django & Varnish & RabbitMQ & Celery & PostgreSQL &

    Redis &
 Cassandra & Riak Thanks
  59. Questions? I have answers. ^ github.com/mattrobenolt @mattrobenolt some