Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Cache Strategies for Web Apps

Cache Strategies for Web Apps

Given at Zendcon, October 29, 2014

Be4677c2556e6af27a4c1c82dad3393b?s=128

Glen Campbell

October 29, 2014
Tweet

More Decks by Glen Campbell

Other Decks in Programming

Transcript

  1. Cache strategies for web apps Glen Campbell @glenc

  2. Yes, I picked the dullest title ever

  3. None
  4. –Wikipedia “A web cache is a mechanism for the temporary

    storage (caching) of web documents, such as HTML pages and images, to reduce bandwidth usage, server load, and perceived lag. A web cache stores copies of documents passing through it; subsequent requests may be satisfied from the cache if certain conditions are met.”
  5. What is the most common type of web cache?

  6. REST • Client-server • Stateless • Cacheable • Layered system

    • Code on demand (optional) • Uniform interface
  7. Example: local HTTP/1.1 200 OK Date: Wed, 29 Oct 2014

    15:04:20 GMT Server: Apache/2.2.15 (CentOS) Last-Modified: Wed, 29 Oct 2014 14:54:23 GMT Accept-Ranges: bytes Content-Length: 212 Cache-Control: max-age=31536000 Expires: Thu, 29 Oct 2015 15:04:20 GMT Vary: Accept-Encoding Connection: close Content-Type: text/html; charset=UTF-8
  8. Example: hotel HTTP/1.0 200 OK Date: Wed, 29 Oct 2014

    15:05:32 GMT Server: Apache/2.2.15 (CentOS) Last-Modified: Wed, 29 Oct 2014 14:54:23 GMT Accept-Ranges: bytes Content-Length: 212 Cache-Control: max-age=31536000 Expires: Thu, 29 Oct 2015 15:05:32 GMT Vary: Accept-Encoding Content-Type: text/html; charset=UTF-8 X-Cache: MISS from localhost X-Cache-Lookup: MISS from localhost:3128 Via: 1.1 localhost:3128 (squid/2.7.STABLE3) Connection: close
  9. What changed? $ diff local hotel 1,2c1,2 < HTTP/1.1 200

    OK < Date: Wed, 29 Oct 2014 15:04:20 GMT --- > HTTP/1.0 200 OK > Date: Wed, 29 Oct 2014 15:05:32 GMT 8c8 < Expires: Thu, 29 Oct 2015 15:04:20 GMT --- > Expires: Thu, 29 Oct 2015 15:05:32 GMT 10d9 < Connection: close 11a11,14 > X-Cache: MISS from localhost > X-Cache-Lookup: MISS from localhost:3128 > Via: 1.1 localhost:3128 (squid/2.7.STABLE3) > Connection: close
  10. HTTP 1.2

  11. Relevant HTTP 1.2 Headers • Age: • Authorization: • Cache-Control:

    • Connection: • ETag: • Expires: • If-Match: • If-None-Match: • If-Range: • Pragma: • Vary: • Warning:
  12. http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
 
 or
 
 http://bit.ly/1p0KHQr

  13. Cache-Control: Requests • no-cache • no-store • max-age={seconds} • max-stale={seconds}

    • min-fresh={seconds} • no-transform • only-if-cached
  14. Cache-Control: Responses • public • private • no-cache • no-store

    • no-transform • must-revalidate • proxy-revalidate • max-age={seconds} • s-maxage={seconds}
  15. Adding headers in PHP void header ( string $string [,

    
 bool $replace = true [, 
 int $http_response_code ]] )
  16. Cache-Control: in PHP header(‘Cache-Control: no-cache’); header(‘Cache-Control: max-age=600’);

  17. ETag: • Don’t use them • Generation is not specified

    by the HTTP standard, and is often not consistent across a cluster. • Error-prone and can be used to track users who refuse cookies. • Turn them off; don’t use them
  18. Expires: • Indicates when the resource is stale. • Specifies

    a date/time rather than delta seconds (Cache-Control: max-age=S) • Mostly used for compatibility with HTTP 1.0; Cache-Control: is more semantically rich.
  19. Extensions • Cache-Control: max-age={s}, stale-while- revalidate={s} • Cache-Control: max-age={s}, stale-while-

    error={s}
  20. ?

  21. Is data cacheable? • Highly cacheable data: news stories, blog

    posts, aggregated data such as ratings or reviews (“likes”). • Uncacheable: secure, private, personal data such as user login information, credit card info, etc. Data that must change rapidly—stock quotes, for example, or health monitoring systems.
  22. Cache Architectures

  23. Web Server Service Example 1. No cache

  24. Web Server Cache (Proxy) Service Web Server Example 2. Shared

    Cache
  25. Web Server Cache (Proxy) Service Web Server Web Server Cache

    (Proxy) Web Server ICP Example 3. Distributed Cache
  26. Web Server Cache (Proxy) Service Web Server Web Server Cache

    (Proxy) Web Server ICP (local cache) (local cache) (local cache) (local cache) Example 4. Local+Remote Cache
  27. HTTP Proxies

  28. Squid • Old, venerable; the reference implementation for the HTTP

    standard • Single-threaded • Can be tricky to configure (a multitude of options) but very high-performance • Implements ICP (Internet Cache Protocol) for distributed and hierarchical caches
  29. Varnish • More modern implementation than Squid; relies on virtual

    memory and multi-threaded access • Easier to set up and configure than squid • Does not support ICP or cache hierarchies
  30. nginx • reverse proxy and webserver - does not need

    a separate web server process • great for static content, according to users • uses asynchronous sockets; one process per core architecture
  31. Manual Caching

  32. DIY caching • Tools let you build your own cache

    system. • Not transparent, but can build transparency. • Most are simple key/value stores • Requires writing code
  33. DIY cache example • Object retrieval interface fetches data from

    service. • Internal methods query the data store (memcached, Redis) first and use stored data if possible. • If data is not in the cache, fetch it from the backend service and store it in the cache.
  34. Upsides for DIY caching • Provides a very clean programmatic

    interface (transparent at the application level) • Can be tailored to specific solutions where you understand the data. • Often very high performance
  35. Downsides to DIY caching • Requires code to be written,

    tested, etc. • Requires code maintenance if the underlying data model is changed. • Not standardized like HTTP for specifying age, freshness of data (i.e., not a generic solution, but a custom one)
  36. Edge Caching

  37. What is an “edge cache?” • A content delivery network

    (CDN) that holds static content on the “edges” of the Internet • Akamai is the biggest, but there are others: LimeLight, Microsoft Azure, Amazon CloudFront • Stores static content in multiple data centers • Content like JavaScript, CSS, images, and other media
  38. How does a CDN work? • Primary site (www.example.com) serves

    the HTML page. • <script> <style> <img> etc. tags reference static content on the CDN • User’s browsers loads (and often stores) the static content locally, because it’s served with a Cache-Control: max-age=32767 header.
  39. Q&A

  40. glen.campbell@rackspace.com
 @glenc
 http://www.glencampbell.co http://developer.rackspace.com
 
 Free Cloud!