Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Content Caching with NGINX

Content Caching with NGINX

For the recorded webinar, visit nginx.com/webinars.

Content Caching is one of the most effective ways to dramatically improve the performance of a web site. In this webinar, we’ll deep-dive into NGINX’s caching abilities and investigate the architecture used, debugging techniques and advanced configuration. By the end of the webinar, you’ll be well equipped to configure NGINX to cache content exactly as you need.

NGINX Inc

May 07, 2014
Tweet

More Decks by NGINX Inc

Other Decks in Technology

Transcript

  1. About this webinar Content  Caching  is  one  of  the  most

     effec0ve  ways  to  drama0cally  improve   the  performance  of  a  web  site.  In  this  webinar,  we’ll  deep-­‐dive  into   NGINX’s  caching  abili0es  and  inves0gate  the  architecture  used,  debugging   techniques  and  advanced  configura0on.  By  the  end  of  the  webinar,  you’ll   be  well  equipped  to  configure  NGINX  to  cache  content  exactly  as  you  need.  
  2. Basic  Principles   Internet N GET  /index.html   GET  /index.html

      Used  by:  Browser  Cache,  Content  Delivery  Network  and/or  Reverse  Proxy  Cache  
  3. Mechanics  of  HTTP  Caching   •  Origin  server  declares  cacheability

     of  content   •  Reques0ng  client  honors  cacheability   –  May  issue  condi0onal  GETs   Expires: Tue, 6 May 2014 02:28:12 GMT! Cache-Control: public, max-age=60! X-Accel-Expires: 30! Last-Modified: Tue, 29 April 2014 02:28:12 GMT! ETag: "3e86-410-3596fbbc“!
  4. What  does  NGINX  cache?   •  Cache  GET  and  HEAD

     with  no  Set-­‐Cookie  response   •  Uniqueness  defined  by  raw  URL  or:   •  Cache  0me  defined  by   –  X-­‐Accel-­‐Expires   –  Cache-­‐Control   –  Expires      hWp://www.w3.org/Protocols/rfc2616/rfc2616-­‐sec13.html     proxy_cache_key $scheme$proxy_host$uri$is_args$args;!
  5. NGINX  Config   proxy_cache_path /tmp/cache keys_zone=one:10m levels=1:2 inactive=60m;! ! server

    {! listen 80;! server_name localhost;! ! location / {! proxy_pass http://localhost:8080;! proxy_cache one;! }! }!
  6. Caching  Process   Internet Check  Cache   Respond  from  

    cache   Read  request   Wait?   cache_lock_timeout! Response   cacheable?   Stream  to  disk   proxy_cache_use_stale error | timeout | invalid_header | ! updating | http_500 | http_502 | http_503 | http_504 |! http_403 | http_404 | off! NGINX  can  use  stale  content  under  the  following  circumstances:   MISS   HIT  
  7. Caching  is  not  just  for  HTTP   •  FastCGI  

      –  Func0ons  much  like  HTTP   •  Memcache   –  Retrieve  content  from  memcached     server  (must  be  prepopulated)   •  uwsgi  and  SCGI   N HTTP   FastCGI   memcached   uwsgi   SCGI   NGINX  is  more  than   just  a  reverse  proxy  
  8. add_header X-Cache-Status $upstream_cache_status;! MISS   Response  not  found  in  cache;

     got  from  upstream.  Response  may  have  been   saved  to  cache   BYPASS   proxy_cache_bypass  got  response  from  upstream.    Response  may  have   been  saved  to  cache   EXPIRED   entry  in  cache  has  expired;  we  return  fresh  content  from  upstream   STALE   takes  control  and  serves  stale  content  from  cache  because  upstream  is  not   responding  correctly   UPDATING   serve  state  content  from  cache  because  cache_lock  has  0med  out  and   proxy_use_stale  takes  control   REVALIDATED   proxy_cache_revalidate  verified  that  the  current  cached  content  was  s0ll   valid  (if-­‐modified-­‐since)   HIT   we  serve  valid,  fresh  content  direct  from  cache   Cache  Instrumenta0on  
  9. Cache  Instrumenta0on   map $remote_addr $cache_status {! 127.0.0.1 $upstream_cache_status;! default

    “”;! }! ! server {! location / {! proxy_pass http://localhost:8002;! proxy_cache one;! ! add_header X-Cache-Status $cache_status;! }! }!
  10. How  it  works...   •  NGINX  uses  a  persistent  disk-­‐based

     cache   –  OS  Page  Cache  keeps  content  in  memory,  with  hints  from   NGINX  processes   •  We’ll  look  at:   –  How  is  content  stored  in  the  cache?   –  How  is  the  cache  loaded  at  startup?   –  Pruning  the  cache  over  0me   –  Purging  content  manually  from  the  cache  
  11. How  is  cached  content  stored?   •  Define  cache  key:

      •  Get  the  content  into  the  cache,  then  check  the  md5   •  Verify  it’s  there:   $ echo -n "httplocalhost:8002/time.php" | md5sum! 6d91b1ec887b7965d6a926cff19379b4 -! $ cat /tmp/cache/4/9b/6d91b1ec887b7965d6a926cff19379b4! proxy_cache_path /tmp/cache keys_zone=one:10m levels=1:2 max_size=40m;! proxy_cache_key $scheme$proxy_host$uri$is_args$args;!
  12. Loading  cache  from  disk   •  Cache  metadata  stored  in

     shared  memory  segment   •  Populated  at  startup  from  cache  by  cache  loader   –  Loads  files  in  blocks  of  100   –  Takes  no  longer  than  200ms   –  Pauses  for  50ms,  then  repeats   proxy_cache_path path keys_zone=name:size ! [loader_files=number] [loader_threshold=time] [loader_sleep=time];! (100) (200ms) (50ms)!
  13. Managing  the  disk  cache   •  Cache  Manager  runs  periodically,

     purging  files  that   were  inac0ve  irrespec,ve  of  cache  ,me,  deleteing   files  in  LRU  style  if  cache  is  too  big   –  Remove  files  that  have  not  been  used  within  10m   –  Remove  files  if  cache  size  exceeds  max_size   proxy_cache_path path keys_zone=name:size! [inactive=time] [max_size=size];! (10m)!
  14. Purging  content  from  disk   •  Find  it  and  delete

     it   – Rela0vely  easy  if  you  know  the  key   •  NGINX  Plus  –  cache  purge  capability   $ curl -X PURGE -D – "http://localhost:8001/*"! HTTP/1.1 204 No Content! Server: nginx/1.5.12! Date: Sat, 03 May 2014 16:33:04 GMT! Connection: keep-alive! X-Cache-Key: httplocalhost:8002/*!
  15. Delayed  caching   •  Saves  on  disk  writes  for  very

     cool  caches   •  Saves  on  upstream  bandwidth  and  disk  writes     proxy_cache_min_uses number;! proxy_cache_revalidate on;! Cache  revalida0on  
  16. Control  over  cache  0me   •  Priority  is:   – 

    X-­‐Accel-­‐Expires   –  Cache-­‐Control   –  Expires   –  proxy_cache_valid   proxy_cache_valid 200 302 10m;! proxy_cache_valid 404 1m;! Set-­‐Cookie  response  header   means  no  caching  
  17. Cache  /  don’t  cache   •  Bypass  the  cache  –

     go  to  origin;  may  cache  result   •  No_Cache  –  if  we  go  to  origin,  don’t  cache  result   •  Typically  used  with  a  complex  cache  key,  and  only  if  the   origin  does  not  sent  appropriate  cache-­‐control  reponses   proxy_cache_bypass string ...;! proxy_no_cache string ...;! proxy_no_cache $cookie_nocache $arg_nocache $http_authorization;!
  18. Mul0ple  Caches   •  Different  cache  policies  for  different  tenants

      •  Pin  caches  to  specific  disks   •  Temp-­‐file  considera0ons  –  put  on  same  disk!:   proxy_cache_path /tmp/cache1 keys_zone=one:10m levels=1:2 inactive=60s;! proxy_cache_path /tmp/cache2 keys_zone=two:2m levels=1:2 inactive=20s;! proxy_temp_path path [level1 [level2 [level3]]];!
  19. Gotchas  with  NGINX  caching   •  No  ‘Vary’  support  –

     use  cache  key   •  ETags  works  with  Last-­‐Modified,  but  not  alone   •  Supports  SSI,  but  not  ESI   •  Other  than  that,  you’re  good  to  go!  
  20. Why  is  page  speed  important?   •  We  used  to

     talk  about  the  ‘N  second  rule’:   –  10-­‐second  rule   •  (Jakob  Nielsen,  March  1997)   –  8-­‐second  rule     •  (Zona  Research,  June  2001)   –  4-­‐second  rule     •  (Jupiter  Research,  June  2006)   –  3-­‐second  rule     •  (PhocusWright,  March  2010)   0   2   4   6   8   10   12   Jan-­‐97   Jan-­‐98   Jan-­‐99   Jan-­‐00   Jan-­‐01   Jan-­‐02   Jan-­‐03   Jan-­‐04   Jan-­‐05   Jan-­‐06   Jan-­‐07   Jan-­‐08   Jan-­‐09   Jan-­‐10   Jan-­‐11   Jan-­‐12   Jan-­‐13   Jan-­‐14  
  21. Google  changed  the  rules     “We  want  you  to

     be  able  to  get   from  one  page  to  another  as   quickly  as  you  turn  the  page  on   a  book”     Urs  Hölzle,  Google      
  22. The  costs  of  poor  performance   •  Google:  search  enhancements

     cost  0.5s  page  load   –  Ad  CTR  dropped  20%   •  Amazon:  Ar0ficially  increased  page  load  by  100ms   –  Customer  revenue  dropped  1%   •  Walmart,  Yahoo,  Shopzilla,  Edmunds,  Mozilla…     –  All  reported  similar  effects  on  revenue   •  Google  Pagerank  –  Page  Speed  affects  Page  Rank   –  Time  to  First  Byte  is  what  appears  to  count  
  23. NGINX  Caching  lets  you   Improve  end-­‐user  performance   Consolidate

     and  simplify  your  web  infrastructure   Increase  server  capacity   Insulate  yourself  from  server  failures  
  24. Closing  thoughts   •  38%  of  the  world’s  busiest  websites

     use  NGINX   •  Check  out  the  blogs  on  nginx.com   •  Future  webinars:  nginx.com/webinars   Try  NGINX  F/OSS  (nginx.org)  or  NGINX  Plus  (nginx.com)