Slide 1

Slide 1 text

Mark  Nottingham    /    mnot@yahoo-­‐inc.com    /    [email protected]    /    @mnot Stupid Web  Caching Tricks

Slide 2

Slide 2 text

foo.yahoo.com

Slide 3

Slide 3 text

front-end the internets the internets

Slide 4

Slide 4 text

services front-end the internets the internets

Slide 5

Slide 5 text

services front-end caching the internets the internets

Slide 6

Slide 6 text

Simple,  right? Well,  let’s  bring  it  into  rotation...

Slide 7

Slide 7 text

Oops. 1276007531.061 205 192.168.1.16 TCP_MISS/200 9286 GET /details?ticker=ABC 1276007531.062 205 192.168.1.17 TCP_MISS/200 9287 GET /details?ticker=ABC 1276007531.064 218 192.168.1.16 TCP_MISS/200 9285 GET /details?ticker=ABC 1276007531.065 198 192.168.1.17 TCP_MISS/200 9286 GET /details?ticker=ABC 1276007531.065 215 192.168.1.15 TCP_MISS/200 9286 GET /details?ticker=ABC 1276007531.068 205 192.168.1.15 TCP_MISS/200 9288 GET /details?ticker=ABC 1276007531.072 1 192.168.1.17 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007531.254 398 192.168.1.15 TCP_MISS/200 9288 GET /details?ticker=ABC 1276007531.261 408 192.168.1.15 TCP_MISS/200 9287 GET /details?ticker=ABC 1276007531.289 429 192.168.1.17 TCP_MISS/200 9286 GET /details?ticker=ABC 1276007531.922 852 192.168.1.15 TCP_MISS/504 282 GET /details?ticker=ABC 1276007532.005 0 192.168.1.15 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007532.044 987 192.168.1.16 TCP_MISS/504 283 GET /details?ticker=ABC 1276007532.045 2 192.168.1.16 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007532.068 1001 192.168.1.17 TCP_MISS/000 0 GET /details?ticker=ABC 1276007532.072 998 192.168.1.16 TCP_MISS/504 278 GET /details?ticker=ABC 1276007591.062 60001 192.168.1.16 TCP_MISS/000 0 GET /details?ticker=ABC 1 2 3 4 5 6

Slide 8

Slide 8 text

Collapsed  Forwarding 1276007531.068 205 192.168.1.16 TCP_MISS/200 9286 GET /details?ticker=ABC 1276007531.068 205 192.168.1.17 TCP_MISS/200 9286 GET /details?ticker=ABC 1276007531.068 205 192.168.1.17 TCP_MISS/200 9286 GET /details?ticker=ABC 1276007531.068 205 192.168.1.15 TCP_MISS/200 9286 GET /details?ticker=ABC 1276007531.068 205 192.168.1.16 TCP_MISS/200 9286 GET /details?ticker=ABC 1276007531.068 205 192.168.1.15 TCP_MISS/200 9286 GET /details?ticker=ABC 1276007531.072 1 192.168.1.17 TCP_HIT/200 9287 GET /details?ticker=ABC 1276007531.072 0 192.168.1.15 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007531.073 1 192.168.1.16 TCP_HIT/200 9285 GET /details?ticker=ABC 1276007531.073 0 192.168.1.15 TCP_HIT/200 9285 GET /details?ticker=ABC 1276007531.074 0 192.168.1.17 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007531.076 1 192.168.1.16 TCP_HIT/200 9285 GET /details?ticker=ABC 1276007531.076 0 192.168.1.17 TCP_HIT/200 9287 GET /details?ticker=ABC 1276007531.077 0 192.168.1.15 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007531.078 1 192.168.1.15 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007531.079 1 192.168.1.16 TCP_HIT/200 9285 GET /details?ticker=ABC 1 2 3 4 5 6

Slide 9

Slide 9 text

in  squid2.HEAD: collapsed_forwarding_timeout collapsed_forwarding on in  squid2:

Slide 10

Slide 10 text

services front-end caching the internets the internets SPOF!

Slide 11

Slide 11 text

services front-end caching the internets the internets

Slide 12

Slide 12 text

Cache  Peering good  business  continuity more  Qlexible worse  hit  rate high  load  when  new  caches  come  online caches  can  come  out  of  sync + -­‐ answer:

Slide 13

Slide 13 text

services front-end caching the internets the internets

Slide 14

Slide 14 text

UDP-­‐based  (option  for  TCP  in  spec) Includes  URI  +  Headers Query,  CLR  operations in  Squid UDP-­‐based   Just  the  URI Query  only in  Squid  /  TrafQic  Server RFC  2756  -­‐  Hyper  Text  Caching  Protocol RFC  2186  -­‐  Internet  Cache  Protocol

Slide 15

Slide 15 text

services front-end caching the internets the internets ?!

Slide 16

Slide 16 text

24 front-end servers x 24 Apache children x 5 pages / second x 8 service requests / page x 10k / service response / 2 cache servers = 11,520 req/sec 900 Mbits/sec /cache server

Slide 17

Slide 17 text

services front-end proxy caching the internets the internets local caching Hierarchy

Slide 18

Slide 18 text

Content  Becomes   1276007530.037 0 192.168.1.17 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007530.057 1 192.168.1.16 TCP_HIT/200 9285 GET /details?ticker=ABC 1276007530.083 0 192.168.1.17 TCP_HIT/200 9287 GET /details?ticker=ABC 1276007530.119 0 192.168.1.15 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007530.141 1 192.168.1.15 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007530.179 0 192.168.1.16 TCP_HIT/200 9285 GET /details?ticker=ABC 1276007530.397 205 192.168.1.15 TCP_REFRESH_MISS/200 9285 GET /details?... 1276007530.401 1 192.168.1.17 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007530.414 201 192.168.1.17 TCP_REFRESH_MISS/200 9285 GET /details?... 1276007530.418 1 192.168.1.15 TCP_HIT/200 9285 GET /details?ticker=ABC 1276007530.434 0 192.168.1.16 TCP_HIT/200 9287 GET /details?ticker=ABC 1276007530.442 198 192.168.1.17 TCP_REFRESH_MISS/200 9285 GET /details?... 1276007530.372 0 192.168.1.15 TCP_HIT/200 9285 GET /details?ticker=ABC 1276007530.494 201 192.168.1.16 TCP_REFRESH_MISS/200 9285 GET /details?... 1276007530.525 1 192.168.1.17 TCP_HIT/200 9284 GET /details?ticker=ABC 1276007530.548 201 192.168.1.17 TCP_REFRESH_MISS/200 9285 GET /details?... 1276007530.563 1 192.168.1.15 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007530.594 0 192.168.1.16 TCP_HIT/200 9285 GET /details?ticker=ABC 1 2 3 4 5 6

Slide 19

Slide 19 text

RFC  5861 implemented coming  soon Squid  2.7 Squid  3.2 Apache  TrafQic  Server Cache-Control: stale-while-revalidate=30

Slide 20

Slide 20 text

stale-­‐while-­‐revalidate 1276007530.037 0 192.168.1.17 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007530.057 1 192.168.1.16 TCP_HIT/200 9285 GET /details?ticker=ABC 1276007530.083 0 192.168.1.17 TCP_HIT/200 9287 GET /details?ticker=ABC 1276007530.119 0 192.168.1.15 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007530.141 1 192.168.1.15 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007530.179 0 192.168.1.16 TCP_HIT/200 9285 GET /details?ticker=ABC 1276007530.192 0 192.168.1.15 TCP_STALE_HIT/200 9285 GET /details?... 1276007530.213 1 192.168.1.17 TCP_STALE_HIT/200 9286 GET /details?... 1276007530.243 0 192.168.1.17 TCP_STALE_HIT/200 9285 GET /details?... 1276007530.294 0 192.168.1.16 TCP_STALE_HIT/200 9287 GET /details?... 1276007530.347 0 192.168.1.17 TCP_STALE_HIT/200 9285 GET /details?... 1276007530.384 219 0.0.0.0 TCP_ASYNC_MISS/200 9285 GET /details?... 1276007530.401 1 192.168.1.17 TCP_HIT/200 9284 GET /details?ticker=ABC 1276007530.418 1 192.168.1.15 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007530.434 0 192.168.1.16 TCP_HIT/200 9285 GET /details?ticker=ABC 1 2 3 4 5 6

Slide 21

Slide 21 text

services front-end proxy caching the internets the internets local caching

Slide 22

Slide 22 text

RFC  5861 implemented coming  soon Squid  2.7 Squid  3.2 Apache  TrafQic  Server Cache-Control: stale-if-error=3600

Slide 23

Slide 23 text

services front-end caching front-­‐end  timeout:  500ms slow  service  =  no  cached  response dropped  client  connection not  cached  =  always  slow Squid Apache  TrafQic  Server quick_abort background_fill Dealing  with Aborted  Requests

Slide 24

Slide 24 text

services front-end caching Getting  an Cache-Control: only-if-cached Immediate  Answer 504 Gateway Error Cache-Control: max-age=3600, max-stale fetch_only_if_cached_access Squid (soon)

Slide 25

Slide 25 text

services front-end proxy caching the internets the internets local caching the internets the internets cache_peer...round-robin

Slide 26

Slide 26 text

services front-end proxy caching the internets the internets local caching the internets the internets cache_peer...carp

Slide 27

Slide 27 text

cache  that Why   Squid won’t   ? response

Slide 28

Slide 28 text

request  Cache-­‐Control   response  Cache-­‐Control   authentication   unfriendly  freshness  information   lack  of  LM/ETag   Easy  Answers

Slide 29

Slide 29 text

request  Cache-­‐Control ignore-­‐reload response  Cache-­‐Control ignore-­‐[no-­‐cache,  no-­‐store,  must-­‐revalidate,  private] authentication ignore-­‐auth unfriendly  freshness  information override-­‐[expire,  lastmod] lack  of  LM/ETag store-­‐stale refresh_pattern . 10 100% 10 [options] ...in  Squid

Slide 30

Slide 30 text

request  Cache-­‐Control proxy.conQig.http.cache.ignore_client_no_cache response  Cache-­‐Control proxy.conQig.http.cache.ignore_server_no_cache authentication proxy.conQig.http.cache.ignore_authentication   unfriendly  freshness  information proxy.conQig.http.cache.when_to_revalidate lack  of  LM/ETag proxy.conQig.http.cache.required_headers ...in  Traf@ic  Server dest_domain=example.com method=GET pin-in-cache=2d

Slide 31

Slide 31 text

Not  So  Easy:   Wandering  URLs http://srv254.dctr.example.com/foo/image.gif http://example.com/thing.xml?uselessToken=abc123 http://example.com/endPointforEverything http://b http://a http://a storeurl_rewrite

Slide 32

Slide 32 text

non-­‐GET  methods Protocol  Errors Vary:  * No  Answers* *without  hacking

Slide 33

Slide 33 text

Your API will be cached.

Slide 34

Slide 34 text

services proxy caching the internets the internets Accelerator  Caching

Slide 35

Slide 35 text

non-­‐canonical  URLs  =  low  cache  hit   rate /people?name=britney_spears&page=2 /people?name=Britney_Spears&page=2 /people?name=Britney_Spears&page=02 /people?NAME=Britney_Spears&page=02 /people?page=2&name=Britney_Spears /people?name=Britney_Spears&page=2& /people?name=Britney_Spears&page=2&token=abc /people?name=Britney_Spears&page=2&user=jane

Slide 36

Slide 36 text

Director XML format local in-cache / fetched from site

Slide 37

Slide 37 text

two  hard  things  in  CS: cache& naming    invalidation things. There  are  only Phil  Karlton

Slide 38

Slide 38 text

 Choose  two.  Or  maybe  one. reliability,   scalability,   immediacy.

Slide 39

Slide 39 text

RFC  2616: the internets the internets http acceleration origin server POST/PUT/DELETE/etc. Invalidations  after  Updates  or  Deletions Request-URI Content-Location Location

Slide 40

Slide 40 text

the internets the internets http acceleration origin server POST/PUT/DELETE/etc. Problem  1:  Peered  Caches

Slide 41

Slide 41 text

the internets the internets http acceleration origin server POST/PUT/DELETE/etc. Sharing  Invalidations  with  HTCP  CLR

Slide 42

Slide 42 text

Problem  2:  Related  Responses POST /articles/123/new_comment /newest_comments /articles/123/comments /comment_feed

Slide 43

Slide 43 text

Link:  rel=invalidated-­‐by POST /articles/123/new_comment /newest_comments /articles/123/comments /comment_feed Link: ; rel=”invalidate Link: ; rel=”invalidated-by Link: ; rel=”invalidated-by”

Slide 44

Slide 44 text

Problem  3:  Dynamic  Relations POST /articles/123/new_comment /newest_comments /articles/123/comments /comment_feed Link: ; rel=”invalidate Link: ; rel=”invalidated-by Link: ; rel=”invalidated-by” /bob/comments /cat/vuvuzela

Slide 45

Slide 45 text

Link:  rel=invalidates POST /articles/123/new_comment /newest_comments /articles/123/comments /comment_feed Link: ; rel=”invalidate Link: ; rel=”invalidated-by Link: ; rel=”invalidated-by” /bob/comments /cat/vuvuzela Link: ; rel=”invalidates” Link: ; rel=”invalidates”

Slide 46

Slide 46 text

Linked Cache Invalidation “side effect” invalidation + link relations =

Slide 47

Slide 47 text

Cache  Channels the internets the internets http acceleration origin server “What’s become stale?”

Slide 48

Slide 48 text

Cache  Channels Linked  Cache  Invalidation Good  for: Bottleneck: Caveat: Good  for: Bottleneck: Caveat: occasional  tight  control ~10-­‐30s  lag;  not  immediate number  of  events  in  channel user-­‐generated  content not  100%  reliable complexity  of  relationships

Slide 49

Slide 49 text

The  whole  point  of  using   a  Web  cache  is  that  you’re not writing code.

Slide 50

Slide 50 text

http://www.squid-­‐cache.org/ http://trafQicserver.apache.org/ http://www.mnot.net/cache_docs/ http://redbot.org/ http://github.com/mnot/