Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Caching up and down the stack in Django - James...

Caching up and down the stack in Django - James Meickle

Whether you're looking to make your web app run faster or scale better, one great way to achieve both is to simply do less work. How? By using caches, the data hidey-holes which generations of engineers have thoughtfully left at key junctures in computing infrastructure from your CPU to the backbone of the internet.

PyGotham 2014

August 17, 2014
Tweet

More Decks by PyGotham 2014

Other Decks in Programming

Transcript

  1. Caching Up and! Down the Stack 1   James Meickle

    Developer evangelist, AppNeta @jmeickle PyGotham August 17, 2014!
  2. 3   WHAT IS CACHING?   Uncached   Cached  

    Client   Data     Source   Data     Source   Client   Cache   Intermediary  
  3. 4   WHAT IS CACHING?   Uncached   Cached  

    Client   Data     Source   Data     Source   Client   Cache   Intermediary   Fast!   Slow...  
  4. 7   •  Images •  CSS •  JavaScript •  HTML

    documents •  DNS WHAT GETS CACHED BY CLIENTS?  
  5. 8   •  HTML documents •  HTML fragments •  Operations

    on data •  Database queries •  Expensive objects WHAT GETS CACHED BY APPS?  
  6. 9   •  Compiled source •  Packages •  Disk access

    •  Memory access •  CPU instructions   WHAT GETS CACHED BY SERVERS?  
  7. 10   •  HTTP responses: •  JavaScript •  CSS • 

    HTML documents CACHING IN DJANGO: CLIENT-SIDE  
  8. 14   •  Use HTTP caches! •  CDN •  Intermediate

    proxies •  Browser •  Set policy with cache headers •  Cache-Control •  Expires CACHING HTTP RESPONSES  
  9. 15   CACHING HTTP RESPONSES   /tl-layouts_base- compiled-757f5eec3603f60850acfdb86e6701cf104f80ae.css! Request Method:

    GET! Status Code: 304 Not Modified! ! Cache-Control: max-age=315360000! Connection: keep-alive! Date: Mon, 18 Feb 2013 22:46:12 GMT! Expires: Thu, 31 Dec 2037 23:55:55 GMT! Last-Modified: Tue, 12 Feb 2013 21:10:20 GMT! Server: nginx/0.8.54  
  10. 18   •  Full pages •  Partial pages •  Objects

    •  Queries   CACHING IN DJANGO: SERVER-SIDE  
  11. 20   FULL-PAGE HTTP CACHING   Client   Varnish  

    Do HTTP caching, but with your rules. No internet standards necessary!   Webserver  
  12. 21   •  Why do it server-side? •  Invalidation • 

    Amount cached •  Changing cache policies FULL-PAGE HTTP CACHING  
  13. 23   •  Full pages •  Partial pages •  Objects

    •  Queries   CACHING IN DJANGO: SERVER-SIDE  
  14. 31   •  Full pages •  Partial pages •  Objects

    •  Queries   CACHING IN DJANGO: SERVER-SIDE  
  15. 32   OBJECT CACHING   def get_item_by_id(key): # Look up

    the item in our database return session.query(User)\ .filter_by(id=key)\ .first()  
  16. 33   OBJECT CACHING   def get_item_by_id(key): # Check in

    cache val = mc.get(key) # If exists, return it if val: return val # If not, get the val, store it in the cache val = return session.query(User)\ .filter_by(id=key)\ .first() mc.set(key, val) return val  
  17. 34   OBJECT CACHING   @decorator def cache(expensive_func, key): #

    Check in cache val = mc.get(key) # If exists, return it if val: return val # If not, get the val, store it in the cache val = expensive_func(key) mc.set(key, val) return val  
  18. 35   OBJECT CACHING   @cache def get_item_by_id(key): # Look

    up the item in our database return session.query(User)\ .filter_by(id=key)\ .first()  
  19. 37   •  Full pages •  Partial pages •  Objects

    •  Queries   CACHING IN DJANGO: SERVER-SIDE  
  20. 38   QUERY CACHING   Cached   Table   Data

      DB  client   Query     Cache   SQL  server   Retrieve results from memory… …or from memcached… …or cache in the DB itself!  
  21. 39   QUERY CACHING   mysql> select SQL_CACHE count(*) from

    traces; +----------+ | count(*) | +----------+ | 3135623 | +----------+ 1 row in set (0.56 sec) mysql> select SQL_CACHE count(*) from traces; +----------+ | count(*) | +----------+ | 3135623 | +----------+ 1 row in set (0.00 sec)  
  22. 43   •  Invalidation   •  Fragmentation •  Warming • 

    Stampedes •  Operational complexity WHAT CAN GO WRONG WHEN CACHING?  
  23. 44   CACHE INVALIDATION   Uncached   Cached   Client

      Data     Source   Data     Source   Client   Cache   Intermediary   Invalidation  
  24. 49   CACHE FRAGMENTATION   “In  the  beginning  there  was

     NCSA  Mosaic,  and  Mosaic   called  itself  NCSA_Mosaic/2.0  (Windows  3.1),  and   Mosaic  displayed  pictures  along  with  text,  and  there  was   much  rejoicing.”   History  of  the  browser  user-­‐agent  string  
  25. 50   •  On a cache miss extra work is

    done •  What if the cache is empty (“cold”)? •  Use a persistent cache (Redis), or •  Pre-“warm” your cache, or •  Ramp up traffic gradually (brute force!)   CACHE WARMING  
  26. 51   •  On a cache miss extra work is

    done •  What if multiple simultaneous misses? •  Every node tries to do the same work at the same time and your app dies   CACHE STAMPEDES  
  27. 52   •  What caching scheme? •  How many extra

    servers? •  What happens if they fail? •  What will you do to debug it? •  When are you looking at stale data?   OPERATIONAL COMPLEXITY  
  28. 53   •  The ‘how’ of caching: •  What are

    you caching? •  Where are you caching it? •  How bad is a cache miss? •  How and when are you invalidating? TAKEAWAYS  
  29. 54   •  The ‘why’ of caching: •  Did it

    actually get faster? •  Is speed worth extra complexity? •  Don’t guess – measure! •  Always use real-world conditions. TAKEAWAYS  
  30. 56   •  Django documentation on caching: https://docs.djangoproject.com/en/dev/topics/cache/ •  Varnish

    caching, via Disqus: http://blog.disqus.com/post/62187806135/scaling-django-to-8- billion-page-views •  Django cache option comparisons: http://codysoyland.com/2010/jan/17/evaluating-django-caching- options/ •  More Django-specific tips: http://www.slideshare.net/csky/where-django-caching-bust-at-the- seams •  Guide to cache-related HTTP headers: http://www.mobify.com/blog/beginners-guide-to-http-cache- headers/ RESOURCES  
  31. 57   TraceView   I’m @jmeickle, and I work for

    @AppNeta in Boston! We love Python, so come work for us J! ! Aug 18-19: DevOpsDays Boston! Sep 15-17: Velocity New York! Sep 18: WebPerfDays New York! Oct 21: TechBreakfast New York! Nov 11-14: AWS re:Invent (Vegas!) ! ! Or, just try us out:! ! THANK YOU!   h`p://www.appneta.com/products/traceview/