Upgrade to Pro — share decks privately, control downloads, hide ads and more …

15-437 Large Scale Webapps

ThierrySans
November 10, 2013

15-437 Large Scale Webapps

ThierrySans

November 10, 2013
Tweet

More Decks by ThierrySans

Other Decks in Education

Transcript

  1. Users respond to speed “Amazon found every 100ms of latency

    cost them 1% in sales” “Google found an extra .5 seconds in search page generation time dropped traffic by 20%” “A broker could lose $4 million in revenues per millisecond if their electronic trading platform is 5 milliseconds behind the competition” http://blog.gigaspaces.com/amazon-found-every-100ms-of-latency-cost-them-1-in-sales/
  2. How to serve millions HTML/JS compression (a.k.a minifier) Optimize code

    (javascript / server) Web caching Scaling over multiple servers
  3. Two ways to do web cache Where to do it?

    How to do it? Why to do it? At the architecture level HTTP proxy cache Static Content At the program level Memory Cache Dynamic Content
  4. HTTP caching with a proxy server
 (for static content) HTTP

    proxy Web App Static Files Cache repeated HTTP requests for a given time ๏ Bad for dynamic content (latency when the content is updated) ✓ Good for static content (Javascript, CSS, Media, static HTML) ➡ Popular HTTP proxies : Squid and Varnish
  5. Fine-grained caching with the web application
 (for dynamic content) Memory

    Cache Web App Static Files Cache controlled by the program ๏ Specific for each app ✓ Good for dynamic content ➡ Popular memory cache: Memcached HTTP proxy Static Files
  6. What to put in the cache to improve performances? Processing

    the request means: 1. Parse the HTTP request 2. Map the URL to the handler 3. Query the database 4. Compute the view DB access is expensive 
 (time and money when your host charges you for DB access)
  7. Distributed Shared Cache : Memcached http://memcached.org/ • Store key/value pairs

    in memory • Throw away data that is the least recently used
  8. A typical cache algorithm retrieve from cache if data not

    in cache: # cache miss query the database update the cache return result
  9. Activate the cache .... CACHES = { 'default': { 'BACKEND':

    'django.core.cache.backends.memcached.MemcachedCache', 'LOCATION': '127.0.0.1:11211', } } .... simpsonsapp/settings.py You can have as many locations as needed (see distributed shared cache)
  10. Caching .... def getEntries(): cache_key = 'index' data = cache.get(cache_key)

    if not data: print "*********** Database Request ***********" entry_list = Entry.objects.all() data = serializers.serialize('json', entry_list) cache.set(cache_key, data, None) return HttpResponse(data, content_type = "application/json") .... api/views.py
  11. Cache Invalidation .... def addCharacter(request): param = json.loads(request.body) e =

    Entry(firstname = param['firstname'],\ w_url = param['w_url'], img_url= param['img_url']) e.save() cache.delete('index') return getEntries() .... api/views.py
  12. Cache Stampede
 (a.k.a dog piling) Problem: Multiple concurrent requests doing

    the same request because cache was cleared Solution: • update the cache instead of clearing it after an insert • a page view will never query the database ➡ Requires cache warming Web App Cache cache miss!
  13. Cache Update .... def warmCache(): print "*********** cache warming ***********"

    cache_key = 'index' entry_list = Entry.objects.all() data = serializers.serialize('json', entry_list) cache.set(cache_key, data, None) def getEntries(): cache_key = 'index' data = cache.get(cache_key) if not data: "*********** init only ***********" warmCache() data = cache.get(cache_key) return HttpResponse(data, content_type = "application/json").... api/views.py
  14. Cache Warming .... def addCharacter(request): param = json.loads(request.body) e =

    Entry(firstname = param['firstname'],\ w_url = param['w_url'], img_url= param['img_url']) e.save() warmCache() return getEntries() .... api/views.py
  15. Load Balancer Serving multiple apps with a load balancer Web

    App HTTP proxy Web App Web App Memcached Memcached Memcached … This is not an efficient cache
  16. Load Balancer Distributed Shared Cache Web App HTTP proxy Web

    App Web App Memcached Memcached Memcached … …
  17. Load Balancer Distributed Databases Web App HTTP proxy Web App

    Web App Memcached Memcached Memcached … … …
  18. High-Performance Software Load Balancer Haproxy Web Server Nginx HTTP proxy

    cache Squid / Varnish Memory Cache Memcached Configuration Manager Zookeeper