pioneer ▸ Logical Design of a Digital Computer with Multiple Asynchronous Rotating Drums and Automatic High Speed Memory Operation (1956) ▸ Automated organization and swapping of pages
IBM System 360 Model 85 introduces CPU cache ▸ 1982 - Motorola 68010 features on-board instruction cache ▸ 1987 - Motorola 68030 features a 256-byte data cache ▸ 1989 - Intel 486 features split cache with on-die L1 and on mainboard L2 ▸ 1995 - Intel Pentium Pro has L1 and L2 on die ▸ 2013 - Intel Haswell MA has 3 caches with shared L4 for CPU & on-board GPU
of data (arrays are allocated in row-major order) ▸ When A[i][j] is referenced, nearby memory addresses are brought in to cache ▸ Code optimized to use cache has the potential to be MUCH faster. ▸ This is often what internals devs are talking about when considering whether or not some object is “in cache”
- Theoretical algorithm that always discards the object which will not be accessed for the longest time. ▸ LRU - Least Recently Used objects are discarded first. Typically uses 2 bits per object to record “age” ▸ PLRU - LRU algorithm that sacrifices miss ratio for lower latency and lower power requirements. Typical implementations use 1 bit per object. Common in CPU caches. ▸ MRU - Most Recently Used objects are discarded first. Useful for cyclical or random access patterns over large sets of data. ▸ Random Replacement - Objects are discarded at random. Used in RISC platforms such as ARM as no housekeeping bits are required.
IN CACHE? DB FETCH WRITE TO CACHE RETURN DATA END <?php class DataRepository { private $cache; private $db; public function getDataByKey($key) { $data = $cache->get($key); if (null === $data) { $data = $db->fetchByKey($key); $cache->set($key, $data); } return $data; } }
$ttl); $m->setMultiByKey($server_key, $items, $ttl); MEMCACHED CLIENT COMPUTES HASH OF KEY HASH IS USED TO COMPUTE SERVER AFFINITY SERIALIZED OBJECT IS STORED ON APPROPRIATE SERVER
-> http://www.paperplanes.de/2011/12/9/the-magic-of-consistent-hashing.html ▸ Nodes claim multiple partitions of hash key space (e.g. 2^160 bit space of SHA-1) ▸ Keys are hashed and assigned to the appropriate node ▸ Adding/removing a node only requires remapping K/n keys on average
Sanfilipo in 2009 ▸ Modeled as a Data Structure Server ▸ Hashes ▸ Sets (Sorted and Unsorted) ▸ Lists ▸ Hashes ▸ Strings ▸ High-performance in memory key-value store with optional persistence
key-value store ▸ Redis, among other things, is a fast, volatile key- value store ▸ Redis supports persistence (and thus can be used for “less” volatile data such as sessions ▸ Redis allows storing native data structures without serialization ▸ Redis implementations generally use consistent hashing to distribute keys
in-memory caches are difficult in PHP because PHP does not persist ▸ APC (defunct) ▸ APCu (beta) ▸ Given the speed of modern distributed caches, use cases are limited
output on request or based on a publish event ▸ Output is written to shared storage ▸ Web servers deliver content from shared storage. Application servers are isolated from traffic. ▸ Breaks down completely with large/complex applications ORIGIN SERVER ORIGIN SERVER ORIGIN SERVER SHARED HIGH-SPEED STORAGE APP SERVER APP SERVER
Sysoev in 2002 ▸ Streamlined web server optimized for highly concurrent, low-overhead, http content delivery. ▸ Particularly optimized for static file delivery ▸ Designed to proxy over HTTP, WSGI, FastCGI (can be used as a load balancer) ▸ Can be configured to generate and maintain a file-based cache of output from external origins (over network/gateway protocols)
by Poul-Henning Kamp for Verdens Gang ▸ Uses memory heap allocation to minimize IO ▸ Optimizations are focused on eliminating system calls ▸ Algorithms to deliver requests to threads most likely to have objects cached in L1/L2
we check if we have this in cache already. # # Typically you clean up the request here, removing cookies you don't need, # rewriting the request, etc. if (req.method == "PURGE") { if (!client.ip ~ purge) { return(synth(403,"Forbidden")); } return(purge); } set req.backend_hint = vdir.backend(); if (req.url ~ "wp-admin|wp-login") { return (pass); } unset req.http.cookie; }
4.9 Billion devices currently connected to the Internet ▸ The network of CDNs, proxies, gateways, and browsers constitute the single largest distributed cache ever created ▸ They (mostly) speak a common language!
are designed to allow origins to communicate cache parameters to clients and proxies ▸ Directives dictate who should or shouldn’t cache, for how long objects should be considered fresh, and sets revalidation policies
intermediary caches if a response is specific to the end user or not (THIS IS NOT A SECURITY FEATURE) ▸ no-cache [=<header>] - Without a header, tells caches that they must revalidate each request (by comparing hashes). With a header provided, this tells caches that they may store the object as long as they strip out the specified header. ▸ no-store - Directs caches to never store this object under any circumstances.
for how long an object can be considered fresh ▸ s-maxage <seconds> - Max-age for shared caches (CDNs/CRPs). These caches will generally respect s-maxage over manage ▸ must-revalidate - Tells caches that they must revalidate (compare hashes) on any request and never serve stale data, even if otherwise configured to serve stale content. ▸ proxy-revalidate - must-revalidate for shared caches
date/time for an object. Largely superseded by maxage ▸ etag - “Entity tag,” usually a hash of the object or hash of the object’s last modified time, used check freshness ▸ vary <header> - Informs caches that they can store one version of content per distinct version of <header>. For example, cache one version per User-Agent ▸ pragma - Deprecated
edgier - much of our growth is happening in developing markets ▸ User agent diversity is increasing dramatically (mobile dominance) ▸ Content collection definitions are less and less deterministic, requiring more flexibility in search and query ops
high performance NOSQL document store that features ▸ High-availability via clustering ▸ Rack/Datacentre-aware sharding ▸ Expressive & dynamic query DSL
CLUSTER (AP-NRT) GLOBAL MESSAGING BUS CONTENT MANAGEMENT SERVICE function (event, callback) { var index = 'posts'; var type = 'post'; var id = event['id']; if (!id) { return callback('Invalid post object received'); } indexRecord(index, type, id, event, callback); },
SEARCHABLE DOCUMENT CACHE CONTENT CACHE DATA CACHE GLOBAL MESSAGING BUS ▸ EXPOSED TO END-USER LOAD ▸ CAN BE LOCATED NEAR THE EDGE ▸ SELF-CONTAINED ORIGIN ▸ SLOWEST COMPONENTS ▸ MINIMAL LOAD ▸ CAN BE CENTRALIZED
operations in your application ▸ Optimizing the most common operations in your application ▸ Minimizing the distance between where data lives and where data is used
don’t care about your users’ experience ▸ When you have infinite money to waste on compute time ▸ When you don’t care how much carbon you pump in to the atmosphere