$30 off During Our Annual Pro Sale. View Details »

Caching

 Caching

An obvious solution to getting a faster & more scalable website is to introduce caching.

When should you use them? Why is cache invalidation one of the hard things in computer science?

Let’s discuss key-value stores (like Memcached & Redis): how to use them in your application, the problems they solve & the ones they bring.

Matthias Mullie

April 18, 2017
Tweet

More Decks by Matthias Mullie

Other Decks in Programming

Transcript

  1. • CPU cache (L1, L2) • Browser cache • Proxy

    servers (e.g. Varnish) • SQL Query cache • Key value stores (e.g. Memcached) • … Caching » Intro Examples
  2. Repeated slow/intensive operations E.g.: • Results of expensive processing (CPU)

    • Results from slow external sources (I/O) Caching » Intro What to cache?
  3. • Frequently changing data • Barely any (repeat) traffic •

    Fast & easy data
 (e.g. most queries, if indexed correctly) Caching » Intro What not to cache?
  4. Caching Don’t cache if you don’t anticipate traffic. When you

    do, design for it, early on. Caching » Intro Caution! = increases complexity = more dev $$$ = more bugs
  5. function expensiveOperation() { sleep(5); return array(‘your data’); } $data =

    expensiveOperation(); Caching » Practical Example: no cache
  6. function expensiveOperation() { sleep(5); return array(‘your data’); } $data =

    expensiveOperation(); Caching » Practical Example: no cache BAD!
  7. • It always takes 5 seconds! • While that operation

    executes, it can’t process other requests Caching » Practical Example: no cache
  8. But… but… I have powerful servers! • They cost a

    lot! (wasted resources) Caching » Practical Example: no cache
  9. But… but… I have powerful servers! • They cost a

    lot! (wasted resources) • What about unexpected peaks? Caching » Practical Example: no cache
  10. The main reason for caching (or any optimization) is usually

    not raw speed. Caching » Practical Example: no cache
  11. The main reason for caching (or any optimization) is usually

    not raw speed. Scalability! Caching » Practical Example: no cache
  12. $cache = new \Memcached(); $cache->addServer('localhost', 11211); $data = $cache->get(‘key’); if

    ($data === null) { $data = expensiveOperation(); $cache->set(‘key’, $data); } Caching » Practical Example: cache
  13. $cache = new \Memcached(); $cache->addServer('localhost', 11211); $data = $cache->get(‘key’); if

    ($data === null) { $data = expensiveOperation(); $cache->set(‘key’, $data); } Caching » Practical Example: cache Bootstrap: connect to cache server. Memcached, in this case, with PECL’s Memcached API.
  14. $cache = new \Memcached(); $cache->addServer('localhost', 11211); $data = $cache->get(‘key’); if

    ($data === null) { $data = expensiveOperation(); $cache->set(‘key’, $data); } Caching » Practical Example: cache Get the data from the cache Don’t forget this connection to the cache server also takes a few ms!
  15. $cache = new \Memcached(); $cache->addServer('localhost', 11211); $data = $cache->get(‘key’); if

    ($data === null) { $data = expensiveOperation(); $cache->set(‘key’, $data); } Caching » Practical Example: cache Check if we actually retrieved data from the cache. If so, we’re done. We don’t need to perform that expensive operation, we already have the result!
  16. $cache = new \Memcached(); $cache->addServer('localhost', 11211); $data = $cache->get(‘key’); if

    ($data === null) { $data = expensiveOperation(); $cache->set(‘key’, $data); } Caching » Practical Example: cache Too bad, data was not (yet) in cache. Guess we’ll have to perform that expensive operation anyway…
  17. $cache = new \Memcached(); $cache->addServer('localhost', 11211); $data = $cache->get(‘key’); if

    ($data === null) { $data = expensiveOperation(); $cache->set(‘key’, $data); } Caching » Practical Example: cache Finally, after having performed the expensive operation, store the result to cache, so we don’t have to do that again the next time!
  18. Caching » Practical Drawbacks • Application got more complex •

    Another (cache) server to maintain • A new library (PECL Memcached) • A few extra lines of code New library + new server also means more potential for security issues
  19. Caching » Practical Drawbacks • Application got more complex •

    The first request got slower • Still had to execute the operation • + check if item was in cache • + store the result to cache
  20. Caching » Practical Drawbacks • Application got more complex •

    The first request got slower • Data from cache may be stale • I’ll get back to this!
  21. Caching » Practical Caching is extra work • Consider cache

    misses • Consider data consistency • Keeping all parts running
  22. Caching » Practical Caching saves $$ • Less wasted machine

    resources • Faster response times • Scales better, more reliable
  23. Caching » Data consistency Race conditions Request 1 $data =

    $cache->get(‘article-1’); // set status to hidden $data[‘status’] = ‘hidden’; // save to database $db->exec( "UPDATE articles SET status = {$data[‘status’]} WHERE id = {$data[‘id’]}“ ); // store to cache $cache->set(‘article-1’, $data); Request 2, same time $data = $cache->get(‘article-1’); // update title $data[‘title’] = ‘new-title’; // save to database $db->exec( "UPDATE articles SET title = {$data[‘title’]} WHERE id = {$data[‘id’]}“ ); // store to cache $cache->set(‘article-1’, $data);
  24. Caching » Data consistency Race conditions • What did we

    end up storing to cache? • Whatever we ended up storing to cache is inconsistent with DB! • Either status === ‘hidden’, • Or title === ‘new-title’
  25. Caching » Data consistency Atomic operations! Increment/decrement: $cache->increment(‘count’); instead of

    $count = $cache->get(‘count’); $cache->set(‘count’, $count + 1);
  26. Caching » Data consistency Atomic operations! Conditional depending on existing:

    $cache->replace(‘key’, $replacement); instead of $data = $cache->get(‘key’); if ($data !== null) { $cache->set(‘key’, $replacement); }
  27. Caching » Data consistency Atomic operations! Conditional depending on non-existing:

    $cache->add(‘key’, $replacement); instead of $data = $cache->get(‘key’); if ($data === null) { $cache->set(‘key’, $replacement); }
  28. Caching » Data consistency Atomic operations! Conditional depending on current

    data: $cache->get(‘key’, null, $token); $cache->cas($token, ’key’, ‘new-value’); instead of $cache->get(‘key’); $cache->set(’key’, ‘new-value’); CAS operation will fail if ‘key’ was changed since it was get (by comparing $token)
  29. Caching » Data consistency Atomic operations! These examples were Memcached

    These kind of indivisible operations can be done with other cache engines too, but sometimes differently (e.g. locking)
  30. Caching » Data consistency Stale data Request 1 $data =

    $cache->get(‘article-1’); // set status to hidden $data[‘status’] = ‘hidden’; // save to database $db->exec( "UPDATE articles SET status = {$data[‘status’]} WHERE id = {$data[‘id’]}“ ); // store to cache $cache->set(‘article-1’, $data); What if this fails? (server down, connection failure, fatal error before getting here, …)
  31. Caching » Data consistency Stale data • Cache failed to

    update • Wrong result • Inconsistent with DB
  32. Phil Karlton There are only two hard things in Computer

    Science: cache invalidation and naming things. Caching » Data consistency
  33. Caching » Data consistency Cache invalidation • Check if operation

    succeeds • ->delete() invalid cache keys • Set cache expiration times
  34. Caching » Data consistency Inconsistent data $data = array(…); $db->beginTransaction();

    $db->exec( “UPDATE articles SET title = $data[‘title’] WHERE id = $data[‘id’]“ ); $db->exec( “UPDATE blobs SET text = $data[‘text’] WHERE article_id = $data[‘id’]“ ); // store to cache $cache->set(‘article-1’, $data); $db->commit();
  35. Caching » Data consistency Inconsistent data This example was easy

    to fix: if ($db->commit()) { $cache->set(‘article-1’, $data); } But a real application may be harder!
  36. Caching » Data consistency Inconsistent data If you don’t need

    to cache, don’t. If you do, start thinking about it early on. It’s much harder to implement as an afterthought.
  37. Caching » Scaling cache Too much data Too big to

    fit on 1 machine? What on earth are you storing? :o
  38. Caching » Scaling cache Too much data • Store less/smaller

    data • Get a bigger machine • Spread it over multiple machines
  39. Caching » Scaling cache Too much data Sharding is easy

    with key-value stores
 (no relationship between data) Horizontal partitioning: every server has (a different) part of the data
  40. Caching » Scaling cache Too much data E.g.: • Keys

    that start with A-M • Keys that start with N-Z (better to use consistent hashing to ensure equal balancing) server 1 server 2
  41. Caching » Scaling cache Too much traffic Replication… it’s tricky!

    Hard to do client-side (data inconsistency) Some cache servers provide it (e.g. Redis)
  42. Caching » Scaling cache Too much traffic Too much writes?

    Not an issue! If you’re writing more than you are reading, you probably shouldn’t be using a cache…
  43. Caching » Scaling cache Stampedes Slashdot effect • Unexpected, •

    Concurrent traffic, for • Data not in cache
  44. Caching » Scaling cache Stampedes Hundreds of requests at the

    same time, with the data not in cache = Hundreds of expensive computations, all at once
  45. Caching » Scaling cache Stampedes • Cache warming • Stampede

    protection • Locking • Early recomputation
  46. Caching » psr/cache & psr/simple-cache Why use a PSR library?

    • Easy to change cache libraries • Easy to change cache backends • Familiarity from other projects And they probably fix implementation inconsistencies across versions/systems.
  47. Caching » psr/cache & psr/simple-cache psr/simple-cache (PSR-16) • Memcached-like API

    • 1 object to interface with cache directly $value = $cache->get(‘key’); $cache->set(‘key’, $value);
  48. Caching » psr/cache & psr/simple-cache psr/cache (PSR-6) • 2 classes

    (Pool & Item) • Pool = cache backend; Item = value • Operations happen on Item objects $item = $pool->getItem(‘key’); $value = $item->get(); $item->set($value); $pool->save($item);
  49. Caching » psr/cache & psr/simple-cache Get psr/simple-cache $value = $cache->get(‘key’);

    psr/cache $item = $pool->getItem(‘key’); $value = $item->get();
  50. Caching » psr/cache & psr/simple-cache Set psr/simple-cache $cache->set(‘key’, $value, $ttl);

    psr/cache $item = $pool->getItem(‘key’); $item->set($value); $item->expiresAfter($ttl); $pool->save($item);
  51. Caching » psr/cache & psr/simple-cache Get multiple psr/simple-cache $values =

    $cache->getMultiple([‘k1’, ‘k2’]); // [‘k1’ => ‘v1’, ‘k2’ => ‘v2’] psr/cache $items = $pool->getItems([‘k1’, ‘k2’]); // [‘k1’ => object(Item), ‘k2’ => object(Item)]
  52. Caching » psr/cache & psr/simple-cache Set multiple psr/simple-cache $cache->setMultiple([‘k1’ =>

    $v1, ‘k2’ => $v2]); psr/cache $items = $pool->getItems([‘k1’, ‘k2’]); $items[‘k1’]->set($v1); $items[‘k2’]->set($v2); array_map(array($pool, ‘saveDeferred’), $items); $pool->commit();
  53. Caching » Scrapbook KeyValueStore • Very similar to psr/simple-cache •

    But supports more operations • I’ll get back to this! • Everything is a KeyValueStore instance • Adapters • Features
  54. Caching » Scrapbook Adapters • Memcached • Redis • Couchbase

    • APC(u) • MySQL, PostgreSQL, SQLite • Filesystem • Memory
  55. Caching » Scrapbook Adapters E.g.: // create \Memcached object pointing

    to your Memcached server $client = new \Memcached(); $client->addServer('localhost', 11211); // create Scrapbook KeyValueStore object $cache = new \MatthiasMullie\Scrapbook\Adapters\Memcached($client); or: // create Scrapbook KeyValueStore object $cache = new \MatthiasMullie\Scrapbook\Adapters\Apc();
  56. Caching » Scrapbook Features • Local Buffer • Transactions •

    Stampede protection • Sharding They’re all KeyValueStore interfaces!
  57. Caching » Scrapbook Local Buffer Avoids multiple requests for same

    key. // wrap BufferedStore around adapter $cache = new \MatthiasMullie\Scrapbook\Buffered\BufferedStore($cache); $cache->get(‘key’); $cache->get(‘key’); Only reaches out to cache once.
  58. Caching » Scrapbook Transactions Similar to transactions in databases. //

    wrap TransactionalStore around adapter $cache = new \MatthiasMullie\Scrapbook\Buffered\TransactionalStore($cache); $cache->begin(); $cache->add(’key’, $value); $cache->replace(‘other-key’, $value2); $cache->commit(); // or rollback(); Either both succeed, or both fail.
  59. Caching » Scrapbook Transactions Caveat: This is not native in

    most cache backends. Due to cleverness involved, this doesn’t remember time-to-live when restoring values. Only use with infinite TTL.
  60. Caching » Scrapbook Stampede protection Stampede protection (with locking): //

    wrap StampedeProtector around adapter $cache = new \MatthiasMullie\Scrapbook\Scale\StampedeProtector($cache); When no lock can be obtained (other request already processing), it just waits, instead of doing the complex operation.
  61. Caching » Scrapbook Sharding // first Redis server $client =

    new \Redis(); $client->connect('192.168.1.100'); $cache1 = new \MatthiasMullie\Scrapbook\Adapters\Redis($client); // second Redis server $client2 = new \Redis(); $client2->connect('192.168.1.101'); $cache2 = new \MatthiasMullie\Scrapbook\Adapters\Redis($client); // wrap Shard around adapter $cache = new \MatthiasMullie\Scrapbook\Scale\Shard($cache1, $cache2); $cache->set(‘key’, $value); $cache->set(‘key2’, $value2); ‘key’ goes to first server, ‘key2’ goes to second
  62. Caching » Scrapbook Multiple features Just keep wrapping them: //

    init adapter $cache = new \MatthiasMullie\Scrapbook\Adapters\Apc(); // add stampede protection $cache = new \MatthiasMullie\Scrapbook\Scale\StampedeProtector($cache); // add local buffer $cache = new \MatthiasMullie\Scrapbook\Buffered\BufferedStore($cache); // add transactions $cache = new \MatthiasMullie\Scrapbook\Buffered\TransactionalStore($cache); …
  63. Caching » Scrapbook Supports more operations We’ve been over these

    already… $cache->cas($token, ’key’, $value); $cache->add(‘key’, $value); $cache->replace(‘key’, $value); $cache->increment(‘key’, $offset, $initial); $cache->decrement(‘key’, $offset, $initial);
  64. Caching » Scrapbook Use PSR If you don’t need these

    operations, stick to psr/cache or psr/simple-cache. Scrapbook comes with adapters for both PSRs: $psr6 = new \MatthiasMullie\Scrapbook\Psr6\Pool($keyvaluestore); $item = $psr6->getItem(‘key’); $psr16 = new \ MatthiasMullie\Scrapbook\Psr16\SimpleCache($keyvaluestore); $value = $psr16->get(‘key’);