Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Caching

 Caching

An obvious solution to getting a faster & more scalable website is to introduce caching.

When should you use them? Why is cache invalidation one of the hard things in computer science?

Let’s discuss key-value stores (like Memcached & Redis): how to use them in your application, the problems they solve & the ones they bring.

54b2d1838a911e14c4b7b46bb8e0e8ff?s=128

Matthias Mullie

April 18, 2017
Tweet

Transcript

  1. How to use key-value stores effectively CACHING

  2. @matthiasmullie Caching

  3. Caching INTRO

  4. • Fast • Temporary • Storage Caching » Intro Definition

  5. • CPU cache (L1, L2) • Browser cache • Proxy

    servers (e.g. Varnish) • SQL Query cache • Key value stores (e.g. Memcached) • … Caching » Intro Examples
  6. Repeated slow/intensive operations E.g.: • Results of expensive processing (CPU)

    • Results from slow external sources (I/O) Caching » Intro What to cache?
  7. • Speed • Scaling (frees up other resources) Caching »

    Intro Why?
  8. • Frequently changing data • Barely any (repeat) traffic •

    Fast & easy data
 (e.g. most queries, if indexed correctly) Caching » Intro What not to cache?
  9. Caching Don’t cache if you don’t anticipate traffic. When you

    do, design for it, early on. Caching » Intro Caution! = increases complexity = more dev $$$ = more bugs
  10. Caching PRACTICAL

  11. function expensiveOperation() { sleep(5); return array(‘your data’); } $data =

    expensiveOperation(); Caching » Practical Example: no cache
  12. function expensiveOperation() { sleep(5); return array(‘your data’); } $data =

    expensiveOperation(); Caching » Practical Example: no cache BAD!
  13. • It always takes 5 seconds! Caching » Practical Example:

    no cache
  14. • It always takes 5 seconds! • While that operation

    executes, it can’t process other requests Caching » Practical Example: no cache
  15. But… but… I have powerful servers! Caching » Practical Example:

    no cache
  16. But… but… I have powerful servers! • They cost a

    lot! (wasted resources) Caching » Practical Example: no cache
  17. But… but… I have powerful servers! • They cost a

    lot! (wasted resources) • What about unexpected peaks? Caching » Practical Example: no cache
  18. The main reason for caching (or any optimization) is usually

    not raw speed. Caching » Practical Example: no cache
  19. The main reason for caching (or any optimization) is usually

    not raw speed. Scalability! Caching » Practical Example: no cache
  20. $cache = new \Memcached(); $cache->addServer('localhost', 11211); $data = $cache->get(‘key’); if

    ($data === null) { $data = expensiveOperation(); $cache->set(‘key’, $data); } Caching » Practical Example: cache
  21. $cache = new \Memcached(); $cache->addServer('localhost', 11211); $data = $cache->get(‘key’); if

    ($data === null) { $data = expensiveOperation(); $cache->set(‘key’, $data); } Caching » Practical Example: cache Bootstrap: connect to cache server. Memcached, in this case, with PECL’s Memcached API.
  22. $cache = new \Memcached(); $cache->addServer('localhost', 11211); $data = $cache->get(‘key’); if

    ($data === null) { $data = expensiveOperation(); $cache->set(‘key’, $data); } Caching » Practical Example: cache Get the data from the cache Don’t forget this connection to the cache server also takes a few ms!
  23. $cache = new \Memcached(); $cache->addServer('localhost', 11211); $data = $cache->get(‘key’); if

    ($data === null) { $data = expensiveOperation(); $cache->set(‘key’, $data); } Caching » Practical Example: cache Check if we actually retrieved data from the cache. If so, we’re done. We don’t need to perform that expensive operation, we already have the result!
  24. $cache = new \Memcached(); $cache->addServer('localhost', 11211); $data = $cache->get(‘key’); if

    ($data === null) { $data = expensiveOperation(); $cache->set(‘key’, $data); } Caching » Practical Example: cache Too bad, data was not (yet) in cache. Guess we’ll have to perform that expensive operation anyway…
  25. $cache = new \Memcached(); $cache->addServer('localhost', 11211); $data = $cache->get(‘key’); if

    ($data === null) { $data = expensiveOperation(); $cache->set(‘key’, $data); } Caching » Practical Example: cache Finally, after having performed the expensive operation, store the result to cache, so we don’t have to do that again the next time!
  26. Caching » Practical Drawbacks • Application got more complex •

    Another (cache) server to maintain • A new library (PECL Memcached) • A few extra lines of code New library + new server also means more potential for security issues
  27. Caching » Practical Drawbacks • Application got more complex •

    The first request got slower • Still had to execute the operation • + check if item was in cache • + store the result to cache
  28. Caching » Practical Drawbacks • Application got more complex •

    The first request got slower • Data from cache may be stale • I’ll get back to this!
  29. Caching » Practical Caching is extra work • Consider cache

    misses • Consider data consistency • Keeping all parts running
  30. Caching » Practical Caching saves $$ • Less wasted machine

    resources • Faster response times • Scales better, more reliable
  31. Caching DATA CONSISTENCY

  32. Caching » Data consistency Race conditions Request 1 $data =

    $cache->get(‘article-1’); // set status to hidden $data[‘status’] = ‘hidden’; // save to database $db->exec( "UPDATE articles SET status = {$data[‘status’]} WHERE id = {$data[‘id’]}“ ); // store to cache $cache->set(‘article-1’, $data); Request 2, same time $data = $cache->get(‘article-1’); // update title $data[‘title’] = ‘new-title’; // save to database $db->exec( "UPDATE articles SET title = {$data[‘title’]} WHERE id = {$data[‘id’]}“ ); // store to cache $cache->set(‘article-1’, $data);
  33. Caching » Data consistency Race conditions • What did we

    end up storing to cache? • Whatever we ended up storing to cache is inconsistent with DB! • Either status === ‘hidden’, • Or title === ‘new-title’
  34. Caching » Data consistency Atomic operations! Increment/decrement: $cache->increment(‘count’); instead of

    $count = $cache->get(‘count’); $cache->set(‘count’, $count + 1);
  35. Caching » Data consistency Atomic operations! Conditional depending on existing:

    $cache->replace(‘key’, $replacement); instead of $data = $cache->get(‘key’); if ($data !== null) { $cache->set(‘key’, $replacement); }
  36. Caching » Data consistency Atomic operations! Conditional depending on non-existing:

    $cache->add(‘key’, $replacement); instead of $data = $cache->get(‘key’); if ($data === null) { $cache->set(‘key’, $replacement); }
  37. Caching » Data consistency Atomic operations! Conditional depending on current

    data: $cache->get(‘key’, null, $token); $cache->cas($token, ’key’, ‘new-value’); instead of $cache->get(‘key’); $cache->set(’key’, ‘new-value’); CAS operation will fail if ‘key’ was changed since it was get (by comparing $token)
  38. Caching » Data consistency Atomic operations! These examples were Memcached

    These kind of indivisible operations can be done with other cache engines too, but sometimes differently (e.g. locking)
  39. Caching » Data consistency Stale data Request 1 $data =

    $cache->get(‘article-1’); // set status to hidden $data[‘status’] = ‘hidden’; // save to database $db->exec( "UPDATE articles SET status = {$data[‘status’]} WHERE id = {$data[‘id’]}“ ); // store to cache $cache->set(‘article-1’, $data); What if this fails? (server down, connection failure, fatal error before getting here, …)
  40. Caching » Data consistency Stale data • Cache failed to

    update • Wrong result • Inconsistent with DB
  41. Phil Karlton There are only two hard things in Computer

    Science: cache invalidation and naming things. Caching » Data consistency
  42. Caching » Data consistency Cache invalidation • Check if operation

    succeeds • ->delete() invalid cache keys • Set cache expiration times
  43. Caching » Data consistency Inconsistent data $data = array(…); $db->beginTransaction();

    $db->exec( “UPDATE articles SET title = $data[‘title’] WHERE id = $data[‘id’]“ ); $db->exec( “UPDATE blobs SET text = $data[‘text’] WHERE article_id = $data[‘id’]“ ); // store to cache $cache->set(‘article-1’, $data); $db->commit();
  44. Caching » Data consistency Inconsistent data • No problem! •

    As long as transaction succeeds…
  45. Caching » Data consistency Inconsistent data This example was easy

    to fix: if ($db->commit()) { $cache->set(‘article-1’, $data); } But a real application may be harder!
  46. Caching » Data consistency Inconsistent data If you don’t need

    to cache, don’t. If you do, start thinking about it early on. It’s much harder to implement as an afterthought.
  47. Caching SCALING CACHE

  48. Caching » Scaling cache Too much data Too big to

    fit on 1 machine? What on earth are you storing? :o
  49. Caching » Scaling cache Too much data • Store less/smaller

    data • Get a bigger machine • Spread it over multiple machines
  50. Caching » Scaling cache Too much data Sharding is easy

    with key-value stores
 (no relationship between data) Horizontal partitioning: every server has (a different) part of the data
  51. Caching » Scaling cache Too much data E.g.: • Keys

    that start with A-M • Keys that start with N-Z (better to use consistent hashing to ensure equal balancing) server 1 server 2
  52. Caching » Scaling cache Too much traffic Shard it! It’ll

    also spread the requests
  53. Caching » Scaling cache Too much traffic Replication… it’s tricky!

    Hard to do client-side (data inconsistency) Some cache servers provide it (e.g. Redis)
  54. Caching » Scaling cache Too much traffic Too much writes?

    Not an issue! If you’re writing more than you are reading, you probably shouldn’t be using a cache…
  55. Caching » Scaling cache Stampedes Slashdot effect • Unexpected, •

    Concurrent traffic, for • Data not in cache
  56. Caching » Scaling cache Stampedes Hundreds of requests at the

    same time, with the data not in cache = Hundreds of expensive computations, all at once
  57. Caching » Scaling cache Stampedes • Cache warming • Stampede

    protection • Locking • Early recomputation
  58. Caching PSR/CACHE & PSR/SIMPLE-CACHE

  59. Caching » psr/cache & psr/simple-cache Why use a PSR library?

    • Easy to change cache libraries • Easy to change cache backends • Familiarity from other projects And they probably fix implementation inconsistencies across versions/systems.
  60. Caching » psr/cache & psr/simple-cache psr/simple-cache (PSR-16) • Memcached-like API

    • 1 object to interface with cache directly $value = $cache->get(‘key’); $cache->set(‘key’, $value);
  61. Caching » psr/cache & psr/simple-cache psr/cache (PSR-6) • 2 classes

    (Pool & Item) • Pool = cache backend; Item = value • Operations happen on Item objects $item = $pool->getItem(‘key’); $value = $item->get(); $item->set($value); $pool->save($item);
  62. Caching » psr/cache & psr/simple-cache Get psr/simple-cache $value = $cache->get(‘key’);

    psr/cache $item = $pool->getItem(‘key’); $value = $item->get();
  63. Caching » psr/cache & psr/simple-cache Set psr/simple-cache $cache->set(‘key’, $value, $ttl);

    psr/cache $item = $pool->getItem(‘key’); $item->set($value); $item->expiresAfter($ttl); $pool->save($item);
  64. Caching » psr/cache & psr/simple-cache Delete psr/simple-cache $cache->delete(‘key’); psr/cache $pool->deleteItem(‘key’);

  65. Caching » psr/cache & psr/simple-cache Flush (clear entire server) psr/simple-cache

    $cache->clear(); psr/cache $pool->clear();
  66. Caching » psr/cache & psr/simple-cache Exists psr/simple-cache $exists = $cache->has(‘key’);

    psr/cache $exists = $pool->hasItem(‘key’);
  67. Caching » psr/cache & psr/simple-cache Get multiple psr/simple-cache $values =

    $cache->getMultiple([‘k1’, ‘k2’]); // [‘k1’ => ‘v1’, ‘k2’ => ‘v2’] psr/cache $items = $pool->getItems([‘k1’, ‘k2’]); // [‘k1’ => object(Item), ‘k2’ => object(Item)]
  68. Caching » psr/cache & psr/simple-cache Set multiple psr/simple-cache $cache->setMultiple([‘k1’ =>

    $v1, ‘k2’ => $v2]); psr/cache $items = $pool->getItems([‘k1’, ‘k2’]); $items[‘k1’]->set($v1); $items[‘k2’]->set($v2); array_map(array($pool, ‘saveDeferred’), $items); $pool->commit();
  69. Caching » psr/cache & psr/simple-cache Delete multiple psr/simple-cache $cache->deleteMultiple([‘k1’, ‘k2’]);

    psr/cache $pool->deleteItems([‘k1’, ‘k2’]);
  70. Caching » psr/cache & psr/simple-cache Which to use? Whichever you

    prefer. Both are equivalent in features.
  71. Caching SCRAPBOOK A PHP CACHE LIBRARY: SCRAPBOOK.CASH

  72. Caching » Scrapbook KeyValueStore • Very similar to psr/simple-cache •

    But supports more operations • I’ll get back to this! • Everything is a KeyValueStore instance • Adapters • Features
  73. Caching » Scrapbook Adapters • Memcached • Redis • Couchbase

    • APC(u) • MySQL, PostgreSQL, SQLite • Filesystem • Memory
  74. Caching » Scrapbook Adapters E.g.: // create \Memcached object pointing

    to your Memcached server $client = new \Memcached(); $client->addServer('localhost', 11211); // create Scrapbook KeyValueStore object $cache = new \MatthiasMullie\Scrapbook\Adapters\Memcached($client); or: // create Scrapbook KeyValueStore object $cache = new \MatthiasMullie\Scrapbook\Adapters\Apc();
  75. Caching » Scrapbook Features • Local Buffer • Transactions •

    Stampede protection • Sharding They’re all KeyValueStore interfaces!
  76. Caching » Scrapbook Local Buffer Avoids multiple requests for same

    key. // wrap BufferedStore around adapter $cache = new \MatthiasMullie\Scrapbook\Buffered\BufferedStore($cache); $cache->get(‘key’); $cache->get(‘key’); Only reaches out to cache once.
  77. Caching » Scrapbook Transactions Similar to transactions in databases. //

    wrap TransactionalStore around adapter $cache = new \MatthiasMullie\Scrapbook\Buffered\TransactionalStore($cache); $cache->begin(); $cache->add(’key’, $value); $cache->replace(‘other-key’, $value2); $cache->commit(); // or rollback(); Either both succeed, or both fail.
  78. Caching » Scrapbook Transactions Caveat: This is not native in

    most cache backends. Due to cleverness involved, this doesn’t remember time-to-live when restoring values. Only use with infinite TTL.
  79. Caching » Scrapbook Stampede protection Stampede protection (with locking): //

    wrap StampedeProtector around adapter $cache = new \MatthiasMullie\Scrapbook\Scale\StampedeProtector($cache); When no lock can be obtained (other request already processing), it just waits, instead of doing the complex operation.
  80. Caching » Scrapbook Sharding // first Redis server $client =

    new \Redis(); $client->connect('192.168.1.100'); $cache1 = new \MatthiasMullie\Scrapbook\Adapters\Redis($client); // second Redis server $client2 = new \Redis(); $client2->connect('192.168.1.101'); $cache2 = new \MatthiasMullie\Scrapbook\Adapters\Redis($client); // wrap Shard around adapter $cache = new \MatthiasMullie\Scrapbook\Scale\Shard($cache1, $cache2); $cache->set(‘key’, $value); $cache->set(‘key2’, $value2); ‘key’ goes to first server, ‘key2’ goes to second
  81. Caching » Scrapbook Multiple features Just keep wrapping them: //

    init adapter $cache = new \MatthiasMullie\Scrapbook\Adapters\Apc(); // add stampede protection $cache = new \MatthiasMullie\Scrapbook\Scale\StampedeProtector($cache); // add local buffer $cache = new \MatthiasMullie\Scrapbook\Buffered\BufferedStore($cache); // add transactions $cache = new \MatthiasMullie\Scrapbook\Buffered\TransactionalStore($cache); …
  82. Caching » Scrapbook Supports more operations We’ve been over these

    already… $cache->cas($token, ’key’, $value); $cache->add(‘key’, $value); $cache->replace(‘key’, $value); $cache->increment(‘key’, $offset, $initial); $cache->decrement(‘key’, $offset, $initial);
  83. Caching » Scrapbook Use PSR If you don’t need these

    operations, stick to psr/cache or psr/simple-cache. Scrapbook comes with adapters for both PSRs: $psr6 = new \MatthiasMullie\Scrapbook\Psr6\Pool($keyvaluestore); $item = $psr6->getItem(‘key’); $psr16 = new \ MatthiasMullie\Scrapbook\Psr16\SimpleCache($keyvaluestore); $value = $psr16->get(‘key’);
  84. Presentation title

  85. Questions? Caching

  86. mullie.eu • scrapbook.cash Caching Resources