Slide 1

Slide 1 text

How to use key-value stores effectively CACHING

Slide 2

Slide 2 text

@matthiasmullie Caching

Slide 3

Slide 3 text

Caching INTRO

Slide 4

Slide 4 text

• Fast • Temporary • Storage Caching » Intro Definition

Slide 5

Slide 5 text

• CPU cache (L1, L2) • Browser cache • Proxy servers (e.g. Varnish) • SQL Query cache • Key value stores (e.g. Memcached) • … Caching » Intro Examples

Slide 6

Slide 6 text

Repeated slow/intensive operations E.g.: • Results of expensive processing (CPU) • Results from slow external sources (I/O) Caching » Intro What to cache?

Slide 7

Slide 7 text

• Speed • Scaling (frees up other resources) Caching » Intro Why?

Slide 8

Slide 8 text

• Frequently changing data • Barely any (repeat) traffic • Fast & easy data
 (e.g. most queries, if indexed correctly) Caching » Intro What not to cache?

Slide 9

Slide 9 text

Caching Don’t cache if you don’t anticipate traffic. When you do, design for it, early on. Caching » Intro Caution! = increases complexity = more dev $$$ = more bugs

Slide 10

Slide 10 text

Caching PRACTICAL

Slide 11

Slide 11 text

function expensiveOperation() { sleep(5); return array(‘your data’); } $data = expensiveOperation(); Caching » Practical Example: no cache

Slide 12

Slide 12 text

function expensiveOperation() { sleep(5); return array(‘your data’); } $data = expensiveOperation(); Caching » Practical Example: no cache BAD!

Slide 13

Slide 13 text

• It always takes 5 seconds! Caching » Practical Example: no cache

Slide 14

Slide 14 text

• It always takes 5 seconds! • While that operation executes, it can’t process other requests Caching » Practical Example: no cache

Slide 15

Slide 15 text

But… but… I have powerful servers! Caching » Practical Example: no cache

Slide 16

Slide 16 text

But… but… I have powerful servers! • They cost a lot! (wasted resources) Caching » Practical Example: no cache

Slide 17

Slide 17 text

But… but… I have powerful servers! • They cost a lot! (wasted resources) • What about unexpected peaks? Caching » Practical Example: no cache

Slide 18

Slide 18 text

The main reason for caching (or any optimization) is usually not raw speed. Caching » Practical Example: no cache

Slide 19

Slide 19 text

The main reason for caching (or any optimization) is usually not raw speed. Scalability! Caching » Practical Example: no cache

Slide 20

Slide 20 text

$cache = new \Memcached(); $cache->addServer('localhost', 11211); $data = $cache->get(‘key’); if ($data === null) { $data = expensiveOperation(); $cache->set(‘key’, $data); } Caching » Practical Example: cache

Slide 21

Slide 21 text

$cache = new \Memcached(); $cache->addServer('localhost', 11211); $data = $cache->get(‘key’); if ($data === null) { $data = expensiveOperation(); $cache->set(‘key’, $data); } Caching » Practical Example: cache Bootstrap: connect to cache server. Memcached, in this case, with PECL’s Memcached API.

Slide 22

Slide 22 text

$cache = new \Memcached(); $cache->addServer('localhost', 11211); $data = $cache->get(‘key’); if ($data === null) { $data = expensiveOperation(); $cache->set(‘key’, $data); } Caching » Practical Example: cache Get the data from the cache Don’t forget this connection to the cache server also takes a few ms!

Slide 23

Slide 23 text

$cache = new \Memcached(); $cache->addServer('localhost', 11211); $data = $cache->get(‘key’); if ($data === null) { $data = expensiveOperation(); $cache->set(‘key’, $data); } Caching » Practical Example: cache Check if we actually retrieved data from the cache. If so, we’re done. We don’t need to perform that expensive operation, we already have the result!

Slide 24

Slide 24 text

$cache = new \Memcached(); $cache->addServer('localhost', 11211); $data = $cache->get(‘key’); if ($data === null) { $data = expensiveOperation(); $cache->set(‘key’, $data); } Caching » Practical Example: cache Too bad, data was not (yet) in cache. Guess we’ll have to perform that expensive operation anyway…

Slide 25

Slide 25 text

$cache = new \Memcached(); $cache->addServer('localhost', 11211); $data = $cache->get(‘key’); if ($data === null) { $data = expensiveOperation(); $cache->set(‘key’, $data); } Caching » Practical Example: cache Finally, after having performed the expensive operation, store the result to cache, so we don’t have to do that again the next time!

Slide 26

Slide 26 text

Caching » Practical Drawbacks • Application got more complex • Another (cache) server to maintain • A new library (PECL Memcached) • A few extra lines of code New library + new server also means more potential for security issues

Slide 27

Slide 27 text

Caching » Practical Drawbacks • Application got more complex • The first request got slower • Still had to execute the operation • + check if item was in cache • + store the result to cache

Slide 28

Slide 28 text

Caching » Practical Drawbacks • Application got more complex • The first request got slower • Data from cache may be stale • I’ll get back to this!

Slide 29

Slide 29 text

Caching » Practical Caching is extra work • Consider cache misses • Consider data consistency • Keeping all parts running

Slide 30

Slide 30 text

Caching » Practical Caching saves $$ • Less wasted machine resources • Faster response times • Scales better, more reliable

Slide 31

Slide 31 text

Caching DATA CONSISTENCY

Slide 32

Slide 32 text

Caching » Data consistency Race conditions Request 1 $data = $cache->get(‘article-1’); // set status to hidden $data[‘status’] = ‘hidden’; // save to database $db->exec( "UPDATE articles SET status = {$data[‘status’]} WHERE id = {$data[‘id’]}“ ); // store to cache $cache->set(‘article-1’, $data); Request 2, same time $data = $cache->get(‘article-1’); // update title $data[‘title’] = ‘new-title’; // save to database $db->exec( "UPDATE articles SET title = {$data[‘title’]} WHERE id = {$data[‘id’]}“ ); // store to cache $cache->set(‘article-1’, $data);

Slide 33

Slide 33 text

Caching » Data consistency Race conditions • What did we end up storing to cache? • Whatever we ended up storing to cache is inconsistent with DB! • Either status === ‘hidden’, • Or title === ‘new-title’

Slide 34

Slide 34 text

Caching » Data consistency Atomic operations! Increment/decrement: $cache->increment(‘count’); instead of $count = $cache->get(‘count’); $cache->set(‘count’, $count + 1);

Slide 35

Slide 35 text

Caching » Data consistency Atomic operations! Conditional depending on existing: $cache->replace(‘key’, $replacement); instead of $data = $cache->get(‘key’); if ($data !== null) { $cache->set(‘key’, $replacement); }

Slide 36

Slide 36 text

Caching » Data consistency Atomic operations! Conditional depending on non-existing: $cache->add(‘key’, $replacement); instead of $data = $cache->get(‘key’); if ($data === null) { $cache->set(‘key’, $replacement); }

Slide 37

Slide 37 text

Caching » Data consistency Atomic operations! Conditional depending on current data: $cache->get(‘key’, null, $token); $cache->cas($token, ’key’, ‘new-value’); instead of $cache->get(‘key’); $cache->set(’key’, ‘new-value’); CAS operation will fail if ‘key’ was changed since it was get (by comparing $token)

Slide 38

Slide 38 text

Caching » Data consistency Atomic operations! These examples were Memcached These kind of indivisible operations can be done with other cache engines too, but sometimes differently (e.g. locking)

Slide 39

Slide 39 text

Caching » Data consistency Stale data Request 1 $data = $cache->get(‘article-1’); // set status to hidden $data[‘status’] = ‘hidden’; // save to database $db->exec( "UPDATE articles SET status = {$data[‘status’]} WHERE id = {$data[‘id’]}“ ); // store to cache $cache->set(‘article-1’, $data); What if this fails? (server down, connection failure, fatal error before getting here, …)

Slide 40

Slide 40 text

Caching » Data consistency Stale data • Cache failed to update • Wrong result • Inconsistent with DB

Slide 41

Slide 41 text

Phil Karlton There are only two hard things in Computer Science: cache invalidation and naming things. Caching » Data consistency

Slide 42

Slide 42 text

Caching » Data consistency Cache invalidation • Check if operation succeeds • ->delete() invalid cache keys • Set cache expiration times

Slide 43

Slide 43 text

Caching » Data consistency Inconsistent data $data = array(…); $db->beginTransaction(); $db->exec( “UPDATE articles SET title = $data[‘title’] WHERE id = $data[‘id’]“ ); $db->exec( “UPDATE blobs SET text = $data[‘text’] WHERE article_id = $data[‘id’]“ ); // store to cache $cache->set(‘article-1’, $data); $db->commit();

Slide 44

Slide 44 text

Caching » Data consistency Inconsistent data • No problem! • As long as transaction succeeds…

Slide 45

Slide 45 text

Caching » Data consistency Inconsistent data This example was easy to fix: if ($db->commit()) { $cache->set(‘article-1’, $data); } But a real application may be harder!

Slide 46

Slide 46 text

Caching » Data consistency Inconsistent data If you don’t need to cache, don’t. If you do, start thinking about it early on. It’s much harder to implement as an afterthought.

Slide 47

Slide 47 text

Caching SCALING CACHE

Slide 48

Slide 48 text

Caching » Scaling cache Too much data Too big to fit on 1 machine? What on earth are you storing? :o

Slide 49

Slide 49 text

Caching » Scaling cache Too much data • Store less/smaller data • Get a bigger machine • Spread it over multiple machines

Slide 50

Slide 50 text

Caching » Scaling cache Too much data Sharding is easy with key-value stores
 (no relationship between data) Horizontal partitioning: every server has (a different) part of the data

Slide 51

Slide 51 text

Caching » Scaling cache Too much data E.g.: • Keys that start with A-M • Keys that start with N-Z (better to use consistent hashing to ensure equal balancing) server 1 server 2

Slide 52

Slide 52 text

Caching » Scaling cache Too much traffic Shard it! It’ll also spread the requests

Slide 53

Slide 53 text

Caching » Scaling cache Too much traffic Replication… it’s tricky! Hard to do client-side (data inconsistency) Some cache servers provide it (e.g. Redis)

Slide 54

Slide 54 text

Caching » Scaling cache Too much traffic Too much writes? Not an issue! If you’re writing more than you are reading, you probably shouldn’t be using a cache…

Slide 55

Slide 55 text

Caching » Scaling cache Stampedes Slashdot effect • Unexpected, • Concurrent traffic, for • Data not in cache

Slide 56

Slide 56 text

Caching » Scaling cache Stampedes Hundreds of requests at the same time, with the data not in cache = Hundreds of expensive computations, all at once

Slide 57

Slide 57 text

Caching » Scaling cache Stampedes • Cache warming • Stampede protection • Locking • Early recomputation

Slide 58

Slide 58 text

Caching PSR/CACHE & PSR/SIMPLE-CACHE

Slide 59

Slide 59 text

Caching » psr/cache & psr/simple-cache Why use a PSR library? • Easy to change cache libraries • Easy to change cache backends • Familiarity from other projects And they probably fix implementation inconsistencies across versions/systems.

Slide 60

Slide 60 text

Caching » psr/cache & psr/simple-cache psr/simple-cache (PSR-16) • Memcached-like API • 1 object to interface with cache directly $value = $cache->get(‘key’); $cache->set(‘key’, $value);

Slide 61

Slide 61 text

Caching » psr/cache & psr/simple-cache psr/cache (PSR-6) • 2 classes (Pool & Item) • Pool = cache backend; Item = value • Operations happen on Item objects $item = $pool->getItem(‘key’); $value = $item->get(); $item->set($value); $pool->save($item);

Slide 62

Slide 62 text

Caching » psr/cache & psr/simple-cache Get psr/simple-cache $value = $cache->get(‘key’); psr/cache $item = $pool->getItem(‘key’); $value = $item->get();

Slide 63

Slide 63 text

Caching » psr/cache & psr/simple-cache Set psr/simple-cache $cache->set(‘key’, $value, $ttl); psr/cache $item = $pool->getItem(‘key’); $item->set($value); $item->expiresAfter($ttl); $pool->save($item);

Slide 64

Slide 64 text

Caching » psr/cache & psr/simple-cache Delete psr/simple-cache $cache->delete(‘key’); psr/cache $pool->deleteItem(‘key’);

Slide 65

Slide 65 text

Caching » psr/cache & psr/simple-cache Flush (clear entire server) psr/simple-cache $cache->clear(); psr/cache $pool->clear();

Slide 66

Slide 66 text

Caching » psr/cache & psr/simple-cache Exists psr/simple-cache $exists = $cache->has(‘key’); psr/cache $exists = $pool->hasItem(‘key’);

Slide 67

Slide 67 text

Caching » psr/cache & psr/simple-cache Get multiple psr/simple-cache $values = $cache->getMultiple([‘k1’, ‘k2’]); // [‘k1’ => ‘v1’, ‘k2’ => ‘v2’] psr/cache $items = $pool->getItems([‘k1’, ‘k2’]); // [‘k1’ => object(Item), ‘k2’ => object(Item)]

Slide 68

Slide 68 text

Caching » psr/cache & psr/simple-cache Set multiple psr/simple-cache $cache->setMultiple([‘k1’ => $v1, ‘k2’ => $v2]); psr/cache $items = $pool->getItems([‘k1’, ‘k2’]); $items[‘k1’]->set($v1); $items[‘k2’]->set($v2); array_map(array($pool, ‘saveDeferred’), $items); $pool->commit();

Slide 69

Slide 69 text

Caching » psr/cache & psr/simple-cache Delete multiple psr/simple-cache $cache->deleteMultiple([‘k1’, ‘k2’]); psr/cache $pool->deleteItems([‘k1’, ‘k2’]);

Slide 70

Slide 70 text

Caching » psr/cache & psr/simple-cache Which to use? Whichever you prefer. Both are equivalent in features.

Slide 71

Slide 71 text

Caching SCRAPBOOK A PHP CACHE LIBRARY: SCRAPBOOK.CASH

Slide 72

Slide 72 text

Caching » Scrapbook KeyValueStore • Very similar to psr/simple-cache • But supports more operations • I’ll get back to this! • Everything is a KeyValueStore instance • Adapters • Features

Slide 73

Slide 73 text

Caching » Scrapbook Adapters • Memcached • Redis • Couchbase • APC(u) • MySQL, PostgreSQL, SQLite • Filesystem • Memory

Slide 74

Slide 74 text

Caching » Scrapbook Adapters E.g.: // create \Memcached object pointing to your Memcached server $client = new \Memcached(); $client->addServer('localhost', 11211); // create Scrapbook KeyValueStore object $cache = new \MatthiasMullie\Scrapbook\Adapters\Memcached($client); or: // create Scrapbook KeyValueStore object $cache = new \MatthiasMullie\Scrapbook\Adapters\Apc();

Slide 75

Slide 75 text

Caching » Scrapbook Features • Local Buffer • Transactions • Stampede protection • Sharding They’re all KeyValueStore interfaces!

Slide 76

Slide 76 text

Caching » Scrapbook Local Buffer Avoids multiple requests for same key. // wrap BufferedStore around adapter $cache = new \MatthiasMullie\Scrapbook\Buffered\BufferedStore($cache); $cache->get(‘key’); $cache->get(‘key’); Only reaches out to cache once.

Slide 77

Slide 77 text

Caching » Scrapbook Transactions Similar to transactions in databases. // wrap TransactionalStore around adapter $cache = new \MatthiasMullie\Scrapbook\Buffered\TransactionalStore($cache); $cache->begin(); $cache->add(’key’, $value); $cache->replace(‘other-key’, $value2); $cache->commit(); // or rollback(); Either both succeed, or both fail.

Slide 78

Slide 78 text

Caching » Scrapbook Transactions Caveat: This is not native in most cache backends. Due to cleverness involved, this doesn’t remember time-to-live when restoring values. Only use with infinite TTL.

Slide 79

Slide 79 text

Caching » Scrapbook Stampede protection Stampede protection (with locking): // wrap StampedeProtector around adapter $cache = new \MatthiasMullie\Scrapbook\Scale\StampedeProtector($cache); When no lock can be obtained (other request already processing), it just waits, instead of doing the complex operation.

Slide 80

Slide 80 text

Caching » Scrapbook Sharding // first Redis server $client = new \Redis(); $client->connect('192.168.1.100'); $cache1 = new \MatthiasMullie\Scrapbook\Adapters\Redis($client); // second Redis server $client2 = new \Redis(); $client2->connect('192.168.1.101'); $cache2 = new \MatthiasMullie\Scrapbook\Adapters\Redis($client); // wrap Shard around adapter $cache = new \MatthiasMullie\Scrapbook\Scale\Shard($cache1, $cache2); $cache->set(‘key’, $value); $cache->set(‘key2’, $value2); ‘key’ goes to first server, ‘key2’ goes to second

Slide 81

Slide 81 text

Caching » Scrapbook Multiple features Just keep wrapping them: // init adapter $cache = new \MatthiasMullie\Scrapbook\Adapters\Apc(); // add stampede protection $cache = new \MatthiasMullie\Scrapbook\Scale\StampedeProtector($cache); // add local buffer $cache = new \MatthiasMullie\Scrapbook\Buffered\BufferedStore($cache); // add transactions $cache = new \MatthiasMullie\Scrapbook\Buffered\TransactionalStore($cache); …

Slide 82

Slide 82 text

Caching » Scrapbook Supports more operations We’ve been over these already… $cache->cas($token, ’key’, $value); $cache->add(‘key’, $value); $cache->replace(‘key’, $value); $cache->increment(‘key’, $offset, $initial); $cache->decrement(‘key’, $offset, $initial);

Slide 83

Slide 83 text

Caching » Scrapbook Use PSR If you don’t need these operations, stick to psr/cache or psr/simple-cache. Scrapbook comes with adapters for both PSRs: $psr6 = new \MatthiasMullie\Scrapbook\Psr6\Pool($keyvaluestore); $item = $psr6->getItem(‘key’); $psr16 = new \ MatthiasMullie\Scrapbook\Psr16\SimpleCache($keyvaluestore); $value = $psr16->get(‘key’);

Slide 84

Slide 84 text

Presentation title

Slide 85

Slide 85 text

Questions? Caching

Slide 86

Slide 86 text

mullie.eu • scrapbook.cash Caching Resources