Slide 1

Slide 1 text

Ben Ramsey July 23, 2008 Give Your Site A Boost With Memcache

Slide 2

Slide 2 text

Why cache? 2

Slide 3

Slide 3 text

To make it faster. 3

Slide 4

Slide 4 text

4 “A cache is a collection of data duplicating original values stored elsewhere or computed earlier, where the original data is expensive to fetch (owing to longer access time) or to compute, compared to the cost of reading the cache. In other words, a cache is a temporary storage area where frequently accessed data can be stored for rapid access.” — Wikipedia

Slide 5

Slide 5 text

Why cache? 5 You want to reduce the number of retrieval queries made to the database You want to reduce the number of external requests (retrieving data from other web services) You want to cut down on filesystem access

Slide 6

Slide 6 text

Caching options 6 Flat file caching Caching data in the database MySQL 4.x query caching Shared memory (APC) RAM disk memcached

Slide 7

Slide 7 text

What is memcached? 7 Distributed Memory Object Caching System Caching daemon Developed by Danga Interactive for LiveJournal.com Uses RAM for storage Acts as a dictionary of stored data with key/ value pairs

Slide 8

Slide 8 text

Is memcached fast? 8 Stored in memory (RAM), not on disk Uses non-blocking network I/O (TCP/IP) Uses libevent to scale to any number of open connections Uses its own slab allocator and hash table to ensure virtual memory never gets externally fragmented and allocations are guaranteed O(1)

Slide 9

Slide 9 text

General usage 9 1.Set up a pool of memcached servers 2.Assign values to keys that are stored in the cluster 3.The memcache client hashes the key to a particular machine in the cluster 4.Subsequent requests for that key retrieve the value from the memcached server on which it was stored 5.Values time out after the specified TTL

Slide 10

Slide 10 text

Memcached principles 10 It’s a non-blocking server It is not a database It does not provide redundancy It doesn't handle failover It does not provide authentication

Slide 11

Slide 11 text

Memcached principles 11 Data is not replicated across the cluster Works great on a small and local-area network A single value cannot contain more than 1MB of data Keys are strings limited to 250 characters

Slide 12

Slide 12 text

Storing data in the pool 12 Advantage is in scalability To fully see the advantage, use a “pool” memcached itself doesn't know about the pool The pool is created by and managed from the client library

Slide 13

Slide 13 text

www 2 memcached www 1 www 3 memcached memcached 13

Slide 14

Slide 14 text

Deterministic failover 14 When one server goes down, the system fails over to another server in the pool Memcached does not provide this Some memcache clients provide failover If you can’t find the data in memcache, eat the look-up cost and retrieve from your data source again, storing it back to the cache

Slide 15

Slide 15 text

www 2 memcached www 1 www 3 memcached memcached www 3 memcached Data inaccessible! Recreate data; Store back to memcache 15

Slide 16

Slide 16 text

The memcached protocol API 16 Storage commands: set, add, replace, append, prepend, cas Retrieval command: get, gets Deletion command: delete Increment/decrement: incr, decr Other commands: stats, flush_all, version, verbosity, quit

Slide 17

Slide 17 text

$> telnet localhost 11211 Trying ::1... Connected to localhost. Escape character is '^]'. set foobar 0 0 15 This is a test. STORED get foobar VALUE foobar 0 15 This is a test. END quit Connection closed by foreign host. $> 17

Slide 18

Slide 18 text

Setting it up 18 http://danga.com/memcached/ $> ./configure; make; make install $> memcached -d -m 2048 -p 11211 Done! Windows port of v1.2.4 at http://www.splinedancer.com/memcached-win32/

Slide 19

Slide 19 text

Memcached clients 19 Perl, Python, Ruby, Java, C# C (libmemcached) PostgreSQL (access memcached from procs and triggers) MySQL (adds memcache_engine storage engine) PHP (pecl/memcache)

Slide 20

Slide 20 text

pecl/memcache 20 The PHP client for connecting to memcached and managing a pool of memcached servers http://pecl.php.net/package/memcache $> pecl install memcache Stable: 2.2.3 Beta: 3.0.1

Slide 21

Slide 21 text

21

Slide 22

Slide 22 text

Features of pecl/memcache 22 memcache.allow_failover memcache.hash_strategy memcache.hash_function memcache.protocol memcache.redundancy memcache.session_redundancy

Slide 23

Slide 23 text

pecl/memcache interface MemcachePool::connect() MemcachePool::addServer() MemcachePool::setServerParams() MemcachePool::get() MemcachePool::add() MemcachePool::set() MemcachePool::replace() MemcachePool::cas() 23

Slide 24

Slide 24 text

pecl/memcache interface MemcachePool::append() MemcachePool::prepend() MemcachePool::delete() MemcachePool::increment() MemcachePool::decrement() MemcachePool::setFailureCallback() 24

Slide 25

Slide 25 text

Key hashing 25 Keys longer than 250 characters are truncated without warning Good practice to hash your key (with MD5 or SHA) at the userland level to ensure long keys don’t get truncated Keys are “global” Use something to uniquely identify keys, e.g. a method signature or an SQL statement

Slide 26

Slide 26 text

Object serialization 26 Objects are serialized before being stored to memcache: get key VALUE key 1 59 O:8:"stdClass":2:{s:3:"foo";s:3:"bar";s: 3:"baz";s:3:"quz";} END Extension unserializes them before returning the object Only objects that can be serialized safely can be stored to memcache, i.e. problems with DOM, SimpleXML, etc.

Slide 27

Slide 27 text

Redundancy and failover 27 memcache.redundancy & memcache.session_redundancy Implement redundancy at the userland level? Again, memcache is not a database

Slide 28

Slide 28 text

Extending MemcachePool 28 Implement global values vs. page-specific values Ensure a single instance of the MemcachePool object Do complex key hashing, if you so choose Set a default expiration for all your data Add all of your servers upon object instantiation

Slide 29

Slide 29 text

Database techniques 29 Create a wrapper for mysql_query() that checks the cache first and returns an array of database results Extend PDO to store results to the cache and get them when you execute a statement

Slide 30

Slide 30 text

Database techniques 30 For large datasets, run a scheduled query once an hour and store it to the cache Please note: memcached can store arrays, objects, etc., but it cannot store a resource, which some database functions (e.g. mysql_query()) return

Slide 31

Slide 31 text

Session storage 31 As of 2.1.1, you can set the session save handler as “memcache” and all will work automagically session.save_handler = memcache session.save_path = "tcp:// 192.168.1.10:11211,tcp:// 192.168.1.11:11211,tcp://192.168.1.12:11211" Store sessions to both the database and memcache Write your own session handler that stores to the database and memcache

Slide 32

Slide 32 text

www 3 memcached www 2 memcached www 1 memcached Session inaccessible! Need to recreate the session! 32

Slide 33

Slide 33 text

33 A demonstration...

Slide 34

Slide 34 text

34 For more information... http://danga.com/memcached/ http://pecl.php.net/package/memcache http://www.socialtext.net/memcached/

Slide 35

Slide 35 text

Thank You Slides available for download at benramsey.com. Ben Ramsey Software Architect Schematic http://www.schematic.com/ 35

Slide 36

Slide 36 text

We’re Hiring! It goes without saying: Schematic is only as good as the people who work here. That’s why we’re so particular about recruiting, training, nurturing, and retaining the very best people in our field. If you have digital expertise (technical, creative, managerial–or something else entirely), enthusiasm, curiosity, and the ability to collaborate with others, we’d love to hear from you. Please visit http://www.schematic.com/ for more information. 36