http://www.flickr.com/photos/linneberg/4481309196/ Buckets Buckets contain things Things are keys, that have values A bucket is like a namespace (and it’s cheap - could even be free)
An entry • lives in a bucket • has a key • has a value bucket key value arbitrary name arbitrary name a binary blob and mime type forms the path to the value
An entry • lives in a bucket • has a key • has a value bucket key value arbitrary name arbitrary name forms the path to the value a binary blob and mime type = Store anything, yay!
What is Riak and what’s the agenda? Decentralized key-value store A database ideally suited for web applications A flexible map/reduce engine Database -> Persistence -> Durability -> Storage backends
Storage Riak Key/Value Store Pluggable back-ends, per bucket Bitcask keeps the key-set in memory (ie, in some (rare) cases it might not fit, then look to InnoDB) ETS - Built in erlang storage, DETS is ETS on disk
Storage Riak Key/Value Store Bitcask InnoDB DETS ETS Balanced trees File system LRU Pluggable back-ends, per bucket Bitcask keeps the key-set in memory (ie, in some (rare) cases it might not fit, then look to InnoDB) ETS - Built in erlang storage, DETS is ETS on disk
Storage Riak Key/Value Store Bitcask InnoDB DETS ETS Balanced trees File system LRU ram based, not durable disk based, durable Pluggable back-ends, per bucket Bitcask keeps the key-set in memory (ie, in some (rare) cases it might not fit, then look to InnoDB) ETS - Built in erlang storage, DETS is ETS on disk
Storage Riak Key/Value Store Bitcask InnoDB DETS ETS Balanced trees File system LRU ram based, not durable disk based, durable default common Pluggable back-ends, per bucket Bitcask keeps the key-set in memory (ie, in some (rare) cases it might not fit, then look to InnoDB) ETS - Built in erlang storage, DETS is ETS on disk
What is Riak and what’s the agenda? Decentralized key-value store A database ideally suited for web applications A flexible map/reduce engine Pluggable back-ends, per bucket
What is Riak and what’s the agenda? Decentralized key-value store A database ideally suited for web applications A flexible map/reduce engine Decentralized -> Cluster - No master node - No single point of failure
The Ring A ring size of 1024 should accommodate most needs Once you’ve set your ring size, it’s fixed Only way to change is to backup/restore your entire cluster
The Ring ring size = 12 1 2 3 4 5 6 7 8 9 10 11 12 A ring size of 1024 should accommodate most needs Once you’ve set your ring size, it’s fixed Only way to change is to backup/restore your entire cluster
The N to the R to the W to the DW and the RW Number of copies ie. distribute to N nodes Read ie. have R nodes agree Buckets have defaults for R, W, DW and RW
The N to the R to the W to the DW and the RW Number of copies ie. distribute to N nodes Read ie. have R nodes agree Write ie. ack’d by W nodes Buckets have defaults for R, W, DW and RW
The N to the R to the W to the DW and the RW Durable write ie. persistently written by DW nodes Number of copies ie. distribute to N nodes Read ie. have R nodes agree Write ie. ack’d by W nodes Buckets have defaults for R, W, DW and RW
The N to the R to the W to the DW and the RW Read-write ie. persistently deleted by RW nodes Durable write ie. persistently written by DW nodes Number of copies ie. distribute to N nodes Read ie. have R nodes agree Write ie. ack’d by W nodes Buckets have defaults for R, W, DW and RW
I do care! • Resolve conflicts in application logic • Conflicts exposed as siblings beneath a key • Response is HTTP 300 Multiple Choice • Served as mime/multipart
Map / Reduce count words function(v) { var words = v.values[0].data.toLowerCase().match('\w*','g'); var counts = []; for(var word in words) if (words[word] != '') { var count = {}; count[words[word]] = 1; counts.push(count); } return counts; }
Map / Reduce count words function(values) { var result = {}; for (var value in values) { for(var word in values[value]) { if (word in result) result[word] += values[value][word]; else result[word] = values[value][word]; } } return [result]; }
Map & Reduce count words {"inputs":"bucket", "query":[{"map":{"language":"javascript", "source":"function(v) { var words = v.values[0].data.toLowerCase().match(/ \w*/g); var counts = []; for(var word in words) if (words[word] != '') { var count = {}; count[words[word]] = 1; counts.push(count); } return counts; }"}},{"reduce":{"language":"javascript", "source":"function(values) { var result = {}; for (var value in values) { for(var word in values [value]) { if (word in result) result[word] += values[value][word]; else result[word] = values[value][word]; } } return [result]; }"}}]} Put this in your POST request and let Riak smoke it
Map & Reduce count words {"inputs":"bucket", "query":[{"map":{"language":"javascript", "source":"function(v) { var words = v.values[0].data.toLowerCase().match(/ \w*/g); var counts = []; for(var word in words) if (words[word] != '') { var count = {}; count[words[word]] = 1; counts.push(count); } return counts; }"}},{"reduce":{"language":"javascript", "source":"function(values) { var result = {}; for (var value in values) { for(var word in values [value]) { if (word in result) result[word] += values[value][word]; else result[word] = values[value][word]; } } return [result]; }"}}]} function(v) { var words = v.values[0].data.toLowerCase().match('\w*','g'); var counts = []; for(var word in words) if (words[word] != '') { var count = {}; count[words[word]] = 1; counts.push(count); } return counts; }
Map & Reduce count words {"inputs":"bucket", "query":[{"map":{"language":"javascript", "source":"function(v) { var words = v.values[0].data.toLowerCase().match(/ \w*/g); var counts = []; for(var word in words) if (words[word] != '') { var count = {}; count[words[word]] = 1; counts.push(count); } return counts; }"}},{"reduce":{"language":"javascript", "source":"function(values) { var result = {}; for (var value in values) { for(var word in values [value]) { if (word in result) result[word] += values[value][word]; else result[word] = values[value][word]; } } return [result]; }"}}]} function(values) { var result = {}; for (var value in values) { for(var word in values[value]) { if (word in result) result[word] += values[value][word]; else result[word] = values[value][word]; } } return [result]; }
The whole enchilada Erlang / OTP Riak Key/Value Store Riak Core Riak Search HTTP API Luwak Partitioning (consistent hashing, hinted handoff) Membership management leave/join Work distribution Cluster state gossip protocol Bitcask InnoDB DETS ETS Balanced trees File system LRU