Slide 1

Slide 1 text

Riak, Ripple, and Rails Designing Apps for Eventual Consistency Orlando Ruby Users Group June 14, 2012 Thursday, June 14, 12

Slide 2

Slide 2 text

I’m Bryce Kerley Thursday, June 14, 12

Slide 3

Slide 3 text

I’m Bryce Kerley Thursday, June 14, 12

Slide 4

Slide 4 text

I’m Bryce Kerley Thursday, June 14, 12

Slide 5

Slide 5 text

I’m Bryce Kerley Thursday, June 14, 12

Slide 6

Slide 6 text

I’m Bryce Kerley Thursday, June 14, 12

Slide 7

Slide 7 text

I’m Bryce Kerley Thursday, June 14, 12

Slide 8

Slide 8 text

I’m Bryce Kerley basho Thursday, June 14, 12

Slide 9

Slide 9 text

I’m Bryce Kerley basho Developer Advocate Thursday, June 14, 12

Slide 10

Slide 10 text

I’m Bryce Kerley basho Developer Advocate Thursday, June 14, 12

Slide 11

Slide 11 text

I’m Bryce Kerley basho Developer Advocate [email protected] @bonzoesc Thursday, June 14, 12

Slide 12

Slide 12 text

I’m Bryce Kerley Thursday, June 14, 12

Slide 13

Slide 13 text

I’m Bryce Kerley #riak on Freenode Thursday, June 14, 12

Slide 14

Slide 14 text

I’m Bryce Kerley #riak on Freenode Thursday, June 14, 12

Slide 15

Slide 15 text

Riak Thursday, June 14, 12

Slide 16

Slide 16 text

Riak Distributed Thursday, June 14, 12

Slide 17

Slide 17 text

Riak Distributed Eventually Consistent Thursday, June 14, 12

Slide 18

Slide 18 text

Riak Distributed Eventually Consistent Fault-tolerant Thursday, June 14, 12

Slide 19

Slide 19 text

Riak Distributed Eventually Consistent Fault-tolerant Open Source Thursday, June 14, 12

Slide 20

Slide 20 text

Riak is Distributed r1n1 r1n2 r1n3 r1n4 r1n5 Thursday, June 14, 12

Slide 21

Slide 21 text

Riak is Distributed r1n1 r1n2 r1n3 r1n4 r1n5 Thursday, June 14, 12

Slide 22

Slide 22 text

Riak is Distributed r1n1 r1n2 r1n3 r1n4 r1n5 Nodes are joined into a cluster Thursday, June 14, 12

Slide 23

Slide 23 text

Riak is Distributed r1n1 r1n2 r1n3 r1n4 r1n5 Nodes are joined into a cluster All nodes equal Thursday, June 14, 12

Slide 24

Slide 24 text

Riak is Distributed r1n1 r1n2 r1n3 r1n4 r1n5 Nodes are joined into a cluster All nodes equal Self-sharding Thursday, June 14, 12

Slide 25

Slide 25 text

Riak is Distributed r1n1 r1n2 r1n3 r1n4 r1n5 Nodes are joined into a cluster All nodes equal Self-sharding Generalizable structure for distributed software Thursday, June 14, 12

Slide 26

Slide 26 text

Riak is Distributed r1n1 r1n2 r1n3 r1n4 r1n5 Nodes are joined into a cluster All nodes equal Self-sharding Generalizable structure for distributed software http://git.io/riak_core Thursday, June 14, 12

Slide 27

Slide 27 text

Riak is Eventually Consistent Thursday, June 14, 12

Slide 28

Slide 28 text

Riak is Eventually Consistent No locks Thursday, June 14, 12

Slide 29

Slide 29 text

Riak is Eventually Consistent No locks Simultaneous Writes Thursday, June 14, 12

Slide 30

Slide 30 text

Riak is Eventually Consistent Siblings vs. Last Write Wins Thursday, June 14, 12

Slide 31

Slide 31 text

Riak is Fault-Tolerant Thursday, June 14, 12 Riak objects are distributed on a “ring,” shown here as a “pizza,” and split into 2^n slices; strictly speaking, five doesn’t count, but it makes the diagrams easier.

Slide 32

Slide 32 text

Riak is Fault-Tolerant r1n1 r1n2 r1n3 r1n4 r1n5 Thursday, June 14, 12 We could store each partition only on a single node, but that’s not fault tolerant; if you lose a node, you lose 1/N your data.

Slide 33

Slide 33 text

Riak is Fault-Tolerant r1n1 r1n2 r1n3 r1n4 r1n5 Thursday, June 14, 12 Instead, we put each partition on multiple nodes, three by default.

Slide 34

Slide 34 text

Riak is Fault-Tolerant r1n1 r1n2 r1n3 r1n4 r1n5 Thursday, June 14, 12 So when r1n3 blows up, spring, turquoise, and blueberry are still available on two other nodes.

Slide 35

Slide 35 text

Riak is Fault-Tolerant r1n1 r1n2 r1n3 r1n4 r1n5 Thursday, June 14, 12 So when r1n3 blows up, spring, turquoise, and blueberry are still available on two other nodes.

Slide 36

Slide 36 text

Riak is Fault-Tolerant r1n1 r1n2 r1n3 r1n4 r1n5 Thursday, June 14, 12 And when a new node replaces it, it can bring itself back online based on other replicas.

Slide 37

Slide 37 text

Riak is Fault-Tolerant r1n1 r1n2 r1n3 r1n4 r1n5 Thursday, June 14, 12 And when a new node replaces it, it can bring itself back online based on other replicas.

Slide 38

Slide 38 text

Riak is Open Source http://git.io/basho http://git.io/riak http://git.io/riak_core http://git.io/riak_kv (mostly in Erlang) Thursday, June 14, 12

Slide 39

Slide 39 text

Backends Thursday, June 14, 12

Slide 40

Slide 40 text

Backends Bitcask Thursday, June 14, 12

Slide 41

Slide 41 text

Backends Bitcask LevelDB Thursday, June 14, 12

Slide 42

Slide 42 text

Backends Bitcask LevelDB Memory Thursday, June 14, 12

Slide 43

Slide 43 text

Backends Bitcask LevelDB Memory Multi Thursday, June 14, 12

Slide 44

Slide 44 text

Querying Key-Value Search Secondary Indexes Map-Reduce Thursday, June 14, 12

Slide 45

Slide 45 text

Key-Value Buckets containing keys Keys can be given names or auto-named /riak/orug/YwEFtEC8NUUiEUjuY4mW8WfqtNd Thursday, June 14, 12

Slide 46

Slide 46 text

Key-Value > curl http://127.0.0.1:8091/riak/orug/ YwEFtEC8NUUiEUjuY4mW8WfqtNd Hello ORUG! Thursday, June 14, 12

Slide 47

Slide 47 text

Key-Value Voxer Posterous GitHub’s git.io Thursday, June 14, 12

Slide 48

Slide 48 text

Querying Key-Value Search Secondary Indexes Map-Reduce Thursday, June 14, 12

Slide 49

Slide 49 text

Search Full text Structure Aware Post-commit hook Painful Re-indexing Thursday, June 14, 12 Structure aware: it knows how to index fields in JSON, XML, and Erlang terms, besides plain text That it only indexes on save means that if you have to re-index, it has to scan each value

Slide 50

Slide 50 text

Querying Key-Value Search Secondary Indexes Map-Reduce Thursday, June 14, 12

Slide 51

Slide 51 text

Secondary Indexes Requires LevelDB backend Indexes specified as metadata My Favorite Thursday, June 14, 12 You can (and for my HIPAA-sensitive project, we do) index at-rest encrypted data this way One re-indexing caveat: you have to re-specify indexes when you update a record

Slide 52

Slide 52 text

Querying Key-Value Search Secondary Indexes Map-Reduce Thursday, June 14, 12

Slide 53

Slide 53 text

Map-Reduce Not Really Querying Aggregate and Filter Results Thursday, June 14, 12 You can (and for my HIPAA-sensitive project, we do) index at-rest encrypted data this way One re-indexing caveat: you have to re-specify indexes when you update a record

Slide 54

Slide 54 text

Riak and Ruby Riak-Client http://rubygems.org/gems/riak-client Objects and methods for talking to Riak Thursday, June 14, 12

Slide 55

Slide 55 text

Riak and Ruby Riak-Client http://rubygems.org/gems/riak-client Objects and methods for talking to Riak Building block for… Thursday, June 14, 12

Slide 56

Slide 56 text

Riak and Ruby Ripple http://rubygems.org/gems/ripple ActiveModel for Riak Validations, Associations, and more! Thursday, June 14, 12

Slide 57

Slide 57 text

Query Design owner_id zombie_id hideout_id eaten_at size Brains Thursday, June 14, 12 Let’s start with a zombie-themed example. There’s a bunch of brains wandering around, they get eaten by zombies. We want to easily query a few things about them.

Slide 58

Slide 58 text

Query Design owner_id zombie_id zombie_id-hideout_id zombie_id-eaten_at hideout-eaten_at owner_id zombie_id hideout_id eaten_at size Brains Thursday, June 14, 12 We want to find a given brain by owner (i.e. who got their brain eaten), the zombie that ate it, find what brains a zombie ate where, find what brains a zombie ate on a given day, and what days a given hideout gets raided on.

Slide 59

Slide 59 text

owner_id zombie_id hideout_id eaten_at size Brains range query multiple records one record Query Design owner_id zombie_id zombie_id-hideout_id zombie_id-eaten_at hideout-eaten_at Thursday, June 14, 12

Slide 60

Slide 60 text

class Brain include Ripple::Document property :owner_key, String, presence: true, index: true property :zombie_key, String, index: true property :hideout_id, Integer property :eaten_at, Time one :owner, using: :stored_key one :zombie, using: :stored_key index :zombie_hideout, String do "#{zombie_key}-#{hideout_id}" end index :zombie_eaten, String do "#{zombie_key}-#{eaten_at}" end index :hideout_eaten, String do owner_id zombie_id hideout_id eaten_at size Brains Query Design Thursday, June 14, 12 For the sake of discussion, Hideouts are stored with ActiveRecord

Slide 61

Slide 61 text

class Brain include Ripple::Document property :owner_key, String, presence: true, index: true property :zombie_key, String, index: true property :hideout_id, Integer property :eaten_at, Time one :owner, using: :stored_key one :zombie, using: :stored_key index :zombie_hideout, String do "#{zombie_key}-#{hideout_id}" end index :zombie_eaten, String do "#{zombie_key}-#{eaten_at}" end index :hideout_eaten, String do "#{hideout_id}-#{eaten_at}" end def hideout Hideout.find hideout_id end def hideout=(hideout) owner_id zombie_id hideout_id eaten_at size Brains Query Design Thursday, June 14, 12 Cardinality is important in multi-field indexes. If we wanted to group by zombie_key, these would be fine

Slide 62

Slide 62 text

index :zombie_hideout, String do "#{zombie_key}-#{hideout_id}" end index :zombie_eaten, String do "#{zombie_key}-#{eaten_at}" end index :hideout_eaten, String do "#{hideout_id}-#{eaten_at}" end def hideout Hideout.find hideout_id end def hideout=(hideout) self.hideout_id = hideout.id end end owner_id zombie_id hideout_id eaten_at size Brains Query Design Thursday, June 14, 12 For the sake of discussion, Hideouts are stored with ActiveRecord

Slide 63

Slide 63 text

owner_id zombie_id hideout_id eaten_at size Brains Query Design Thursday, June 14, 12 For the sake of discussion, Hideouts are stored with ActiveRecord

Slide 64

Slide 64 text

owner_id zombie_id hideout_id eaten_at size Brains Query Design Thursday, June 14, 12 We’re going to load all our brains into “bs”

Slide 65

Slide 65 text

bs = Brain.find_by_index '$bucket', '_' #=> [] bs.first.indexes_for_persistence #=> {"owner_key_bin"=>#, "zombie_key_bin"=>#, "zombie_hideout_bin"=>#, "zombie_eaten_bin"=>#

Slide 66

Slide 66 text

bs = Brain.find_by_index '$bucket', '_' #=> [] bs.first.indexes_for_persistence #=> {"owner_key_bin"=>#, "zombie_key_bin"=>#, "zombie_hideout_bin"=>#, "zombie_eaten_bin"=>#

Slide 67

Slide 67 text

bs = Brain.find_by_index '$bucket', '_' #=> [] bs.first.indexes_for_persistence #=> {"owner_key_bin"=>#, "zombie_key_bin"=>#, "zombie_hideout_bin"=>#, "zombie_eaten_bin"=>#

Slide 68

Slide 68 text

ow zom hid eat siz Query Design Thursday, June 14, 12 These are the five indexes

Slide 69

Slide 69 text

#=> [] bs.first.indexes_for_persistence #=> {"owner_key_bin"=>#, "zombie_key_bin"=>#, "zombie_hideout_bin"=>#, "zombie_eaten_bin"=>#, "hideout_eaten_bin"=>#} Brain.find_by_index 'hideout_eaten', ow zom hid eat siz Query Design Thursday, June 14, 12 These are the five indexes

Slide 70

Slide 70 text

owner_id zombie_id hideout_id eaten_at size Brains Query Design Thursday, June 14, 12 We use a range query to find all the brains that got eaten at hideout 5 between 100 and 200, and we see that 170 counts

Slide 71

Slide 71 text

{"IzlkD8XbH3efwkXj5n8tAQzzCZp-5"}>, "zombie_eaten_bin"=>#, "hideout_eaten_bin"=>#} Brain.find_by_index 'hideout_eaten', ("5-00000000001339645100".."5-0000000000133 9645200") #=> [] Brain.find_by_index 'hideout_eaten', ("5-00000000001339645100".."5-0000000000133 9645150") #=> [] owner_id zombie_id hideout_id eaten_at size Brains Query Design Thursday, June 14, 12 We use a range query to find all the brains that got eaten at hideout 5 between 100 and 200, and we see that 170 counts

Slide 72

Slide 72 text

owner_id zombie_id hideout_id eaten_at size Brains Query Design Thursday, June 14, 12 And there wasn’t anything between 100 and 150; 170 is outside of that range.

Slide 73

Slide 73 text

("5-00000000001339645100".."5-0000000000133 9645200") #=> [] Brain.find_by_index 'hideout_eaten', ("5-00000000001339645100".."5-0000000000133 9645150") #=> [] owner_id zombie_id hideout_id eaten_at size Brains Query Design Thursday, June 14, 12 And there wasn’t anything between 100 and 150; 170 is outside of that range.

Slide 74

Slide 74 text

owner_id zombie_id hideout_id eaten_at size Brains Query Design Thursday, June 14, 12 And there wasn’t anything between 100 and 150; 170 is outside of that range.

Slide 75

Slide 75 text

Hooking Up Rails Add ripple to Gemfile Add config/ripple.yml Put Ripple-backed models in app/models Thursday, June 14, 12

Slide 76

Slide 76 text

Hooking Up Rails Add ripple to Gemfile source 'https://rubygems.org' gem 'rails', '3.2.5' gem 'sqlite3' gem 'ripple', '~> 1.0.0.beta2' gem 'haml' Thursday, June 14, 12

Slide 77

Slide 77 text

Hooking Up Rails Add config/ripple.yml development: host: localhost http_port: 8091 pb_port: 8081 Thursday, June 14, 12

Slide 78

Slide 78 text

Hooking Up Rails Put Ripple-backed models in app/models class Zombie include Ripple::Document end Thursday, June 14, 12

Slide 79

Slide 79 text

Hooking Up Rails Enjoy! Thursday, June 14, 12

Slide 80

Slide 80 text

Hooking Up Rails Thursday, June 14, 12

Slide 81

Slide 81 text

Hooking Up Rails Thursday, June 14, 12

Slide 82

Slide 82 text

Hooking Up Rails But these IDs are awful! Thursday, June 14, 12

Slide 83

Slide 83 text

Eventually Consistent Counters WARNING ONGOING RESEARCH DON’T USE Thursday, June 14, 12

Slide 84

Slide 84 text

Eventually Consistent Counters Convergent Replicated Data Types Thursday, June 14, 12

Slide 85

Slide 85 text

Eventually Consistent Counters http://git.io/dMxQFw aphyr/meangirls Thursday, June 14, 12

Slide 86

Slide 86 text

Eventually Consistent Counters http://bit.ly/prime-cheat “You can’t use prime numbers to cheat on CRDTs” Thursday, June 14, 12

Slide 87

Slide 87 text

Photos http://flic.kr/p/5NfKi2 - Satellite Beach http://flic.kr/p/9kTSCe - USF Bull http://flic.kr/p/53K5JV - Coconut Grove http://flic.kr/p/4r3Vjk - Steven Bristol Thursday, June 14, 12