Slide 1

Slide 1 text

with Riak @alexmoore Scaling
 Data Safely


Slide 2

Slide 2 text

Internet 1 (2) (3)

Slide 3

Slide 3 text

Internet 503 Timeout 1 (2) (3)

Slide 4

Slide 4 text

Internet 1 (2) 3

Slide 5

Slide 5 text

Internet 1 (2) 3

Slide 6

Slide 6 text

Internet (2) 3 2 3 1

Slide 7

Slide 7 text

Internet 4 2 3 (2) 1 3 3

Slide 8

Slide 8 text

Internet CDN 2 6 7 2 4 3 1 3 3

Slide 9

Slide 9 text

Internet CDN 5 6 2 4 3 2 1 3 3 7 3

Slide 10

Slide 10 text

Internet CDN 6 7 2 3 2 1 3 3

Slide 11

Slide 11 text

Internet CDN P S S 3 6 7 2 5 3 2 1 4 4 3 3 7

Slide 12

Slide 12 text

Internet CDN 6 8 2 3 2 1 3 3

Slide 13

Slide 13 text

Internet CDN 6 8 2 3 2 1 3 3 3? 4? q User Session Widgets Lolcats Stuff 8? magic

Slide 14

Slide 14 text

Internet CDN 6 8 2 3 2 1 3 3 3? 4? q User Session Widgets Lolcats Stuff 8? magic

Slide 15

Slide 15 text

Internet CDN 6 8 2 3 2 1 3 3 3? 4? q User Session Widgets Lolcats Stuff 8? magic

Slide 16

Slide 16 text

Internet CDN 6 8 2 3 2 1 3 3 3? 4? q User Session Widgets Lolcats Stuff 8? magic Who’s on first?

Slide 17

Slide 17 text

Internet CDN 6 8 2 3 2 1 3 3 3? 4? q User Session Widgets Lolcats Stuff 8? magic What the hell have you built? Who’s on first?

Slide 18

Slide 18 text

Enter Riak

Slide 19

Slide 19 text

Riak is a distributed, fault-tolerant, highly available database

Slide 20

Slide 20 text

No content

Slide 21

Slide 21 text

Distributed

Slide 22

Slide 22 text

Distributed a cluster of nodes each node has an identical role

Slide 23

Slide 23 text

Fault- tolerant

Slide 24

Slide 24 text

Fault- Tolerant symmetrical no Single Point of Failure

Slide 25

Slide 25 text

Highly Available

Slide 26

Slide 26 text

Highly Available replicas & redundancy

Slide 27

Slide 27 text

Scalable

Slide 28

Slide 28 text

Scalable add or remove nodes linear scalability

Slide 29

Slide 29 text

Scalable add or remove nodes linear scalability

Slide 30

Slide 30 text

Multiple Platforms

Slide 31

Slide 31 text

Multiple Platforms

Slide 32

Slide 32 text

Multiple Platforms

Slide 33

Slide 33 text

Predictable Performance

Slide 34

Slide 34 text

No content

Slide 35

Slide 35 text

Internet CDN 5 6 2 3 2 1 3 3

Slide 36

Slide 36 text

Internet CDN 5 6 2 3 2 1 3 3 3 3 4 4

Slide 37

Slide 37 text

1   2   3   4   5   6   7   8   9   10   11   12 Building A Cluster #  All  Nodes   $  riak  start   ! #  Nodes  2..n.each     $  riak-­‐admin  join  [email protected]   ! #  Any  Node   $  riak-­‐admin  cluster  plan   $  riak-­‐admin  cluster  commit   !

Slide 38

Slide 38 text

No content

Slide 39

Slide 39 text

No content

Slide 40

Slide 40 text

Architecture

Slide 41

Slide 41 text

Riak is the ops-friendly database

Slide 42

Slide 42 text

Riak is distributed

Slide 43

Slide 43 text

Consistent Hashing hash(“bucket/key”)

Slide 44

Slide 44 text

hash ring

Slide 45

Slide 45 text

tokenize it

Slide 46

Slide 46 text

node 0 node 1 node 2

Slide 47

Slide 47 text

node 0 node 1 node 2 hash(key)

Slide 48

Slide 48 text

node 0 node 1 node 2 Replicas are stored to the N - 1 contiguous partitions

Slide 49

Slide 49 text

node 0 node 1 node 2 hash(“companies/GE”) Replicas are stored to the N - 1 contiguous partitions

Slide 50

Slide 50 text

node 0 node 1 node 2 hash(“companies/GE”) Replicas are stored to the N - 1 contiguous partitions

Slide 51

Slide 51 text

node 0 node 1 node 2

Slide 52

Slide 52 text

node 0 node 1 node 2 node 3 + Scaling out

Slide 53

Slide 53 text

Quorum requests N R W PR/PW DW

Slide 54

Slide 54 text

get(“users/clay-davis”)

Slide 55

Slide 55 text

get(“users/clay-davis”) client Riak

Slide 56

Slide 56 text

get(“users/clay-davis”) Get Handler (FSM) client Riak

Slide 57

Slide 57 text

get(“users/clay-davis”) Get Handler (FSM) client Riak hash(“users/clay-davis”) == 10, 11, 12

Slide 58

Slide 58 text

get(“users/clay-davis”) Get Handler (FSM) client Riak hash(“users/clay-davis”) == 10, 11, 12 Coordinating node Cluster 6 7 8 9 10 11 12 13 14 15 16 The Ring

Slide 59

Slide 59 text

get(“users/clay-davis”) Get Handler (FSM) client Riak get(“users/clay-davis”) Coordinating node Cluster 6 7 8 9 10 11 12 13 14 15 16 The Ring

Slide 60

Slide 60 text

get(“users/clay-davis”) Get Handler (FSM) client Riak Coordinating node Cluster 6 7 8 9 10 11 12 13 14 15 16 The Ring R=2

Slide 61

Slide 61 text

get(“users/clay-davis”) Get Handler (FSM) client Riak Coordinating node Cluster 6 7 8 9 10 11 12 13 14 15 16 The Ring R=2 obj

Slide 62

Slide 62 text

get(“users/clay-davis”) Get Handler (FSM) client Riak R=2 obj obj

Slide 63

Slide 63 text

get(“users/clay-davis”) Get Handler (FSM) client Riak R=2 obj obj

Slide 64

Slide 64 text

get(“users/clay-davis”) obj

Slide 65

Slide 65 text

Read Repair (Anti-Entropy)

Slide 66

Slide 66 text

replica replica replica

Slide 67

Slide 67 text

replica replica replica X

Slide 68

Slide 68 text

replica replica replica replica replica replica

Slide 69

Slide 69 text

Active Anti-Entropy (self healing clusters)

Slide 70

Slide 70 text

real-time updates persistent non-blocking disk-based

Slide 71

Slide 71 text

merkle tree to track changes coordinated at the vnode level runs as a background process exchange with neighbor vnodes for inconsistencies resolution semantics: trigger read-repair

Slide 72

Slide 72 text

= hashes marked “dirty”

Slide 73

Slide 73 text

No content

Slide 74

Slide 74 text

No content

Slide 75

Slide 75 text

No content

Slide 76

Slide 76 text

No content

Slide 77

Slide 77 text

= keys to read-repair

Slide 78

Slide 78 text

Using Riak

Slide 79

Slide 79 text

Riak is the ops-friendly database

Slide 80

Slide 80 text

Riak is friendly to Developers, too!

Slide 81

Slide 81 text

{“key”: “value”}

Slide 82

Slide 82 text

key value key value key value key value key value key value key value key value key value Keys are namespaced into Buckets

Slide 83

Slide 83 text

Buckets aren’t “real”

Slide 84

Slide 84 text

✘ ✘ ✘ Buckets aren’t “real”

Slide 85

Slide 85 text

Metadata Bucket Properties

Slide 86

Slide 86 text

Aitch Tee Tee Pee GET /buckets/bucket/keys/key PUT /buckets/bucket/keys/key DEL /buckets/bucket/keys/key

Slide 87

Slide 87 text

1   2   3   4   5   6   7   8   9   10   11   12 Working With Ruby - Starting $  gem  install  riak-­‐client   ! book  =  {          isbn:  '1111979723',          title:  'Moby  Dick',          author:  'Herman  Melville',          body:  'Call  me  Ishmael.  Some  years  ago...',          copies_owned:  3   }  

Slide 88

Slide 88 text

1   2   3   4   5   6   7   8   9   10   11   12 Working With Ruby - Create #  Starting  Client   client  =  Riak::Client.new(protocol:   'pbc',                           pb_port:     8087,                         host:       '10.0.0.1')   ! books_bucket  =  client.bucket('books')     ! robj_moby  =  books_bucket.new(book[:isbn])   robj_moby.data  =  book   robj_moby.store   ! !

Slide 89

Slide 89 text

1   2   3   4   5   6   7   8   9   10   11   12 Working With Ruby - Read fetched_book  =  books_bucket.get('1111979723')   ! puts  fetched_book.raw_data   !  =>  {"isbn":"1111979723","title":"Moby  Dick”,   "author":"Herman  Melville",   "body":"Call  me  Ishmael.  Some  years   ago...","copies_owned":3}   ! fetched_book.data['isbn']   !  =>  "1111979723"

Slide 90

Slide 90 text

1   2   3   4   5   6   7   8   9   10   11   12 Working With Ruby - Delete fetched_book.delete   ! #  or   ! books_bucket.delete('1111979723')  

Slide 91

Slide 91 text

1   2   3   4   5   6   7   8   9   10   11   12 Working With Ruby - 2I ! fetched_book.indexes['author_bin']  =  []   fetched_book.indexes['author_bin']  <<               fetched_book.data['author']   ! fetched_book.store   ! books_bucket.get_index('author_bin','Herman   Melville')   ! =>  ["1111979723"]

Slide 92

Slide 92 text

etc...

Slide 93

Slide 93 text

JUST ONE MORE THING

Slide 94

Slide 94 text

Internet CDN 5 6 2 3 2 1 3 3 3 3 4 4

Slide 95

Slide 95 text

Internet CDN 5 6 2 3 2 1 3 3 3 4

Slide 96

Slide 96 text

Internet CDN 5 6 2 3 2 1 3 3 3 4

Slide 97

Slide 97 text

in riak.conf:

Slide 98

Slide 98 text

search = off in riak.conf:

Slide 99

Slide 99 text

search = off search = on in riak.conf:

Slide 100

Slide 100 text

1   2   3   4   5   6   7   8   9   10   11   12 client.create_search_index(“tweets_idx”)   ! bucket  =  Riak::Bucket.new(client,  "tweets")   bucket.props  =  {"search_index"  =>  "tweets_idx"}   ! #  load  some  tweets  into  riak…   ! results  =  client.search("tweets",  "text:who")   ! puts  results[‘docs'].first  

Slide 101

Slide 101 text

1   2   3   4   5   6   7   8   9   10   11   12 pp  results[‘docs'].first   ! {"score"=>"2.50804900000000019489e+00",    "_yz_rb"=>"tweets",    "_yz_rk"=>"421672897250226176",    ...}

Slide 102

Slide 102 text

More resources: docs.basho.com “Taste of Riak”

Slide 103

Slide 103 text

Multi-Datacenter Replication (E.G. - Nuke proofing)

Slide 104

Slide 104 text

Full Sync

Slide 105

Slide 105 text

Real-Time

Slide 106

Slide 106 text

No content

Slide 107

Slide 107 text

put(“users/jimmy”)

Slide 108

Slide 108 text

put(“users/jimmy”)

Slide 109

Slide 109 text

put(“users/jimmy”)

Slide 110

Slide 110 text

put(“users/jimmy”)

Slide 111

Slide 111 text

put(“users/jimmy”)

Slide 112

Slide 112 text

Internet CDN 5 6 2 3 2 1 3 3 3 4

Slide 113

Slide 113 text

Internet CDN 5 6 2 3 2 1 3 3 3 4 ?

Slide 114

Slide 114 text

Internet CDN 5 6 2 3 2 1 3 3 3 4 ?

Slide 115

Slide 115 text

Q & A Alex Moore - Client Services Engineer, Basho
 @alexmoore
 [email protected]
 http://alexmoore.io