Slide 1

Slide 1 text

Riak Use Cases Dissecting the Solutions to Hard Problems NoSQL Roadshow - Amsterdam 29 / 11 / 2012 Friday, 30 November 12

Slide 2

Slide 2 text

$ Friday, 30 November 12

Slide 3

Slide 3 text

whoami $ Friday, 30 November 12

Slide 4

Slide 4 text

whoami $ Name: Chris Molozian Title: Client Services Engineer Company: Basho Technologies Email: [email protected] Friday, 30 November 12

Slide 5

Slide 5 text

$ Friday, 30 November 12

Slide 6

Slide 6 text

whoami $ Friday, 30 November 12

Slide 7

Slide 7 text

whoami $ Name: Matthew Revell $ Title: Community Manager Company: Basho Technologies Twitter: @matthewrevell Friday, 30 November 12

Slide 8

Slide 8 text

whoami $ Name: Matthew Revell $ ./presentation Title: Community Manager Company: Basho Technologies Twitter: @matthewrevell Friday, 30 November 12

Slide 9

Slide 9 text

NoSQL "departs from the relational model altogether; it should therefore have been called more appropriately 'NoREL' " ~ Carlo Strozzi Friday, 30 November 12

Slide 10

Slide 10 text

NoSQL • Divided into a (growing) list of categories (the more exotic ones include Multivalue and Tuple stores) • All are “optimized” for record storage • Arguably the largest categories are: "departs from the relational model altogether; it should therefore have been called more appropriately 'NoREL' " ~ Carlo Strozzi Friday, 30 November 12

Slide 11

Slide 11 text

NoSQL • Divided into a (growing) list of categories (the more exotic ones include Multivalue and Tuple stores) • All are “optimized” for record storage • Arguably the largest categories are: Graph "departs from the relational model altogether; it should therefore have been called more appropriately 'NoREL' " ~ Carlo Strozzi Friday, 30 November 12

Slide 12

Slide 12 text

NoSQL • Divided into a (growing) list of categories (the more exotic ones include Multivalue and Tuple stores) • All are “optimized” for record storage • Arguably the largest categories are: Graph Key-Value "departs from the relational model altogether; it should therefore have been called more appropriately 'NoREL' " ~ Carlo Strozzi Friday, 30 November 12

Slide 13

Slide 13 text

NoSQL • Divided into a (growing) list of categories (the more exotic ones include Multivalue and Tuple stores) • All are “optimized” for record storage • Arguably the largest categories are: Graph Key-Value "departs from the relational model altogether; it should therefore have been called more appropriately 'NoREL' " ~ Carlo Strozzi Document Friday, 30 November 12

Slide 14

Slide 14 text

Graph Friday, 30 November 12

Slide 15

Slide 15 text

Graph • Data is represented using: Friday, 30 November 12

Slide 16

Slide 16 text

Graph • Data is represented using: • Nodes - an entity of some kind (i.e. User) Friday, 30 November 12

Slide 17

Slide 17 text

Graph • Data is represented using: • Nodes - an entity of some kind (i.e. User) • Edges - the relationship between nodes Friday, 30 November 12

Slide 18

Slide 18 text

Graph • Data is represented using: • Nodes - an entity of some kind (i.e. User) • Edges - the relationship between nodes • Use when the important data is in the edges Friday, 30 November 12

Slide 19

Slide 19 text

Graph • Data is represented using: • Nodes - an entity of some kind (i.e. User) • Edges - the relationship between nodes • Use when the important data is in the edges Node Node Node Edge Edge Edge Friday, 30 November 12

Slide 20

Slide 20 text

Document Friday, 30 November 12

Slide 21

Slide 21 text

Document • Data is represented using: • Documents - A !exible collection of k/v pairs ("elds) • Use when most queries are not primary key Friday, 30 November 12

Slide 22

Slide 22 text

Document • Data is represented using: • Documents - A !exible collection of k/v pairs ("elds) • Use when most queries are not primary key Document UserID (key) : test_user Name : Chris Job : Engineer Friday, 30 November 12

Slide 23

Slide 23 text

Key-Value Friday, 30 November 12

Slide 24

Slide 24 text

Key-Value • Data is represented using: • Record - A key/value pair • Use when most queries are primary key or you can denormalize the data problem to k/v pair Friday, 30 November 12

Slide 25

Slide 25 text

Key-Value • Data is represented using: • Record - A key/value pair • Use when most queries are primary key or you can denormalize the data problem to k/v pair key value namespace key value key value Friday, 30 November 12

Slide 26

Slide 26 text

Friday, 30 November 12

Slide 27

Slide 27 text

What is Riak? Friday, 30 November 12

Slide 28

Slide 28 text

What is Riak? • Distributed key/value store + extras Friday, 30 November 12

Slide 29

Slide 29 text

What is Riak? • Distributed key/value store + extras • Advanced query features Friday, 30 November 12

Slide 30

Slide 30 text

What is Riak? • Distributed key/value store + extras • Advanced query features • Pre/Post commit hooks Friday, 30 November 12

Slide 31

Slide 31 text

What is Riak? • Distributed key/value store + extras • Advanced query features • Pre/Post commit hooks • Multiple storage engines Friday, 30 November 12

Slide 32

Slide 32 text

What is Riak? • Distributed key/value store + extras • Advanced query features • Pre/Post commit hooks • Multiple storage engines • Scales Linearly + Fault Tolerant Friday, 30 November 12

Slide 33

Slide 33 text

What is Riak? • Distributed key/value store + extras • Advanced query features • Pre/Post commit hooks • Multiple storage engines • Scales Linearly + Fault Tolerant • Open Source (Apache 2.0) Friday, 30 November 12

Slide 34

Slide 34 text

What is Riak? • Distributed key/value store + extras • Advanced query features • Pre/Post commit hooks • Multiple storage engines • Scales Linearly + Fault Tolerant • Open Source (Apache 2.0) • Written in Erlang/OTP Friday, 30 November 12

Slide 35

Slide 35 text

Tunable Consistency Friday, 30 November 12

Slide 36

Slide 36 text

Tunable Consistency • We haven’t solved CAP; no one has Friday, 30 November 12

Slide 37

Slide 37 text

Tunable Consistency • We haven’t solved CAP; no one has • With Riak, you tune the CAP values: Friday, 30 November 12

Slide 38

Slide 38 text

Tunable Consistency • We haven’t solved CAP; no one has • With Riak, you tune the CAP values: • N: number of instances of your data Friday, 30 November 12

Slide 39

Slide 39 text

Tunable Consistency • We haven’t solved CAP; no one has • With Riak, you tune the CAP values: • N: number of instances of your data • R: number of nodes Riak reads from Friday, 30 November 12

Slide 40

Slide 40 text

Tunable Consistency • We haven’t solved CAP; no one has • With Riak, you tune the CAP values: • N: number of instances of your data • R: number of nodes Riak reads from • W: number of nodes Riak writes to, before optional further replication Friday, 30 November 12

Slide 41

Slide 41 text

Tunable Consistency • We haven’t solved CAP; no one has • With Riak, you tune the CAP values: • N: number of instances of your data • R: number of nodes Riak reads from • W: number of nodes Riak writes to, before optional further replication • Per cluster, per bucket or per operation Friday, 30 November 12

Slide 42

Slide 42 text

Con!ict Resolution (1) Friday, 30 November 12

Slide 43

Slide 43 text

Con!ict Resolution (1) • Concurrent actors modifying the same data cause data divergence. Friday, 30 November 12

Slide 44

Slide 44 text

Con!ict Resolution (1) • Concurrent actors modifying the same data cause data divergence. • Riak provides two solutions to manage this: Friday, 30 November 12

Slide 45

Slide 45 text

Con!ict Resolution (1) • Concurrent actors modifying the same data cause data divergence. • Riak provides two solutions to manage this: • Last Write Wins Naive approach but works for some use cases Friday, 30 November 12

Slide 46

Slide 46 text

Con!ict Resolution (1) • Concurrent actors modifying the same data cause data divergence. • Riak provides two solutions to manage this: • Last Write Wins Naive approach but works for some use cases • Vector Clocks Retain “sibling” copies of data for merging Friday, 30 November 12

Slide 47

Slide 47 text

Con!ict Resolution (2) Friday, 30 November 12

Slide 48

Slide 48 text

Con!ict Resolution (2) node node node node node App App App LB VMs Riak Cluster Friday, 30 November 12

Slide 49

Slide 49 text

Con!ict Resolution (2) node node node node node App App App LB VMs Riak Cluster Application layer timestamps, with siblings Friday, 30 November 12

Slide 50

Slide 50 text

Con!ict Resolution (3) Friday, 30 November 12

Slide 51

Slide 51 text

Con!ict Resolution (3) node node node node node LB Riak Cluster App App App App App Friday, 30 November 12

Slide 52

Slide 52 text

Con!ict Resolution (3) Application layer business logic, with siblings node node node node node LB Riak Cluster App App App App App Friday, 30 November 12

Slide 53

Slide 53 text

Sibling Handling Friday, 30 November 12

Slide 54

Slide 54 text

Sibling Handling "We don't ever do conflict resolution by picking a random sibling." Friday, 30 November 12

Slide 55

Slide 55 text

Sibling Handling "We don't ever do conflict resolution by picking a random sibling." "For an array property, we often take the union of all values in all siblings. This works great for array properties that we only ever add to." Friday, 30 November 12

Slide 56

Slide 56 text

Sibling Handling "We don't ever do conflict resolution by picking a random sibling." "For an array property, we often take the union of all values in all siblings. This works great for array properties that we only ever add to." "We often take the maximum sibling value or the minimum sibling value, depending on the semantics of that attribute." Friday, 30 November 12

Slide 57

Slide 57 text

Sibling Handling "We don't ever do conflict resolution by picking a random sibling." "For an array property, we often take the union of all values in all siblings. This works great for array properties that we only ever add to." "We often take the maximum sibling value or the minimum sibling value, depending on the semantics of that attribute." ~ Myron Marston, SEOMoz Friday, 30 November 12

Slide 58

Slide 58 text

• HTTP REST or optimised binary interface (PB) • O#cial Basho supported: • Community: C#, C/C++, Haskell, Clojure, Scala, Go, PHP and many others Client Libraries Friday, 30 November 12

Slide 59

Slide 59 text

Riak Use Cases Friday, 30 November 12

Slide 60

Slide 60 text

Riak Use Cases • Reliability, !exibility, scalability Friday, 30 November 12

Slide 61

Slide 61 text

Riak Use Cases • Reliability, !exibility, scalability • Session Data Friday, 30 November 12

Slide 62

Slide 62 text

Riak Use Cases • Reliability, !exibility, scalability • Session Data • Serving Advertising Friday, 30 November 12

Slide 63

Slide 63 text

Riak Use Cases • Reliability, !exibility, scalability • Session Data • Serving Advertising • Log and Sensor Data Friday, 30 November 12

Slide 64

Slide 64 text

Riak Use Cases • Reliability, !exibility, scalability • Session Data • Serving Advertising • Log and Sensor Data • Content Addressable Storage (CAS) Friday, 30 November 12

Slide 65

Slide 65 text

Riak Use Cases • Reliability, !exibility, scalability • Session Data • Serving Advertising • Log and Sensor Data • Content Addressable Storage (CAS) • Private Cloud [S3 API] - Riak CS Friday, 30 November 12

Slide 66

Slide 66 text

Riak Use Cases • Reliability, !exibility, scalability • Session Data • Serving Advertising • Log and Sensor Data • Content Addressable Storage (CAS) • Private Cloud [S3 API] - Riak CS • Wherever low latency increases revenue Friday, 30 November 12

Slide 67

Slide 67 text

Rovio is an industry-changing entertainment media company based in Finland, and the creator of the globally successful Angry Birds franchise. Friday, 30 November 12

Slide 68

Slide 68 text

• Store Game Session data in Riak A per-user collection of game “states”. Rovio is an industry-changing entertainment media company based in Finland, and the creator of the globally successful Angry Birds franchise. Friday, 30 November 12

Slide 69

Slide 69 text

• Store Game Session data in Riak A per-user collection of game “states”. • Synchronization of user’s data across gaming devices. Rovio is an industry-changing entertainment media company based in Finland, and the creator of the globally successful Angry Birds franchise. Friday, 30 November 12

Slide 70

Slide 70 text

• Store Game Session data in Riak A per-user collection of game “states”. • Synchronization of user’s data across gaming devices. • Buckets: Account - Keyed by user_id Rovio is an industry-changing entertainment media company based in Finland, and the creator of the globally successful Angry Birds franchise. Friday, 30 November 12

Slide 71

Slide 71 text

An Enterprise Social Network that brings together employees, content, conversations, and business data in a single location. Friday, 30 November 12

Slide 72

Slide 72 text

• Store “Noti"cations” in Riak A per-user sorted set of events with calls to action. An Enterprise Social Network that brings together employees, content, conversations, and business data in a single location. Friday, 30 November 12

Slide 73

Slide 73 text

• Store “Noti"cations” in Riak A per-user sorted set of events with calls to action. • Data types consist of: Cursor, Item List, Items {id: 41626118990497, timestamp: 1300845012, category: “likes- message”, properties: {liker_id: 97238, [... etc]} } An Enterprise Social Network that brings together employees, content, conversations, and business data in a single location. Friday, 30 November 12

Slide 74

Slide 74 text

• Store “Noti"cations” in Riak A per-user sorted set of events with calls to action. • Data types consist of: Cursor, Item List, Items {id: 41626118990497, timestamp: 1300845012, category: “likes- message”, properties: {liker_id: 97238, [... etc]} } • Buckets: Cursor - Keyed by user_id + cursor_name Stream - Keyed by user_id An Enterprise Social Network that brings together employees, content, conversations, and business data in a single location. Friday, 30 November 12

Slide 75

Slide 75 text

• Store “Noti"cations” in Riak A per-user sorted set of events with calls to action. • Data types consist of: Cursor, Item List, Items {id: 41626118990497, timestamp: 1300845012, category: “likes- message”, properties: {liker_id: 97238, [... etc]} } • Buckets: Cursor - Keyed by user_id + cursor_name Stream - Keyed by user_id • SOA - known as “Streamie” An Enterprise Social Network that brings together employees, content, conversations, and business data in a single location. Friday, 30 November 12

Slide 76

Slide 76 text

SEOmoz is the world’s most popular provider of SEO software. Their easy to use tools and tutorials make search engine optimization accessible to everyone. Friday, 30 November 12

Slide 77

Slide 77 text

• Ranking collections of web documents SEOmoz is the world’s most popular provider of SEO software. Their easy to use tools and tutorials make search engine optimization accessible to everyone. Friday, 30 November 12

Slide 78

Slide 78 text

• Ranking collections of web documents • Data types consist of: Subscription(s), Ranking List, Ranking History, Recent Ranking Report... (etc) SEOmoz is the world’s most popular provider of SEO software. Their easy to use tools and tutorials make search engine optimization accessible to everyone. Friday, 30 November 12

Slide 79

Slide 79 text

• Ranking collections of web documents • Data types consist of: Subscription(s), Ranking List, Ranking History, Recent Ranking Report... (etc) • Buckets: Ranking List - Keyed by engine+locale+keyword+URL_fragment Subscription - Keyed by user_campaign SEOmoz is the world’s most popular provider of SEO software. Their easy to use tools and tutorials make search engine optimization accessible to everyone. Friday, 30 November 12

Slide 80

Slide 80 text

Other Users Friday, 30 November 12

Slide 81

Slide 81 text

Other Users • High Availability Environments (i.e. Health Care) Friday, 30 November 12

Slide 82

Slide 82 text

Other Users • High Availability Environments (i.e. Health Care) • Content Addressable Storage as a Service (like a private Dropbox cloud) Friday, 30 November 12

Slide 83

Slide 83 text

Other Users • High Availability Environments (i.e. Health Care) • Content Addressable Storage as a Service (like a private Dropbox cloud) • Oil/Gas Rig Environment Logging Friday, 30 November 12

Slide 84

Slide 84 text

Other Users • High Availability Environments (i.e. Health Care) • Content Addressable Storage as a Service (like a private Dropbox cloud) • Oil/Gas Rig Environment Logging • Web Gaming Platforms Friday, 30 November 12

Slide 85

Slide 85 text

Other Users • High Availability Environments (i.e. Health Care) • Content Addressable Storage as a Service (like a private Dropbox cloud) • Oil/Gas Rig Environment Logging • Web Gaming Platforms • Product Catalog (and other Retail use cases) Friday, 30 November 12

Slide 86

Slide 86 text

With any Use Case Consider 3 Things Friday, 30 November 12

Slide 87

Slide 87 text

With any Use Case Consider 3 Things • Query Patterns Friday, 30 November 12

Slide 88

Slide 88 text

With any Use Case Consider 3 Things • Query Patterns • Inter-connectivity of your data (how much can it be denormalised) Friday, 30 November 12

Slide 89

Slide 89 text

With any Use Case Consider 3 Things • Query Patterns • Inter-connectivity of your data (how much can it be denormalised) • Polyglot solution (SOA + Database) (no single database "ts every problem) Friday, 30 November 12

Slide 90

Slide 90 text

With any Use Case Consider 3 Things • Query Patterns • Inter-connectivity of your data (how much can it be denormalised) • Polyglot solution (SOA + Database) (no single database "ts every problem) Understand your data access patterns and you’ll be able to choose the right database for you Friday, 30 November 12

Slide 91

Slide 91 text

Basho EMEA Friday, 30 November 12

Slide 92

Slide 92 text

Questions? Chris Molozian, [email protected] Matthew Revell, [email protected] Friday, 30 November 12

Slide 93

Slide 93 text

Want to know more? We will come and give a Riak tech talk at your organisation or group: bit.ly/RiakTechTalk Friday, 30 November 12