Slide 1

Slide 1 text

Data Modeling for Scale with Riak Data Types Sean Cribbs @seancribbs #riak #datatypes QCon NYC 2014

Slide 2

Slide 2 text

I work for Basho We make Visit our booth!

Slide 3

Slide 3 text

Riak is Eventually Consistent key-value + indexes + search + MapReduce

Slide 4

Slide 4 text

Eventual Consistency Replicated Loose coordination Convergence 1 2 3

Slide 5

Slide 5 text

✔Fault-tolerant ✔Highly available ✔Low-latency Eventual is Good

Slide 6

Slide 6 text

Consistency? 1 2 3 B A No clear winner! Throw one out? Keep both?

Slide 7

Slide 7 text

Consistency? 1 2 3 B A No clear winner! Throw one out? Keep both? Cassandra

Slide 8

Slide 8 text

Consistency? 1 2 3 B A No clear winner! Throw one out? Keep both? Cassandra Riak

Slide 9

Slide 9 text

Con!icts! A! B!

Slide 10

Slide 10 text

Semantic Resolution • Your app knows the domain - use business rules to resolve • Amazon Dynamo’s shopping cart

Slide 11

Slide 11 text

Semantic Resolution • Your app knows the domain - use business rules to resolve • Amazon Dynamo’s shopping cart “Ad hoc approaches have proven brittle and error-prone”

Slide 12

Slide 12 text

Convergent Replicated Data Types

Slide 13

Slide 13 text

Convergent Replicated Data Types useful abstractions

Slide 14

Slide 14 text

Convergent Replicated Data Types multiple independent copies useful abstractions

Slide 15

Slide 15 text

Convergent Replicated Data Types multiple independent copies resolves automatically toward a single value useful abstractions

Slide 16

Slide 16 text

How CRDTs Work • A partially-ordered set of values • A merge function • An identity value • In!ation operations

Slide 17

Slide 17 text

How CRDTs Work • A partially-ordered set of values • A merge function • An identity value • In!ation operations What CRDTs Enable • Consistency without coordination • Fluent, rich interaction with data

Slide 18

Slide 18 text

This research is supported in part by European FP7 project 609 551 SyncFree http://syncfree.lip6.fr/ (2013--2016).

Slide 19

Slide 19 text

by @joedevivo

Slide 20

Slide 20 text

Forget CRDTs Do Data Modeling

Slide 21

Slide 21 text

Data Modeling for Riak • Identify needs for both read and write • Design around key as index • Denormalize relationships if possible • Weigh data size against coherence

Slide 22

Slide 22 text

Riak Data Types

Slide 23

Slide 23 text

Riak Data Types increment decrement Counter :: int

Slide 24

Slide 24 text

Riak Data Types increment decrement Counter :: int add* remove Set :: { bytes }

Slide 25

Slide 25 text

Riak Data Types increment decrement Counter :: int add* remove Set :: { bytes } remove update* Map :: bytes → DT

Slide 26

Slide 26 text

Riak Data Types increment decrement Counter :: int add* remove Set :: { bytes } remove update* Map :: bytes → DT

Slide 27

Slide 27 text

Riak Data Types increment decrement Counter :: int add* remove Set :: { bytes } assign Register :: bytes remove update* Map :: bytes → DT

Slide 28

Slide 28 text

Riak Data Types increment decrement Counter :: int add* remove Set :: { bytes } enable* disable Flag :: boolean assign Register :: bytes remove update* Map :: bytes → DT

Slide 29

Slide 29 text

MADDATA

Slide 30

Slide 30 text

Counters

Slide 31

Slide 31 text

Ad Network • Impressions - when someone sees an ad • Click-through - when someone clicks on an ad • Hourly rollups ad-metrics//-

Slide 32

Slide 32 text

$ riak-admin bucket-type create ad-metrics \ '{"props":{"datatype":"counter"}}' ad-metrics created $ riak-admin bucket-type activate ad-metrics ad-metrics has been activated $ riak-admin bucket-type list ad-metrics (active) Ad Network

Slide 33

Slide 33 text

from riak import RiakClient from rogersads import RIAK_CONFIG from time import strftime client = RiakClient(**RIAK_CONFIG) metrics = client.bucket_type('ad-metrics') def record_metric(campaign, metric_type): key = metric_type + strftime('-%Y%m%d-%H') counter = metrics.bucket(campaign).new(key) counter.increment() counter.store() Ad Network

Slide 34

Slide 34 text

Ad Network 0 750 1500 2250 3000 9 10 11 12 13 14 15 16

Slide 35

Slide 35 text

Sets

Slide 36

Slide 36 text

• RSVPs - guest lists • Connections - friends lists per-user • Likes - expressing interest PartyOn

Slide 37

Slide 37 text

$ riak-admin bucket-type create partyon-sets \ '{"props":{"datatype":"set"}}' partyon-sets created $ riak-admin bucket-type activate partyon-sets partyon-sets has been activated $ riak-admin bucket-type list partyon-sets (active) PartyOn

Slide 38

Slide 38 text

• RSVPs partyon-sets/rsvps/ • Connections partyon-sets/friends/ • Likes partyon-sets/likes/ PartyOn

Slide 39

Slide 39 text

from riak.datatypes import Set sets = client.bucket_type('partyon-sets') rsvps = sets.bucket('rsvps') friends = sets.bucket('friends') likes = sets.bucket('likes') PartyOn

Slide 40

Slide 40 text

def rsvp_get(event): return rsvps.get(event) # Returns a Set def rsvp_add(event, user): guests = rsvps.new(event) guests.add(user) guests.store(return_body=True) return guests.context def rsvp_remove(event, user, context): guests = Set(rsvps, event, context=context) guests.remove(user) guests.store() PartyOn

Slide 41

Slide 41 text

Maps (and the rest)

Slide 42

Slide 42 text

GameNet • User pro"les - demographic data users/profiles/ • Achievements - trophies per game users/trophies/ • Game state - progress and stats users//

Slide 43

Slide 43 text

$ riak-admin bucket-type create users \ '{"props":{"datatype":"map"}}' users created $ riak-admin bucket-type activate users users has been activated $ riak-admin bucket-type list users (active) GameNet

Slide 44

Slide 44 text

users = client.bucket_type('users') def update_profile(user, fields): profile = users.bucket('profiles').get(user) for field in fields: if field in USER_FLAGS: if fields[field]: profile.flags[field].enable() else: profile.flags[field].disable() else: value = fields[field] profile.registers[field].assign(value) profile.store() GameNet

Slide 45

Slide 45 text

def add_trophy(user, game, trophy): trophies = users.bucket('trophies').get(user) trophies.sets[game].add(trophy) trophies.store() def get_trophies(user, game): trophies = users.bucket('trophies').get(user) return trophies.sets[game].value GameNet

Slide 46

Slide 46 text

def build_structure(user, game, structure, gold, wood, stone): gamestate = users.bucket(game).get(user) gamestate.sets['structures'].add(structure) gamestate.counters['gold'].decrement(gold) gamestate.counters['wood'].decrement(wood) gamestate.counters['stone'].decrement(stone) gamestate.store(return_body=True) return gamestate.value GameNet

Slide 47

Slide 47 text

client.create_search_index('asteroids') users.bucket('asteroids').set_property('search_index', 'asteroids') def find_asteroids_opponents(min_score=0): query = "score_counter:[{} to *]".format(min_score) results = client.fulltext_search( 'asteroids', query, fl=['userid_register', 'score_counter']) return results['docs'] GameNet

Slide 48

Slide 48 text

Bene"ts • Richer interactions, familiar types • Write mutations, not state • No merge function to write • Same reliability and predictability of vanilla Riak

Slide 49

Slide 49 text

Caveats • Value size still matters • Updates not idempotent (yet) • Cross-key atomicity not possible (yet)

Slide 50

Slide 50 text

Future • Riak 2.0 due out this summer - betas available now! • Richer querying, lighter storage requirements, more types

Slide 51

Slide 51 text

No content