Slide 1

Slide 1 text

CRDT Convergent Replicated Data Types Bryce Kerley SRC Fringe, Sunday, July 1, 2012 Sunday, July 1, 12

Slide 2

Slide 2 text

I’m Bryce Kerley Sunday, July 1, 12

Slide 3

Slide 3 text

I’m Bryce Kerley Sunday, July 1, 12

Slide 4

Slide 4 text

I’m Bryce Kerley Sunday, July 1, 12

Slide 5

Slide 5 text

I’m Bryce Kerley Sunday, July 1, 12

Slide 6

Slide 6 text

I’m Bryce Kerley Sunday, July 1, 12

Slide 7

Slide 7 text

I’m Bryce Kerley basho Sunday, July 1, 12

Slide 8

Slide 8 text

I’m Bryce Kerley basho Developer Advocate Sunday, July 1, 12

Slide 9

Slide 9 text

I’m Bryce Kerley basho Developer Advocate Sunday, July 1, 12

Slide 10

Slide 10 text

I’m Bryce Kerley basho Developer Advocate [email protected] @bonzoesc Sunday, July 1, 12

Slide 11

Slide 11 text

Riak is Distributed r1n1 r1n2 r1n3 r1n4 r1n5 Sunday, July 1, 12

Slide 12

Slide 12 text

Non-Distributed Just use a lock Sunday, July 1, 12

Slide 13

Slide 13 text

Distributed It’s hard :( Sunday, July 1, 12

Slide 14

Slide 14 text

Distributed Let’s play a game! Sunday, July 1, 12 Yes, we will both attack at the agreed-upon time.

Slide 15

Slide 15 text

Distributed Embrace uncertainty. Sunday, July 1, 12 You can’t win; but you can change the game.

Slide 16

Slide 16 text

Distributed Synchronization vs. Tolerance Sunday, July 1, 12

Slide 17

Slide 17 text

Commutative and Convergent 1 + 2 + 3 3 + 1 + 2 1 + 3 + 2 Sunday, July 1, 12

Slide 18

Slide 18 text

Commutative and Convergent Addition Set union Sunday, July 1, 12

Slide 19

Slide 19 text

Commutative and Convergent Addition Set union But not floating point; Sunday, July 1, 12

Slide 20

Slide 20 text

Commutative and Convergent Addition Set union But not floating point; Never use floating point Sunday, July 1, 12

Slide 21

Slide 21 text

G-Counter How many times has something been seen to happen? Sunday, July 1, 12

Slide 22

Slide 22 text

G-Counter Sunday, July 1, 12 Does it matter if I see something momentarily inaccurate? Not really. Does it matter if inaccuracies accumulate? Probably.

Slide 23

Slide 23 text

G-Counter node count a 1000 b 100 c 77 total 1177 Sunday, July 1, 12 Does it matter if I see something momentarily inaccurate? Not really. Does it matter if inaccuracies accumulate? Probably.

Slide 24

Slide 24 text

G-Counter node count a 1000 b 100 c 77 total 1177 node count a 1000 b 100 c 77 total 1177 Sunday, July 1, 12 On the left, we see node “B” On the right, we see node “A”

Slide 25

Slide 25 text

G-Counter node count a 1000 b 105 c 77 total 1182 node count a 1007 b 100 c 77 total 1184 Sunday, July 1, 12 “B” increments by 5 “A” increments by 7 They each see different counts

Slide 26

Slide 26 text

G-Counter node count a 1007 b 105 c 77 total 1189 Sunday, July 1, 12 However, they converge to the same count

Slide 27

Slide 27 text

G-Counter Each counting node counts for itself only When merging, highest count per-row wins Sunday, July 1, 12

Slide 28

Slide 28 text

G-Counter { type: ‘g-counter’, counts: { ‘a’: 1007, ‘b’: 105, ‘c’: 77 } } node count a 1007 b 105 c 77 total 1189 Sunday, July 1, 12

Slide 29

Slide 29 text

G-Set Set union is also commutative and convergent Sunday, July 1, 12 Set insertion is a special case of set union

Slide 30

Slide 30 text

G-Counter { type: ‘g-counter’, counts: { ‘a’: 1007, ‘b’: 105, ‘c’: 77 } } node count a 1007 b 105 c 77 total 1189 Sunday, July 1, 12

Slide 31

Slide 31 text

G-Set { type: ‘g-set’, members: [ ‘a’, ‘b’, ‘c’ ] } members a b c Sunday, July 1, 12

Slide 32

Slide 32 text

G-Set { type: ‘g-set’, members: [ ‘a’, ‘b’, ‘c’ ] } members a b c Sunday, July 1, 12

Slide 33

Slide 33 text

G-Set { type: ‘g-set’, members: [ ‘a’, ‘b’, ‘c’ ] } members a b c Sunday, July 1, 12

Slide 34

Slide 34 text

Sets Can we do remove from a set? Sunday, July 1, 12 i.e. remove from a set?

Slide 35

Slide 35 text

naïve set Just remove it? Sunday, July 1, 12

Slide 36

Slide 36 text

naïve set members a b members b c Sunday, July 1, 12 did we remove “a” and add “c” did we remove “c” and add “a” we don’t know who was first

Slide 37

Slide 37 text

Removal Sunday, July 1, 12 When you add an object to signal that another one’s been removed, it’s called a “tombstone.”

Slide 38

Slide 38 text

2P-Set additions a b c removals a Sunday, July 1, 12

Slide 39

Slide 39 text

2P-Set additions a b c removals a How do I add “ a” again? Sunday, July 1, 12

Slide 40

Slide 40 text

LWW-Element-Set + - tef 3 4 sam 1 ry 5 5 Sunday, July 1, 12 last write wins; each write is indexed by an increasing integer

Slide 41

Slide 41 text

LWW-Element-Set + - tef 3 4 sam 1 ry 5 5 Sunday, July 1, 12 last write wins; each write is indexed by an increasing integer

Slide 42

Slide 42 text

LWW-Element-Set + - ry 1 2 Sunday, July 1, 12 Ryan is added to the set on the left, added and then removed on the right, which is how we got into the current conflict.

Slide 43

Slide 43 text

LWW-Element-Set + - ry 1 2 + - ry 4 5 Sunday, July 1, 12 Ryan is added to the set on the left, added and then removed on the right, which is how we got into the current conflict.

Slide 44

Slide 44 text

LWW-Element-Set + - ry 1 2 + - ry 5 2 + - ry 4 5 Sunday, July 1, 12 Ryan is added to the set on the left, added and then removed on the right, which is how we got into the current conflict.

Slide 45

Slide 45 text

LWW-Element-Set + - ry 1 2 + - ry 5 5 + - ry 5 2 + - ry 4 5 Sunday, July 1, 12 Ryan is added to the set on the left, added and then removed on the right, which is how we got into the current conflict.

Slide 46

Slide 46 text

LWW-Element-Set + - tef 3 4 sam 1 ry 5 5 bias: bias: + Sunday, July 1, 12

Slide 47

Slide 47 text

So far… G-Counter: a set of counters G-Set: a set 2P-Set: two sets LWW-Element-Set: a set of counters Sunday, July 1, 12

Slide 48

Slide 48 text

OR-Set Sunday, July 1, 12

Slide 49

Slide 49 text

OR-Set + - eric 1, 2 1 aaron 3 ryan 4, 5 4 Sunday, July 1, 12 each element is a 2P-set of additions and removals to remove ryan from the set, we put the corresponding removal in the “removed” part of his 2P-set

Slide 50

Slide 50 text

OR-Set + - eric 1, 2 1 aaron 3 ryan 4, 5 4, 5 Sunday, July 1, 12 each element is a 2P-set of additions and removals to remove ryan from the set, we put the corresponding removal in the “removed” part of his 2P-set

Slide 51

Slide 51 text

OR-Set + - eric 1, 2 1 aaron 3 ryan 4, 5 4, 5 Sunday, July 1, 12 we can use this to implement a list of users that have responded to a social event: they add to the “+” set when they say yes, and to the “-” set when they say no

Slide 52

Slide 52 text

OR-Set + - eric 1, 2 1 aaron 3 ryan 4, 5 4, 5 I was indecisive! Sunday, July 1, 12 we can use this to implement a list of users that have responded to a social event: they add to the “+” set when they say yes, and to the “-” set when they say no

Slide 53

Slide 53 text

OR-Set + - eric 1, 2 1 aaron 3 ryan 4, 5 4, 5 I was indecisive! Of course! Sunday, July 1, 12 we can use this to implement a list of users that have responded to a social event: they add to the “+” set when they say yes, and to the “-” set when they say no

Slide 54

Slide 54 text

OR-Set + - eric 1, 2 1 aaron 3 ryan 4, 5 4, 5 I was indecisive! Of course! I have to do my hair that day :( Sunday, July 1, 12 we can use this to implement a list of users that have responded to a social event: they add to the “+” set when they say yes, and to the “-” set when they say no

Slide 55

Slide 55 text

Garbage Collection An update f will sometimes add some information r(f) to the payload in order to deal cleanly with operations concurrent with f. As an example, in the Add-Remove Partial Order of Section 3.4.2, remove leaves a tombstone in order to allow addBetweens to proceed. Once f is stable, i.e., all operations concurrent with f have been delivered, r(f) serves no useful purpose. A GC opportunity exists to detect this condition and discard r(f). Liveness of Φ requires that the set of replicas be known and that they not crash per- manently (undetectably). Under these assumptions, the stability algorithm of Wuu and Bernstein [44] can be adapted. The algorithm assumes causal delivery. An update g has an associated vector clock v(g). Replica xi maintains the last vector clock value received from every other replica x , noted V min(j), which identifies all updates that x knows to have been delivered by x . Replica must periodically propagate its vector clock to update V min values, possibly by sending empty messages. With this information, (∀j : V min(j) ≥ v(f)) 㱺 Φ (f). Importantly, the information required is typically already used by a reliable delivery mech- anism, and GC can be performed in the background. Sunday, July 1, 12

Slide 56

Slide 56 text

Garbage Collection An update f will sometimes add some information r(f) to the payload in order to deal cleanly with operations concurrent with f. As an example, in the Add-Remove Partial Order of Section 3.4.2, remove leaves a tombstone in order to allow addBetweens to proceed. Once f is stable, i.e., all operations concurrent with f have been delivered, r(f) serves no useful purpose. A GC opportunity exists to detect this condition and discard r(f). Liveness of Φ requires that the set of replicas be known and that they not crash per- manently (undetectably). Under these assumptions, the stability algorithm of Wuu and Bernstein [44] can be adapted. The algorithm assumes causal delivery. An update g has an associated vector clock v(g). Replica xi maintains the last vector clock value received from every other replica x , noted V min(j), which identifies all updates that x knows to have been delivered by x . Replica must periodically propagate its vector clock to update V min values, possibly by sending empty messages. With this information, (∀j : V min(j) ≥ v(f)) 㱺 Φ (f). Importantly, the information required is typically already used by a reliable delivery mech- anism, and GC can be performed in the background. tl; dr: garbage collection is necessary but can be dicey! Sunday, July 1, 12

Slide 57

Slide 57 text

Why? Sunday, July 1, 12 If you’re lucky, your app will grow bigger than you can handle with a single-machine database. Distributed databases make compromises that CRDTs can work

Slide 58

Slide 58 text

Why? r1n1 r1n2 r1n3 r1n4 r1n5 Sunday, July 1, 12 If you’re lucky, your app will grow bigger than you can handle with a single-machine database. Distributed databases make compromises that CRDTs can work

Slide 59

Slide 59 text

Why? r1n1 r1n2 r1n3 r1n4 r1n5 r1n1 r1n2 r1n3 r1n4 r1n5 r1n1 r1n2 r1n3 r1n4 r1n5 Sunday, July 1, 12 If you’re really spectacularly lucky, your app will grow beyond one datacenter. Distributed databases are the only way to handle this.

Slide 60

Slide 60 text

Code Sunday, July 1, 12

Slide 61

Slide 61 text

Code Sunday, July 1, 12

Slide 62

Slide 62 text

Thanks http://bit.ly/crdt-src Sunday, July 1, 12