Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Convergent Replicated Data Types - SRC Fringe

Convergent Replicated Data Types - SRC Fringe

Data types friendly to distributed databases.

Bryce "BonzoESC" Kerley

July 01, 2012
Tweet

More Decks by Bryce "BonzoESC" Kerley

Other Decks in Programming

Transcript

  1. Distributed Let’s play a game! Sunday, July 1, 12 Yes,

    we will both attack at the agreed-upon time.
  2. Commutative and Convergent 1 + 2 + 3 3 +

    1 + 2 1 + 3 + 2 Sunday, July 1, 12
  3. Commutative and Convergent Addition Set union But not floating point;

    Never use floating point Sunday, July 1, 12
  4. G-Counter Sunday, July 1, 12 Does it matter if I

    see something momentarily inaccurate? Not really. Does it matter if inaccuracies accumulate? Probably.
  5. G-Counter node count a 1000 b 100 c 77 total

    1177 Sunday, July 1, 12 Does it matter if I see something momentarily inaccurate? Not really. Does it matter if inaccuracies accumulate? Probably.
  6. G-Counter node count a 1000 b 100 c 77 total

    1177 node count a 1000 b 100 c 77 total 1177 Sunday, July 1, 12 On the left, we see node “B” On the right, we see node “A”
  7. G-Counter node count a 1000 b 105 c 77 total

    1182 node count a 1007 b 100 c 77 total 1184 Sunday, July 1, 12 “B” increments by 5 “A” increments by 7 They each see different counts
  8. G-Counter node count a 1007 b 105 c 77 total

    1189 Sunday, July 1, 12 However, they converge to the same count
  9. G-Counter Each counting node counts for itself only When merging,

    highest count per-row wins Sunday, July 1, 12
  10. G-Counter { type: ‘g-counter’, counts: { ‘a’: 1007, ‘b’: 105,

    ‘c’: 77 } } node count a 1007 b 105 c 77 total 1189 Sunday, July 1, 12
  11. G-Set Set union is also commutative and convergent Sunday, July

    1, 12 Set insertion is a special case of set union
  12. G-Counter { type: ‘g-counter’, counts: { ‘a’: 1007, ‘b’: 105,

    ‘c’: 77 } } node count a 1007 b 105 c 77 total 1189 Sunday, July 1, 12
  13. Sets Can we do remove from a set? Sunday, July

    1, 12 i.e. remove from a set?
  14. naïve set members a b members b c Sunday, July

    1, 12 did we remove “a” and add “c” did we remove “c” and add “a” we don’t know who was first
  15. Removal Sunday, July 1, 12 When you add an object

    to signal that another one’s been removed, it’s called a “tombstone.”
  16. 2P-Set additions a b c removals a How do I

    add “ a” again? Sunday, July 1, 12
  17. LWW-Element-Set + - tef 3 4 sam 1 ry 5

    5 Sunday, July 1, 12 last write wins; each write is indexed by an increasing integer
  18. LWW-Element-Set + - tef 3 4 sam 1 ry 5

    5 Sunday, July 1, 12 last write wins; each write is indexed by an increasing integer
  19. LWW-Element-Set + - ry 1 2 Sunday, July 1, 12

    Ryan is added to the set on the left, added and then removed on the right, which is how we got into the current conflict.
  20. LWW-Element-Set + - ry 1 2 + - ry 4

    5 Sunday, July 1, 12 Ryan is added to the set on the left, added and then removed on the right, which is how we got into the current conflict.
  21. LWW-Element-Set + - ry 1 2 + - ry 5

    2 + - ry 4 5 Sunday, July 1, 12 Ryan is added to the set on the left, added and then removed on the right, which is how we got into the current conflict.
  22. LWW-Element-Set + - ry 1 2 + - ry 5

    5 + - ry 5 2 + - ry 4 5 Sunday, July 1, 12 Ryan is added to the set on the left, added and then removed on the right, which is how we got into the current conflict.
  23. LWW-Element-Set + - tef 3 4 sam 1 ry 5

    5 bias: bias: + Sunday, July 1, 12
  24. So far… G-Counter: a set of counters G-Set: a set

    2P-Set: two sets LWW-Element-Set: a set of counters Sunday, July 1, 12
  25. OR-Set + - eric 1, 2 1 aaron 3 ryan

    4, 5 4 Sunday, July 1, 12 each element is a 2P-set of additions and removals to remove ryan from the set, we put the corresponding removal in the “removed” part of his 2P-set
  26. OR-Set + - eric 1, 2 1 aaron 3 ryan

    4, 5 4, 5 Sunday, July 1, 12 each element is a 2P-set of additions and removals to remove ryan from the set, we put the corresponding removal in the “removed” part of his 2P-set
  27. OR-Set + - eric 1, 2 1 aaron 3 ryan

    4, 5 4, 5 Sunday, July 1, 12 we can use this to implement a list of users that have responded to a social event: they add to the “+” set when they say yes, and to the “-” set when they say no
  28. OR-Set + - eric 1, 2 1 aaron 3 ryan

    4, 5 4, 5 I was indecisive! Sunday, July 1, 12 we can use this to implement a list of users that have responded to a social event: they add to the “+” set when they say yes, and to the “-” set when they say no
  29. OR-Set + - eric 1, 2 1 aaron 3 ryan

    4, 5 4, 5 I was indecisive! Of course! Sunday, July 1, 12 we can use this to implement a list of users that have responded to a social event: they add to the “+” set when they say yes, and to the “-” set when they say no
  30. OR-Set + - eric 1, 2 1 aaron 3 ryan

    4, 5 4, 5 I was indecisive! Of course! I have to do my hair that day :( Sunday, July 1, 12 we can use this to implement a list of users that have responded to a social event: they add to the “+” set when they say yes, and to the “-” set when they say no
  31. Garbage Collection An update f will sometimes add some information

    r(f) to the payload in order to deal cleanly with operations concurrent with f. As an example, in the Add-Remove Partial Order of Section 3.4.2, remove leaves a tombstone in order to allow addBetweens to proceed. Once f is stable, i.e., all operations concurrent with f have been delivered, r(f) serves no useful purpose. A GC opportunity exists to detect this condition and discard r(f). Liveness of Φ requires that the set of replicas be known and that they not crash per- manently (undetectably). Under these assumptions, the stability algorithm of Wuu and Bernstein [44] can be adapted. The algorithm assumes causal delivery. An update g has an associated vector clock v(g). Replica xi maintains the last vector clock value received from every other replica x , noted V min(j), which identifies all updates that x knows to have been delivered by x . Replica must periodically propagate its vector clock to update V min values, possibly by sending empty messages. With this information, (∀j : V min(j) ≥ v(f)) 㱺 Φ (f). Importantly, the information required is typically already used by a reliable delivery mech- anism, and GC can be performed in the background. Sunday, July 1, 12
  32. Garbage Collection An update f will sometimes add some information

    r(f) to the payload in order to deal cleanly with operations concurrent with f. As an example, in the Add-Remove Partial Order of Section 3.4.2, remove leaves a tombstone in order to allow addBetweens to proceed. Once f is stable, i.e., all operations concurrent with f have been delivered, r(f) serves no useful purpose. A GC opportunity exists to detect this condition and discard r(f). Liveness of Φ requires that the set of replicas be known and that they not crash per- manently (undetectably). Under these assumptions, the stability algorithm of Wuu and Bernstein [44] can be adapted. The algorithm assumes causal delivery. An update g has an associated vector clock v(g). Replica xi maintains the last vector clock value received from every other replica x , noted V min(j), which identifies all updates that x knows to have been delivered by x . Replica must periodically propagate its vector clock to update V min values, possibly by sending empty messages. With this information, (∀j : V min(j) ≥ v(f)) 㱺 Φ (f). Importantly, the information required is typically already used by a reliable delivery mech- anism, and GC can be performed in the background. tl; dr: garbage collection is necessary but can be dicey! Sunday, July 1, 12
  33. Why? Sunday, July 1, 12 If you’re lucky, your app

    will grow bigger than you can handle with a single-machine database. Distributed databases make compromises that CRDTs can work
  34. Why? r1n1 r1n2 r1n3 r1n4 r1n5 Sunday, July 1, 12

    If you’re lucky, your app will grow bigger than you can handle with a single-machine database. Distributed databases make compromises that CRDTs can work
  35. Why? r1n1 r1n2 r1n3 r1n4 r1n5 r1n1 r1n2 r1n3 r1n4

    r1n5 r1n1 r1n2 r1n3 r1n4 r1n5 Sunday, July 1, 12 If you’re really spectacularly lucky, your app will grow beyond one datacenter. Distributed databases are the only way to handle this.