Convergent Replicated Data Types in Riak 2.0

Convergent Replicated Data Types in Riak 2.0

Talk by Gordon Guthrie, Senior Software Engineer at Basho.

Summary

A review of the CAP Theorem and the difficulties of resolving conflicts in highly distributed systems. Covering the issues and various theories on how to resolve including the use CRDTs in Riak

Details

CRDTs are used to replicate data across multiple computers in a network, executing updates without the need for remote synchronisation. This leads to merge conflicts in systems using conventional eventual consistency technology, but CRDTs are designed such that conflicts are mathematically impossible. Under the constraints of the CAP theorem they provide the strongest consistency guarantees for available/partition-tolerant (AP) settings.

The CRDT concept was first formally defined in 2007 by Marc Shapiro and Nuno Preguiça in terms of operation commutativity, and development was initially motivated by collaborative text editing. The concept of semilattice evolution of replicated states was first defined by Baquero and Moura in 1997, and development was initially motivated by mobile computing. The two concepts were later unified in 2011.

Basho has worked with the EU and Marc Shapiro's team to push CRDTs into distributed systems. Riak v2.x is the first commercial product to include this functionality

Cb6e6da05b5b943d2691ceefa3381cad?s=128

Big Data Spain

May 29, 2015
Tweet

Transcript

  1. Convergent Replicated Data Types in Riak 2.0 Gordon Guthrie Senior

    Software Engineer - Basho
  2. OLAP OLTP Financial s 99.9999% 99.999% 99.99% 99.9% 99% Availability

    Pressure Consistency Pressure Year End Quarter End Month End Week End End Of Day End Of Hour End Of Minute Popping a Cap in the Enterprise’s Ass Customers Company
  3. ‘Data Platform’ Riak Riak Available Replication Consistent At A Point

    In Time CRDT’s – The Magic Customers Company
  4. What is Riak?

  5. What is it for? • Riak is a highly-available, low-latency

    KV store • it is an open-source implementation of the Dynamo paper from Amazon • Dynamo underpins the Amazon shopping cart
  6. Why does it matter? • availability matters • latency matters

    • Amazon found every 100ms of latency cost them 1% in sales • Google found an extra 0.5 seconds in search page generation time dropped traffic by 20%
  7. Riak Availability

  8. The Ring

  9. …it’s a hash function… If I have: • a ring

    of 16 slots • a hashed keyspace of 2 8 Then the slots are: • Slot 1: 0 – 15 • Slot 2: 16 – 31 • Slot 3: 32 – 47 • etc, etc {key: ‘Alice’, value: ‘hello’} hash(‘Alice’, 2 8) ­> 17 Store it in Slot 2
  10. Mapped To Machines Physical Machine 1 Physical Machine 2 Physical

    Machine 3 Physical Machine 4 Physical Machine 5
  11. Machines Go Down Physical Machine 1 Physical Machine 2 Physical

    Machine 3 Physical Machine 3 Physical Machine 4 Physical Machine 5
  12. And Come Back Physical Machine 1 Physical Machine 2 Physical

    Machine 3 Physical Machine 4 Physical Machine 5
  13. but that’s not enough That’s how the data survives a

    box dying
  14. What happens if the network partitions

  15. CAP Theorem

  16. The Cap Theorem Pick 2 Consistency Availability Partition Tolerance Riak

  17. • In a distributed computer system you trade:  Consistency

     Availability  Partition tolerance • Pick two  CA (Relational)  CP (Mongo, Hbase, Redis)  AP (Riak, Cassandra, Couchbase, Dynamo DB) C C A A P P CAP Theorem Relaxed consistency X
  18. • Think of C, A and P as dials on

    a dashboard • In a good implementation (like Riak) they can be tuned to achieve the most appropriate balance of consistency, availability and partition tolerance for specific workloads 0 1 2 3 4 A 5 6 7 8 9 10 11 0 1 2 3 4 P 5 6 7 8 9 10 11 0 1 2 3 4 C 5 6 7 8 9 10 11 Tuning the CAP
  19. Partitions Happen • because Riak is a highly-available data store

    it will continue to store your data • but your data will not be consistent
  20. AP Physical Machine 1 Physical Machine 2 Physical Machine 3

    Physical Machine 4 Physical Machine 5 Partition Val 1 Val 2
  21. And Come Back Physical Machine 1 Physical Machine 2 Physical

    Machine 3 Physical Machine 4 Physical Machine 5
  22. Now you have siblings Physical Machine 1 Physical Machine 2

    Physical Machine 3 Physical Machine 4 Physical Machine 5 {Val 1, Val 2}
  23. Siblings the developer’s problem that Riak 2.0 is trying to

    fix
  24. Sibling Resolution • Developers hate siblings • Developers hate resolving

    siblings • Developers play Timestamp Roulette and go for Last Write Wins • but with intercontinental lag, NTP failure, clock skew and latency sometimes Timestamps kill your data
  25. Google’s View • “Designing applications to cope with concurrency anomalies

    in their data is very error-prone, time-consuming, and ultimately not worth the performance gains.” • “We have a lot of experience with eventual consistency systems at Google.” • “We find developers spend a significant fraction of their time building extremely complex and error-prone mechanisms to cope with eventual consistency”
  26. The Partition Cycle Consistent Available Available Partitioned Not Consistent Available

    (Eventually) Consistent Available partitioned unpartitioned Developer Hell Developer Heaven CRDTs
  27. CRDTs Consistent Replicated Data Types

  28. Understanding CRDTs • Not going to go into a lot

    of detail • Based on the idea of Lamport Vector Clocks • A Lamport Vector clock allows an ‘actor’ in a distributed data system to say 2 things: • this is when I changed this data • and when I changed it I knew all the things every other actor had done upto these times • Its a vector clock because it has a separate clock for every actor
  29. Where Are The Actors? • Actors can be clients-side or

    server-side or both • Riak is implementing an architecture that allows users to use server-side CRDTs • to avoid the risk of ‘actor’ explosion
  30. Client Side Actors • Alice’s Laptop • Alice’s phone •

    Alice’s iPad • Alice’s daughter’s iPad (her battery is flat) • Bob’s new laptop • Bob’s old laptop • etc, etc
  31. None
  32. How They Work • Lets take a simple example with

    two actions: • add an item to a shopping cart • empty the shopping cart • these actions don’t commute: • add item then clear gives an empty cart • empty cart then add item gives a cart with one item in it
  33. How They Work Consistent Available Available Partitioned Not Consistent Available

    (Eventually) Consistent Available partitioned unpartitioned CRDTs Clear Add Item [{a, }] [{b, }] {[{a, }], [{b, }]} The ‘actor’ clearing the shopping cart didn’t know there was anything in the cart Therefore there should be something in the cart after merging the two updates
  34. How They Work Consistent Available Available Partitioned Not Consistent Available

    (Eventually) Consistent Available partitioned unpartitioned CRDTs Clear Add Item [{a, }] [{a, },{b, }] {[{a, }], [{a, },{b, }]} The ‘actor’ clearing the shopping cart knew there was something in the cart Therefore there should be nothing in the cart after merging the two updates
  35. Scary Maths Stuf • Join Semi­Lattices that have a Bottom

    and a Least Upper Bound • that have defined properties involving Associativity, Commutativity, Idempotency • with a Merge Function
  36. References Your essential CRDT reading list http://christophermeiklejohn.com/crdt/2014/07/22/readings-in- crdts.html

  37. CRDTs in Riak 2.0

  38. Native Eventually Consistent Data Types • Flags • Registers •

    Counters • Sets • Maps • can contain Flags, Registers, Counter and Maps • the basis of composable data types
  39. Flags • Can have one of two values: • enabled

    • disabled • Operations: • enable • disable • Examples - use like Booleans • has a tweet been sent? • has the customer signed up to a pricing plan?
  40. Registers • Named Binaries that have contents which have a

    value • Operations: • store new value • Examples • store “My New Post” in the field “Blog Post Title”
  41. Counters • Contain a number which can be incremented or

    decremented • Operations • increment • decrement • Examples • the number of “likes” in Facebook • the number of Twitter followers
  42. Sets • Collections of unique binaries • Operations • add

    an element • remove an element • add a list of elements • remove a list of elements • Examples • shopping cart
  43. Maps • Maps are like hash maps • Operations •

    add a named Flag, Register, Counter, Set or Map • remove a named Flag, Register, Counter, Set or Map • pass through an op for an element in the Map • Example • the Map is used to compose the data model
  44. Questions? Fin