Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Understanding Eventual Consistency and Riak

Understanding Eventual Consistency and Riak

The success of applications and businesses on the Internet is often measured in converting visitors into customers, retaining user attentions and providing rich experiences, even to mobile devices.
All of these things require responsiveness -- low latency and constant availability.

Riak is a networked key-value database that enables applications to be highly-available and keep latencies low, but requires the developer to grasp eventual consistency. I'll discuss why this tradeoff to eventual consistency is necessary (including reviewing the CAP theorem), how Riak implements eventual consistency, and how developers can harness Riak to tackle hard availability problems.

Sean Cribbs

April 23, 2013
Tweet

More Decks by Sean Cribbs

Other Decks in Technology

Transcript

  1. FLP Impossibility Perfect asynchronous consensus is impossible with even one

    fault. http://the-paper-trail.org/blog/a-brief-tour-of-flp-impossibility/ Fischer, Lynch, Patterson - 1985
  2. FLP Impossibility Perfect asynchronous consensus is impossible with even one

    fault. http://the-paper-trail.org/blog/a-brief-tour-of-flp-impossibility/ Fischer, Lynch, Patterson - 1985
  3. FLP Impossibility Perfect asynchronous consensus is impossible with even one

    fault. Failure and indefinite delay are indistinguishable. http://the-paper-trail.org/blog/a-brief-tour-of-flp-impossibility/ Fischer, Lynch, Patterson - 1985
  4. CAP Under partition (failure), a system cannot maintain both availability

    and consistency. Practical systems favor one or the other. Brewer, et al. - 1997, 2001
  5. Harvest & Yield Harvest: how much of dataset is reflected

    in a response (C) Fox, Brewer - 2003?
  6. Harvest & Yield Harvest: how much of dataset is reflected

    in a response (C) Yield: how likely is the datastore to complete request (A) Fox, Brewer - 2003?
  7. Harvest & Yield Harvest: how much of dataset is reflected

    in a response (C) Yield: how likely is the datastore to complete request (A) http://codahale.com/you-cant-sacrifice-partition-tolerance/ Fox, Brewer - 2003?
  8. Safety & Liveness Safety: bad things don’t happen agreement, durability,

    integrity Liveness: good things eventually happen progress, responsiveness, resilience Leslie Lamport, 1977
  9. Failure Tolerance F + 1 at least one Eventual 2F

    + 1 majority Paxos, 2PC 3F + 1 super-majority Byzantine
  10. Failure Tolerance F + 1 at least one Eventual 2F

    + 1 majority Paxos, 2PC 3F + 1 super-majority Byzantine
  11. Failure Tolerance F + 1 at least one Eventual 2F

    + 1 majority Paxos, 2PC 3F + 1 super-majority Byzantine
  12. Failure Tolerance F + 1 at least one Eventual 2F

    + 1 majority Paxos, 2PC 3F + 1 super-majority Byzantine “Any sufficiently large system is in a constant state of partial failure.” Justin Sheehy, Basho CTO
  13. Failure Tolerance F + 1 at least one Eventual 2F

    + 1 majority Paxos, 2PC 3F + 1 super-majority Byzantine
  14. By default, 3 copies are stored (N). Consistent hashing and

    virtual nodes decouple storage from physical machines. Replication
  15. By default, 3 copies are stored (N). Consistent hashing and

    virtual nodes decouple storage from physical machines. “Sloppy” quorum increases availability. Replication
  16. By default, 3 copies are stored (N). Consistent hashing and

    virtual nodes decouple storage from physical machines. “Sloppy” quorum increases availability. Hinted handoff ensures eventual delivery. Replication
  17. Vector Clocks Each read includes a token of the current

    state (vector clock) Clients send the token back with writes
  18. Vector Clocks Each read includes a token of the current

    state (vector clock) Clients send the token back with writes Riak updates the token and detects stale versions and conflicts by comparison
  19. Vector Clocks Each read includes a token of the current

    state (vector clock) Clients send the token back with writes Riak updates the token and detects stale versions and conflicts by comparison Conflicts are exposed to your application
  20. It happens frequently without strong consistency. For example, have two

    threads write at the same time. Riak detects this conflict for you.
  21. Your application is also part of the Eventually Consistent system.

    It makes the hard decisions that Riak can’t.
  22. Riak will give you all values that can’t be resolved

    automatically. Just write back the resolved value.
  23. Think of it like merging or rebasing git branches. Sometimes

    you MUST use git mergetool to fix things.
  24. Example If your value is a shopping cart, merge all

    items and sum quantities into a single cart.
  25. Example If your value is a shopping cart, merge all

    items and sum quantities into a single cart. Maybe the customer buys more that way!
  26. Conflict-Free Replicated Data-Types Values that converge automatically aka Bounded Join

    Semi-Lattices Registers, counters, sets, graphs, etc http://hal.inria.fr/inria-00397981/en/
  27. Conflict-Free Replicated Data-Types Values that converge automatically aka Bounded Join

    Semi-Lattices Registers, counters, sets, graphs, etc Active area of research in EU http://hal.inria.fr/inria-00397981/en/
  28. A CRDT: G-Set A set that only grows in elements

    Converges via set-union Allows add and member operations
  29. A CRDT: G-Set A set that only grows in elements

    Converges via set-union Allows add and member operations {A} ⋃ {B} = {A,B}
  30. Example A “like” button could use an Observed-Remove Set that

    contains all the users who clicked it. Each “like” is a separate add entry. To “unlike”, tombstone all the “likes” the user has added.
  31. Use-Cases Session storage: Wikia Sensor data: Boundary, Temetra Social: Yammer,

    Voxer Gaming: Rovio Infrastructure: Comcast, IDC Frontier
  32. Conclusion Any sufficiently large system is inconsistent and constantly failing

    Being consistent or available under failure is a practical tradeoff and spectrum of choices
  33. Conclusion Any sufficiently large system is inconsistent and constantly failing

    Being consistent or available under failure is a practical tradeoff and spectrum of choices Riak remains available during failures, progresses toward consistency