Upgrade to Pro — share decks privately, control downloads, hide ads and more …

CAP Theorem: You don’t need CP, you don’t want AP, and you can’t have CA

CAP Theorem: You don’t need CP, you don’t want AP, and you can’t have CA

Video: https://www.youtube.com/watch?v=hUd_9FENShA

CAP Theorem is everywhere: “Consistency, Availability, Partition tolerance — choose any two!” But it is oversimplified and misunderstood more often than not. CAP’s consistency isn’t what most people think it is; CAP’s availability isn’t what most people think it is; what does partition-tolerance even mean?

In this talk we’ll explore the CAP-theorem and understand what it is really asserting. We’ll understand that just calling a system out as CP or AP (or even CA) is pretty pointless, and learn to judge them beyond the simple monikers. We’ll also analyse some popular databases of the world (MySQL, ZooKeeper, Cassandra, Kafka etc.) with this framework.

Siddhartha Reddy

July 16, 2015
Tweet

More Decks by Siddhartha Reddy

Other Decks in Technology

Transcript

  1. Show of hands ✤ Heard of CAP Theorem? ✤ Used

    CAP Theorem? https://www.flickr.com/photos/dotbenjamin/2765083201/
  2. Show of hands ✤ Heard of CAP Theorem? ✤ Used

    CAP Theorem? ✤ Read the paper/proof? https://www.flickr.com/photos/dotbenjamin/2765083201/
  3. History ✤ Proposed by Dr. Eric Brewer
 [Symposium on Principles

    of Distributed Computing, 2000] ✤ Proved by Gilbert & Lynch
 [Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services, 2002] ✤ “a shared-data system can have at most two of the three following properties: Consistency, Availability, and tolerance to network Partitions”
  4. CAP Theorem ✤ Consistency, Availability, Partition-tolerance
 — “pick any two”

    ✤ CP: Consistent & Partition-tolerant
 (sacrifice Availability)
  5. CAP Theorem ✤ Consistency, Availability, Partition-tolerance
 — “pick any two”

    ✤ CP: Consistent & Partition-tolerant
 (sacrifice Availability) ✤ AP: Available & Partition-tolerant
 (sacrifice Consistency)
  6. CAP Theorem ✤ Consistency, Availability, Partition-tolerance
 — “pick any two”

    ✤ CP: Consistent & Partition-tolerant
 (sacrifice Availability) ✤ AP: Available & Partition-tolerant
 (sacrifice Consistency) ✤ CA: Consistent & Available
 (sacrifice Partition-tolerance)
  7. CAP-Partitions “When a network is partitioned, all messages sent from

    nodes in one component of the partition to nodes in another component are lost” https://aphyr.com/posts/285-call-me-maybe-riak
  8. CA: Sacrificing Partition-tolerance Partitions can’t or won’t happen A single-node

    database → nothing to partition. Yay! “The network is reliable” ✤ LAN, high quality equipment etc.
  9. Partitions are a reality Network equipment failure Power/cooling failure Network

    congestion Garbage Collection Misconfigurations Software bugs
  10. CAP-Consistency ✤ Linearizable consistency ✤ “requests of the distributed shared

    memory to act as if they were executing on a single node, responding to operations one at a time”
  11. CAP-Consistency ✤ Linearizable consistency ✤ “requests of the distributed shared

    memory to act as if they were executing on a single node, responding to operations one at a time” Once an operation is complete, it will be visible to all.
  12. Linearizability is hard ✤ Two-phase commits ✤ Sync replication ✤

    Memory access in CPU — not linearizable
 (memory barriers can be used to make it linearizable) https://aphyr.com/posts/313-strong-consistency-models
  13. Linearizability is hard ✤ Two-phase commits ✤ Sync replication ✤

    Memory access in CPU — not linearizable
 (memory barriers can be used to make it linearizable) ✤ Variables — not linearizable
 (volatile keyword in Java makes them linearizable) https://aphyr.com/posts/313-strong-consistency-models
  14. Linearizability is hard ✤ Two-phase commits ✤ Sync replication ✤

    Memory access in CPU — not linearizable
 (memory barriers can be used to make it linearizable) ✤ Variables — not linearizable
 (volatile keyword in Java makes them linearizable) ✤ Relational DBs https://aphyr.com/posts/313-strong-consistency-models
  15. Linearizability — extreme form of consistency
 & extremely costly to

    provide; 
 possible to build useful systems without it You don’t need CP
  16. Availability: The blue pill Or why you don’t want AP

    https://www.flickr.com/photos/mlemos/3911634112
  17. High Availability (HA) ✤ Uptime: 99%, 99.9%, 99.99%, 99.999% ✤

    CAP-availability has little to do with
 High Availability
  18. High Availability (HA) ✤ Uptime: 99%, 99.9%, 99.99%, 99.999% ✤

    CAP-availability has little to do with
 High Availability CAP Availability High Availability non-failing nodes the whole system unbounded response time realistic
 response times
  19. High Availability (HA) ✤ Uptime: 99%, 99.9%, 99.99%, 99.999% ✤

    CAP-availability has little to do with
 High Availability ✤ All nodes returning empty response
 ⇒ CAP-available! CAP Availability High Availability non-failing nodes the whole system unbounded response time realistic
 response times
  20. High Availability (HA) ✤ Uptime: 99%, 99.9%, 99.99%, 99.999% ✤

    CAP-availability has little to do with
 High Availability ✤ All nodes returning empty response
 ⇒ CAP-available! ✤ No response from any node
 ⇒ CAP-available! CAP Availability High Availability non-failing nodes the whole system unbounded response time realistic
 response times
  21. Consistency vs. Availability ✤ Trade-offs between consistency & availability are

    real ✤ AP to CP is a spectrum
 and “the whole space is useful” – Eric Brewer
 AP CP Availability Consistency
  22. Consistency vs. Availability ✤ Trade-offs between consistency & availability are

    real ✤ AP to CP is a spectrum
 and “the whole space is useful” – Eric Brewer
 ✤ Most systems are neither CP nor AP, don’t need CP or AP ✤ eventual consistency & high availability is usually good enough AP CP Availability Consistency
  23. Consistency, Availability & more ✤ Trade-offs between consistency & availability

    are real;
 there are also trade-offs with latency, operational simplicity etc. AP CP Availability Consistency Latency Simplicity
  24. Consistency, Availability & more ✤ Trade-offs between consistency & availability

    are real;
 there are also trade-offs with latency, operational simplicity etc. AP CP Availability Consistency Latency Simplicity
  25. Consistency, Availability & more ✤ Trade-offs between consistency & availability

    are real;
 there are also trade-offs with latency, operational simplicity etc. ✤ CAP-Theorem is an incomplete tool
 for analysing the these trade-offs AP CP Availability Consistency Latency Simplicity
  26. PACELC (pass-elk) If there is a Partition, how does the

    system trade-off Availability & Consistency
  27. PACELC (pass-elk) If there is a Partition, how does the

    system trade-off Availability & Consistency Else, how does the system trade-off Latency & Consistency
  28. AP CP Availability Consistency Latency Consistency Write-consistency: ANY
 Hinted-handoffs: on


    Read-consistency: ONE Write-consistency: ALL
 Read-consistency: ALL
  29. Concluding ✤ Availability vs. Consistency vs. Latency ✤ Trade-offs are

    real ✤ A business (not engineering) decision ✤ “Would you rather be down or show wrong prices?” ✤ “Would you rather be slow or show wrong prices?”
  30. Concluding ✤ Availability vs. Consistency vs. Latency ✤ Trade-offs are

    real ✤ A business (not engineering) decision ✤ “Would you rather be down or show wrong prices?” ✤ “Would you rather be slow or show wrong prices?” ✤ Financial transactions?
  31. Concluding ✤ Availability vs. Consistency vs. Latency ✤ Trade-offs are

    real ✤ A business (not engineering) decision ✤ “Would you rather be down or show wrong prices?” ✤ “Would you rather be slow or show wrong prices?” ✤ Financial transactions? ✤ Finance industry runs on availability!