Upgrade to Pro — share decks privately, control downloads, hide ads and more …

CAP Theorem: You don’t need CP, you don’t want AP, and you can’t have CA

CAP Theorem: You don’t need CP, you don’t want AP, and you can’t have CA

Video: https://www.youtube.com/watch?v=hUd_9FENShA

CAP Theorem is everywhere: “Consistency, Availability, Partition tolerance — choose any two!” But it is oversimplified and misunderstood more often than not. CAP’s consistency isn’t what most people think it is; CAP’s availability isn’t what most people think it is; what does partition-tolerance even mean?

In this talk we’ll explore the CAP-theorem and understand what it is really asserting. We’ll understand that just calling a system out as CP or AP (or even CA) is pretty pointless, and learn to judge them beyond the simple monikers. We’ll also analyse some popular databases of the world (MySQL, ZooKeeper, Cassandra, Kafka etc.) with this framework.

Siddhartha Reddy

July 16, 2015
Tweet

More Decks by Siddhartha Reddy

Other Decks in Technology

Transcript

  1. 16th July 2015
    Siddhartha Reddy
    Partition-tolerance
    Consistency Availability
    You don’t need CP You don’t want AP
    You can’t have CA

    View full-size slide

  2. Show of hands
    https://www.flickr.com/photos/dotbenjamin/2765083201/

    View full-size slide

  3. Show of hands
    ✤ Heard of CAP Theorem?
    https://www.flickr.com/photos/dotbenjamin/2765083201/

    View full-size slide

  4. Show of hands
    ✤ Heard of CAP Theorem?
    ✤ Used CAP Theorem?
    https://www.flickr.com/photos/dotbenjamin/2765083201/

    View full-size slide

  5. Show of hands
    ✤ Heard of CAP Theorem?
    ✤ Used CAP Theorem?
    ✤ Read the paper/proof?
    https://www.flickr.com/photos/dotbenjamin/2765083201/

    View full-size slide

  6. History
    ✤ Proposed by Dr. Eric Brewer

    [Symposium on Principles of Distributed Computing, 2000]

    View full-size slide

  7. History
    ✤ Proposed by Dr. Eric Brewer

    [Symposium on Principles of Distributed Computing, 2000]
    ✤ Proved by Gilbert & Lynch

    [Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services, 2002]
    ✤ “a shared-data system can have at most two of the three following
    properties: Consistency, Availability, and tolerance to network Partitions”

    View full-size slide

  8. Partition!
    Choose

    Availability
    Choose

    Consistency

    View full-size slide

  9. CAP Theorem
    ✤ Consistency, Availability, Partition-tolerance

    — “pick any two”

    View full-size slide

  10. CAP Theorem
    ✤ Consistency, Availability, Partition-tolerance

    — “pick any two”
    ✤ CP: Consistent & Partition-tolerant

    (sacrifice Availability)

    View full-size slide

  11. CAP Theorem
    ✤ Consistency, Availability, Partition-tolerance

    — “pick any two”
    ✤ CP: Consistent & Partition-tolerant

    (sacrifice Availability)
    ✤ AP: Available & Partition-tolerant

    (sacrifice Consistency)

    View full-size slide

  12. CAP Theorem
    ✤ Consistency, Availability, Partition-tolerance

    — “pick any two”
    ✤ CP: Consistent & Partition-tolerant

    (sacrifice Availability)
    ✤ AP: Available & Partition-tolerant

    (sacrifice Consistency)
    ✤ CA: Consistent & Available

    (sacrifice Partition-tolerance)

    View full-size slide

  13. Consistent & Available
    — An existential crisis
    Or why you can’t have CA

    View full-size slide

  14. CAP-Partitions
    “When a network is partitioned,
    all messages sent from nodes in
    one component of the partition to
    nodes in another component are
    lost”
    https://aphyr.com/posts/285-call-me-maybe-riak

    View full-size slide

  15. CA: Sacrificing Partition-tolerance

    View full-size slide

  16. CA: Sacrificing Partition-tolerance
    Partitions can’t or won’t happen

    View full-size slide

  17. CA: Sacrificing Partition-tolerance
    Partitions can’t or won’t happen
    A single-node database → nothing to partition. Yay!

    View full-size slide

  18. CA: Sacrificing Partition-tolerance
    Partitions can’t or won’t happen
    A single-node database → nothing to partition. Yay!
    “The network is reliable”
    ✤ LAN, high quality equipment etc.

    View full-size slide

  19. Reality-check: Networks Fail
    Image Credits: Switch, PDU, Wiring, Congested Network

    View full-size slide

  20. Partitions are a reality
    Network equipment failure
    Power/cooling failure
    Network congestion

    View full-size slide

  21. Partitions are a reality
    Network equipment failure
    Power/cooling failure
    Network congestion
    Garbage Collection

    View full-size slide

  22. Partitions are a reality
    Network equipment failure
    Power/cooling failure
    Network congestion
    Garbage Collection
    Misconfigurations

    View full-size slide

  23. Partitions are a reality
    Network equipment failure
    Power/cooling failure
    Network congestion
    Garbage Collection
    Misconfigurations
    Software bugs

    View full-size slide

  24. You can’t wish away Partitions!
    You can’t have CA

    View full-size slide

  25. Consistency
    — A costly fare
    Or why you don’t need CP

    View full-size slide

  26. CAP-Consistency
    ✤ Linearizable consistency
    ✤ “requests of the distributed
    shared memory to act as if they
    were executing on a single node,
    responding to operations one at a
    time”

    View full-size slide

  27. CAP-Consistency
    ✤ Linearizable consistency
    ✤ “requests of the distributed
    shared memory to act as if they
    were executing on a single node,
    responding to operations one at a
    time”
    Once an operation is complete,
    it will be visible to all.

    View full-size slide

  28. Linearizability is hard

    View full-size slide

  29. Linearizability is hard
    https://aphyr.com/posts/313-strong-consistency-models

    View full-size slide

  30. Linearizability is hard
    ✤ Two-phase commits
    https://aphyr.com/posts/313-strong-consistency-models

    View full-size slide

  31. Linearizability is hard
    ✤ Two-phase commits
    ✤ Sync replication
    https://aphyr.com/posts/313-strong-consistency-models

    View full-size slide

  32. Linearizability is hard
    ✤ Two-phase commits
    ✤ Sync replication
    ✤ Memory access in CPU — not linearizable

    (memory barriers can be used to make it linearizable)
    https://aphyr.com/posts/313-strong-consistency-models

    View full-size slide

  33. Linearizability is hard
    ✤ Two-phase commits
    ✤ Sync replication
    ✤ Memory access in CPU — not linearizable

    (memory barriers can be used to make it linearizable)
    ✤ Variables — not linearizable

    (volatile keyword in Java makes them linearizable)
    https://aphyr.com/posts/313-strong-consistency-models

    View full-size slide

  34. Linearizability is hard
    ✤ Two-phase commits
    ✤ Sync replication
    ✤ Memory access in CPU — not linearizable

    (memory barriers can be used to make it linearizable)
    ✤ Variables — not linearizable

    (volatile keyword in Java makes them linearizable)
    ✤ Relational DBs
    https://aphyr.com/posts/313-strong-consistency-models

    View full-size slide

  35. Linearizability — extreme form of consistency

    & extremely costly to provide; 

    possible to build useful systems without it
    You don’t need CP

    View full-size slide

  36. Availability:
    The blue pill
    Or why you don’t want AP
    https://www.flickr.com/photos/mlemos/3911634112

    View full-size slide

  37. CAP-Availability
    ✤ “every request received by a non-
    failing node in the system must
    result in a response”

    View full-size slide

  38. High Availability (HA)

    View full-size slide

  39. High Availability (HA)
    ✤ Uptime: 99%, 99.9%, 99.99%, 99.999%

    View full-size slide

  40. High Availability (HA)
    ✤ Uptime: 99%, 99.9%, 99.99%, 99.999%
    ✤ CAP-availability has little to do with

    High Availability

    View full-size slide

  41. High Availability (HA)
    ✤ Uptime: 99%, 99.9%, 99.99%, 99.999%
    ✤ CAP-availability has little to do with

    High Availability
    CAP Availability High Availability
    non-failing nodes the whole system
    unbounded
    response time
    realistic

    response times

    View full-size slide

  42. High Availability (HA)
    ✤ Uptime: 99%, 99.9%, 99.99%, 99.999%
    ✤ CAP-availability has little to do with

    High Availability
    ✤ All nodes returning empty response

    ⇒ CAP-available!
    CAP Availability High Availability
    non-failing nodes the whole system
    unbounded
    response time
    realistic

    response times

    View full-size slide

  43. High Availability (HA)
    ✤ Uptime: 99%, 99.9%, 99.99%, 99.999%
    ✤ CAP-availability has little to do with

    High Availability
    ✤ All nodes returning empty response

    ⇒ CAP-available!
    ✤ No response from any node

    ⇒ CAP-available!
    CAP Availability High Availability
    non-failing nodes the whole system
    unbounded
    response time
    realistic

    response times

    View full-size slide

  44. Your goal is High Availability;

    CAP-availability has little overlap with that
    You don’t want AP

    View full-size slide

  45. Consistency &
    Availability
    Relationship status: It’s complicated

    View full-size slide

  46. Consistency vs. Availability
    ✤ Trade-offs between consistency & availability are real

    View full-size slide

  47. Consistency vs. Availability
    ✤ Trade-offs between consistency & availability are real
    ✤ AP to CP is a spectrum

    and “the whole space is useful” – Eric Brewer

    AP CP
    Availability Consistency

    View full-size slide

  48. Consistency vs. Availability
    ✤ Trade-offs between consistency & availability are real
    ✤ AP to CP is a spectrum

    and “the whole space is useful” – Eric Brewer

    ✤ Most systems are neither CP nor AP, don’t need CP or AP
    ✤ eventual consistency & high availability is usually good enough
    AP CP
    Availability Consistency

    View full-size slide

  49. Consistency, Availability & more
    ✤ Trade-offs between consistency & availability are real;

    there are also trade-offs with latency, operational simplicity etc.
    AP CP
    Availability Consistency
    Latency
    Simplicity

    View full-size slide

  50. Consistency, Availability & more
    ✤ Trade-offs between consistency & availability are real;

    there are also trade-offs with latency, operational simplicity etc.
    AP CP
    Availability Consistency
    Latency
    Simplicity

    View full-size slide

  51. Consistency, Availability & more
    ✤ Trade-offs between consistency & availability are real;

    there are also trade-offs with latency, operational simplicity etc.
    ✤ CAP-Theorem is an incomplete tool

    for analysing the these trade-offs
    AP CP
    Availability Consistency
    Latency
    Simplicity

    View full-size slide

  52. PACELC: A CAP-theorem alternative

    View full-size slide

  53. PACELC: A CAP-theorem alternative
    AP CP
    Availability Consistency
    Latency
    Consistency

    View full-size slide

  54. PACELC (pass-elk)

    View full-size slide

  55. PACELC (pass-elk)
    If there is a Partition,
    how does the system trade-off Availability & Consistency

    View full-size slide

  56. PACELC (pass-elk)
    If there is a Partition,
    how does the system trade-off Availability & Consistency
    Else,
    how does the system trade-off Latency & Consistency

    View full-size slide

  57. AP CP
    Availability Consistency
    Latency
    Consistency

    View full-size slide

  58. AP CP
    Availability Consistency
    Latency
    Consistency
    PC/EC

    View full-size slide

  59. AP CP
    Availability Consistency
    Latency
    Consistency
    PA/EL
    PC/EC

    View full-size slide

  60. AP CP
    Availability Consistency
    Latency
    Consistency
    PA/EL
    PA/EC
    PC/EC

    View full-size slide

  61. AP CP
    Availability Consistency
    Latency
    Consistency
    PC/EL
    PA/EL
    PA/EC
    PC/EC

    View full-size slide

  62. Applying the theory
    Examining some real-world databases

    View full-size slide

  63. AP CP
    Availability Consistency
    Latency
    Consistency

    View full-size slide

  64. AP CP
    Availability Consistency
    Latency
    Consistency
    Async Replication

    View full-size slide

  65. AP CP
    Availability Consistency
    Latency
    Consistency
    Semi-sync Replication
    Async Replication

    View full-size slide

  66. AP CP
    Availability Consistency
    Latency
    Consistency

    View full-size slide

  67. AP CP
    Availability Consistency
    Latency
    Consistency
    read

    View full-size slide

  68. AP CP
    Availability Consistency
    Latency
    Consistency
    read sync+read

    View full-size slide

  69. AP CP
    Availability Consistency
    Latency
    Consistency

    View full-size slide

  70. AP CP
    Availability Consistency
    Latency
    Consistency
    Write-consistency: ANY

    Hinted-handoffs: on

    Read-consistency: ONE

    View full-size slide

  71. AP CP
    Availability Consistency
    Latency
    Consistency
    Write-consistency: ANY

    Hinted-handoffs: on

    Read-consistency: ONE
    Write-consistency: ALL

    Read-consistency: ALL

    View full-size slide

  72. AP CP
    Availability Consistency
    Latency
    Consistency

    View full-size slide

  73. AP CP
    Availability Consistency
    Latency
    Consistency
    required.acks=0

    View full-size slide

  74. AP CP
    Availability Consistency
    Latency
    Consistency
    required.acks=0
    required.acks=-1

    min.isr=<#replicas>

    View full-size slide

  75. AP CP
    Availability Consistency
    Latency
    Consistency
    required.acks=0
    required.acks=-1

    min.isr=1
    required.acks=-1

    min.isr=<#replicas>

    View full-size slide

  76. AP CP
    Availability Consistency
    Latency
    Consistency

    View full-size slide

  77. AP CP
    Availability Consistency
    Latency
    Consistency
    PNUTS

    View full-size slide

  78. Concluding
    ✤ Availability vs. Consistency vs. Latency
    ✤ Trade-offs are real

    View full-size slide

  79. Concluding
    ✤ Availability vs. Consistency vs. Latency
    ✤ Trade-offs are real
    ✤ A business (not engineering) decision

    View full-size slide

  80. Concluding
    ✤ Availability vs. Consistency vs. Latency
    ✤ Trade-offs are real
    ✤ A business (not engineering) decision
    ✤ “Would you rather be down or show wrong prices?”
    ✤ “Would you rather be slow or show wrong prices?”

    View full-size slide

  81. Concluding
    ✤ Availability vs. Consistency vs. Latency
    ✤ Trade-offs are real
    ✤ A business (not engineering) decision
    ✤ “Would you rather be down or show wrong prices?”
    ✤ “Would you rather be slow or show wrong prices?”
    ✤ Financial transactions?

    View full-size slide

  82. Concluding
    ✤ Availability vs. Consistency vs. Latency
    ✤ Trade-offs are real
    ✤ A business (not engineering) decision
    ✤ “Would you rather be down or show wrong prices?”
    ✤ “Would you rather be slow or show wrong prices?”
    ✤ Financial transactions?
    ✤ Finance industry runs on availability!

    View full-size slide

  83. Thank You.
    Siddhartha Reddy
    @sids
    sid@flipkart.com
    References: https://pinboard.in/u:sids/t:fifthel2015/

    View full-size slide