Consistency
"[...] a total order on all operations
such that each operation looks as if it
were completed at a single instant."
Slide 4
Slide 4 text
Availability
"[...] every request received by a non-
failing node in the system must result
in a response."
Slide 5
Slide 5 text
Partition Tolerance
"In order to model partition tolerance,
the network will be allowed to lose
arbitrarily many messages sent from
one node to another."
Slide 6
Slide 6 text
Proof by Construction
If a client writes to one side of a
partition, any reads that go to the
other side of that partition cannot
know about this write
Slide 7
Slide 7 text
No content
Slide 8
Slide 8 text
Conclusion
"most real-world systems today are
forced to settle with returning “most
of the data, most of the time.”"
Slide 9
Slide 9 text
No content
Slide 10
Slide 10 text
Fallacy #1:
The Network Is Reliable
https://aphyr.com/posts/288-the-network-is-reliable
Distributed systems can either go for
C or A
Slide 11
Slide 11 text
Fallacy #2:
Guarantees must hold
under all (critical)
conditions
Slide 12
Slide 12 text
Fallacy #3:
Beating the CAP Theorem
Checklist
http://ferd.ca/beating-the-cap-theorem-checklist.html
Slide 13
Slide 13 text
4 you pushed the actual problem to another layer of
the system
4 you are not, in fact, designing a distributed system
4 latency is a thing that exists
4 using "infinite timeouts" is not an acceptable
solution to lost messages
4 read-only mode is still unavailability for writes
4 you shouldn't be in charge of people's data
Slide 14
Slide 14 text
No content
Slide 15
Slide 15 text
CAP vs ACID
Terms:
Distributed systems vs databases
Slide 16
Slide 16 text
Consistency
Predicate: Integrity constraint
History: Total order / single copy
Increment / decrement counter consistency: ACID !, CAP ❓
Slide 17
Slide 17 text
Isolation
Often relaxed in RDBMS
Serializable, repeatable read, read (un-) committed
Consistency model in CAP
Slide 18
Slide 18 text
Atomicity
RDBMS "all-or-nothing" not isolated
Consistency model in CAP
Slide 19
Slide 19 text
Durability
Not covered in the CAP paper
Slide 20
Slide 20 text
Repeating for the 4th time
today, fsync is not required
for strong consistency, this
post saying so is wrong
http://antirez.com/news/67
1
https://twitter.com/kellabyte/status/410224523602960385
Slide 21
Slide 21 text
No content
Slide 22
Slide 22 text
"the impossibility of guaranteeing both
safety and liveness in an unreliable
distributed system"
Slide 23
Slide 23 text
"some systems may sacrifice both
consistency and availability! In doing so
they may achieve a trade-off better
suited for the application at hand."
Slide 24
Slide 24 text
No content
Slide 25
Slide 25 text
"CAP prohibits only a tiny part of the
design space: perfect availability and
consistency in the presence of
partitions, which are rare."
Slide 26
Slide 26 text
"retrying communication indefinitely is
in essence choosing C over A."
Slide 27
Slide 27 text
"Most systems cannot always merge
conflicts [after a partition recovery]."
Slide 28
Slide 28 text
CAP in practice
https://aphyr.com/tags/jepsen
MongoDB, ElasticSearch, etcd & Consul, RabbitMQ, Redis,
Cassandra, Kafka, NuoDB, Zookeeper, Riak, PostgreSQL
Slide 29
Slide 29 text
Pick Two — At Most
https://aphyr.com/posts/322-call-me-maybe-mongodb-stale-reads
MongoDB replication: CP
Slide 30
Slide 30 text
Play It Safe
Write concern majority
Read from primary
Slide 31
Slide 31 text
No content
Slide 32
Slide 32 text
Dirty Read
Isolation: Read uncommitted
Readable from (minority) primary while async replication is
running — network partition with rollback after a timeout
Slide 33
Slide 33 text
Stale Read
Reading old data from a (minority)
primary until a timeout happens
Slide 34
Slide 34 text
/dev/null breaks CAP: effect of
write are always consistent,
it's always available, and all
replicas are consistent even
during partitions.
1
https://twitter.com/ashic/status/591511683987701760