Don't Give Up on Serializability Just Yet

Consistency and Candy Crush Neha Narula @neha dotScale June 8,
2015 1 Don’t give up on serializability just yet

@neha 2 •  PhD from MIT •  Formerly at
Google •  Research in fast transactions for multi-core databases and distributed systems

3 … the most important person in my gang
will be a systems programmer. A person who can debug a device driver or a distributed system is a person who can be trusted in a Hobbesian nightmare of breathtaking scope; a systems programmer has seen the terrors of the world and understood the intrinsic horror of existence.

Consistency models help us reason about our code and avoid
subtle bugs

Outline Consistency as in ACID Consistency models Consistency as in
CAP

Outline Consistency as in ACID! Consistency models Consistency as in
CAP

7 mysql> START TRANSACTION; mysql> UPDATE t SET x=x+1
WHERE y=2; mysql> UPDATE t SET y=y+1 WHERE z=3; mysql> COMMIT;

ACID transactions Atomic Consistent Isolated Durable 8 Whole thing
happens or not Application-deﬁned correctness Transactions don’t interfere with each other Database can recover correctly from a crash

What is serializability? The result of executing a set of
transactions is equivalent to executing those transactions one at a time, in some serial order. If each transaction preserves correctness, the database will be in a correct state. We can pretend like there’s no concurrency! 9

What is serializability? 10 serializability != serial execution

TXN1(k, j Key) (int, int) { a := GET(k) b
:= GET(j) return a, b } Serializable database transactions 11 TXN2(k, j Key) { ADD(k,1) ADD(j,1) } TXN1 TXN2 TXN2 TXN1 time or" To the programmer:" Valid return values for TX1: (0,0)" k=0,j=0" or (1,1)"

Interleaved execution:" GET(k) GET(j) Transactions can execute in parallel 12
ADD(k,1) ADD(j,1) time k=0,j=0 TX1 returns (1,1)"

Interleaved execution:" GET(k)GET(j) Non-serializable means incorrect interleavings 13 ADD(k,1)
ADD(j,1) time TX1 returns (1,0)!" k=0,j=0

Beneﬁts of serializability •  Do not have to reason about
interleavings •  Express invariants in one place: the code 14

Outline Consistency as in ACID Consistency models! Consistency as in
CAP

Eventual consistency: key/value stores •  Bigtable 16 •  Dynamo

Eventual consistency If no new updates are made to a
key, eventually all accesses will return the last updated value.

Eventual consistency If no new updates are made to a
key, eventually all accesses will return the last updated value the same value. (What is last, really?) (And when do we stop writing?)

Strict consistency •  Reads and writes appear to have executed
in a total order that matches time •  Single processor semantics •  Linearizability 19

Different Consistency Models Strict consistency Sequential consistency Causal consistency PRAM
consistency Read-your-writes consistency Eventual consistency 20 Stronger" Weaker"

CAP!

CAP theorem •  Brewer’s PODC talk: Consistency, Availability, Partition-tolerance: choose
two in 2000 –  Partition-tolerance is a failure model –  Choice: can you process reads and writes during a partition or not? •  FLP result: Impossibility of Distributed Consensus with One Faulty Process in 1985 –  Asynchronous model; cannot tell the difference between message delay and failure

What does this mean? Is it impossible to run a
correct distributed database?

NP-hard

What does CAP mean? It is impossible to 100% of
the time make progress and get the right answer if we can’t rely on synchronous messaging We can 100% of the time make progress and get the right answer if partitions heal (we know the upper bound on message delays) We can still play Candy Crush

CAP" Consistency vs. performance Consistency requires communication and blocking. How
do we reduce these costs while producing a correct ordering of reads and writes and handling failures?

Spanner/F1 “We believe it is better to have application programmers
deal with performance problems due to overuse of transactions as bottlenecks arise, rather than always coding around the lack of transactions.” Corbett, James C., Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, Jeffrey John Furman, Sanjay Ghemawat et al. "Spanner: Google’s globally distributed database." ACM Transactions on Computer Systems (TOCS) , 2013.

CAP

Takeaways Use well-tested, long-lived databases with SERIALIZABLE until you have
a performance problem Be aware of what is changing when you move between systems with different consistency models Consciously decide what trade-offs to make 29

Thanks!" The Stata Center via emax: http://hip.cat/emax/ [email protected] http://nehanaru.la @neha

Don't Give Up on Serializability Just Yet

Don't Give Up on Serializability Just Yet

Neha

More Decks by Neha

Other Decks in Programming

Featured

Transcript

Consistency and Candy Crush Neha Narula @neha dotScale June 8,

@neha 2 •  PhD from MIT •  Formerly at

3 … the most important person in my gang

Consistency models help us reason about our code and avoid

Outline Consistency as in ACID Consistency models Consistency as in

Outline Consistency as in ACID! Consistency models Consistency as in

7 mysql> START TRANSACTION; mysql> UPDATE t SET x=x+1

ACID transactions Atomic Consistent Isolated Durable 8 Whole thing

What is serializability? The result of executing a set of

What is serializability? 10 serializability != serial execution

TXN1(k, j Key) (int, int) { a := GET(k) b

Interleaved execution:" GET(k) GET(j) Transactions can execute in parallel 12

Interleaved execution:" GET(k)GET(j) Non-serializable means incorrect interleavings 13 ADD(k,1)

Beneﬁts of serializability •  Do not have to reason about

Outline Consistency as in ACID Consistency models! Consistency as in

Eventual consistency: key/value stores •  Bigtable 16 •  Dynamo

Eventual consistency If no new updates are made to a

Eventual consistency If no new updates are made to a

Strict consistency •  Reads and writes appear to have executed

Different Consistency Models Strict consistency Sequential consistency Causal consistency PRAM

Outline Consistency as in ACID Consistency models Consistency as in

CAP theorem •  Brewer’s PODC talk: Consistency, Availability, Partition-tolerance: choose

What does this mean? Is it impossible to run a

NP-hard

What does CAP mean? It is impossible to 100% of

CAP" Consistency vs. performance Consistency requires communication and blocking. How

Spanner/F1 “We believe it is better to have application programmers

Outline Consistency as in ACID Consistency models Consistency as in

Takeaways Use well-tested, long-lived databases with SERIALIZABLE until you have

Thanks!" The Stata Center via emax: http://hip.cat/emax/ [email protected] http://nehanaru.la @neha