Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Don't Give Up on Serializability Just Yet

Neha
June 08, 2015

Don't Give Up on Serializability Just Yet

A short version of a talk on serializability and consistency, given at dotScale in Paris. Describes consistency in three contexts: database transactions, consistency models, and the CAP theorem.

Neha

June 08, 2015
Tweet

More Decks by Neha

Other Decks in Programming

Transcript

  1. Consistency and Candy
    Crush
    Neha Narula
    @neha

    dotScale June 8, 2015
    1  
    Don’t give up on serializability just yet

    View Slide

  2. @neha
    2  
    •  PhD from MIT
    •  Formerly at
    Google
    •  Research in fast
    transactions for
    multi-core
    databases and
    distributed
    systems

    View Slide

  3. 3  
    … the most important person in my gang
    will be a systems programmer. A person
    who can debug a device driver or a
    distributed system is a person who can
    be trusted in a Hobbesian nightmare of
    breathtaking scope; a systems
    programmer has seen the terrors of
    the world and understood the
    intrinsic horror of existence.

    View Slide

  4. Consistency models help us
    reason about our code and avoid
    subtle bugs

    View Slide

  5. Outline
    Consistency as in ACID
    Consistency models
    Consistency as in CAP

    View Slide

  6. Outline
    Consistency as in ACID!
    Consistency models
    Consistency as in CAP

    View Slide

  7. 7  
    mysql> START TRANSACTION;
    mysql> UPDATE t SET x=x+1 WHERE y=2;
    mysql> UPDATE t SET y=y+1 WHERE z=3;
    mysql> COMMIT;

    View Slide

  8. ACID transactions
    Atomic
    Consistent
    Isolated
    Durable
    8  
    Whole thing happens or not
    Application-defined
    correctness
    Transactions don’t interfere
    with each other
    Database can recover
    correctly from a crash

    View Slide

  9. What is serializability?
    The result of executing a set of transactions is
    equivalent to executing those transactions
    one at a time, in some serial order.

    If each transaction preserves correctness, the
    database will be in a correct state.

    We can pretend like there’s no concurrency!
    9  

    View Slide

  10. What is serializability?
    10  
    serializability != serial execution

    View Slide

  11. TXN1(k, j Key) (int, int) {
    a := GET(k)
    b := GET(j)
    return a, b
    }
    Serializable database transactions
    11  
    TXN2(k, j Key) {
    ADD(k,1)
    ADD(j,1)
    }
    TXN1 TXN2
    TXN2 TXN1
    time
    or"
    To the programmer:"
    Valid return values
    for TX1: (0,0)"
    k=0,j=0"
    or (1,1)"

    View Slide

  12. Interleaved execution:"
    GET(k) GET(j)
    Transactions can execute in parallel
    12  
    ADD(k,1) ADD(j,1)
    time
    k=0,j=0
    TX1
    returns
    (1,1)"

    View Slide

  13. Interleaved execution:"
    GET(k)GET(j)
    Non-serializable means incorrect
    interleavings
    13  
    ADD(k,1) ADD(j,1)
    time
    TX1
    returns
    (1,0)!"
    k=0,j=0

    View Slide

  14. Benefits of serializability
    •  Do not have to reason about interleavings
    •  Express invariants in one place: the code
    14  

    View Slide

  15. Outline
    Consistency as in ACID
    Consistency models!
    Consistency as in CAP

    View Slide

  16. Eventual consistency: key/value
    stores
    •  Bigtable
    16  
    •  Dynamo

    View Slide

  17. Eventual consistency
    If no new updates are made to a key,
    eventually all accesses will return the last
    updated value.

    View Slide

  18. Eventual consistency
    If no new updates are made to a key,
    eventually all accesses will return the last
    updated value the same value.

    (What is last, really?)

    (And when do we stop writing?)

    View Slide

  19. Strict consistency
    •  Reads and writes appear to have executed
    in a total order that matches time
    •  Single processor semantics
    •  Linearizability
    19  

    View Slide

  20. Different Consistency Models
    Strict consistency
    Sequential consistency
    Causal consistency
    PRAM consistency
    Read-your-writes consistency
    Eventual consistency
    20  
    Stronger"
    Weaker"

    View Slide

  21. Outline
    Consistency as in ACID
    Consistency models
    Consistency as in CAP!

    View Slide

  22. CAP theorem
    •  Brewer’s PODC talk: Consistency, Availability,
    Partition-tolerance: choose two in 2000
    –  Partition-tolerance is a failure model
    –  Choice: can you process reads and writes during a
    partition or not?

    •  FLP result: Impossibility of Distributed
    Consensus with One Faulty Process in 1985
    –  Asynchronous model; cannot tell the difference
    between message delay and failure

    View Slide

  23. What does this mean?


    Is it impossible to run a correct
    distributed database?

    View Slide

  24. NP-hard

    View Slide

  25. What does CAP mean?
    It is impossible to 100% of the time make
    progress and get the right answer if we can’t
    rely on synchronous messaging

    We can 100% of the time make progress and
    get the right answer if partitions heal (we
    know the upper bound on message delays)

    We can still play Candy Crush

    View Slide

  26. CAP"
    Consistency vs. performance
    Consistency requires communication and
    blocking.

    How do we reduce these costs while
    producing a correct ordering of reads and
    writes and handling failures?

    View Slide

  27. Spanner/F1
    “We believe it is better to have application
    programmers deal with performance
    problems due to overuse of transactions as
    bottlenecks arise, rather than always coding
    around the lack of transactions.”
    Corbett, James C., Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, Jeffrey John Furman,
    Sanjay Ghemawat et al. "Spanner: Google’s globally distributed database." ACM Transactions on
    Computer Systems (TOCS) , 2013.

    View Slide

  28. Outline
    Consistency as in ACID
    Consistency models
    Consistency as in CAP

    View Slide

  29. Takeaways
    Use well-tested, long-lived databases with
    SERIALIZABLE until you have a performance
    problem

    Be aware of what is changing when you move
    between systems with different consistency
    models

    Consciously decide what trade-offs to make
    29  

    View Slide

  30. Thanks!"

    The Stata Center via emax: http://hip.cat/emax/
    [email protected]
    http://nehanaru.la
    @neha

    View Slide