Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A Critique of Snapshot Isolation

A Critique of Snapshot Isolation

"A Critique of Snapshot Isolation" is a notable transaction paper implemented in CockroachDB and Badger. It analyzes snapshot isolation (SI) and highlights that while SI offers high concurrency by checking only for write-write conflicts, it fails to guarantee serializability, allowing anomalies like write skew. The authors introduce write-snapshot isolation (WSI), an alternative that ensures serializability by preventing read-write conflicts rather than write-write conflicts. The paper demonstrates that WSI can achieve performance comparable to SI while providing the stricter consistency guarantees of serializability, making it a more robust solution for transactional support in large-scale distributed databases.

Arjun Sunil Kumar

June 14, 2024
Tweet

More Decks by Arjun Sunil Kumar

Other Decks in Research

Transcript

  1. A critique of snapshot isolation Authors: Maysam Yabandeh , Daniel

    Gómez Ferro - Yahoo Research Presenter: Arjun Sunil Kumar
  2. 1. Isolation Levels and anomalies 2. Serialization Strategies 3. SI

    and SSI Implementations 4. Intuitive example 5. Eval Agenda
  3. SQL 92 - Transaction Isolation RedBook - http://www.redbook.io/ch6-isolation.html SQL 92

    - https://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt
  4. 1. Dirty Read Good for Write Heavy Workload - Metrics

    Ref: https://www.thenile.dev/blog/transaction-isolation-postgres Read Uncommitted No locks
  5. 6. No Anomaly Serializable (R/W/Range Lock) T1 T2 R(x) W(x)

    R(y) W(y) R(x) W(x) R(y) W(y) Range Locks applies for SELECT … WHERE SELECT … BETWEEN 2PL - Growing Phase: Get necessary Shared/Exclusive/Range locks on the dataset. - Do work - Shrinking Phase: release locks
  6. Snapshot Isolation - a form of MVCC S S C

    C https://unsplash.com/photos/group-of-men-running-in-track-field-HgoKvtKpyHA
  7. 5. Write Skew R same R same W diff W

    diff Ref: https://drkp.net/papers/ssi-vldb12.pdf
  8. 1. Sequential Execution Main Memory Database Each Core runs Single

    Thread Exec Engine. Each Core has a single partition. https://www.vldb.org/pvldb/vol6/p1942-debrabant.pdf https://hstore.cs.brown.edu/documentation/architecture-overview/
  9. 1. Timeout 2. Pre-commit Phase 3. Higher Message Overhead 3.

    3PC 1. Participant Failure - Halts Txn. 2. Coordinator Failure - Needs recovery time.
  10. - Deadlock/Starvation - Failed Locks or Slow Txn prevents the

    others Txns from making progress. Problems With Locks Based Approach
  11. Serializable - It’s like Git Commit https://fangpenlin.com/posts/2013/09/30/keep-a-readable-git-history/ SERIALIZABLE isolation guarantees

    that even though transactions may execute in parallel, the result is the same as if they had executed one at a time, without any concurrency. This ensures data correctness by preventing all "anomalies" allowed by weaker isolation levels.
  12. Lock based Strategy vs Timestamp Based Strategy Grow Shrink Work

    LOCK (Eager) TS Validate Work OCC (Lazy) Commit Conflict Detection! May or May not use locks TS Work/Validate TO (As you go) Commit Validate Work Scheduling (Eager) Order TS Batch You have more chance of conflict.
  13. 5. Basic Timestamp Ordering > Record keeps track of WTS,

    RTS https://www.britannica.co m/topic/Big-Ben-clock-Lo ndon https://15445.courses.cs.cmu.edu/fall2023/notes/17-timestampordering.pdf BTO uses timestamps to control the access and update of data items at the time of execution. It aborts any transaction that violates the timestamp order. OCC uses timestamps to validate the transactions at the end of execution. Ref: https://muratbuffalo.blogspot.com/2022/11/timestamp-based-algorithms-for.html
  14. 8. OCC - Central Oracle (Recent Committed Txns) Central Oracle

    for Ts and Last Txns. https://www.etsy.com/listing/60 3500543/cairn-photo-clock-sta cked-stones https://15445.courses.cs.cmu.edu/fall2023/notes/17-timestampordering.pdf
  15. 9. Scheduling - Calvin Sequencer https://www.cs.yale.edu/homes/thomson/publications/calvin-sigmod12.pdf Smaller Timeunit batches Need

    to know R/W set before hand Preloading. https://blog.acolyer.org/2019/03/29/calvin-fast-distributed-transactions-f or-partitioned-database-systems/
  16. Serialization Techniques - Summary Locks TSO Conflict Check at Work

    Graph Testing For Conflict Check At End Central Manager For Conflict Check At End Batching and Scheduling Seq Exec
  17. 1. Lock Based SI Stores latest Commit TS for WW

    conflict check. Used by lock-free SI. Used in 2PC
  18. Traffic Types 1. RO Read Set = empty Write Set

    = empty Analytics 2. OLTP Read Set = small ~20 rows Write Set = small ~20 rows • Approximate Data Set • Separate OLAP workload from OLTP oracle.
  19. Txn

  20. Read Ts 1. Get the next timestamp 2. Add to

    minHeap so that we can find the smallest active readTs for GC. 3. Wait for all ready to commit txn to commit.
  21. Commit Ts 1. Conflict Check 2. Close ReadTs 3. GC

    Commit Txn’s older than lowest ReadTs in MinHeap. 4. Generate CommitTs 5. Add commitTs to Commit Visibility Waiter
  22. The Cashier A →$10→ B A →$20→ C 3 4

    5 6 Oracle Compute Nodes Performing Txns Txn Req
  23. Reat Ts 3 4 5 6 A →Balance 7 Txn’s

    which have CommitTS (CommitWaiter)
  24. Read Done/GC Trigger A →$10→ B 7 8, 9 C:10

    GC cheques before timestamp 7, If 7 is the MinHeap Peek. Kind of like Sliding Window of Committed Txn’s
  25. Expectation • RO Txn’s are never aborted by either RSI

    or WSI • RSI and WSI have same level of concurrency
  26. Overhead on Status Oracle - W vs RW set CS

    is more for WSI 2x Memory Item Write Set Read Set Clients: 1 to 64 (exponential) To get higher throughput, you could partition the database and use a status oracle for each partition.
  27. Abort Rate Abort Rate in ZipfianLatest increases more quickly. ZipfianLatest,

    the read set is selected mostly from the recent written data, which increases the chance of R-w conflict in WSI.
  28. To Summarize • Serializability is Expensive in 2PL • Weaker

    Isolation Levels and it’s anomaly • Snapshot Isolation and Write Skew Anomaly • Use RW conflict instead of WW conflict for Serializability • Doesn’t impact RO Txn. • WSI and RSI performance almost comparable. • Serializability could be brought to lock-free txn systems without hurting Performance.
  29. RSI vs WSI RSI WSI Serializability Read from a particular

    snapshot readTs Write to a Particular Snapshot (ie Serializable in Timeline) Take the Dr Alice, Dr Bob situation.