2017 Jim Gray Award Talk: Coordination Avoidance in Distributed Databases

2017 Jim Gray Award Talk: Coordination Avoidance in Distributed Databases

Transcript

  1. COORDINATION AVOIDANCE
 IN
 DISTRIBUTED
 DATABASES PETER BAILIS Stanford bailis.org 2017

    ACM SIGMOD Jim Gray Award Talk Chicago, IL May 2017
  2. How should we design database systems that enable new applications

    to scale? “post on timeline” “accept friend request”
  3. CLASSIC:
 ACID

  4. CLASSIC:
 ACID serializable transactions “accept friend request” “post on timeline”

  5. CLASSIC:
 ACID serializable transactions

  6. transactions cannot make progress independently Problem: Serializability requires Coordination Two-Phase

    Locking Optimistic Concurrency Control Pre-Scheduling Multi-Version Concurrency Control Blocking Waiting Aborts
  7. transactions cannot make progress independently Problem: Serializability requires Coordination 133.7+

    ms RTT (7.5/s) Well-known for decades, but…
  8. Nje!3111t! Jnufsnfu! OpTRM!

  9. Major focus: coordination-free execution, or guaranteed response from every replica

    Availability Low latency Perfect horizontal scalability Benefits: OpTRM!
  10. Major focus: coordination-free execution, or guaranteed response from every replica

    Availability Low latency Perfect horizontal scalability Benefits: cost: rarely guarantee application safety properties
  11. Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO

    SAFETY COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION THESIS WORK: What is the coordination cost of a given safety guarantee? How do we achieve the minimum? “ACID” “NoSQL”
  12. Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Weak Isolation HotOS13,

    VLDB14 Causality SOCC12, SIGMOD13 Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE APP SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14
  13. Model Prediction and Training CIDR15, LearningSys15 Atomic Visibility SIGMOD14 Database

    Constraints VLDB15, SIGMOD15 Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION Data Serving and Transactions Analytics
  14. Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Model Prediction and

    Training CIDR15, LearningSys15 Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE APP SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 COORDINATION FREE
  15. The Far Side, Gary Larson

  16. WHAT THE APPLICATION SAYS “post on timeline” “accept friend request”

    write read write read write write read write write write read write WHAT THE DATABASE HEARS read read read read read read
  17. (Abridged) Related Work » Semantics-based concurrency control: esp. commutativity and

    CALM analysis, laws of order » Available storage systems: optimistic replication, causal memory, CRDTs, eventually consistent transactions » Distributed computing: CAP, FLP, NBAC, quorums » Here: focus on necessary coordination for common, modern data-intensive apps
  18. Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Weak Isolation HotOS13,

    VLDB14 Causality SOCC12, SIGMOD13 Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 Model Prediction and Training CIDR15, LearningSys15
  19. only 3/18 serializable by default only 10/18 provide serializability at

    all [VLDB 2014]
  20. does this weak isolation require coordination? many RDBMSs don’t provide

    serializability?!?
  21. Highly Available Transactions Example: Read Committed (RC) Informal: no dirty

    reads Transactions conceal writes until commit
  22. Fyjtujnh! Ebubcbtf! Jtpmbujpn Tfttjpn!Hvbsbnufft Ejtusjcvufe! Sfhjtufst!

  23. Unavailable Sticky Available Highly Available Legend prevents lost update†, prevents

    write skew‡, requires recency guarantees⊕ Sticky Available Unavailable Highly Available [VLDB 2014]
  24. Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Weak Isolation HotOS13,

    VLDB14 Causality SOCC12, SIGMOD13 Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 Model Prediction and Training CIDR15, LearningSys15
  25. Constraint Operation Equality, Inequality Any Generate unique ID Any Specify

    unique ID Insert > Increment > Decrement < Decrement < Increment Foreign Key Insert Foreign Key Delete Secondary Indexing Any Materialized Views Any AUTO_INCREMENT Insert Typical database constraints and operations (SQL)
  26. CONSTRAINT: User IDs are unique OPERATION: Add users MERGE: Set

    union {{Stu,ID=1}, {Ann,ID=1}} Constraint violated! {} MERGE add {Stu,ID=1} add {Ann,ID=1} Key idea: Check if constraints can be violated by “merging” independent operations ICT: Invariant Confluence Test
  27. Key idea: Check if constraints can be violated by “merging”

    independent operations CONSTRAINT: User IDs are positive OPERATION: Add users MERGE: Set union {{Stu,ID=1}, {Ann,ID=1}} Constraint holds! {} MERGE add {Stu,ID=1} add {Ann,ID=1} ICT: Invariant Confluence Test
  28. Key idea: Check if constraints can be violated by “merging”

    independent operations OUR CONTRIBUTION: Generalizes classic partitioning-based indistinguishability arguments Theorem. A globally I-valid system can execute a set of transactions T with coordination-freedom, transactional availability, and convergence if and only if T are I-confluent with respect to I. [VLDB 2015] ICT ⟺ safe, coordination-free execution possible ICT: Invariant Confluence Test
  29. Constraint Operation OK? Equality, Inequality Any ??? Generate unique ID

    Any ??? Specify unique ID Insert ??? > Increment ??? > Decrement ??? < Decrement ??? < Increment ??? Foreign Key Insert ??? Foreign Key Delete ??? Secondary Indexing Any ??? Materialized Views Any ??? AUTO_INCREMENT Insert ??? Typical database constraints and operations (SQL) Under set merge
  30. Constraint Operation OK? Equality, Inequality Any Y Generate unique ID

    Any Y Specify unique ID Insert N > Increment Y > Decrement N < Decrement Y < Increment N Foreign Key Insert Y Foreign Key Delete Y* Secondary Indexing Any Y Materialized Views Any Y AUTO_INCREMENT Insert N [VLDB 2015] Typical database constraints and operations (SQL) Under set merge R A M P [SIGMOD 2014]
  31. adopt-a-hydrant alchemy_cms amahi bostonrb boxroom brevidy browsercms bucketwise calagator canvas-lms

    carter chiliproject citizenry comas comfortable- mexican-sofa communityengine copycopter- server danbooru diaspora discourse enki fat_free_crm fedena forem fulcrum gitlab-ci gitlabhq govsgo heaven inkwell insoshi jobsworth juvia kandan linuxfr.org lobsters lovd-by-less nimbleshop obtvse onebody opal opencongress opengovernment openproject piggybak publify radiant railscollab redmine refinerycms ror_ecommerce rucksack saasy salor-retail selfstarter sharetribe skyline spot-us spree sprintapp squaresquash sugar teambox tracks tryshoppe wallgig
  32. CONSTRAINTS INCREDIBLY COMMON adopt-a-hydrant alchemy_cms amahi bostonrb boxroom brevidy browsercms

    bucketwise calagator canvas-lms carter chiliproject citizenry comas comfortable-mexican-sofa communityengine copycopter-server danbooru diaspora discourse enki fat_free_crm fedena forem fulcrum gitlab-ci gitlabhq govsgo heaven inkwell insoshi jobsworth juvia kandan linuxfr.org lobsters lovd-by-less nimbleshop obtvse onebody opal opencongress opengovernment openproject piggybak publify radiant railscollab redmine refinerycms ror_ecommerce rucksack saasy salor-retail selfstarter sharetribe skyline spot-us spree sprintapp squaresquash sugar teambox tracks tryshoppe wallgig zena 67 projects 1.77M LoC 1957 tables 9986 total; avg. 5.1 per table [SIGMOD 2015] 86.9% PASS ICT
  33. 14/16 CONSTRAINTS PASS ICT TPC-C scale to over 25x best

    listed result 0 50 100 150 200 2M 4M 6M 8M 10M 12M 14M Total Throughput (txn/s) 0 50 100 150 200 Number of Servers 0 20K 40K 60K 80K Throughput (txn/s/server) 6-11x faster than ACID/serializability 8 16 32 48 64 Number of Warehouses 40K 100K 600K Throughput (txns/s) Coordination-Avoiding Serializable (2PL)
  34. Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Model Prediction and

    Training CIDR15, TBA Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 COORDINATION FREE
  35. Unruly developers are fantastic inspiration • As applications have evolved,

    so have their database demands and desired semantics • Our opportunity: build systems that implement the semantics users want (not just what we want)
  36. • Mounting evidence: many programmers don’t use transactions correctly (or

    at all!) • Need not despair: opportunity for new theory and systems ACIDRain: Concurrency-Related Attacks on Database-Backed Web Applications [SIGMOD17]
  37. Ali Ghodsi Joe Hellerstein Ion Stoica COADVISORS Alan Fekete Mike

    Franklin KEY COLLABORATORS, MENTORS
  38. Michael R. Bernstein, Rick Branson, Mark Callaghan, Adrian Colyer, Sean

    Cribbs, Jonathan Ellis, Alex Feinberg, Andy Gross, Coda Hale, Colin Jones, Evan Jones, Kyle Kingsbury, Adam Marcus, Caitie McCaffrey, Christopher Meiklejohn, Mike Miller, Jeremiah Peschka, Mark Phillips, Henry Robinson, Mehul Shah, Xavier Shay, Justin Sheehy, Ines Sombra, Kelly Sommers, Sriram Srinivasan and a cast of unruly developers and renegades: Also many thanks to a host of phenomenal colleagues and collaborators Peter Alvaro, Neil Conway, Shivaram Venkataraman, Joey Gonzalez, Haoyuan Li, Zhao Zhang, Aaron Davidson, Mike Jordan
  39. Eventual Consistency COORDINATION FREE NO SAFETY Atomic Visibility SIGMOD14 Database

    Constraints VLDB15, SIGMOD15 Model Prediction and Training CIDR15, TBA Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE APP SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 COORDINATION FREE Joint work with Ali Ghodsi, Joe Hellerstein, Ion Stoica, Mike Franklin, Michael Jordan, Alan Fekete, Dan Crankshaw, Shivaram Venkataraman, Neil Conway, Peter Alvaro, Aaron Davidson, Joey Gonzalez, Kyle Kingsbury, Haoyuan Li, and Zhao Zhang