Slide 1

Slide 1 text

COORDINATION AVOIDANCE
 IN
 DISTRIBUTED
 DATABASES PETER BAILIS Stanford bailis.org 2017 ACM SIGMOD Jim Gray Award Talk Chicago, IL May 2017

Slide 2

Slide 2 text

How should we design database systems that enable new applications to scale? “post on timeline” “accept friend request”

Slide 3

Slide 3 text

CLASSIC:
 ACID

Slide 4

Slide 4 text

CLASSIC:
 ACID serializable transactions “accept friend request” “post on timeline”

Slide 5

Slide 5 text

CLASSIC:
 ACID serializable transactions

Slide 6

Slide 6 text

transactions cannot make progress independently Problem: Serializability requires Coordination Two-Phase Locking Optimistic Concurrency Control Pre-Scheduling Multi-Version Concurrency Control Blocking Waiting Aborts

Slide 7

Slide 7 text

transactions cannot make progress independently Problem: Serializability requires Coordination 133.7+ ms RTT (7.5/s) Well-known for decades, but…

Slide 8

Slide 8 text

Nje!3111t! Jnufsnfu! OpTRM!

Slide 9

Slide 9 text

Major focus: coordination-free execution, or guaranteed response from every replica Availability Low latency Perfect horizontal scalability Benefits: OpTRM!

Slide 10

Slide 10 text

Major focus: coordination-free execution, or guaranteed response from every replica Availability Low latency Perfect horizontal scalability Benefits: cost: rarely guarantee application safety properties

Slide 11

Slide 11 text

Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION THESIS WORK: What is the coordination cost of a given safety guarantee? How do we achieve the minimum? “ACID” “NoSQL”

Slide 12

Slide 12 text

Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE APP SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14

Slide 13

Slide 13 text

Model Prediction and Training CIDR15, LearningSys15 Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION Data Serving and Transactions Analytics

Slide 14

Slide 14 text

Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Model Prediction and Training CIDR15, LearningSys15 Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE APP SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 COORDINATION FREE

Slide 15

Slide 15 text

The Far Side, Gary Larson

Slide 16

Slide 16 text

WHAT THE APPLICATION SAYS “post on timeline” “accept friend request” write read write read write write read write write write read write WHAT THE DATABASE HEARS read read read read read read

Slide 17

Slide 17 text

(Abridged) Related Work » Semantics-based concurrency control: esp. commutativity and CALM analysis, laws of order » Available storage systems: optimistic replication, causal memory, CRDTs, eventually consistent transactions » Distributed computing: CAP, FLP, NBAC, quorums » Here: focus on necessary coordination for common, modern data-intensive apps

Slide 18

Slide 18 text

Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 Model Prediction and Training CIDR15, LearningSys15

Slide 19

Slide 19 text

only 3/18 serializable by default only 10/18 provide serializability at all [VLDB 2014]

Slide 20

Slide 20 text

does this weak isolation require coordination? many RDBMSs don’t provide serializability?!?

Slide 21

Slide 21 text

Highly Available Transactions Example: Read Committed (RC) Informal: no dirty reads Transactions conceal writes until commit

Slide 22

Slide 22 text

Fyjtujnh! Ebubcbtf! Jtpmbujpn Tfttjpn!Hvbsbnufft Ejtusjcvufe! Sfhjtufst!

Slide 23

Slide 23 text

Unavailable Sticky Available Highly Available Legend prevents lost update†, prevents write skew‡, requires recency guarantees⊕ Sticky Available Unavailable Highly Available [VLDB 2014]

Slide 24

Slide 24 text

Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 Model Prediction and Training CIDR15, LearningSys15

Slide 25

Slide 25 text

Constraint Operation Equality, Inequality Any Generate unique ID Any Specify unique ID Insert > Increment > Decrement < Decrement < Increment Foreign Key Insert Foreign Key Delete Secondary Indexing Any Materialized Views Any AUTO_INCREMENT Insert Typical database constraints and operations (SQL)

Slide 26

Slide 26 text

CONSTRAINT: User IDs are unique OPERATION: Add users MERGE: Set union {{Stu,ID=1}, {Ann,ID=1}} Constraint violated! {} MERGE add {Stu,ID=1} add {Ann,ID=1} Key idea: Check if constraints can be violated by “merging” independent operations ICT: Invariant Confluence Test

Slide 27

Slide 27 text

Key idea: Check if constraints can be violated by “merging” independent operations CONSTRAINT: User IDs are positive OPERATION: Add users MERGE: Set union {{Stu,ID=1}, {Ann,ID=1}} Constraint holds! {} MERGE add {Stu,ID=1} add {Ann,ID=1} ICT: Invariant Confluence Test

Slide 28

Slide 28 text

Key idea: Check if constraints can be violated by “merging” independent operations OUR CONTRIBUTION: Generalizes classic partitioning-based indistinguishability arguments Theorem. A globally I-valid system can execute a set of transactions T with coordination-freedom, transactional availability, and convergence if and only if T are I-confluent with respect to I. [VLDB 2015] ICT ⟺ safe, coordination-free execution possible ICT: Invariant Confluence Test

Slide 29

Slide 29 text

Constraint Operation OK? Equality, Inequality Any ??? Generate unique ID Any ??? Specify unique ID Insert ??? > Increment ??? > Decrement ??? < Decrement ??? < Increment ??? Foreign Key Insert ??? Foreign Key Delete ??? Secondary Indexing Any ??? Materialized Views Any ??? AUTO_INCREMENT Insert ??? Typical database constraints and operations (SQL) Under set merge

Slide 30

Slide 30 text

Constraint Operation OK? Equality, Inequality Any Y Generate unique ID Any Y Specify unique ID Insert N > Increment Y > Decrement N < Decrement Y < Increment N Foreign Key Insert Y Foreign Key Delete Y* Secondary Indexing Any Y Materialized Views Any Y AUTO_INCREMENT Insert N [VLDB 2015] Typical database constraints and operations (SQL) Under set merge R A M P [SIGMOD 2014]

Slide 31

Slide 31 text

adopt-a-hydrant alchemy_cms amahi bostonrb boxroom brevidy browsercms bucketwise calagator canvas-lms carter chiliproject citizenry comas comfortable- mexican-sofa communityengine copycopter- server danbooru diaspora discourse enki fat_free_crm fedena forem fulcrum gitlab-ci gitlabhq govsgo heaven inkwell insoshi jobsworth juvia kandan linuxfr.org lobsters lovd-by-less nimbleshop obtvse onebody opal opencongress opengovernment openproject piggybak publify radiant railscollab redmine refinerycms ror_ecommerce rucksack saasy salor-retail selfstarter sharetribe skyline spot-us spree sprintapp squaresquash sugar teambox tracks tryshoppe wallgig

Slide 32

Slide 32 text

CONSTRAINTS INCREDIBLY COMMON adopt-a-hydrant alchemy_cms amahi bostonrb boxroom brevidy browsercms bucketwise calagator canvas-lms carter chiliproject citizenry comas comfortable-mexican-sofa communityengine copycopter-server danbooru diaspora discourse enki fat_free_crm fedena forem fulcrum gitlab-ci gitlabhq govsgo heaven inkwell insoshi jobsworth juvia kandan linuxfr.org lobsters lovd-by-less nimbleshop obtvse onebody opal opencongress opengovernment openproject piggybak publify radiant railscollab redmine refinerycms ror_ecommerce rucksack saasy salor-retail selfstarter sharetribe skyline spot-us spree sprintapp squaresquash sugar teambox tracks tryshoppe wallgig zena 67 projects 1.77M LoC 1957 tables 9986 total; avg. 5.1 per table [SIGMOD 2015] 86.9% PASS ICT

Slide 33

Slide 33 text

14/16 CONSTRAINTS PASS ICT TPC-C scale to over 25x best listed result 0 50 100 150 200 2M 4M 6M 8M 10M 12M 14M Total Throughput (txn/s) 0 50 100 150 200 Number of Servers 0 20K 40K 60K 80K Throughput (txn/s/server) 6-11x faster than ACID/serializability 8 16 32 48 64 Number of Warehouses 40K 100K 600K Throughput (txns/s) Coordination-Avoiding Serializable (2PL)

Slide 34

Slide 34 text

Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Model Prediction and Training CIDR15, TBA Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 COORDINATION FREE

Slide 35

Slide 35 text

Unruly developers are fantastic inspiration • As applications have evolved, so have their database demands and desired semantics • Our opportunity: build systems that implement the semantics users want (not just what we want)

Slide 36

Slide 36 text

• Mounting evidence: many programmers don’t use transactions correctly (or at all!) • Need not despair: opportunity for new theory and systems ACIDRain: Concurrency-Related Attacks on Database-Backed Web Applications [SIGMOD17]

Slide 37

Slide 37 text

Ali Ghodsi Joe Hellerstein Ion Stoica COADVISORS Alan Fekete Mike Franklin KEY COLLABORATORS, MENTORS

Slide 38

Slide 38 text

Michael R. Bernstein, Rick Branson, Mark Callaghan, Adrian Colyer, Sean Cribbs, Jonathan Ellis, Alex Feinberg, Andy Gross, Coda Hale, Colin Jones, Evan Jones, Kyle Kingsbury, Adam Marcus, Caitie McCaffrey, Christopher Meiklejohn, Mike Miller, Jeremiah Peschka, Mark Phillips, Henry Robinson, Mehul Shah, Xavier Shay, Justin Sheehy, Ines Sombra, Kelly Sommers, Sriram Srinivasan and a cast of unruly developers and renegades: Also many thanks to a host of phenomenal colleagues and collaborators Peter Alvaro, Neil Conway, Shivaram Venkataraman, Joey Gonzalez, Haoyuan Li, Zhao Zhang, Aaron Davidson, Mike Jordan

Slide 39

Slide 39 text

Eventual Consistency COORDINATION FREE NO SAFETY Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Model Prediction and Training CIDR15, TBA Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE APP SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 COORDINATION FREE Joint work with Ali Ghodsi, Joe Hellerstein, Ion Stoica, Mike Franklin, Michael Jordan, Alan Fekete, Dan Crankshaw, Shivaram Venkataraman, Neil Conway, Peter Alvaro, Aaron Davidson, Joey Gonzalez, Kyle Kingsbury, Haoyuan Li, and Zhao Zhang