Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Case for Invariant-Based Concurrency Control

pbailis
January 05, 2015

The Case for Invariant-Based Concurrency Control

pbailis

January 05, 2015
Tweet

More Decks by pbailis

Other Decks in Technology

Transcript

  1. CONCURRENCY CONTROL THE CASE FOR INVARIANT-BASED Peter Bailis UC Berkeley

    with Alan Fekete, Mike Franklin, Ali Ghodsi, Ion Stoica, Joe Hellerstein
  2. CONCURRENCY CONTROL THE CASE FOR INVARIANT-BASED Peter Bailis UC Berkeley

    with Alan Fekete, Mike Franklin, Ali Ghodsi, Ion Stoica, Joe Hellerstein CIDR 2015 Gong Show 5 January 2015, Pacific Grove, CA
  3. do not support serializability HANA Actian Ingres YES Aerospike NO

    N Persistit NO N Clustrix NO N Greenplum YES IBM DB2 YES IBM Informix YES MySQL YES MemSQL NO N MS SQL Server YES NuoDB NO N Oracle 11G NO N Oracle BDB YES Oracle BDB JE YES Postgres 9.2.2 YES* SAP Hana NO N ScaleDB NO N VoltDB YES Serializability supported? [VLDB 2014]
  4. do not support serializability HANA Actian Ingres YES Aerospike NO

    N Persistit NO N Clustrix NO N Greenplum YES IBM DB2 YES IBM Informix YES MySQL YES MemSQL NO N MS SQL Server YES NuoDB NO N Oracle 11G NO N Oracle BDB YES Oracle BDB JE YES Postgres 9.2.2 YES* SAP Hana NO N ScaleDB NO N VoltDB YES 8/18 databases surveyed didn’t 15/18 used weak models by default Serializability supported? [VLDB 2014]
  5. READ COMMITTED G0: Write Cycles. A history H exhibits phenomenon

    G0 if DSG(H) contains a directed cycle consisting entirely of write-dependency edges. G1a: Aborted Reads. A history H shows phenomenon G1a if it contains an aborted transaction T1 and a committed transaction T2 such that T2 has read some object (maybe via a predicate) modified by T1. G1b: Intermediate Reads. A history H shows phenomenon G1b if it contains a committed transaction T2 that has read a version of object x (maybe via a predicate) written by transaction T1 that was not T1’s final modification of x. G1c: Circular Information Flow. A history H exhibits phenomenon G1c if DSG(H) contains a directed cycle consisting entirely of dependency edges. [Atul Adya’s Ph.D, 1999]
  6. READ COMMITTED G0: Write Cycles. A history H exhibits phenomenon

    G0 if DSG(H) contains a directed cycle consisting entirely of write-dependency edges. G1a: Aborted Reads. A history H shows phenomenon G1a if it contains an aborted transaction T1 and a committed transaction T2 such that T2 has read some object (maybe via a predicate) modified by T1. G1b: Intermediate Reads. A history H shows phenomenon G1b if it contains a committed transaction T2 that has read a version of object x (maybe via a predicate) written by transaction T1 that was not T1’s final modification of x. G1c: Circular Information Flow. A history H exhibits phenomenon G1c if DSG(H) contains a directed cycle consisting entirely of dependency edges. [Atul Adya’s Ph.D, 1999] Highly nuanced, very technical, sometimes incomplete!
  7. It is insane to assume users can/should reason about weak

    isolation… a fate worse than death
  8. It is insane to assume users can/should reason about weak

    isolation… …and yet they still use it! a fate worse than death
  9. Invariants: “usernames should be unique” “each patient should have a

    attending doctor” “account balances should be positive”
  10. 1.) Are easier to reason about than weak isolation Invariants:

    “usernames should be unique” “each patient should have a attending doctor” “account balances should be positive”
  11. 1.) Are easier to reason about than weak isolation 2.)

    Are already specified in many applications Invariants: “usernames should be unique” “each patient should have a attending doctor” “account balances should be positive”
  12. adopt-a-hydrant alchemy_cms amahi bostonrb boxroom brevidy browsercms bucketwise calagator canvas-lms

    carter chiliproject citizenry comas comfortable- mexican-sofa communityengine copycopter- server danbooru diaspora discourse enki fat_free_crm fedena forem fulcrum gitlab-ci gitlabhq govsgo heaven inkwell insoshi jobsworth juvia kandan linuxfr.org lobsters lovd-by-less nimbleshop obtvse onebody opal opencongress opengovernment openproject piggybak publify radiant railscollab redmine refinerycms ror_ecommerce rucksack saasy salor-retail selfstarter sharetribe skyline spot-us spree sprintapp squaresquash sugar teambox tracks tryshoppe wallgig
  13. adopt-a-hydrant alchemy_cms amahi bostonrb boxroom brevidy browsercms bucketwise calagator canvas-lms

    carter chiliproject citizenry comas comfortable-mexican-sofa communityengine copycopter-server danbooru diaspora discourse enki fat_free_crm fedena forem fulcrum gitlab-ci gitlabhq govsgo heaven inkwell insoshi jobsworth juvia kandan linuxfr.org lobsters lovd-by-less nimbleshop obtvse onebody opal opencongress opengovernment openproject piggybak publify radiant railscollab redmine refinerycms ror_ecommerce rucksack saasy salor-retail selfstarter sharetribe skyline spot-us spree sprintapp squaresquash sugar teambox tracks tryshoppe wallgig zena [Ask for draft; or interview me]
  14. adopt-a-hydrant alchemy_cms amahi bostonrb boxroom brevidy browsercms bucketwise calagator canvas-lms

    carter chiliproject citizenry comas comfortable-mexican-sofa communityengine copycopter-server danbooru diaspora discourse enki fat_free_crm fedena forem fulcrum gitlab-ci gitlabhq govsgo heaven inkwell insoshi jobsworth juvia kandan linuxfr.org lobsters lovd-by-less nimbleshop obtvse onebody opal opencongress opengovernment openproject piggybak publify radiant railscollab redmine refinerycms ror_ecommerce rucksack saasy salor-retail selfstarter sharetribe skyline spot-us spree sprintapp squaresquash sugar teambox tracks tryshoppe wallgig zena 67 projects 1.77M LoC 1957 tables 259 total; avg. 0.13 per table [Ask for draft; or interview me]
  15. adopt-a-hydrant alchemy_cms amahi bostonrb boxroom brevidy browsercms bucketwise calagator canvas-lms

    carter chiliproject citizenry comas comfortable-mexican-sofa communityengine copycopter-server danbooru diaspora discourse enki fat_free_crm fedena forem fulcrum gitlab-ci gitlabhq govsgo heaven inkwell insoshi jobsworth juvia kandan linuxfr.org lobsters lovd-by-less nimbleshop obtvse onebody opal opencongress opengovernment openproject piggybak publify radiant railscollab redmine refinerycms ror_ecommerce rucksack saasy salor-retail selfstarter sharetribe skyline spot-us spree sprintapp squaresquash sugar teambox tracks tryshoppe wallgig zena 67 projects 1.77M LoC 1957 tables 9986 total; avg. 5.1 per table 259 total; avg. 0.13 per table [Ask for draft; or interview me]
  16. adopt-a-hydrant alchemy_cms amahi bostonrb boxroom brevidy browsercms bucketwise calagator canvas-lms

    carter chiliproject citizenry comas comfortable-mexican-sofa communityengine copycopter-server danbooru diaspora discourse enki fat_free_crm fedena forem fulcrum gitlab-ci gitlabhq govsgo heaven inkwell insoshi jobsworth juvia kandan linuxfr.org lobsters lovd-by-less nimbleshop obtvse onebody opal opencongress opengovernment openproject piggybak publify radiant railscollab redmine refinerycms ror_ecommerce rucksack saasy salor-retail selfstarter sharetribe skyline spot-us spree sprintapp squaresquash sugar teambox tracks tryshoppe wallgig zena 67 projects 1.77M LoC 1957 tables 9986 total; avg. 5.1 per table 259 total; avg. 0.13 per table [Ask for draft; or interview me] 39.2x more common!
  17. 1.) Are easier to reason about than weak isolation 2.)

    Are already specified in many applications Invariants: “usernames should be unique” “each patient should have a attending doctor” “account balances should be positive”
  18. Foreign Key Constraints YES Primary Key Constraints YES Row-Level Check

    Constraints YES Multi-Row Check Constraints NO Generic ADT Invariants NO UDF Invariants NO DB supported invariants today:
  19. Foreign Key Constraints YES Primary Key Constraints YES Row-Level Check

    Constraints YES Multi-Row Check Constraints NO Generic ADT Invariants NO UDF Invariants NO DB supported invariants today:
  20. Foreign Key Constraints YES Primary Key Constraints YES Row-Level Check

    Constraints YES Multi-Row Check Constraints NO Generic ADT Invariants NO UDF Invariants NO DB supported invariants today: & little support for distributing, suggesting, mining invariants
  21. 1.) Are easier to reason about than weak isolation 2.)

    Are already specified in many applications 3.) Should be a first-class database primitive 4.) Enable more efficient systems design Invariants:
  22. 1.) Are easier to reason about than weak isolation 2.)

    Are already specified in many applications 3.) Should be a first-class database primitive 4.) Enable more efficient systems design Invariants:
  23. 1.) Are easier to reason about than weak isolation 2.)

    Are already specified in many applications 3.) Should be a first-class database primitive 4.) Enable more efficient systems design Invariants:
  24. scale to over 25x prior best on New-Order 0 50

    100 150 200 2M 4M 6M 8M 10M 12M 14M Total Throughput (txn/s) 0 50 100 150 200 Number of Servers 0 20K 40K 60K 80K Throughput (txn/s/server) 6-11x faster than ACID/serializability on New-Order 8 16 32 48 64 Number of Warehouses 40K 100K 600K Throughput (txns/s) Coordination-Avoiding Serializable (2PL) TPC-C
  25. 1.) Are easier to reason about than weak isolation 2.)

    Are already specified in many applications 3.) Should be a first-class database primitive 4.) Enable more efficient systems design Invariants:
  26. 1.) Are easier to reason about than weak isolation 2.)

    Are already specified in many applications 3.) Should be a first-class database primitive 4.) Enable more efficient systems design Invariants: We can do so much better than weak isolation
  27. Image Credits: world by Wayne Tyler Sall surprised by Julian

    Deveaux database by Austin Condiff man by Simon Child by the Noun Project Creative Commons - Attribution (CC by 3.0)