Slide 1

Slide 1 text

COORDINATION AVOIDANCE
 IN
 DISTRIBUTED
 DATABASES PETER BAILIS UC Berkeley

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

DATA TODAY:

Slide 4

Slide 4 text

SCALE DATA TODAY: UNPRECEDENTED

Slide 5

Slide 5 text

SCALE Billion-user Internet services 3B Internet users in 2014 2.3B Mobile broadband users DATA TODAY: UNPRECEDENTED Ericsson Mobility Report, UN International Telecommunication Union, Facebook, Google, NSA,

Slide 6

Slide 6 text

SCALE VOLUME Billion-user Internet services 3B Internet users in 2014 2.3B Mobile broadband users Facebook RocksDB: 9B ops/sec Google BigTable: 600M ops/sec LinkedIn Kafka: 2.5M ops/sec DATA TODAY: UNPRECEDENTED Ericsson Mobility Report, UN International Telecommunication Union, Facebook, Google, NSA, @RocksDB, @AKPurtell, Martin Kleppmann

Slide 7

Slide 7 text

SCALE VOLUME INTERACTIVITY Billion-user Internet services 3B Internet users in 2014 2.3B Mobile broadband users Facebook RocksDB: 9B ops/sec Google BigTable: 600M ops/sec LinkedIn Kafka: 2.5M ops/sec Impatient users want low latency Always-on responsiveness Personalized user experiences DATA TODAY: UNPRECEDENTED Ericsson Mobility Report, UN International Telecommunication Union, Facebook, Google, NSA, @RocksDB, @AKPurtell, Martin Kleppmann

Slide 8

Slide 8 text

SCALE VOLUME INTERACTIVITY DATA TODAY: UNPRECEDENTED

Slide 9

Slide 9 text

SCALE VOLUME INTERACTIVITY AND GROWING! DATA TODAY: UNPRECEDENTED

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

No content

Slide 12

Slide 12 text

“post on timeline” “accept friend request”

Slide 13

Slide 13 text

How should we design database systems that enable applications to scale? “post on timeline” “accept friend request”

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

CLASSIC:
 ACID

Slide 16

Slide 16 text

CLASSIC:
 ACID serializable transactions “accept friend request” “post on timeline”

Slide 17

Slide 17 text

CLASSIC:
 ACID serializable transactions “accept friend request” “post on timeline”

Slide 18

Slide 18 text

CLASSIC:
 ACID serializable transactions

Slide 19

Slide 19 text

serializability: equivalence to some serial execution

Slide 20

Slide 20 text

“post on timeline” serializability: equivalence to some serial execution

Slide 21

Slide 21 text

“post on timeline” “accept friend request” serializability: equivalence to some serial execution

Slide 22

Slide 22 text

“post on timeline” “accept friend request” serializability: equivalence to some serial execution very general!

Slide 23

Slide 23 text

r(y) w(x←1) r(x) w(y←1) very general! serializability: equivalence to some serial execution

Slide 24

Slide 24 text

r(y) w(x←1) r(x) w(y←1) very general! …but restricts concurrency serializability: equivalence to some serial execution

Slide 25

Slide 25 text

serializability: equivalence to some serial execution very general! …but restricts concurrency

Slide 26

Slide 26 text

serializability: equivalence to some serial execution very general! …but restricts concurrency CONCURRENT EXECUTION

Slide 27

Slide 27 text

serializability: equivalence to some serial execution r(x)=0 very general! …but restricts concurrency CONCURRENT EXECUTION

Slide 28

Slide 28 text

serializability: equivalence to some serial execution r(x)=0 r(y)=0 very general! …but restricts concurrency CONCURRENT EXECUTION

Slide 29

Slide 29 text

serializability: equivalence to some serial execution r(x)=0 w(y←1) r(y)=0 very general! …but restricts concurrency CONCURRENT EXECUTION

Slide 30

Slide 30 text

serializability: equivalence to some serial execution r(x)=0 w(x←1) w(y←1) r(y)=0 very general! …but restricts concurrency CONCURRENT EXECUTION

Slide 31

Slide 31 text

serializability: equivalence to some serial execution r(x)=0 w(x←1) w(y←1) r(y)=0 very general! …but restricts concurrency r(y)=0 w(x←1) 2 r(x)=0 w(y←1) 1 CONCURRENT EXECUTION

Slide 32

Slide 32 text

serializability: equivalence to some serial execution r(x)=0 w(x←1) w(y←1) r(y)=0 very general! …but restricts concurrency Should have r(y)!1 r(y)=0 w(x←1) 2 r(x)=0 w(y←1) 1 CONCURRENT EXECUTION

Slide 33

Slide 33 text

serializability: equivalence to some serial execution r(x)=0 w(x←1) w(y←1) r(y)=0 very general! …but restricts concurrency Should have r(y)!1 r(y)=0 w(x←1) 2 r(x)=0 w(y←1) 1 CONCURRENT EXECUTION

Slide 34

Slide 34 text

serializability: equivalence to some serial execution r(x)=0 w(x←1) w(y←1) r(y)=0 very general! …but restricts concurrency Should have r(y)!1 r(y)=0 w(x←1) 2 r(x)=0 w(y←1) 1 r(y)=0 w(x←1) 1 r(x)=0 w(y←1) 2 CONCURRENT EXECUTION

Slide 35

Slide 35 text

serializability: equivalence to some serial execution r(x)=0 w(x←1) w(y←1) r(y)=0 very general! …but restricts concurrency Should have r(y)!1 r(y)=0 w(x←1) 2 r(x)=0 w(y←1) 1 Should have r(x)!1 r(y)=0 w(x←1) 1 r(x)=0 w(y←1) 2 CONCURRENT EXECUTION

Slide 36

Slide 36 text

serializability: equivalence to some serial execution r(x)=0 w(x←1) w(y←1) r(y)=0 very general! …but restricts concurrency Should have r(y)!1 r(y)=0 w(x←1) 2 r(x)=0 w(y←1) 1 Should have r(x)!1 r(y)=0 w(x←1) 1 r(x)=0 w(y←1) 2 CONCURRENT EXECUTION

Slide 37

Slide 37 text

serializability: equivalence to some serial execution r(x)=0 w(x←1) w(y←1) r(y)=0 very general! …but restricts concurrency Should have r(y)!1 r(y)=0 w(x←1) 2 r(x)=0 w(y←1) 1 Should have r(x)!1 r(y)=0 w(x←1) 1 r(x)=0 w(y←1) 2 CONCURRENT EXECUTION IS NOT SERIALIZABLE!

Slide 38

Slide 38 text

serializability: equivalence to some serial execution r(x)=0 w(x←1) w(y←1) r(y)=0 very general! …but restricts concurrency transactions cannot make progress independently Serializability requires Coordination Should have r(y)!1 r(y)=0 w(x←1) 2 r(x)=0 w(y←1) 1 Should have r(x)!1 r(y)=0 w(x←1) 1 r(x)=0 w(y←1) 2 CONCURRENT EXECUTION IS NOT SERIALIZABLE!

Slide 39

Slide 39 text

transactions cannot make progress independently Serializability requires Coordination

Slide 40

Slide 40 text

transactions cannot make progress independently Serializability requires Coordination Two-Phase Locking Optimistic Concurrency Control Pre-Scheduling Multi-Version Concurrency Control

Slide 41

Slide 41 text

transactions cannot make progress independently Serializability requires Coordination Two-Phase Locking Optimistic Concurrency Control Pre-Scheduling Multi-Version Concurrency Control Blocking Waiting Aborts

Slide 42

Slide 42 text

transactions cannot make progress independently Serializability requires Coordination Two-Phase Locking Optimistic Concurrency Control Pre-Scheduling Multi-Version Concurrency Control Blocking Waiting Aborts Costs of Coordination Between Concurrent Transactions

Slide 43

Slide 43 text

1. Decreased performance transactions cannot make progress independently Serializability requires Coordination Two-Phase Locking Optimistic Concurrency Control Pre-Scheduling Multi-Version Concurrency Control Blocking Waiting Aborts Costs of Coordination Between Concurrent Transactions

Slide 44

Slide 44 text

No content

Slide 45

Slide 45 text

No content

Slide 46

Slide 46 text

No content

Slide 47

Slide 47 text

2 3 4 5 6 7 8 Number of Servers in Transaction 0 200 400 600 800 1000 1200 Maximum Throughput (txns/s) Number of Servers in Transaction Local datacenter (Amazon EC2) Based on [Bobtail, Xu et al., NSDI 13] For conflicting transactions

Slide 48

Slide 48 text

2 3 4 5 6 7 8 Number of Servers in Transaction 0 200 400 600 800 1000 1200 Maximum Throughput (txns/s) Number of Servers in Transaction Local datacenter (Amazon EC2) Based on [Bobtail, Xu et al., NSDI 13] For conflicting transactions

Slide 49

Slide 49 text

2 3 4 5 6 7 8 Number of Servers in Transaction 0 200 400 600 800 1000 1200 Maximum Throughput (txns/s) Number of Servers in Transaction +OR +CA +IR +SP +TO +SI +SY Participating Datacenters (+VA) 2 4 6 8 10 12 Maximum Throughput (txn/s) Local datacenter (Amazon EC2) Based on [Bobtail, Xu et al., NSDI 13] Multi-datacenter (Amazon EC2) Based on [HAT, Bailis et al., VLDB 14] For conflicting transactions

Slide 50

Slide 50 text

2 3 4 5 6 7 8 Number of Servers in Transaction 0 200 400 600 800 1000 1200 Maximum Throughput (txns/s) Number of Servers in Transaction +OR +CA +IR +SP +TO +SI +SY Participating Datacenters (+VA) 2 4 6 8 10 12 Maximum Throughput (txn/s) Local datacenter (Amazon EC2) Based on [Bobtail, Xu et al., NSDI 13] Multi-datacenter (Amazon EC2) Based on [HAT, Bailis et al., VLDB 14] For conflicting transactions

Slide 51

Slide 51 text

2 3 4 5 6 7 8 Number of Servers in Transaction 0 200 400 600 800 1000 1200 Maximum Throughput (txns/s) Number of Servers in Transaction +OR +CA +IR +SP +TO +SI +SY Participating Datacenters (+VA) 2 4 6 8 10 12 Maximum Throughput (txn/s) Local datacenter (Amazon EC2) Based on [Bobtail, Xu et al., NSDI 13] Multi-datacenter (Amazon EC2) Based on [HAT, Bailis et al., VLDB 14] For conflicting transactions

Slide 52

Slide 52 text

1. Decreased performance » due to waiting, communication delays, aborts » exacerbated in distributed environment! 2. Decreased availability during failures transactions cannot make progress independently Serializability requires Coordination Costs of Coordination Between Concurrent Transactions

Slide 53

Slide 53 text

1. Decreased performance » due to waiting, communication delays, aborts » exacerbated in distributed environment! 2. Decreased availability during failures transactions cannot make progress independently Serializability requires Coordination Costs of Coordination Between Concurrent Transactions

Slide 54

Slide 54 text

1. Decreased performance » due to waiting, communication delays, aborts » exacerbated in distributed environment! 2. Decreased availability during failures transactions cannot make progress independently Serializability requires Coordination Costs of Coordination Between Concurrent Transactions

Slide 55

Slide 55 text

1. Decreased performance » due to waiting, communication delays, aborts » exacerbated in distributed environment! 2. Decreased availability during failures transactions cannot make progress independently Serializability requires Coordination Costs of Coordination Between Concurrent Transactions

Slide 56

Slide 56 text

1. Decreased performance » due to waiting, communication delays, aborts » exacerbated in distributed environment! 2. Decreased availability during failures transactions cannot make progress independently Serializability requires Coordination Well-known for decades; cf. “CAP” Costs of Coordination Between Concurrent Transactions

Slide 57

Slide 57 text

How should we design database systems that enable applications to scale?

Slide 58

Slide 58 text

Serializability COORDINATION REQUIRED How should we design database systems that enable applications to scale?

Slide 59

Slide 59 text

Serializability COORDINATION REQUIRED “NoSQL” COORDINATION FREE How should we design database systems that enable applications to scale?

Slide 60

Slide 60 text

NoSQL

Slide 61

Slide 61 text

NoSQL

Slide 62

Slide 62 text

No content

Slide 63

Slide 63 text

Eventual Consistency “if no new updates are made to the [database], eventually all accesses will return the last updated value[s]” — Werner Vogels, Amazon CTO

Slide 64

Slide 64 text

Eventual Consistency “if no new updates are made to the [database], eventually all accesses will return the last updated value[s]” — Werner Vogels, Amazon CTO

Slide 65

Slide 65 text

Eventual Consistency “if no new updates are made to the [database], eventually all accesses will return the last updated value[s]” — Werner Vogels, Amazon CTO

Slide 66

Slide 66 text

Eventual Consistency “if no new updates are made to the [database], eventually all accesses will return the last updated value[s]” — Werner Vogels, Amazon CTO

Slide 67

Slide 67 text

Eventual Consistency “if no new updates are made to the [database], eventually all accesses will return the last updated value[s]” — Werner Vogels, Amazon CTO provides no safety: what happens in the meantime?

Slide 68

Slide 68 text

[VLDB 2012, VLDB Journal 2014 “Best of VLDB 2012”, SIGMOD 2013 (Demo), CACM Research Highlight] Probabilistically Bounded Staleness (PBS)

Slide 69

Slide 69 text

[VLDB 2012, VLDB Journal 2014 “Best of VLDB 2012”, SIGMOD 2013 (Demo), CACM Research Highlight] Probabilistically Bounded Staleness (PBS) » Monte Carlo analysis of protocol behavior

Slide 70

Slide 70 text

[VLDB 2012, VLDB Journal 2014 “Best of VLDB 2012”, SIGMOD 2013 (Demo), CACM Research Highlight] Probabilistically Bounded Staleness (PBS) » Monte Carlo analysis of protocol behavior » Key finding: frequently “correct” results…

Slide 71

Slide 71 text

[VLDB 2012, VLDB Journal 2014 “Best of VLDB 2012”, SIGMOD 2013 (Demo), CACM Research Highlight] Probabilistically Bounded Staleness (PBS) » Monte Carlo analysis of protocol behavior » Key finding: frequently “correct” results… PBS: Voldemort Database at LinkedIn 99% of reads return the last update 23ms after write

Slide 72

Slide 72 text

[VLDB 2012, VLDB Journal 2014 “Best of VLDB 2012”, SIGMOD 2013 (Demo), CACM Research Highlight] Probabilistically Bounded Staleness (PBS) » Monte Carlo analysis of protocol behavior » Key finding: frequently “correct” results… PBS: Voldemort Database at LinkedIn 99% of reads return the last update 23ms after write 32-90% decrease in 99.9th percentile latency

Slide 73

Slide 73 text

[VLDB 2012, VLDB Journal 2014 “Best of VLDB 2012”, SIGMOD 2013 (Demo), CACM Research Highlight] Probabilistically Bounded Staleness (PBS) » Monte Carlo analysis of protocol behavior » Key finding: frequently “correct” results… PBS: Voldemort Database at LinkedIn 99% of reads return the last update 23ms after write 32-90% decrease in 99.9th percentile latency

Slide 74

Slide 74 text

[VLDB 2012, VLDB Journal 2014 “Best of VLDB 2012”, SIGMOD 2013 (Demo), CACM Research Highlight] Probabilistically Bounded Staleness (PBS) » Monte Carlo analysis of protocol behavior » Key finding: frequently “correct” results… PBS: Voldemort Database at LinkedIn 99% of reads return the last update 23ms after write 32-90% decrease in 99.9th percentile latency …BUT NO GUARANTEES! 㱺 DIFFICULT TO PROGRAM

Slide 75

Slide 75 text

No content

Slide 76

Slide 76 text

No content

Slide 77

Slide 77 text

“…sometimes the [write] is retrieved from the datastore and sometimes it is not.”

Slide 78

Slide 78 text

Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY

Slide 79

Slide 79 text

Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14

Slide 80

Slide 80 text

Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY COORDINATION AVOIDANCE PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 MY WORK:

Slide 81

Slide 81 text

Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 MY WORK:

Slide 82

Slide 82 text

The Far Side, Gary Larson

Slide 83

Slide 83 text

No content

Slide 84

Slide 84 text

WHAT THE APPLICATION SAYS “post on timeline” “accept friend request”

Slide 85

Slide 85 text

WHAT THE APPLICATION SAYS “post on timeline” “accept friend request” write read write read write write read write write write read write WHAT THE DATABASE HEARS read read read read read read

Slide 86

Slide 86 text

No content

Slide 87

Slide 87 text

DESIGN DATABASE SYSTEMS THAT EXPLOIT SEMANTICS OF HIGH-VALUE USE CASES MY APPROACH:

Slide 88

Slide 88 text

DESIGN DATABASE SYSTEMS THAT EXPLOIT SEMANTICS OF HIGH-VALUE USE CASES MY APPROACH: Study practical database use cases

Slide 89

Slide 89 text

DESIGN DATABASE SYSTEMS THAT EXPLOIT SEMANTICS OF HIGH-VALUE USE CASES MY APPROACH: Study practical database use cases Derive principles and algorithms

Slide 90

Slide 90 text

DESIGN DATABASE SYSTEMS THAT EXPLOIT SEMANTICS OF HIGH-VALUE USE CASES MY APPROACH: Study practical database use cases Derive principles and algorithms Build systems to realize the benefits

Slide 91

Slide 91 text

DESIGN DATABASE SYSTEMS THAT EXPLOIT SEMANTICS OF HIGH-VALUE USE CASES MY APPROACH: Study practical database use cases Derive principles and algorithms Build systems to realize the benefits

Slide 92

Slide 92 text

Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION PBS VLDB12, VLDBJ14, SIGMOD13, CACM14

Slide 93

Slide 93 text

Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14

Slide 94

Slide 94 text

Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14

Slide 95

Slide 95 text

Causality SOCC12, SIGMOD13 Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14

Slide 96

Slide 96 text

Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14

Slide 97

Slide 97 text

Atomic Visibility SIGMOD14 Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14

Slide 98

Slide 98 text

Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14

Slide 99

Slide 99 text

Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION

Slide 100

Slide 100 text

Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION Data Serving and Transactions

Slide 101

Slide 101 text

Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION Data Serving and Transactions Model Prediction and Training CIDR15, TBA Analytics

Slide 102

Slide 102 text

Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Model Prediction and Training CIDR15, TBA Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14

Slide 103

Slide 103 text

Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Model Prediction and Training CIDR15, TBA Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14

Slide 104

Slide 104 text

Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Model Prediction and Training CIDR15, TBA Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 COORDINATION FREE

Slide 105

Slide 105 text

(Abridged) Related Work

Slide 106

Slide 106 text

(Abridged) Related Work » Semantics-based concurrency control: esp. commutativity and CALM analysis, laws of order » Available storage systems: optimistic replication, causal memory, CRDTs, eventually consistent transactions » Distributed computing: CAP, FLP, NBAC, quorums

Slide 107

Slide 107 text

(Abridged) Related Work » Semantics-based concurrency control: esp. commutativity and CALM analysis, laws of order » Available storage systems: optimistic replication, causal memory, CRDTs, eventually consistent transactions » Distributed computing: CAP, FLP, NBAC, quorums » Here: focus on necessary coordination for common, modern data-intensive apps

Slide 108

Slide 108 text

Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Model Prediction and Training CIDR15, TBA Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 COORDINATION FREE

Slide 109

Slide 109 text

Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Model Prediction and Training CIDR15, TBA Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 COORDINATION FREE 1

Slide 110

Slide 110 text

Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Model Prediction and Training CIDR15, TBA Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 COORDINATION FREE 1 2

Slide 111

Slide 111 text

Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Model Prediction and Training CIDR15, TBA Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 COORDINATION FREE 1 2 3

Slide 112

Slide 112 text

Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Model Prediction and Training CIDR15, TBA Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 COORDINATION FREE 1

Slide 113

Slide 113 text

Social Graph

Slide 114

Slide 114 text

Social Graph

Slide 115

Slide 115 text

Social Graph Facebook

Slide 116

Slide 116 text

Social Graph 1.2B+ vertices Facebook

Slide 117

Slide 117 text

Social Graph 1.2B+ vertices 420B+ edges Facebook

Slide 118

Slide 118 text

Social Graph 1.2B+ vertices 420B+ edges Facebook

Slide 119

Slide 119 text

Social Graph 1 2 3 4 5 6 User Facebook 1.2B+ vertices 420B+ edges

Slide 120

Slide 120 text

Social Graph 1 2 3 4 5 6 2, 3, 5 User Adjacency List 1, 3, 5 1, 5, 6 6 1, 2, 3, 6 3, 4, 5 Facebook 1.2B+ vertices 420B+ edges

Slide 121

Slide 121 text

Social Graph 1 2, 3, 5 User Adjacency List 2 1, 3, 5 3 1, 5, 6 4 6 5 1, 2, 3, 6 6 3, 4, 5 1.2B+ vertices 420B+ edges Facebook

Slide 122

Slide 122 text

1 2, 3, 5 6 3, 4, 5

Slide 123

Slide 123 text

1 2, 3, 5 6 3, 4, 5

Slide 124

Slide 124 text

1 2, 3, 5 6 3, 4, 5 ,6 ,1

Slide 125

Slide 125 text

1 2, 3, 5 6 3, 4, 5 ,6 ,1 To preserve graph, should observe either: » Both links » Neither link

Slide 126

Slide 126 text

1 2, 3, 5 6 3, 4, 5 ,6 ,1 To preserve graph, should observe either: » Both links » Neither link Atomic Visibility

Slide 127

Slide 127 text

Atomic Visibility

Slide 128

Slide 128 text

Atomic Visibility either all or none of each transaction’s updates should be visible to other transactions

Slide 129

Slide 129 text

Atomic Visibility either all or none of each transaction’s updates should be visible to other transactions

Slide 130

Slide 130 text

Atomic Visibility X = 1 WRITE Y = 1 WRITE either all or none of each transaction’s updates should be visible to other transactions

Slide 131

Slide 131 text

Atomic Visibility OR X = 1 READ Y = 1 READ READ X = READ Y = X = 1 WRITE Y = 1 WRITE either all or none of each transaction’s updates should be visible to other transactions

Slide 132

Slide 132 text

Atomic Visibility OR X = 1 READ Y = 1 READ READ X = READ Y = X = 1 WRITE Y = 1 WRITE either all or none of each transaction’s updates should be visible to other transactions

Slide 133

Slide 133 text

Atomic Visibility OR X = 1 READ Y = 1 READ READ X = READ Y = either all or none of each transaction’s updates should be visible to other transactions

Slide 134

Slide 134 text

BUT NOT Atomic Visibility OR X = 1 READ Y = 1 READ READ X = READ Y = either all or none of each transaction’s updates should be visible to other transactions OR X = 1 READ Y = 1 READ READ X = READ Y =

Slide 135

Slide 135 text

BUT NOT Atomic Visibility OR X = 1 READ Y = 1 READ READ X = READ Y = either all or none of each transaction’s updates should be visible to other transactions OR X = 1 READ Y = 1 READ READ X = READ Y = “FRACTURED READS”

Slide 136

Slide 136 text

Atomic Visibility is sufficient to correctly maintain: social graph structure

Slide 137

Slide 137 text

r(x)=0 w(x←1) w(y←1) r(y)=0 Should have r(y)!1 r(y)=0 w(x←1) 2 r(x)=0 w(y←1) 1 Should have r(x)!1 r(y)=0 w(x←1) 1 r(x)=0 w(y←1) 2 CONCURRENT EXECUTION IS NOT SERIALIZABLE! Atomic Visibility is not serializability!

Slide 138

Slide 138 text

r(x)=0 w(x←1) w(y←1) r(y)=0 Should have r(y)!1 r(y)=0 w(x←1) 2 r(x)=0 w(y←1) 1 Should have r(x)!1 r(y)=0 w(x←1) 1 r(x)=0 w(y←1) 2 CONCURRENT EXECUTION IS NOT SERIALIZABLE! Atomic Visibility is not serializability! …but respects Atomic Visibility!

Slide 139

Slide 139 text

Fractured Reads Item Anti- Dependency Cycles Anti-Dependency Cycles Serializability Prevents Prevents Prevents Snapshot Isolation Prevents Prevents Doesn’t prevent Atomic Visibility via Read Atomic Prevents Doesn’t prevent Doesn’t prevent Eventual Consistency Doesn’t prevent Doesn’t prevent Doesn’t prevent Atomic Visibility compared

Slide 140

Slide 140 text

Fractured Reads Item Anti- Dependency Cycles Anti-Dependency Cycles Serializability Prevents Prevents Prevents Snapshot Isolation Prevents Prevents Doesn’t prevent Atomic Visibility via Read Atomic Prevents Doesn’t prevent Doesn’t prevent Eventual Consistency Doesn’t prevent Doesn’t prevent Doesn’t prevent Atomic Visibility compared WANT TO PREVENT

Slide 141

Slide 141 text

Fractured Reads Item Anti- Dependency Cycles Anti-Dependency Cycles Serializability Prevents Prevents Prevents Snapshot Isolation Prevents Prevents Doesn’t prevent Atomic Visibility via Read Atomic Prevents Doesn’t prevent Doesn’t prevent Eventual Consistency Doesn’t prevent Doesn’t prevent Doesn’t prevent Atomic Visibility compared WANT TO PREVENT

Slide 142

Slide 142 text

Fractured Reads Item Anti- Dependency Cycles Anti-Dependency Cycles Serializability Prevents Prevents Prevents Snapshot Isolation Prevents Prevents Doesn’t prevent Atomic Visibility via Read Atomic Prevents Doesn’t prevent Doesn’t prevent Eventual Consistency Doesn’t prevent Doesn’t prevent Doesn’t prevent Atomic Visibility compared WANT TO PREVENT

Slide 143

Slide 143 text

Fractured Reads Item Anti- Dependency Cycles Anti-Dependency Cycles Serializability Prevents Prevents Prevents Snapshot Isolation Prevents Prevents Doesn’t prevent Atomic Visibility via Read Atomic Prevents Doesn’t prevent Doesn’t prevent Eventual Consistency Doesn’t prevent Doesn’t prevent Doesn’t prevent Atomic Visibility compared WANT TO PREVENT

Slide 144

Slide 144 text

Fractured Reads Item Anti- Dependency Cycles Anti-Dependency Cycles Serializability Prevents Prevents Prevents Snapshot Isolation Prevents Prevents Doesn’t prevent Atomic Visibility via Read Atomic Prevents Doesn’t prevent Doesn’t prevent Eventual Consistency Doesn’t prevent Doesn’t prevent Doesn’t prevent Atomic Visibility compared WANT TO PREVENT

Slide 145

Slide 145 text

Fractured Reads Item Anti- Dependency Cycles Anti-Dependency Cycles Serializability Prevents Prevents Prevents Snapshot Isolation Prevents Prevents Doesn’t prevent Atomic Visibility via Read Atomic Prevents Doesn’t prevent Doesn’t prevent Eventual Consistency Doesn’t prevent Doesn’t prevent Doesn’t prevent Atomic Visibility compared WANT TO PREVENT

Slide 146

Slide 146 text

Fractured Reads Item Anti- Dependency Cycles Anti-Dependency Cycles Serializability Prevents Prevents Prevents Snapshot Isolation Prevents Prevents Doesn’t prevent Atomic Visibility via Read Atomic Prevents Doesn’t prevent Doesn’t prevent Eventual Consistency Doesn’t prevent Doesn’t prevent Doesn’t prevent Atomic Visibility compared Require coordination to prevent! [VLDB 2014] WANT TO PREVENT

Slide 147

Slide 147 text

Fractured Reads Item Anti- Dependency Cycles Anti-Dependency Cycles Serializability Prevents Prevents Prevents Snapshot Isolation Prevents Prevents Doesn’t prevent Atomic Visibility via Read Atomic Prevents Doesn’t prevent Doesn’t prevent Eventual Consistency Doesn’t prevent Doesn’t prevent Doesn’t prevent Atomic Visibility compared Require coordination to prevent! [VLDB 2014] WANT TO PREVENT

Slide 148

Slide 148 text

Fractured Reads Item Anti- Dependency Cycles Anti-Dependency Cycles Serializability Prevents Prevents Prevents Snapshot Isolation Prevents Prevents Doesn’t prevent Atomic Visibility via Read Atomic Prevents Doesn’t prevent Doesn’t prevent Eventual Consistency Doesn’t prevent Doesn’t prevent Doesn’t prevent Atomic Visibility compared Require coordination to prevent! [VLDB 2014] WANT TO PREVENT

Slide 149

Slide 149 text

Atomic Visibility is sufficient to correctly maintain: social graph structure

Slide 150

Slide 150 text

Also applies to other relationships

Slide 151

Slide 151 text

Also applies to other relationships an attending doctor should have each patient

Slide 152

Slide 152 text

Atomic Visibility is sufficient to correctly maintain: social graph structure

Slide 153

Slide 153 text

Atomic Visibility is sufficient to correctly maintain: referential integrity secondary indexes materialized views social graph structure

Slide 154

Slide 154 text

Atomic Visibility is sufficient to correctly maintain: referential integrity secondary indexes materialized views despite being weaker than serializability social graph structure

Slide 155

Slide 155 text

Atomic Visibility via Locking

Slide 156

Slide 156 text

Atomic Visibility via Locking X=0 Y=0 X = 1 W Y = 1 W

Slide 157

Slide 157 text

Atomic Visibility via Locking X = 1 W Y = 1 W X=1 Y=1

Slide 158

Slide 158 text

Atomic Visibility via Locking X = 1 R Y = 1 R X = 1 W Y = 1 W X=1 Y=1

Slide 159

Slide 159 text

Atomic Visibility via Locking X = 1 W Y = 1 W Y=0 X=1

Slide 160

Slide 160 text

Atomic Visibility via Locking X = ? R X = 1 W Y = 1 W Y=0 Y = ? R X=1

Slide 161

Slide 161 text

Atomic Visibility via Locking X = ? R X = 1 W Y = 1 W Y=0 Y = ? R X=1 Server 1001 Server 1002

Slide 162

Slide 162 text

Atomic Visibility via Locking X = ? R X = 1 W Y = 1 W Y=0 Y = ? R X=1 Server 1001 Server 1002

Slide 163

Slide 163 text

Atomic Visibility via Locking X = ? R X = 1 W Y = 1 W Y=0 Y = ? R X=1 Server 1001 Server 1002

Slide 164

Slide 164 text

No content

Slide 165

Slide 165 text

T I M E

Slide 166

Slide 166 text

LOCKING W(Y) R(X) R(Y) W(X) T I M E

Slide 167

Slide 167 text

LOCKING W(Y) R(X) R(Y) W(X) ATOMICITY VIOLATED! T I M E

Slide 168

Slide 168 text

LOCKING W(Y) R(X) R(Y) W(X) ATOMICITY VIOLATED! T I M E

Slide 169

Slide 169 text

LOCKING W(Y) R(X) R(Y) W(X) W(Y) R(X) R(Y) W(X) ATOMICITY VIOLATED! T I M E OPTIMISTIC

Slide 170

Slide 170 text

Y X LOCKING W(Y) R(X) R(Y) W(X) W(Y) R(X) R(Y) W(X) ATOMICITY VIOLATED! T I M E OPTIMISTIC VALIDATE ATOMICITY

Slide 171

Slide 171 text

Y X LOCKING VIOLATED? ABORT W(Y) R(X) R(Y) W(X) W(Y) R(X) R(Y) W(X) ATOMICITY VIOLATED! T I M E OPTIMISTIC VALIDATE ATOMICITY

Slide 172

Slide 172 text

Y X LOCKING VIOLATED? ABORT W(Y) R(X) R(Y) W(X) W(Y) R(X) R(Y) W(X) ATOMICITY VIOLATED! T I M E OPTIMISTIC VALIDATE ATOMICITY

Slide 173

Slide 173 text

Y X LOCKING VIOLATED? ABORT W(Y) R(X) R(Y) W(X) W(Y) R(X) R(Y) W(X) ATOMICITY VIOLATED! T I M E OPTIMISTIC VALIDATE ATOMICITY BOTH RELY ON COORDINATION

Slide 174

Slide 174 text

Due to coordination overheads…

Slide 175

Slide 175 text

Facebook Tao Google Megastore LinkedIn Espresso Due to coordination overheads… Amazon DynamoDB Apache Cassandra Basho Riak Yahoo! PNUTS Google App Engine

Slide 176

Slide 176 text

Facebook Tao Google Megastore LinkedIn Espresso Due to coordination overheads… Amazon DynamoDB Apache Cassandra Basho Riak Yahoo! PNUTS …consciously choose to violate atomic visibility Google App Engine

Slide 177

Slide 177 text

Facebook Tao Google Megastore LinkedIn Espresso Due to coordination overheads… Amazon DynamoDB Apache Cassandra Basho Riak Yahoo! PNUTS …consciously choose to violate atomic visibility “[Tao] explicitly favors efficiency and availability over consistency…[an edge] may exist without an inverse; these hanging associations are scheduled for repair by an asynchronous job.” Google App Engine

Slide 178

Slide 178 text

Our contributions: to maintain social graph structure referential integrity [SIGMOD 2014, selected for “Best of SIGMOD” ACM TODS] secondary indexes materialized views

Slide 179

Slide 179 text

Our contributions: to maintain 1. A new model: atomic visibility (via Read Atomic isolation) is (provably) sufficient social graph structure referential integrity [SIGMOD 2014, selected for “Best of SIGMOD” ACM TODS] secondary indexes materialized views

Slide 180

Slide 180 text

Our contributions: to maintain 1. A new model: atomic visibility (via Read Atomic isolation) is (provably) sufficient 2. Efficient protocols: RAMP transactions enforce atomic visibility without coordination social graph structure referential integrity [SIGMOD 2014, selected for “Best of SIGMOD” ACM TODS] secondary indexes materialized views

Slide 181

Slide 181 text

WHAT THE APPLICATION SAYS “accept friend request” “update index entry” write write read write read write read read read read read write write read WHAT THE DATABASE HEARS read read read write read write

Slide 182

Slide 182 text

“accept friend request” “update index entry” write write read write read write read read read read read write write write read

Slide 183

Slide 183 text

“accept friend request” “update index entry” ATOMIC VISIBILITY write write read write read write read read read read read write write write read

Slide 184

Slide 184 text

“accept friend request” “update index entry” RAMP TRANSACTION ATOMIC VISIBILITY write write read write read write read read read read read write write write read

Slide 185

Slide 185 text

“accept friend request” “update index entry” RAMP TRANSACTION RAMP TRANSACTION ATOMIC VISIBILITY write write read write read write read read read read read write write write read

Slide 186

Slide 186 text

ATOMICITY VIOLATED! Y X LOCKING W(Y) R(X) R(Y) W(X) W(Y) R(X) R(Y) W(X) OPTIMISTIC T I M E VIOLATED? ABORT VALIDATE ATOMICITY

Slide 187

Slide 187 text

ATOMICITY VIOLATED! Y X LOCKING W(Y) R(X) R(Y) W(X) W(Y) R(X) R(Y) W(X) OPTIMISTIC RAMP TRANSACTIONS T I M E VIOLATED? ABORT VALIDATE ATOMICITY

Slide 188

Slide 188 text

ATOMICITY VIOLATED! Y X LOCKING W(Y) R(X) R(Y) W(X) W(Y) R(X) R(Y) W(X) OPTIMISTIC RAMP TRANSACTIONS T I M E Without coordination, atomicity violations will (initially) occur! VIOLATED? ABORT VALIDATE ATOMICITY

Slide 189

Slide 189 text

ATOMICITY VIOLATED! Y X LOCKING W(Y) R(X) R(Y) W(X) W(Y) R(X) R(Y) W(X) OPTIMISTIC RAMP TRANSACTIONS W(Y) R(X) R(Y) W(X) T I M E Without coordination, atomicity violations will (initially) occur! VIOLATED? ABORT VALIDATE ATOMICITY

Slide 190

Slide 190 text

ATOMICITY VIOLATED! Y X LOCKING W(Y) R(X) R(Y) W(X) W(Y) R(X) R(Y) W(X) OPTIMISTIC RAMP TRANSACTIONS W(Y) R(X) R(Y) W(X) T I M E Without coordination, atomicity violations will (initially) occur! Don’t panic! Don’t abort! VIOLATED? ABORT VALIDATE ATOMICITY

Slide 191

Slide 191 text

ATOMICITY VIOLATED! Y X LOCKING W(Y) R(X) R(Y) W(X) W(Y) R(X) R(Y) W(X) OPTIMISTIC RAMP TRANSACTIONS W(Y) R(X) R(Y) W(X) DETECT RACES T I M E Without coordination, atomicity violations will (initially) occur! Don’t panic! Don’t abort! VIOLATED? ABORT VALIDATE ATOMICITY

Slide 192

Slide 192 text

ATOMICITY VIOLATED! Y X LOCKING W(Y) R(X) R(Y) W(X) W(Y) R(X) R(Y) W(X) OPTIMISTIC RAMP TRANSACTIONS W(Y) R(X) R(Y) W(X) REPAIR ATOMICITY DETECT RACES T I M E Without coordination, atomicity violations will (initially) occur! Don’t panic! Don’t abort! VIOLATED? ABORT VALIDATE ATOMICITY

Slide 193

Slide 193 text

ATOMICITY VIOLATED! Y X LOCKING W(Y) R(X) R(Y) W(X) W(Y) R(X) R(Y) W(X) OPTIMISTIC RAMP TRANSACTIONS W(Y) R(X) R(Y) W(X) REPAIR ATOMICITY DETECT RACES R(Y) T I M E Without coordination, atomicity violations will (initially) occur! Don’t panic! Don’t abort! VIOLATED? ABORT VALIDATE ATOMICITY

Slide 194

Slide 194 text

RAMP TRANSACTIONS REPAIR ATOMICITY DETECT RACES

Slide 195

Slide 195 text

Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES

Slide 196

Slide 196 text

Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES

Slide 197

Slide 197 text

Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES X = 1 W Y = 1 W Server 1001 X=0 Y=0 Server 1002

Slide 198

Slide 198 text

Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES X = 1 W Y = 1 W Server 1001 X=0 Y=0 Server 1002 X=1

Slide 199

Slide 199 text

Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES X = 1 W Y = 1 W Server 1001 X=0 Y=0 Server 1002 X=1 X = ? R Y = ? R X = 1 Y = 0

Slide 200

Slide 200 text

Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES X = 1 W Y = 1 W Server 1001 X=0 Y=0 Server 1002 X=1 X = ? R Y = ? R X = 1 Y = 0

Slide 201

Slide 201 text

Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES X = 1 W Y = 1 W Server 1001 X=0 Y=0 Server 1002 X=1 X = ? R Y = ? R X = 1 Y = 0

Slide 202

Slide 202 text

Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES X = 1 W Y = 1 W Server 1001 X=0 Y=0 Server 1002 X=1 X = ? R Y = ? R X = 1 Y = 0 via intention metadata

Slide 203

Slide 203 text

Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES X = 1 W Y = 1 W Server 1001 Y=0 Server 1002 X=1 via intention metadata

Slide 204

Slide 204 text

Y=0 T0 {} intention · Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES X = 1 W Y = 1 W X=1 T1 {Y} intention · T0 intention · via intention metadata

Slide 205

Slide 205 text

value Y=0 T0 {} intention · Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES X = 1 W Y = 1 W value X=1 T1 {Y} intention · T0 intention · via intention metadata

Slide 206

Slide 206 text

value Y=0 T0 {} intention · Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES X = 1 W Y = 1 W value X=1 T1 {Y} intention · T0 intention · via intention metadata

Slide 207

Slide 207 text

value Y=0 T0 {} intention · Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES X = 1 W Y = 1 W value X=1 T1 {Y} intention · T0 intention · via intention metadata “A transaction called T1 wrote this and also wrote to Y”

Slide 208

Slide 208 text

Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES X = 1 W Y = 1 W value X=1 T1 {Y} intention · value Y=0 T0 {} intention · via intention metadata

Slide 209

Slide 209 text

Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES X = 1 W Y = 1 W value X=1 T1 {Y} intention · value Y=0 T0 {} intention · via intention metadata

Slide 210

Slide 210 text

Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES X = 1 W Y = 1 W value X=1 T1 {Y} intention · value Y=0 T0 {} intention · via intention metadata X = ? R Y = ? R

Slide 211

Slide 211 text

Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES value X=1 T1 {Y} intention · via intention metadata X = ? R Y = ? R X = 1 W Y = 1 W value Y=0 T0 {} intention ·

Slide 212

Slide 212 text

Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES value X=1 T1 {Y} intention · via intention metadata X = ? R R X = 1 W Y = 1 W X = 1 Y = 0 value Y=0 T0 {} intention · “A transaction called T1 wrote this and also wrote to Y”

Slide 213

Slide 213 text

Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES value X=1 T1 {Y} intention · via intention metadata X = ? R R X = 1 W Y = 1 W X = 1 Y = 0 value Y=0 T0 {} intention · “A transaction called T1 wrote this and also wrote to Y”

Slide 214

Slide 214 text

Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES value X=1 T1 {Y} intention · via intention metadata X = ? R R X = 1 W Y = 1 W X = 1 Y = 0 Where is T1’s write to Y? value Y=0 T0 {} intention ·

Slide 215

Slide 215 text

Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES value X=1 T1 {Y} intention · via intention metadata X = ? R R X = 1 W Y = 1 W X = 1 Y = 0 Where is T1’s write to Y? value Y=0 T0 {} intention ·

Slide 216

Slide 216 text

Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES value X=1 T1 {Y} intention · via intention metadata X = ? R R X = 1 W Y = 1 W X = 1 Y = 0 Where is T1’s write to Y? value Y=0 T0 {} intention ·

Slide 217

Slide 217 text

Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES value X=1 T1 {Y} intention · via intention metadata X = ? R R X = 1 W Y = 1 W X = 1 Y = 0 Where is T1’s write to Y? value Y=0 T0 {} intention · via multi-versioning, ready bit

Slide 218

Slide 218 text

Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES X = 1 W Y = 1 W value X=1 T1 {Y} intention · via intention metadata via multi-versioning, ready bit value Y=0 T0 {} intention ·

Slide 219

Slide 219 text

Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES via intention metadata value intention X=0 T0 {} · value intention Y=0 T0 {} · X = 1 W Y = 1 W via multi-versioning, ready bit

Slide 220

Slide 220 text

Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES via intention metadata value intention X=0 T0 {} · value intention Y=0 T0 {} · X = 1 W Y = 1 W ready ready via multi-versioning, ready bit

Slide 221

Slide 221 text

Y=1 T1 {X} · X=1 T1 {Y} · Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES via intention metadata value intention X=0 T0 {} · value intention Y=0 T0 {} · X = 1 W Y = 1 W ready ready 1.) Place write on each server. via multi-versioning, ready bit

Slide 222

Slide 222 text

Y=1 T1 {X} · X=1 T1 {Y} · Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES via intention metadata value intention X=0 T0 {} · value intention Y=0 T0 {} · X = 1 W Y = 1 W ready ready 1.) Place write on each server. 2.) Set ready bit on each write on server. via multi-versioning, ready bit

Slide 223

Slide 223 text

Y=1 T1 {X} · X=1 T1 {Y} · Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES via intention metadata value intention X=0 T0 {} · value intention Y=0 T0 {} · X = 1 W Y = 1 W ready ready 1.) Place write on each server. 2.) Set ready bit on each write on server. via multi-versioning, ready bit Ready bit invariant: if ready bit is set, all writes in transaction are present on their respective servers

Slide 224

Slide 224 text

Y=1 T1 {X} · X=1 T1 {Y} · Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning value intention X=0 T0 {} · value intention Y=0 T0 {} · X = 1 W Y = 1 W ready ready Ready bit invariant: if ready bit is set, all writes in transaction are present on their respective servers

Slide 225

Slide 225 text

Y=1 T1 {X} · X=1 T1 {Y} · Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning value intention X=0 T0 {} · value intention Y=0 T0 {} · X = 1 W Y = 1 W ready ready Ready bit invariant: if ready bit is set, all writes in transaction are present on their respective servers

Slide 226

Slide 226 text

Y=1 T1 {X} · X=1 T1 {Y} · Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning value intention X=0 T0 {} · value intention Y=0 T0 {} · X = 1 W Y = 1 W ready ready X = ? R Y = ? R Ready bit invariant: if ready bit is set, all writes in transaction are present on their respective servers

Slide 227

Slide 227 text

Y=1 T1 {X} · X=1 T1 {Y} · Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning value intention X=0 T0 {} · value intention Y=0 T0 {} · ready ready X = ? R Y = ? R Ready bit invariant: if ready bit is set, all writes in transaction are present on their respective servers

Slide 228

Slide 228 text

Y=1 T1 {X} · X=1 T1 {Y} · Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning value intention X=0 T0 {} · value intention Y=0 T0 {} · ready ready X = ? R Y = ? R 1.) Fetch “highest” ready versions. Ready bit invariant: if ready bit is set, all writes in transaction are present on their respective servers

Slide 229

Slide 229 text

Y=1 T1 {X} · X=1 T1 {Y} · Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning value intention X=0 T0 {} · value intention Y=0 T0 {} · ready ready X = ? R Y = ? R 1.) Fetch “highest” ready versions. X = 1 Ready bit invariant: if ready bit is set, all writes in transaction are present on their respective servers

Slide 230

Slide 230 text

Y=1 T1 {X} · X=1 T1 {Y} · Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning value intention X=0 T0 {} · value intention Y=0 T0 {} · ready ready X = ? R Y = ? R 1.) Fetch “highest” ready versions. X = 1 Y = 0 Ready bit invariant: if ready bit is set, all writes in transaction are present on their respective servers

Slide 231

Slide 231 text

Y=1 T1 {X} · X=1 T1 {Y} · Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning value intention X=0 T0 {} · value intention Y=0 T0 {} · ready ready X = ? R Y = ? R 1.) Fetch “highest” ready versions. 2.) Fetch any missing writes using metadata. X = 1 Y = 0 Ready bit invariant: if ready bit is set, all writes in transaction are present on their respective servers

Slide 232

Slide 232 text

Y=1 T1 {X} · X=1 T1 {Y} · Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning value intention X=0 T0 {} · value intention Y=0 T0 {} · ready ready X = ? R Y = ? R 1.) Fetch “highest” ready versions. 2.) Fetch any missing writes using metadata. X = 1 Y = 0 Ready bit invariant: if ready bit is set, all writes in transaction are present on their respective servers

Slide 233

Slide 233 text

Y=1 T1 {X} · X=1 T1 {Y} · Atomic Visibility via RAMP Transactions REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning value intention X=0 T0 {} · value intention Y=0 T0 {} · ready ready X = ? R Y = ? R 1.) Fetch “highest” ready versions. 2.) Fetch any missing writes using metadata. X = 1 Y = 0 Y = 1 Ready bit invariant: if ready bit is set, all writes in transaction are present on their respective servers

Slide 234

Slide 234 text

Write RTT READ RTT (best case) READ RTT (worst case) METADATA 2 1 2 O(txn len) write set summary REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning, ready bit RAMP Details

Slide 235

Slide 235 text

Write RTT READ RTT (best case) READ RTT (worst case) METADATA 2 1 2 O(txn len) write set summary REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning, ready bit RAMP Details

Slide 236

Slide 236 text

Write RTT READ RTT (best case) READ RTT (worst case) METADATA 2 1 2 O(txn len) write set summary REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning, ready bit RAMP Details Ensures that readers never have to wait

Slide 237

Slide 237 text

Write RTT READ RTT (best case) READ RTT (worst case) METADATA 2 1 2 O(txn len) write set summary REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning, ready bit RAMP Details Ensures that readers never have to wait

Slide 238

Slide 238 text

Write RTT READ RTT (best case) READ RTT (worst case) METADATA 2 1 2 O(txn len) write set summary REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning, ready bit RAMP Details Ensures that readers never have to wait

Slide 239

Slide 239 text

Write RTT READ RTT (best case) READ RTT (worst case) METADATA 2 1 2 O(txn len) write set summary REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning, ready bit RAMP Details Ensures that readers never have to wait 2nd RTT for repair, in the event of a race

Slide 240

Slide 240 text

Write RTT READ RTT (best case) READ RTT (worst case) METADATA 2 1 2 O(txn len) write set summary REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning, ready bit RAMP Details Ensures that readers never have to wait 2nd RTT for repair, in the event of a race

Slide 241

Slide 241 text

Write RTT READ RTT (best case) READ RTT (worst case) METADATA 2 1 2 O(txn len) write set summary REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning, ready bit RAMP Details

Slide 242

Slide 242 text

Write RTT READ RTT (best case) READ RTT (worst case) METADATA 2 1 2 O(txn len) write set summary REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning, ready bit RAMP Details Transaction IDs: sequence number and client ID » Also use to order overwrites!

Slide 243

Slide 243 text

Write RTT READ RTT (best case) READ RTT (worst case) METADATA 2 1 2 O(txn len) write set summary REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning, ready bit RAMP Details Garbage collection of old versions: » Set timeout (TTL) for overwritten versions » Limit read transaction duration to TTL Transaction IDs: sequence number and client ID » Also use to order overwrites!

Slide 244

Slide 244 text

Write RTT READ RTT (best case) READ RTT (worst case) METADATA 2 1 2 O(txn len) write set summary REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning, ready bit RAMP Details

Slide 245

Slide 245 text

Write RTT READ RTT (best case) READ RTT (worst case) METADATA 2 1 2 O(txn len) write set summary REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning, ready bit RAMP Details

Slide 246

Slide 246 text

Write RTT READ RTT (best case) READ RTT (worst case) METADATA 2 1 2 O(txn len) write set summary REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning, ready bit RAMP Details Can we use less metadata for intent?

Slide 247

Slide 247 text

Algorithm Write RTT READ RTT (best case) READ RTT (worst case) METADATA RAMP-Fast 2 1 2 O(txn len) write set summary RAMP-Small 2 2 2 O(1) timestamp RAMP-Hybrid 2 1+ε 2 O(1) Bloom filter REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning, ready bit RAMP Variants

Slide 248

Slide 248 text

RAMP Variants Algorithm Write RTT READ RTT (best case) READ RTT (worst case) METADATA RAMP-Fast 2 1 2 O(txn len) write set summary RAMP-Small 2 2 2 O(1) timestamp RAMP-Hybrid 2 1+ε 2 O(1) Bloom filter REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning, ready bit

Slide 249

Slide 249 text

RAMP Variants Algorithm Write RTT READ RTT (best case) READ RTT (worst case) METADATA RAMP-Fast 2 1 2 O(txn len) write set summary RAMP-Small 2 2 2 O(1) timestamp RAMP-Hybrid 2 1+ε 2 O(1) Bloom filter REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning, ready bit

Slide 250

Slide 250 text

RAMP Variants Algorithm Write RTT READ RTT (best case) READ RTT (worst case) METADATA RAMP-Fast 2 1 2 O(txn len) write set summary RAMP-Small 2 2 2 O(1) timestamp RAMP-Hybrid 2 1+ε 2 O(1) Bloom filter REPAIR ATOMICITY DETECT RACES via intention metadata Always attempt to repair… …no metadata needed! via multi-versioning, ready bit

Slide 251

Slide 251 text

RAMP Variants Algorithm Write RTT READ RTT (best case) READ RTT (worst case) METADATA RAMP-Fast 2 1 2 O(txn len) write set summary RAMP-Small 2 2 2 O(1) timestamp RAMP-Hybrid 2 1+ε 2 O(B(ε)) Bloom filter REPAIR ATOMICITY DETECT RACES via intention metadata via multi-versioning, ready bit

Slide 252

Slide 252 text

RAMP Variants Algorithm Write RTT READ RTT (best case) READ RTT (worst case) METADATA RAMP-Fast 2 1 2 O(txn len) write set summary RAMP-Small 2 2 2 O(1) timestamp RAMP-Hybrid 2 1+ε 2 O(B(ε)) Bloom filter REPAIR ATOMICITY DETECT RACES via intention metadata Bloom filter summarizes intent False positives: extra read RTTs via multi-versioning, ready bit

Slide 253

Slide 253 text

SYSTEM KNOWS SEMANTICS 㱺 CLIENTS CAN COOPERATE WITHOUT WAITING FOR EACH OTHER RAMP Overview

Slide 254

Slide 254 text

SYSTEM KNOWS SEMANTICS 㱺 CLIENTS CAN COOPERATE WITHOUT WAITING FOR EACH OTHER KEY IDEA: DETECT RACES Storing intention in metadata allows readers to check for missing writes RAMP Overview

Slide 255

Slide 255 text

SYSTEM KNOWS SEMANTICS 㱺 CLIENTS CAN COOPERATE WITHOUT WAITING FOR EACH OTHER KEY IDEA: DETECT RACES Storing intention in metadata allows readers to check for missing writes KEY IDEA: REPAIR ATOMICITY Transactions “hide” writes until others can reliably complete them (ready bit) RAMP Overview

Slide 256

Slide 256 text

SYSTEM KNOWS SEMANTICS 㱺 CLIENTS CAN COOPERATE WITHOUT WAITING FOR EACH OTHER KEY IDEA: DETECT RACES Storing intention in metadata allows readers to check for missing writes KEY IDEA: REPAIR ATOMICITY Transactions “hide” writes until others can reliably complete them (ready bit) coordination free: transactions do not wait for any others to complete RAMP Overview

Slide 257

Slide 257 text

RAMP Evaluation

Slide 258

Slide 258 text

RAMP Evaluation

Slide 259

Slide 259 text

RAMP Evaluation 1. What is the overhead of the RAMP protocols?

Slide 260

Slide 260 text

RAMP Evaluation 1. What is the overhead of the RAMP protocols? 2. What is the benefit of coordination-free execution?

Slide 261

Slide 261 text

RAMP Evaluation 1. What is the overhead of the RAMP protocols? 2. What is the benefit of coordination-free execution? 3. How do the RAMP protocols scale?

Slide 262

Slide 262 text

RAMP Evaluation evaluated on Amazon EC2 cr1.8xlarge servers (1-100 servers; default: 5) 1. What is the overhead of the RAMP protocols? 2. What is the benefit of coordination-free execution? 3. How do the RAMP protocols scale?

Slide 263

Slide 263 text

YCSB: WorkloadA, 95% reads, 1M items, 4 items/txn 0 2000 4000 6000 8000 10000 Concurrent Clients 0 30K 60K 90K 120K 150K 180K Throughput (txn/s)

Slide 264

Slide 264 text

YCSB: WorkloadA, 95% reads, 1M items, 4 items/txn 0 2000 4000 6000 8000 10000 Concurrent Clients 0 30K 60K 90K 120K 150K 180K Throughput (txn/s) RAMP-H NWNR LWNR LWSR LWLR E-PCI No Concurrency Control

Slide 265

Slide 265 text

YCSB: WorkloadA, 95% reads, 1M items, 4 items/txn 0 2000 4000 6000 8000 10000 Concurrent Clients 0 30K 60K 90K 120K 150K 180K Throughput (txn/s) RAMP-H NWNR LWNR LWSR LWLR E-PCI No Concurrency Control Doesn’t enforce atomic visibility

Slide 266

Slide 266 text

YCSB: WorkloadA, 95% reads, 1M items, 4 items/txn 0 2000 4000 6000 8000 10000 Concurrent Clients 0 30K 60K 90K 120K 150K 180K Throughput (txn/s) RAMP-H NWNR LWNR LWSR LWLR E-PCI No Concurrency Control LWSR LWLR E-PCI Serializable 2PL

Slide 267

Slide 267 text

YCSB: WorkloadA, 95% reads, 1M items, 4 items/txn 0 2000 4000 6000 8000 10000 Concurrent Clients 0 30K 60K 90K 120K 150K 180K Throughput (txn/s) RAMP-H NWNR LWNR LWSR LWLR E-PCI No Concurrency Control LWSR LWLR E-PCI Serializable 2PL NWNR LWNR LWSR LWLR E-PCI Write Locks Only

Slide 268

Slide 268 text

YCSB: WorkloadA, 95% reads, 1M items, 4 items/txn 0 2000 4000 6000 8000 10000 Concurrent Clients 0 30K 60K 90K 120K 150K 180K Throughput (txn/s) RAMP-H NWNR LWNR LWSR LWLR E-PCI No Concurrency Control LWSR LWLR E-PCI Serializable 2PL NWNR LWNR LWSR LWLR E-PCI Write Locks Only RAMP-F RAMP-S RAMP-Fast

Slide 269

Slide 269 text

YCSB: WorkloadA, 95% reads, 1M items, 4 items/txn 0 2000 4000 6000 8000 10000 Concurrent Clients 0 30K 60K 90K 120K 150K 180K Throughput (txn/s) RAMP-H NWNR LWNR LWSR LWLR E-PCI No Concurrency Control LWSR LWLR E-PCI Serializable 2PL NWNR LWNR LWSR LWLR E-PCI Write Locks Only RAMP-F RAMP-S RAMP-Fast Within 5% of baseline

Slide 270

Slide 270 text

YCSB: WorkloadA, 95% reads, 1M items, 4 items/txn 0 2000 4000 6000 8000 10000 Concurrent Clients 0 30K 60K 90K 120K 150K 180K Throughput (txn/s) RAMP-H NWNR LWNR LWSR LWLR E-PCI No Concurrency Control LWSR LWLR E-PCI Serializable 2PL NWNR LWNR LWSR LWLR E-PCI Write Locks Only RAMP-F RAMP-S RAMP-Fast RAMP-F RAMP-S RAMP-H RAMP-Small

Slide 271

Slide 271 text

YCSB: WorkloadA, 95% reads, 1M items, 4 items/txn 0 2000 4000 6000 8000 10000 Concurrent Clients 0 30K 60K 90K 120K 150K 180K Throughput (txn/s) RAMP-H NWNR LWNR LWSR LWLR E-PCI No Concurrency Control LWSR LWLR E-PCI Serializable 2PL NWNR LWNR LWSR LWLR E-PCI Write Locks Only RAMP-F RAMP-S RAMP-Fast RAMP-F RAMP-S RAMP-H RAMP-Small Always needs 2RTT reads

Slide 272

Slide 272 text

RAMP-F RAMP-S RAMP-H NWNR RAMP-Hybrid YCSB: WorkloadA, 95% reads, 1M items, 4 items/txn 0 2000 4000 6000 8000 10000 Concurrent Clients 0 30K 60K 90K 120K 150K 180K Throughput (txn/s) 0 2000 4000 6000 8000 10000 Concurrent Clients 0 30K 60K 90K 120K 150K 180K Throughput (txn/s) RAMP-H NWNR LWNR LWSR LWLR E-PCI No Concurrency Control LWSR LWLR E-PCI Serializable 2PL NWNR LWNR LWSR LWLR E-PCI Write Locks Only RAMP-F RAMP-S RAMP-Fast RAMP-F RAMP-S RAMP-H RAMP-Small

Slide 273

Slide 273 text

YCSB: uniform access, 1M items, 4 items/txn, 95% reads 0 25 50 75 100 Number of Servers 0 2M 4M 6M 8M Throughput (ops/s)

Slide 274

Slide 274 text

RAMP-H NWNR LWNR LWSR LWLR E-PCI No Concurrency Control YCSB: uniform access, 1M items, 4 items/txn, 95% reads 0 25 50 75 100 Number of Servers 0 2M 4M 6M 8M Throughput (ops/s)

Slide 275

Slide 275 text

RAMP-H NWNR LWNR LWSR LWLR E-PCI No Concurrency Control RAMP-F RAMP-S RAMP-Fast RAMP-F RAMP-S RAMP-H RAMP-Small RAMP-F RAMP-S RAMP-H NWNR RAMP-Hybrid YCSB: uniform access, 1M items, 4 items/txn, 95% reads 0 25 50 75 100 Number of Servers 0 2M 4M 6M 8M Throughput (ops/s)

Slide 276

Slide 276 text

“accept friend request” “update index entry” RAMP TRANSACTION RAMP TRANSACTION ATOMIC VISIBILITY write write read write read write read read read read read write write write read

Slide 277

Slide 277 text

Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Model Prediction and Training CIDR15, TBA Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 COORDINATION FREE

Slide 278

Slide 278 text

Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Model Prediction and Training CIDR15, TBA Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 COORDINATION FREE

Slide 279

Slide 279 text

Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Model Prediction and Training CIDR15, TBA Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 COORDINATION FREE

Slide 280

Slide 280 text

write read write read write write read write write write read write WHAT THE DATABASE HEARS read read read read read read WHAT THE APPLICATION SAYS my billing application is “correct” my new social app “does the right thing”

Slide 281

Slide 281 text

No content

Slide 282

Slide 282 text

Database users express correctness criteria via database constraints

Slide 283

Slide 283 text

“usernames should be unique” “account balances should remain positive” “there should only be one administrator” Database users express correctness criteria via database constraints

Slide 284

Slide 284 text

Constraint Operation Equality, Inequality Any Generate unique ID Any Specify unique ID Insert > Increment > Decrement < Decrement < Increment Foreign Key Insert Foreign Key Delete Secondary Indexing Any Materialized Views Any AUTO_INCREMENT Insert Typical database constraints and operations (SQL)

Slide 285

Slide 285 text

No content

Slide 286

Slide 286 text

adopt-a-hydrant alchemy_cms amahi bostonrb boxroom brevidy browsercms bucketwise calagator canvas-lms carter chiliproject citizenry comas comfortable- mexican-sofa communityengine copycopter- server danbooru diaspora discourse enki fat_free_crm fedena forem fulcrum gitlab-ci gitlabhq govsgo heaven inkwell insoshi jobsworth juvia kandan linuxfr.org lobsters lovd-by-less nimbleshop obtvse onebody opal opencongress opengovernment openproject piggybak publify radiant railscollab redmine refinerycms ror_ecommerce rucksack saasy salor-retail selfstarter sharetribe skyline spot-us spree sprintapp squaresquash sugar teambox tracks tryshoppe wallgig

Slide 287

Slide 287 text

adopt-a-hydrant alchemy_cms amahi bostonrb boxroom brevidy browsercms bucketwise calagator canvas-lms carter chiliproject citizenry comas comfortable-mexican-sofa communityengine copycopter-server danbooru diaspora discourse enki fat_free_crm fedena forem fulcrum gitlab-ci gitlabhq govsgo heaven inkwell insoshi jobsworth juvia kandan linuxfr.org lobsters lovd-by-less nimbleshop obtvse onebody opal opencongress opengovernment openproject piggybak publify radiant railscollab redmine refinerycms ror_ecommerce rucksack saasy salor-retail selfstarter sharetribe skyline spot-us spree sprintapp squaresquash sugar teambox tracks tryshoppe wallgig zena 67 projects 1.77M LoC 1957 tables [SIGMOD 2015]

Slide 288

Slide 288 text

adopt-a-hydrant alchemy_cms amahi bostonrb boxroom brevidy browsercms bucketwise calagator canvas-lms carter chiliproject citizenry comas comfortable-mexican-sofa communityengine copycopter-server danbooru diaspora discourse enki fat_free_crm fedena forem fulcrum gitlab-ci gitlabhq govsgo heaven inkwell insoshi jobsworth juvia kandan linuxfr.org lobsters lovd-by-less nimbleshop obtvse onebody opal opencongress opengovernment openproject piggybak publify radiant railscollab redmine refinerycms ror_ecommerce rucksack saasy salor-retail selfstarter sharetribe skyline spot-us spree sprintapp squaresquash sugar teambox tracks tryshoppe wallgig zena 67 projects 1.77M LoC 1957 tables 259 total; avg. 0.13 per table [SIGMOD 2015]

Slide 289

Slide 289 text

adopt-a-hydrant alchemy_cms amahi bostonrb boxroom brevidy browsercms bucketwise calagator canvas-lms carter chiliproject citizenry comas comfortable-mexican-sofa communityengine copycopter-server danbooru diaspora discourse enki fat_free_crm fedena forem fulcrum gitlab-ci gitlabhq govsgo heaven inkwell insoshi jobsworth juvia kandan linuxfr.org lobsters lovd-by-less nimbleshop obtvse onebody opal opencongress opengovernment openproject piggybak publify radiant railscollab redmine refinerycms ror_ecommerce rucksack saasy salor-retail selfstarter sharetribe skyline spot-us spree sprintapp squaresquash sugar teambox tracks tryshoppe wallgig zena 67 projects 1.77M LoC 1957 tables 9986 total; avg. 5.1 per table 259 total; avg. 0.13 per table [SIGMOD 2015]

Slide 290

Slide 290 text

CONSTRAINTS MORE COMMON 37x adopt-a-hydrant alchemy_cms amahi bostonrb boxroom brevidy browsercms bucketwise calagator canvas-lms carter chiliproject citizenry comas comfortable-mexican-sofa communityengine copycopter-server danbooru diaspora discourse enki fat_free_crm fedena forem fulcrum gitlab-ci gitlabhq govsgo heaven inkwell insoshi jobsworth juvia kandan linuxfr.org lobsters lovd-by-less nimbleshop obtvse onebody opal opencongress opengovernment openproject piggybak publify radiant railscollab redmine refinerycms ror_ecommerce rucksack saasy salor-retail selfstarter sharetribe skyline spot-us spree sprintapp squaresquash sugar teambox tracks tryshoppe wallgig zena 67 projects 1.77M LoC 1957 tables 9986 total; avg. 5.1 per table 259 total; avg. 0.13 per table [SIGMOD 2015]

Slide 291

Slide 291 text

write read write read write write read write write write read write WHAT THE DATABASE HEARS read read read read read read WHAT THE APPLICATION SAYS “no duplicate users”

Slide 292

Slide 292 text

write read write read write write read write write write read write WHAT THE DATABASE HEARS read read read read read read WHAT THE APPLICATION SAYS “no duplicate users” TODAY: ENFORCEMENT VIA COORDINATION

Slide 293

Slide 293 text

write read write read write write read write write write read write WHAT THE DATABASE HEARS read read read read read read WHAT THE APPLICATION SAYS “no duplicate users” CAN WE USE CONSTRAINTS TO AVOID COORDINATION?

Slide 294

Slide 294 text

WHAT THE APPLICATION SAYS “no duplicate users” constraint WHAT THE DATABASE HEARS constraint constraint constraint constraint constraint constraint constraint “no duplicate users” CAN WE USE CONSTRAINTS TO AVOID COORDINATION?

Slide 295

Slide 295 text

Key idea: Check if constraints can be violated by “merging” independent operations

Slide 296

Slide 296 text

Key idea: Check if constraints can be violated by “merging” independent operations ICT: Invariant Confluence Test

Slide 297

Slide 297 text

CONSTRAINT: User IDs are unique OPERATION: Add users MERGE: Set union Key idea: Check if constraints can be violated by “merging” independent operations ICT: Invariant Confluence Test

Slide 298

Slide 298 text

CONSTRAINT: User IDs are unique OPERATION: Add users MERGE: Set union {{Stu,ID=1}, {Ann,ID=1}} Constraint violated! {} MERGE add {Stu,ID=1} add {Ann,ID=1} Key idea: Check if constraints can be violated by “merging” independent operations ICT: Invariant Confluence Test

Slide 299

Slide 299 text

Key idea: Check if constraints can be violated by “merging” independent operations CONSTRAINT: User IDs are positive OPERATION: Add users MERGE: Set union ICT: Invariant Confluence Test

Slide 300

Slide 300 text

Key idea: Check if constraints can be violated by “merging” independent operations CONSTRAINT: User IDs are positive OPERATION: Add users MERGE: Set union {{Stu,ID=1}, {Ann,ID=1}} Constraint holds! {} MERGE add {Stu,ID=1} add {Ann,ID=1} ICT: Invariant Confluence Test

Slide 301

Slide 301 text

Key idea: Check if constraints can be violated by “merging” independent operations ICT: Invariant Confluence Test

Slide 302

Slide 302 text

Key idea: Check if constraints can be violated by “merging” independent operations OUR CONTRIBUTION: [VLDB 2015] ICT: Invariant Confluence Test

Slide 303

Slide 303 text

Key idea: Check if constraints can be violated by “merging” independent operations OUR CONTRIBUTION: Theorem. A globally I-valid system can execute a set of transactions T with coordination-freedom, transactional availability, and convergence if and only if T are I-confluent with respect to I. [VLDB 2015] ICT ⟺ safe, coordination-free execution possible ICT: Invariant Confluence Test

Slide 304

Slide 304 text

Key idea: Check if constraints can be violated by “merging” independent operations OUR CONTRIBUTION: Generalizes classic partitioning-based indistinguishability arguments Theorem. A globally I-valid system can execute a set of transactions T with coordination-freedom, transactional availability, and convergence if and only if T are I-confluent with respect to I. [VLDB 2015] ICT ⟺ safe, coordination-free execution possible ICT: Invariant Confluence Test

Slide 305

Slide 305 text

Constraint Operation OK? Equality, Inequality Any ??? Generate unique ID Any ??? Specify unique ID Insert ??? > Increment ??? > Decrement ??? < Decrement ??? < Increment ??? Foreign Key Insert ??? Foreign Key Delete ??? Secondary Indexing Any ??? Materialized Views Any ??? AUTO_INCREMENT Insert ??? Typical database constraints and operations (SQL) Under set merge

Slide 306

Slide 306 text

Constraint Operation OK? Equality, Inequality Any Y Generate unique ID Any Y Specify unique ID Insert N > Increment Y > Decrement N < Decrement Y < Increment N Foreign Key Insert Y Foreign Key Delete Y* Secondary Indexing Any Y Materialized Views Any Y AUTO_INCREMENT Insert N [VLDB 2015] Typical database constraints and operations (SQL) Under set merge

Slide 307

Slide 307 text

Constraint Operation OK? Equality, Inequality Any Y Generate unique ID Any Y Specify unique ID Insert N > Increment Y > Decrement N < Decrement Y < Increment N Foreign Key Insert Y Foreign Key Delete Y* Secondary Indexing Any Y Materialized Views Any Y AUTO_INCREMENT Insert N [VLDB 2015] Typical database constraints and operations (SQL) R A M P Under set merge

Slide 308

Slide 308 text

adopt-a-hydrant alchemy_cms amahi bostonrb boxroom brevidy browsercms bucketwise calagator canvas-lms carter chiliproject citizenry comas comfortable-mexican-sofa communityengine copycopter-server danbooru diaspora discourse enki fat_free_crm fedena forem fulcrum gitlab-ci gitlabhq govsgo heaven inkwell insoshi jobsworth juvia kandan linuxfr.org lobsters lovd-by-less nimbleshop obtvse onebody opal opencongress opengovernment openproject piggybak publify radiant railscollab redmine refinerycms ror_ecommerce rucksack saasy salor-retail selfstarter sharetribe skyline spot-us spree sprintapp squaresquash sugar teambox tracks tryshoppe wallgig zena 67 projects 1.77M LoC 1957 tables 9986 total; avg. 5.1 per table 259 total; avg. 0.13 per table [SIGMOD 2015]

Slide 309

Slide 309 text

adopt-a-hydrant alchemy_cms amahi bostonrb boxroom brevidy browsercms bucketwise calagator canvas-lms carter chiliproject citizenry comas comfortable-mexican-sofa communityengine copycopter-server danbooru diaspora discourse enki fat_free_crm fedena forem fulcrum gitlab-ci gitlabhq govsgo heaven inkwell insoshi jobsworth juvia kandan linuxfr.org lobsters lovd-by-less nimbleshop obtvse onebody opal opencongress opengovernment openproject piggybak publify radiant railscollab redmine refinerycms ror_ecommerce rucksack saasy salor-retail selfstarter sharetribe skyline spot-us spree sprintapp squaresquash sugar teambox tracks tryshoppe wallgig zena 67 projects 1.77M LoC 1957 tables 9986 total; avg. 5.1 per table 259 total; avg. 0.13 per table 86.9% PASS ICT [SIGMOD 2015]

Slide 310

Slide 310 text

No content

Slide 311

Slide 311 text

TPC-C

Slide 312

Slide 312 text

14/16 CONSTRAINTS PASS ICT TPC-C

Slide 313

Slide 313 text

14/16 CONSTRAINTS PASS ICT TPC-C 6-11x faster than ACID/serializability 8 16 32 48 64 Number of Warehouses 40K 100K 600K Throughput (txns/s) Coordination-Avoiding Serializable (2PL)

Slide 314

Slide 314 text

14/16 CONSTRAINTS PASS ICT TPC-C scale to over 25x best listed result 0 50 100 150 200 2M 4M 6M 8M 10M 12M 14M Total Throughput (txn/s) 0 50 100 150 200 Number of Servers 0 20K 40K 60K 80K Throughput (txn/s/server) 6-11x faster than ACID/serializability 8 16 32 48 64 Number of Warehouses 40K 100K 600K Throughput (txns/s) Coordination-Avoiding Serializable (2PL)

Slide 315

Slide 315 text

WHAT THE APPLICATION SAYS “no duplicate users” constraint WHAT THE DATABASE HEARS constraint constraint constraint constraint constraint constraint constraint “no duplicate users” CAN WE USE CONSTRAINTS TO AVOID COORDINATION?

Slide 316

Slide 316 text

Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Model Prediction and Training CIDR15, TBA Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 COORDINATION FREE

Slide 317

Slide 317 text

Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Model Prediction and Training CIDR15, TBA Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 COORDINATION FREE

Slide 318

Slide 318 text

Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Model Prediction and Training CIDR15, TBA Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 COORDINATION FREE

Slide 319

Slide 319 text

Key idea: Exploit statistical robustness in system designs

Slide 320

Slide 320 text

PLASMA: ASYNCHRONOUS LEARNING [Ongoing] Key idea: Exploit statistical robustness in system designs

Slide 321

Slide 321 text

PLASMA: ASYNCHRONOUS LEARNING [Ongoing] TIME Bulk Synch Parallel Key idea: Exploit statistical robustness in system designs

Slide 322

Slide 322 text

PLASMA: ASYNCHRONOUS LEARNING [Ongoing] ML task: Express algorithms via async iterator (e.g., ADMM) Bulk Async Parallel TIME TIME Bulk Synch Parallel Key idea: Exploit statistical robustness in system designs Break dataflow barriers using new iterator model

Slide 323

Slide 323 text

VELOX: FAST ONLINE PREDICTIONS [CIDR 2015] PLASMA: ASYNCHRONOUS LEARNING [Ongoing] ML task: Express algorithms via async iterator (e.g., ADMM) Bulk Async Parallel TIME TIME Bulk Synch Parallel Key idea: Exploit statistical robustness in system designs Break dataflow barriers using new iterator model

Slide 324

Slide 324 text

VELOX: FAST ONLINE PREDICTIONS [CIDR 2015] Fast incremental personalization Batch retrain shared features PLASMA: ASYNCHRONOUS LEARNING [Ongoing] ML task: Express algorithms via async iterator (e.g., ADMM) Bulk Async Parallel TIME TIME Bulk Synch Parallel Key idea: Exploit statistical robustness in system designs Break dataflow barriers using new iterator model

Slide 325

Slide 325 text

VELOX: FAST ONLINE PREDICTIONS [CIDR 2015] Fast incremental personalization Batch retrain shared features PLASMA: ASYNCHRONOUS LEARNING [Ongoing] ML task: Express algorithms via async iterator (e.g., ADMM) Bulk Async Parallel TIME TIME Bulk Synch Parallel Key idea: Exploit statistical robustness in system designs Prioritize model maintenance by robustness Break dataflow barriers using new iterator model

Slide 326

Slide 326 text

VELOX: FAST ONLINE PREDICTIONS [CIDR 2015] Fast incremental personalization Batch retrain shared features PLASMA: ASYNCHRONOUS LEARNING [Ongoing] ML task: Express algorithms via async iterator (e.g., ADMM) Bulk Async Parallel TIME TIME Bulk Synch Parallel Key idea: Exploit statistical robustness in system designs Prioritize model maintenance by robustness ML task: Split models according to robustness Break dataflow barriers using new iterator model

Slide 327

Slide 327 text

Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Model Prediction and Training CIDR15, TBA Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 COORDINATION FREE

Slide 328

Slide 328 text

Serializability COORDINATION REQUIRED GUARANTEED SAFETY Eventual Consistency COORDINATION FREE NO SAFETY Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Model Prediction and Training CIDR15, TBA Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 COORDINATION FREE

Slide 329

Slide 329 text

DESIGN DATABASE SYSTEMS THAT EXPLOIT SEMANTICS OF HIGH-VALUE USE CASES MY APPROACH: Study practical database use cases Derive principles and algorithms Build systems to realize the benefits

Slide 330

Slide 330 text

No content

Slide 331

Slide 331 text

PBS: Integrated into Cassandra 1.2 release + recent extensions at a major Internet company

Slide 332

Slide 332 text

PBS: Integrated into Cassandra 1.2 release RAMP: Proposed feature in Cassandra 3.0 (Reportedly) on roadmap for Facebook Apollo, IBM Cloudant + recent extensions at a major Internet company

Slide 333

Slide 333 text

PBS: Integrated into Cassandra 1.2 release RAMP: Proposed feature in Cassandra 3.0 (Reportedly) on roadmap for Facebook Apollo, IBM Cloudant + recent extensions at a major Internet company HAT Isolation: part of Kleppmann@LinkedIn’s Hermitage testing suite

Slide 334

Slide 334 text

PBS: Integrated into Cassandra 1.2 release RAMP: Proposed feature in Cassandra 3.0 (Reportedly) on roadmap for Facebook Apollo, IBM Cloudant + recent extensions at a major Internet company HAT Isolation: part of Kleppmann@LinkedIn’s Hermitage testing suite Active dialogue with developer, NoSQL community via invited talks, blogging, social media

Slide 335

Slide 335 text

Current Practice PBS VLDB12, SIGMOD13, VLDBJ14, CACM14 EC Today CACM/Queue13 Consistency without Borders SoCC13 Network Partitions CACM/Queue14 Feral Concurrency Control SIGMOD15 Principles I-Confluence VLDB15 HATs HotOS13, VLDB14 Explicit Causality SoCC12 Systems Bolt-On SIGMOD13 RAMP + Indexing SIGMOD14 Velox CIDR15 Plasma + BAP Ongoing MY WORK: COORDINATION AVOIDANCE

Slide 336

Slide 336 text

Current Practice PBS VLDB12, SIGMOD13, VLDBJ14, CACM14 EC Today CACM/Queue13 Consistency without Borders SoCC13 Network Partitions CACM/Queue14 Feral Concurrency Control SIGMOD15 Principles I-Confluence VLDB15 HATs HotOS13, VLDB14 Explicit Causality SoCC12 Systems Bolt-On SIGMOD13 RAMP + Indexing SIGMOD14 Velox CIDR15 Plasma + BAP Ongoing MY WORK: COORDINATION AVOIDANCE

Slide 337

Slide 337 text

No content

Slide 338

Slide 338 text

FUTURE WORK

Slide 339

Slide 339 text

FUTURE WORK Automatically coordinated applications

Slide 340

Slide 340 text

FUTURE WORK Automatically coordinated applications Bespoke analysis and coordination synthesis

Slide 341

Slide 341 text

FUTURE WORK Automatically coordinated applications Bespoke analysis and coordination synthesis “Query optimization” for transaction execution

Slide 342

Slide 342 text

FUTURE WORK Automatically coordinated applications Bespoke analysis and coordination synthesis “Query optimization” for transaction execution DB meets “Big Data” Learning

Slide 343

Slide 343 text

FUTURE WORK Automatically coordinated applications Bespoke analysis and coordination synthesis “Query optimization” for transaction execution DB meets “Big Data” Learning View materialization and selection for model maintenance

Slide 344

Slide 344 text

FUTURE WORK Automatically coordinated applications Bespoke analysis and coordination synthesis “Query optimization” for transaction execution DB meets “Big Data” Learning View materialization and selection for model maintenance Bounded divergence control for coordinating learners

Slide 345

Slide 345 text

FUTURE WORK Automatically coordinated applications Bespoke analysis and coordination synthesis “Query optimization” for transaction execution DB meets “Big Data” Learning View materialization and selection for model maintenance Bounded divergence control for coordinating learners Next-Generation Data Applications

Slide 346

Slide 346 text

FUTURE WORK Automatically coordinated applications Bespoke analysis and coordination synthesis “Query optimization” for transaction execution DB meets “Big Data” Learning View materialization and selection for model maintenance Bounded divergence control for coordinating learners Next-Generation Data Applications Next 10-100x growth in data volume due to sensors, apps

Slide 347

Slide 347 text

FUTURE WORK Automatically coordinated applications Bespoke analysis and coordination synthesis “Query optimization” for transaction execution DB meets “Big Data” Learning View materialization and selection for model maintenance Bounded divergence control for coordinating learners Next-Generation Data Applications Next 10-100x growth in data volume due to sensors, apps New interfaces for increased coordination costs, heterogeneity

Slide 348

Slide 348 text

WHAT THE APPLICATION SAYS “post on timeline” “accept friend request” write read write read write write read write write write read write WHAT THE DATABASE HEARS read read read read read read

Slide 349

Slide 349 text

Eventual Consistency COORDINATION FREE NO SAFETY Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Model Prediction and Training CIDR15, TBA Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 COORDINATION FREE

Slide 350

Slide 350 text

Eventual Consistency COORDINATION FREE NO SAFETY Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Model Prediction and Training CIDR15, TBA Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 COORDINATION FREE Joint work with Ali Ghodsi, Joe Hellerstein, Ion Stoica, Mike Franklin, Michael Jordan, Alan Fekete, Dan Crankshaw, Shivaram Venkataraman, Neil Conway, Peter Alvaro, Aaron Davidson, Joey Gonzalez, Kyle Kingsbury, Haoyuan Li, and Zhao Zhang

Slide 351

Slide 351 text

Eventual Consistency COORDINATION FREE NO SAFETY Atomic Visibility SIGMOD14 Database Constraints VLDB15, SIGMOD15 Model Prediction and Training CIDR15, TBA Weak Isolation HotOS13, VLDB14 Causality SOCC12, SIGMOD13 COORDINATION AVOIDANCE GUARANTEED SAFETY WITHOUT COORDINATION MORE SEMANTICS MORE SAFETY PBS VLDB12, VLDBJ14, SIGMOD13, CACM14 COORDINATION FREE Joint work with Ali Ghodsi, Joe Hellerstein, Ion Stoica, Mike Franklin, Michael Jordan, Alan Fekete, Dan Crankshaw, Shivaram Venkataraman, Neil Conway, Peter Alvaro, Aaron Davidson, Joey Gonzalez, Kyle Kingsbury, Haoyuan Li, and Zhao Zhang

Slide 352

Slide 352 text

Many illustrations by the Noun Project (CC-Attribution): surprised by Julian Derveaux world by Wayne Tyler Sall database by Austin Condiff earth by Martin Vanco Woman by Simon Child Man by Simon Child Doctor by Simon Child David-Hockney by Simon Child Server by Simon Child clock by christoph robausch