server 0 server 1 server 2 INCR(x,1) INCR(x,1) INCR(x,1) 6 Increments on the same records execute one at a time and require coordination 1) Network calls 2) Waiting for locks time
0 server 1 server 2 INCR(x0 ,1) INCR(x1 ,1) INCR(x2 ,1) 8 • Increments on the same record can proceed in parallel on local counters • No network calls, no shared locks 1 1 1 Use replicated counters for x time
} INCR(rt_count:tweet, 1) INSERT(rt_list:tweet, user) INSERT(timeline:user, x) followers := GET(follow:user) for f := range followers { INSERT(timeline:f, x) } 14 x := GET(tweet) rts := GET(rt_list:tweet) DELETE(rt_list:tweet) followers := GET(follow:user) DELETE(rt_count:tweet) for u := range rts { REMOVE(timeline:u, x) } for f := range followers { REMOVE(timeline:f, x) } DELETE(tweet) Result: Deleted tweets left around in timelines!
happens or not Application-defined correctness Other transactions do not interfere Can recover correctly from a crash SET TRANSACTION ISOLATION LEVEL SERIALIZABLE BEGIN TRANSACTION ... COMMIT
transactions is the same as if those transactions had executed one at a time, in some serial order. If each transaction preserves correctness, the DB will be in a correct state. We can pretend like there’s no concurrency! 22
plan based on operations on popular records • Split a record into local copies on every server • Coordination-free execution in the common case • Coordinate in the uncommon case to maintain serializability. 29
Operation Model Developers write transactions as stored procedures which are composed of operations on keys and values: 30 value GET(k) void PUT(k,v) void INCR(k,n) void MAX(k,n) void MULT(k,n) void OPUT(k,v,o) void TOPK_INSERT(k,v,o) void UDF(k,v,a) Traditional key/value operations Operations on numeric values which modify the existing value
Operation Model Developers write transactions as stored procedures which are composed of operations on keys and values: 32 value GET(k) void PUT(k,v) void INCR(k,n) void MAX(k,n) void MULT(k,n) void OPUT(k,v,o) void TOPK_INSERT(k,v,o) void UDF(k,v,a) Traditional key/value operations Operations on numeric values which modify the existing value Replicate for reads Save last write Replicate for commutative operations Log operations
plans and no split data • Sample remote operations on records and lock wait times • Initiate plan based on most common operation • Stop plan if common operation changes 38 PhaseDB handles dynamic, changing workloads
GET(x) GET(x) PUT(y,2) GET(x) PUT(z,1) 39 server 3 GET(x) PUT(y,2) • Suppose x is on server 0 (home server). • Home server watches remote accesses time +1 +1 +1 Split x for reads
operations to align with common operations • Executes ommon operations in parallel when record is split • Samples to automatically determine a good plan for contended records and adjust to changing workloads 40
Workloads are regular; we can optimize for the common case. • Still many opportunities to improve performance while retaining easy to understand semantics. 49 http://nehanaru.la @neha
PUT(y,2) INCR(x,1) PUT(z,1) 53 server 3 INCR(x,1) PUT(y,2) server 0 server 1 server 2 INCR(x0 ,1) INCR(x1 ,1) PUT(y,2) INCR(x2 ,1) PUT(z,1) server 3 INCR(x3 ,1) PUT(y,2) • When a record (x) is split operations on it are transformed into operations on local copies (x0 , x1 , x2 , x3 ) • Home server sends copies to other servers split time
Rest of the records use 2PL+2PC (y, z) • 2PL+2PC ensures serializability for the non-split parts of the transaction 54 server 0 server 1 server 2 INCR(x0 ,1) INCR(x1 ,1) PUT(y,2) INCR(x2 ,1) PUT(z,1) server 3 INCR(x3 ,1) PUT(y,2) split time
a read of x in the current state • Block operation to execute after reconciliation 55 server 0 server 1 server 2 INCR(x0 ,1) INCR(x1 ,1) PUT(y,2) INCR(x2 ,1) PUT(z,1) server 3 INCR(x3 ,1) PUT(y,2) split INCR(x1 ,1) INCR(x2 ,1) INCR(x1 ,1) time GET(x)
,1) INCR(x1 ,1) PUT(y,2) INCR(x2 ,1) PUT(z,1) server 3 INCR(x3 ,1) PUT(y,2) split • Home server initiates a cycle. • All servers hear they should reconcile their local copies of x • Stop processing local copy operations GET(x) INCR(x1 ,1) INCR(x2 ,1) INCR(x1 ,1)
all servers have finished reconciliation • Unblock x for other operations 57 server 0 server 1 server 2 server 3 cycling x = x + x0 + x1 + x2 + x3 GET(x) GET(x1 ) GET(x2 ) GET(x3 )
3 cycling GET(x) • Reconcile state to owning server • Wait until all servers have finished reconciliation • Unblock x for other operations x = x + x0 + x1 + x2 + x3 GET(x1 ) GET(x2 ) GET(x3 )