Slide 1

Slide 1 text

Haeinsa Overview Design Principles, Transaction Algorithms and Performance Evaluation of Haeinsa 1 Copyright 2013 VCNC Inc. All rights reserved

Slide 2

Slide 2 text

There is no transaction in NoSQL • ACID transaction in this document • Could be series of multiple operations • No restrictions on operations (such as single row only) • Transaction should guarantees full ACID • No full ACID transaction in NoSQL • HBase provides row level ACID semantics only • Cassandra provides row level ACID semantics only • MongoDB operations are atomic only at the level of single document • Other NoSQL are not so different at all 2 Copyright 2013 VCNC Inc. All rights reserved

Slide 3

Slide 3 text

Why no ACID for NoSQL? • Providing reliable and fast transaction for distributed system is hard • Don’t misunderstand CAP theorem! 3 T1 T2 Copyright 2013 VCNC Inc. All rights reserved

Slide 4

Slide 4 text

ACID Transaction for NoSQL • Google’s Percolator, Megastore, Spanner • Provides full ACID properties for distributed system • Run on Google’s closed system • Attempts to provide ACID transaction for NoSQL • HAcid, Omid, HBaseSI and so on • But most of were not linear scalable • Haeinsa is the ACID transaction library • Provides full ACID properties for HBase • Linear scalable throughput and fault-tolerant • Battle tested (Used in real service) 4 Copyright 2013 VCNC Inc. All rights reserved

Slide 5

Slide 5 text

About Haeinsa • Multi-row, multi-table transaction library for HBase • Provides strong ACID property • Linear scalable throughput • Fault-tolerant against both client and HBase failure • Isolation level is serializability • Provides all of basic operations: Get, Put, Delete, Scan • Used successfully in real service • Open Source • https://github.com/vcnc/haeinsa 5 Copyright 2013 VCNC Inc. All rights reserved

Slide 6

Slide 6 text

Sample Code HaeinsaTransaction tx = tm.begin(); HaeinsaPut put1 = new HaeinsaPut(rowKey1); put1.add(family, qualifier, value1); table.put(tx, put1); HaeinsaPut put2 = new HaeinsaPut(rowKey2); put2.add(family, qualifier, value2); table.put(tx, put2); tx.commit(); Copyright 2013 VCNC Inc. All rights reserved 6

Slide 7

Slide 7 text

Design Principals • No modification on HBase • Optimistic concurrency control • Lock column storing metadata of each row • Two-phase commit protocol 7 Copyright 2013 VCNC Inc. All rights reserved

Slide 8

Slide 8 text

Why no modification on HBase? • Haeinsa is a client library, so it can be used for HBase cluster as it is. • Easy to implement, easy to migrate to Haeinsa from existing HBase cluster 8 Bare-bone Hbase Cluster Client using Haeinsa Copyright 2013 VCNC Inc. All rights reserved

Slide 9

Slide 9 text

Pessimistic concurrency control • Wait until other concurrent transaction to be completed, and execute after. • Lock entire process of transaction T1 T2 T2 can start after T2 completes due to locking 9 Copyright 2013 VCNC Inc. All rights reserved

Slide 10

Slide 10 text

Optimistic concurrency control • Proceed without locking, and check conflicts with other concurrent transactions before commit • Lock only on conflict check logic T1 T2 T2 can starts even if T1 is not completed (no locking on this state) abort T2 if conflict with T1 10 Copyright 2013 VCNC Inc. All rights reserved

Slide 11

Slide 11 text

Why OCC? • Better concurrency and performance for low- conflict environment. • General case of schema design in Hbase for OLTP leads to are low-conflict environment. • E.g. Each row store all the data of single user. 11 Row key Data Lock Bob 3: $10 State:STABLE CommitTimestamp:3 Joe 3: $2 State:STABLE CommitTimestamp:3 Access to user’s own row only is the most case of transaction in general OLTP service. Joe Copyright 2013 VCNC Inc. All rights reserved

Slide 12

Slide 12 text

Lock column • Special column used by Haeinsa internally • Represents locking state of the row • Stores transactional metadata of each row • Metadata contains state of the row, mutations of the transaction and timestamps of the row Row key Data Lock Bob 3: $10 State:STABLE CommitTimestamp:3 Joe 3: $2 State:STABLE CommitTimestamp:3 12 Copyright 2013 VCNC Inc. All rights reserved

Slide 13

Slide 13 text

Why lock column for each row? • Lock column contains transactional metadata • Percolator contains transactional metadata by each cell • Lock column is the basic unit of locking • Haeinsa stores metadata in each row • Unit of locking is wider than percolator • Increases probability of conflict but less overhead • But, we can presume that locking by row is small enough to achieve low conflict rate 13 Copyright 2013 VCNC Inc. All rights reserved

Slide 14

Slide 14 text

Two-Phase Commit Protocol • 2PC is one of atomic commitment protocol 14 Coordinator Participant Participant Coordinator Participant Participant Commit-request phase Collect votes from Participants. If all participants vote YES, then starts commit phase, abort otherwise Commit phase Send the decision to participants. Each participant follow the decision and send ack to coordinator Copyright 2013 VCNC Inc. All rights reserved

Slide 15

Slide 15 text

How read operation works 1. Read Lock column 2. If the row is not in stable state, abort the transaction or recover the row if possible 3. Read data from cell Row key Bal Lock Bob 3: $10 State:STABLE CommitTimestamp:3 Joe 3: $2 State:STABLE CommitTimestamp:3 15 Copyright 2013 VCNC Inc. All rights reserved

Slide 16

Slide 16 text

How write operation works 1. Read Lock column 2. If the row is not in stable state, abort the transaction or recover the row if possible 3. Store new value in client-side buffer 4. Write value in HBase only if Lock column did not changed (This operation can be executed atomically with checkAndPut operation) 16 Copyright 2013 VCNC Inc. All rights reserved

Slide 17

Slide 17 text

How it works? • Let’s trace how Haeinsa works by studying the transaction that Bob giving the $7 to Joe. 17 HBase-side Row key Bal Lock Bob 3: $10 State:STABLE CommitTimestamp:3 Joe 3: $2 State:STABLE CommitTimestamp:3 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() Client-side << Before Transaction >> Copyright 2013 VCNC Inc. All rights reserved

Slide 18

Slide 18 text

How it works? 18 C is representing Client, and Rbob and Rjoe are rows representing balance of Bob and Joe. We will trace how Haeinsa works during the transaction Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() State of Transaction Writes = [] Locks = {} Copyright 2013 VCNC Inc. All rights reserved

Slide 19

Slide 19 text

19 Nothing to do. Haeinsa just creates Transaction instance in Client memory. Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() State of Transaction Writes = [] Locks = {} Copyright 2013 VCNC Inc. All rights reserved

Slide 20

Slide 20 text

20 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() State of Transaction Writes = [] Locks[Bob] = (STABLE, 3) Read Bob’s Lock column first. And then read Bob’s Balance column. Lock has state of row and valid commit timestamp information. Copyright 2013 VCNC Inc. All rights reserved

Slide 21

Slide 21 text

21 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() State of Transaction Writes = [(Bob, bal, $3)] Locks[Bob] = (STABLE, 3) Bob’s new balance store into client’s memory. This new value will be applied to HBase on commit. Copyright 2013 VCNC Inc. All rights reserved

Slide 22

Slide 22 text

22 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() State of Transaction Writes = [(Bob, bal, $3)] Locks[Bob] = (STABLE, 3) Locks[Joe] = (STABLE, 3) Read Joe’s Lock column first. And then read Bob’s Balance column. Copyright 2013 VCNC Inc. All rights reserved

Slide 23

Slide 23 text

23 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() State of Transaction Writes = [(Bob, bal, $3), (Joe, bal, $9)] Locks[Bob] = (STABLE, 3) Locks[Joe] = (STABLE, 3) Bob’s new balance store into client’s memory. This new value will be applied to HBase on commit. Copyright 2013 VCNC Inc. All rights reserved

Slide 24

Slide 24 text

24 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() State of Transaction Writes = [(Bob, bal, $3), (Joe, bal, $9)] Locks[Bob] = (STABLE, 3) Locks[Joe] = (STABLE, 3) Now, It is time to explain commit operation. Let’s assume that Bob’s row is primary row, and Joe’s is secondary. We will use 2PC protocol for commit. Copyright 2013 VCNC Inc. All rights reserved

Slide 25

Slide 25 text

25 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() State of Transaction Writes = [(Bob, bal, $3), (Joe, bal, $9)] Locks[Bob] = (PREWRITTEN, 6, 4, [Joe]) Locks[Joe] = (STABLE, 3) Prewrite Bob’s new balance to Bob’s row. Row became prewritten state, so Haeinsa prevents any other transaction’s access to the row. Remember: checkAndPut is atomic and ensures that value of the row has not been modified since read. << prewritten >> Copyright 2013 VCNC Inc. All rights reserved

Slide 26

Slide 26 text

26 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() State of Transaction Writes = [(Joe, bal, $9)] Locks[Bob] = (PREWRITTEN, 6, 4, [Joe]) Locks[Joe] = (PREWRITTEN, 6, 4, , Bob) Prewrite Joe’s new balance to Bob’s row. Remember: checkAndPut is atomic and ensures that value of the row has not been modified since read. << prewritten >> << prewritten >> Copyright 2013 VCNC Inc. All rights reserved

Slide 27

Slide 27 text

27 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() State of Transaction Writes = [] Locks[Bob] = (COMMITTED, 6, , [Joe]) Locks[Joe] = (PREWRITTEN, 6, 4, , Bob) Make state of Bob’s row to COMMITTED. Transaction can be treated as succeed from now on. << committed >> << prewritten >> Copyright 2013 VCNC Inc. All rights reserved

Slide 28

Slide 28 text

28 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() State of Transaction Writes = [] Locks[Bob] = (COMMITTED, 6, , [Joe]) Locks[Joe] = (STABLE, 6) Make state of Joe’s row to STABLE. Now other transaction can access to the row. << committed >> << stable>> Copyright 2013 VCNC Inc. All rights reserved

Slide 29

Slide 29 text

29 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() State of Transaction Writes = [] Locks[Bob] = (STABLE, 6) Locks[Joe] = (STABLE, 6) Make state of Bob’s row to STABLE. Now other transaction can access to the row. << stable>> << stable >> Copyright 2013 VCNC Inc. All rights reserved

Slide 30

Slide 30 text

30 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() State of Transaction Writes = [] Locks = {} Transaction completed. All rows are in stable state. << stable>> << stable >> Copyright 2013 VCNC Inc. All rights reserved

Slide 31

Slide 31 text

Correctness of the algorithm 31 Let's check whether the sequence of operations really ensure ACID transaction of multiple rows! Rbob Rjoe C get write get checkAndPut write Copyright 2013 VCNC Inc. All rights reserved

Slide 32

Slide 32 text

32 checkAndPut is the atomic operation provided by HBase. So we can say that row didn't modified since execution of the get operation. Rbob Rjoe C get write get checkAndPut write checkAndPut ensures that value of the row has not been modified since read. Remember: Every modification via Haeinsa modifies Lock column also. Copyright 2013 VCNC Inc. All rights reserved

Slide 33

Slide 33 text

33 Haeinsa don't allows any operations to access unstable rows. That means, Haeinsa locks participating rows during commit operation. Rbob Rjoe C get write get checkAndPut write Since the row is not in STABLE state, other transaction can't access to the row during this interval. And each checkAndPut operation ensures that the row has not been accessed by other transaction. Copyright 2013 VCNC Inc. All rights reserved

Slide 34

Slide 34 text

34 Atomicity of the transaction ensured by single checkAndPut operation. Rbob Rjoe C get write get checkAndPut write This checkAndPut operation determine whether whole transaction is succeed or not. Success of the transaction is determined by atomic operation. << committed >> Copyright 2013 VCNC Inc. All rights reserved

Slide 35

Slide 35 text

35 Any of checkAndPut operation fails, all rows can be recovered to STABLE state. If state of primary row is COMMITED, the transaction can be treated as succeed, so, apply mutations to each row. If not, delete prewritten values from all rows. Rbob Rjoe C get write get checkAndPut write Any of these operation fails, states of row can be recovered to STABLE. Copyright 2013 VCNC Inc. All rights reserved

Slide 36

Slide 36 text

There is rows representing balance of Bob and Joe. Let’s trace how Haeinsa works by studying the transaction that Bob giving the $7 to Joe. HBase-side Row key Bal Lock Bob 3: $10 State:STABLE CommitTimestamp:3 Joe 3: $2 State:STABLE CommitTimestamp:3 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() Client-side Detailed trace of the transaction << Before Transaction >> 36 Copyright 2013 VCNC Inc. All rights reserved

Slide 37

Slide 37 text

Nothing to do. Writes and Locks are client-side memory structure. HBase-side Row key bal lock Bob 3: $10 State:STABLE CommitTimestamp:3 Joe 3: $2 State:STABLE CommitTimestamp:3 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() Client-side State of Transaction Writes = [] Locks = {} 37 Copyright 2013 VCNC Inc. All rights reserved

Slide 38

Slide 38 text

Read Bob’s Lock column first. And then read Bob’s Balance column. HBase-side Row key bal lock Bob 3: $10 State:STABLE CommitTimestamp:3 Joe 3: $2 State:STABLE CommitTimestamp:3 Client-side State of Transaction Writes = [] Locks[Bob] = (STABLE, 3) BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() 38 Copyright 2013 VCNC Inc. All rights reserved

Slide 39

Slide 39 text

Bob’s new balance put into writes. Store on client- side memory. It will be write on Hbase on commit. HBase-side Row key bal lock Bob 3: $10 State:STABLE CommitTimestamp:3 Joe 3: $2 State:STABLE CommitTimestamp:3 Client-side State of Transaction Writes = [(Bob, bal, $3)] Locks[Bob] = (STABLE, 3) BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() 39 Copyright 2013 VCNC Inc. All rights reserved

Slide 40

Slide 40 text

Read Joe’s Lock column first. And then read Joe’s Balance column. HBase-side Row key bal lock Bob 3: $10 State:STABLE CommitTimestamp:3 Joe 3: $2 State:STABLE CommitTimestamp:3 Client-side State of Transaction Writes = [(Bob, bal, $3)] Locks[Bob] = (STABLE, 3) Locks[Joe] = (STABLE, 3) BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() 40 Copyright 2013 VCNC Inc. All rights reserved

Slide 41

Slide 41 text

Joe’s new balance put into writes. HBase-side Row key bal lock Bob 3: $10 State:STABLE CommitTimestamp:3 Joe 3: $2 State:STABLE CommitTimestamp:3 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() Client-side State of Transaction Writes = [(Bob, bal, $3), (Joe, bal, $9)] Locks[Bob] = (STABLE, 3) Locks[Joe] = (STABLE, 3) 41 Copyright 2013 VCNC Inc. All rights reserved

Slide 42

Slide 42 text

Prewrite value on primary row. Primary row is selected by particular algorithm by Haeinsa. HBase-side Row key bal lock Bob 4: $3 3: $10 State:PREWRITTEN CommitTimestamp:6 PrewriteTimestamp:4 Secondaries:[Joe] Joe 3: $2 State:STABLE CommitTimestamp:3 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() Client-side State of Transaction Writes = [(Bob, bal, $3), (Joe, bal, $9)] Locks[Bob] = (PREWRITTEN, 6, 4, [Joe]) Locks[Joe] = (STABLE, 3) 42 Copyright 2013 VCNC Inc. All rights reserved

Slide 43

Slide 43 text

Prewrite value on secondary row. Secondary row is the row which is not primary row. HBase-side Row key bal lock Bob 4: $3 3: $10 State:PREWRITTEN CommitTimestamp:6 PrewriteTimestamp:4 Secondaries:[Joe] Joe 4: $9 3: $2 State:PREWRITTEN CommitTimestamp:6 PrewriteTimestamp:4 Primary:Bob BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() Client-side State of Transaction Writes = [(Joe, bal, $9)] Locks[Bob] = (PREWRITTEN, 6, 4, [Joe]) Locks[Joe] = (PREWRITTEN, 6, 4, , Bob) 43 Copyright 2013 VCNC Inc. All rights reserved

Slide 44

Slide 44 text

If prewrite all succeed, change state of primary row to COMMITED. The transaction can be treated as succeed at this moment. HBase-side Row key bal lock Bob 4: $3 3: $10 State:COMMITTED CommitTimestamp:6 Secondaries:[Joe] Joe 4: $9 3: $2 State:PREWRITTEN CommitTimestamp:6 PrewriteTimestamp:4 Primary:Bob BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() Client-side State of Transaction Writes = [] Locks[Bob] = (COMMITTED, 6, , [Joe]) Locks[Joe] = (PREWRITTEN, 6, 4, , Bob) 44 Copyright 2013 VCNC Inc. All rights reserved

Slide 45

Slide 45 text

Change state of secondary row to STABLE. HBase-side Row key bal lock Bob 4: $3 3: $10 State:COMMITTED CommitTimestamp:6 Secondaries:[Joe] Joe 4: $9 3: $2 State:STABLE CommitTimestamp:6 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() Client-side State of Transaction Writes = [] Locks[Bob] = (COMMITTED, 6, , [Joe]) Locks[Joe] = (STABLE, 6) 45 Copyright 2013 VCNC Inc. All rights reserved

Slide 46

Slide 46 text

Change state of primary row to STABLE. HBase-side Row key bal lock Bob 4: $3 3: $10 State:STABLE CommitTimestamp:6 Joe 4: $9 3: $2 State:STABLE CommitTimestamp:6 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() Client-side State of Transaction Writes = [] Locks[Bob] = (STABLE, 6) Locks[Joe] = (STABLE, 6) 46 Copyright 2013 VCNC Inc. All rights reserved

Slide 47

Slide 47 text

Transaction completed. All rows are in stable state. HBase-side Row key bal lock Bob 4: $3 3: $10 State:STABLE CommitTimestamp:6 Joe 4: $9 3: $2 State:STABLE CommitTimestamp:6 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() Client-side State of Transaction Writes = [] Locks={} 47 Copyright 2013 VCNC Inc. All rights reserved

Slide 48

Slide 48 text

Performance Optimization • Read only transaction • No Put or Delete operation. • No checkAndPut operation. Just get Lock column and check whether it is modified. • Single row transaction • Only single row is participated to the transaction. • Just one checkAndPut operation. (STABLESTABLE) 48 Copyright 2013 VCNC Inc. All rights reserved

Slide 49

Slide 49 text

Read Only Transaction 49 C is representing Client, and Rbob and Rjoe are rows representing balance of Bob and Joe. We will trace how read-only transaction works. Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) joeBal = Read(Joe, bal) Commit() State of Transaction Writes = [] Locks = {} Copyright 2013 VCNC Inc. All rights reserved

Slide 50

Slide 50 text

50 Nothing to do. Haeinsa just creates Transaction instance in Client memory. Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) joeBal = Read(Joe, bal) Commit() State of Transaction Writes = [] Locks = {} Copyright 2013 VCNC Inc. All rights reserved

Slide 51

Slide 51 text

51 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) joeBal = Read(Joe, bal) Commit() State of Transaction Writes = [] Locks[Bob] = (STABLE, 3) Read Bob’s Lock column and then read Bob’s Balance column. Copyright 2013 VCNC Inc. All rights reserved

Slide 52

Slide 52 text

52 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) joeBal = Read(Joe, bal) Commit() State of Transaction Writes = [] Locks[Bob] = (STABLE, 3) Locks[Joe] = (STABLE, 3) Read Joe’s Lock column and then read Joe’s Balance column. Copyright 2013 VCNC Inc. All rights reserved

Slide 53

Slide 53 text

53 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) joeBal = Read(Joe, bal) Commit() State of Transaction Writes = [] Locks[Bob] = (STABLE, 3) Locks[Joe] = (STABLE, 3) Now we will see what happens in commit operation if transaction contains read operation only. Copyright 2013 VCNC Inc. All rights reserved

Slide 54

Slide 54 text

54 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) joeBal = Read(Joe, bal) Commit() State of Transaction Writes = [] Locks[Bob] = (STABLE, 3) Locks[Joe] = (STABLE, 3) Get Bob’s lock and check if lock is modified since first read operation executed. If lock modified, the transaction will be aborted. << stable >> Copyright 2013 VCNC Inc. All rights reserved

Slide 55

Slide 55 text

55 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) joeBal = Read(Joe, bal) Commit() State of Transaction Writes = [] Locks[Bob] = (STABLE, 3) Locks[Joe] = (STABLE, 3) Get Joe’s lock and check if lock is modified since first read operation executed. If lock modified, the transaction will be aborted. If lock did not modified, transaction treated as successful. << stable>> << stable >> Copyright 2013 VCNC Inc. All rights reserved

Slide 56

Slide 56 text

56 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) joeBal = Read(Joe, bal) Commit() State of Transaction Writes = [] Locks = {} The transaction completed successfully. This series of operations ensures that rows has not been modified by other concurrent transaction. << stable>> << stable >> Copyright 2013 VCNC Inc. All rights reserved

Slide 57

Slide 57 text

Single Row Transaction 57 C is representing Client, and Rbob is row of Bob. Let’s trace how Haeinsa works with single row transaction Rbob C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal+$7) bobTot = Read(Bob, total) Write(Bob, total, bobTot+$7) Commit() State of Transaction Writes = [] Locks = {} Copyright 2013 VCNC Inc. All rights reserved

Slide 58

Slide 58 text

58 Nothing to do. Haeinsa just creates Transaction instance in Client memory. BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal+$7) bobTot = Read(Bob, total) Write(Bob, total, bobTot+$7) Commit() State of Transaction Writes = [] Locks = {} Rbob C Copyright 2013 VCNC Inc. All rights reserved

Slide 59

Slide 59 text

59 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal+$7) bobTot = Read(Bob, total) Write(Bob, total, bobTot+$7) Commit() State of Transaction Writes = [] Locks[Bob] = (STABLE, 3) Read Bob’s Lock column first. And then read Bob’s Balance column. Lock contains state of the row. Rbob C Copyright 2013 VCNC Inc. All rights reserved

Slide 60

Slide 60 text

60 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal+$7) bobTot = Read(Bob, total) Write(Bob, total, bobTot+$7) Commit() State of Transaction Writes = [(Bob, bal, $17)] Locks[Bob] = (STABLE, 3) Bob’s new balance store into client’s memory. This new value will be applied to HBase on commit. Rbob C Copyright 2013 VCNC Inc. All rights reserved

Slide 61

Slide 61 text

61 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal+$7) bobTot = Read(Bob, total) Write(Bob, total, bobTot+$7) Commit() State of Transaction Writes = [(Bob, bal, $17)] Locks[Bob] = (STABLE, 3) Read Bob’s total column. We already have Bob’s lock so we do not read Bob’s lock this time. Rbob C Copyright 2013 VCNC Inc. All rights reserved

Slide 62

Slide 62 text

62 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal+$7) bobTot = Read(Bob, total) Write(Bob, total, bobTot+$7) Commit() State of Transaction Writes = [(Bob, bal, $17), (Bob, total, $17)] Locks[Bob] = (STABLE, 3) Bob’s new total store into client’s memory. This new value will be applied to HBase on commit. Rbob C Copyright 2013 VCNC Inc. All rights reserved

Slide 63

Slide 63 text

63 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal+$7) bobTot = Read(Bob, total) Write(Bob, total, bobTot+$7) Commit() State of Transaction Writes = [(Bob, bal, $17), (Bob, total, $17)] Locks[Bob] = (STABLE, 3) Now, It is time to explain commit operation. Commit operation of single row transaction is much simpler than multi-row transaction Rbob C Copyright 2013 VCNC Inc. All rights reserved

Slide 64

Slide 64 text

64 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal+$7) bobTot = Read(Bob, total) Write(Bob, total, bobTot+$7) Commit() State of Transaction Writes = [(Bob, bal, $17), (Bob, total, $17)] Locks[Bob] = (STABLE, 4) Only one checkAndPut operation needed for single row transaction. All new values are applied to HBase with single Hbase operation. Rbob C << stable>> Copyright 2013 VCNC Inc. All rights reserved

Slide 65

Slide 65 text

65 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal+$7) bobTot = Read(Bob, total) Write(Bob, total, bobTot+$7) Commit() State of Transaction Writes = [] Locks = {} Transaction completed. Single checkAndPut operation ensures that there is no modification of value by other concurrent transaction. Rbob C << stable>> Copyright 2013 VCNC Inc. All rights reserved

Slide 66

Slide 66 text

Limitation of Haeinsa • No Controls on Timestamp • Timestamp in Hbase are used by Haeinsa internally • Timestamp interface does not exists in Haeinsa APIs • Bounded Transaction Size • Targeted for transactions across handful of rows (approx. from 1 to 100s of rows) • Not for transaction against thousands of rows 66 Copyright 2013 VCNC Inc. All rights reserved

Slide 67

Slide 67 text

Performance of Haeinsa • Performance of Haeinsa can be vary by combination of operations of the transaction • If operations are gathered in a small number of rows, the better performance can be 67 Copyright 2013 VCNC Inc. All rights reserved

Slide 68

Slide 68 text

Measuring Performance • Tested on AWS (c1.xlarge) • Practical Performance • Transaction = (3writes + 1read) * 2rows + 1read * 1row • Simulation of most transaction in our service • Better performance than raw Hbase • Because: Hbase does more RPC than Haeinsa (Haeinsa applies writes on commit with checkAndPut) • Worst case Performance • Transaction = 1write* 2rows + 1read * 1row • 2 to 3 times worse than raw Hbase • But: it is much better than other transaction libraries Copyright 2013 VCNC Inc. All rights reserved 68

Slide 69

Slide 69 text

Practical Performance (Linear Scalability) Copyright 2013 VCNC Inc. All rights reserved 69 0 5000 10000 15000 20000 25000 30000 35000 40000 45000 50000 0 200 400 600 800 1000 1200 Tx/Sec ECU of HBase Cluster Haeinsa HBase

Slide 70

Slide 70 text

Practical Performance (Latency) Copyright 2013 VCNC Inc. All rights reserved 70 0 5 10 15 20 25 30 35 0 200 400 600 800 1000 1200 ms ECU of HBase Cluster Haeinsa HBase

Slide 71

Slide 71 text

Practical Performance (Throughput) Copyright 2013 VCNC Inc. All rights reserved 71 0 10 20 30 40 50 60 0 200 400 600 800 1000 1200 Tx/ECU ECU of HBase Cluster Haeinsa HBase

Slide 72

Slide 72 text

Worst-case Performance (Linear Scalability) Copyright 2013 VCNC Inc. All rights reserved 72 0 20000 40000 60000 80000 100000 120000 0 200 400 600 800 1000 1200 Tx/Sec ECU of HBase Cluster Haeinsa HBase

Slide 73

Slide 73 text

Worst-case Performance (Latency) Copyright 2013 VCNC Inc. All rights reserved 73 0 5 10 15 20 25 30 0 200 400 600 800 1000 1200 ms ECU of HBase Cluster Haeinsa HBase

Slide 74

Slide 74 text

Worst-case Performance (Throughput) Copyright 2013 VCNC Inc. All rights reserved 74 0 20 40 60 80 100 120 140 0 200 400 600 800 1000 1200 Tx/ECU ECU of HBase Cluster Haeinsa HBase

Slide 75

Slide 75 text

Conflict Rate • If conflict occurs during commit operation, Haeinsa throws ConflictException • If ConflictException catched, our server retries the request with backoff • If maximum retry count exeeds, request fails • We measured conflict rate in our real service • Conflict rate: 0.004%~0.010% • Retry fail rate: 0.0003%~0.0010% 75 Copyright 2013 VCNC Inc. All rights reserved

Slide 76

Slide 76 text

Use Case • Haeinsa is currently used in real service • Between • Mobile service for couples • Processes 300M+ transaction per day by Haeinsa • http://appbetween.us 76 Copyright 2013 VCNC Inc. All rights reserved

Slide 77

Slide 77 text

Links • Haeinsa Source Codes https://github.com/vcnc/haeinsa • Haeinsa Wiki https://github.com/vcnc/haeinsa/wiki • VCNC: Company who maintains Haeinsa http://www.vcnc.co.kr • Bewteen: Service which using Haeinsa http://appbetween.us Copyright 2013 VCNC Inc. All rights reserved 77

Slide 78

Slide 78 text

How to Reach us • Email [email protected] • You can report us bugs and improvement: https://github.com/vcnc/haeinsa/issues • We are hiring: http://engineering.vcnc.co.kr/jobs Copyright 2013 VCNC Inc. All rights reserved 78