Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Haeinsa Overview (HBase Transaction Library)

6a11050c8147e4f5fbf2637907c27964?s=47 VCNC
October 10, 2013

Haeinsa Overview (HBase Transaction Library)

Haeinsa is linearly scalable multi-row, multi-table transaction library for HBase. Haeinsa uses two-phase locking and optimistic concurrency control for implementing transaction. The isolation level of transaction is serializable. Let's see how Haeinsa works briefly.
https://github.com/vcnc/haeinsa

6a11050c8147e4f5fbf2637907c27964?s=128

VCNC

October 10, 2013
Tweet

Transcript

  1. Haeinsa Overview Design Principles, Transaction Algorithms and Performance Evaluation of

    Haeinsa 1 Copyright 2013 VCNC Inc. All rights reserved
  2. There is no transaction in NoSQL • ACID transaction in

    this document • Could be series of multiple operations • No restrictions on operations (such as single row only) • Transaction should guarantees full ACID • No full ACID transaction in NoSQL • HBase provides row level ACID semantics only • Cassandra provides row level ACID semantics only • MongoDB operations are atomic only at the level of single document • Other NoSQL are not so different at all 2 Copyright 2013 VCNC Inc. All rights reserved
  3. Why no ACID for NoSQL? • Providing reliable and fast

    transaction for distributed system is hard • Don’t misunderstand CAP theorem! 3 T1 T2 Copyright 2013 VCNC Inc. All rights reserved
  4. ACID Transaction for NoSQL • Google’s Percolator, Megastore, Spanner •

    Provides full ACID properties for distributed system • Run on Google’s closed system • Attempts to provide ACID transaction for NoSQL • HAcid, Omid, HBaseSI and so on • But most of were not linear scalable • Haeinsa is the ACID transaction library • Provides full ACID properties for HBase • Linear scalable throughput and fault-tolerant • Battle tested (Used in real service) 4 Copyright 2013 VCNC Inc. All rights reserved
  5. About Haeinsa • Multi-row, multi-table transaction library for HBase •

    Provides strong ACID property • Linear scalable throughput • Fault-tolerant against both client and HBase failure • Isolation level is serializability • Provides all of basic operations: Get, Put, Delete, Scan • Used successfully in real service • Open Source • https://github.com/vcnc/haeinsa 5 Copyright 2013 VCNC Inc. All rights reserved
  6. Sample Code HaeinsaTransaction tx = tm.begin(); HaeinsaPut put1 = new

    HaeinsaPut(rowKey1); put1.add(family, qualifier, value1); table.put(tx, put1); HaeinsaPut put2 = new HaeinsaPut(rowKey2); put2.add(family, qualifier, value2); table.put(tx, put2); tx.commit(); Copyright 2013 VCNC Inc. All rights reserved 6
  7. Design Principals • No modification on HBase • Optimistic concurrency

    control • Lock column storing metadata of each row • Two-phase commit protocol 7 Copyright 2013 VCNC Inc. All rights reserved
  8. Why no modification on HBase? • Haeinsa is a client

    library, so it can be used for HBase cluster as it is. • Easy to implement, easy to migrate to Haeinsa from existing HBase cluster 8 Bare-bone Hbase Cluster Client using Haeinsa Copyright 2013 VCNC Inc. All rights reserved
  9. Pessimistic concurrency control • Wait until other concurrent transaction to

    be completed, and execute after. • Lock entire process of transaction T1 T2 T2 can start after T2 completes due to locking 9 Copyright 2013 VCNC Inc. All rights reserved
  10. Optimistic concurrency control • Proceed without locking, and check conflicts

    with other concurrent transactions before commit • Lock only on conflict check logic T1 T2 T2 can starts even if T1 is not completed (no locking on this state) abort T2 if conflict with T1 10 Copyright 2013 VCNC Inc. All rights reserved
  11. Why OCC? • Better concurrency and performance for low- conflict

    environment. • General case of schema design in Hbase for OLTP leads to are low-conflict environment. • E.g. Each row store all the data of single user. 11 Row key Data Lock Bob 3: $10 State:STABLE CommitTimestamp:3 Joe 3: $2 State:STABLE CommitTimestamp:3 Access to user’s own row only is the most case of transaction in general OLTP service. Joe Copyright 2013 VCNC Inc. All rights reserved
  12. Lock column • Special column used by Haeinsa internally •

    Represents locking state of the row • Stores transactional metadata of each row • Metadata contains state of the row, mutations of the transaction and timestamps of the row Row key Data Lock Bob 3: $10 State:STABLE CommitTimestamp:3 Joe 3: $2 State:STABLE CommitTimestamp:3 12 Copyright 2013 VCNC Inc. All rights reserved
  13. Why lock column for each row? • Lock column contains

    transactional metadata • Percolator contains transactional metadata by each cell • Lock column is the basic unit of locking • Haeinsa stores metadata in each row • Unit of locking is wider than percolator • Increases probability of conflict but less overhead • But, we can presume that locking by row is small enough to achieve low conflict rate 13 Copyright 2013 VCNC Inc. All rights reserved
  14. Two-Phase Commit Protocol • 2PC is one of atomic commitment

    protocol 14 Coordinator Participant Participant Coordinator Participant Participant Commit-request phase Collect votes from Participants. If all participants vote YES, then starts commit phase, abort otherwise Commit phase Send the decision to participants. Each participant follow the decision and send ack to coordinator Copyright 2013 VCNC Inc. All rights reserved
  15. How read operation works 1. Read Lock column 2. If

    the row is not in stable state, abort the transaction or recover the row if possible 3. Read data from cell Row key Bal Lock Bob 3: $10 State:STABLE CommitTimestamp:3 Joe 3: $2 State:STABLE CommitTimestamp:3 15 Copyright 2013 VCNC Inc. All rights reserved
  16. How write operation works 1. Read Lock column 2. If

    the row is not in stable state, abort the transaction or recover the row if possible 3. Store new value in client-side buffer 4. Write value in HBase only if Lock column did not changed (This operation can be executed atomically with checkAndPut operation) 16 Copyright 2013 VCNC Inc. All rights reserved
  17. How it works? • Let’s trace how Haeinsa works by

    studying the transaction that Bob giving the $7 to Joe. 17 HBase-side Row key Bal Lock Bob 3: $10 State:STABLE CommitTimestamp:3 Joe 3: $2 State:STABLE CommitTimestamp:3 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() Client-side << Before Transaction >> Copyright 2013 VCNC Inc. All rights reserved
  18. How it works? 18 C is representing Client, and Rbob

    and Rjoe are rows representing balance of Bob and Joe. We will trace how Haeinsa works during the transaction Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() State of Transaction Writes = [] Locks = {} Copyright 2013 VCNC Inc. All rights reserved
  19. 19 Nothing to do. Haeinsa just creates Transaction instance in

    Client memory. Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() State of Transaction Writes = [] Locks = {} Copyright 2013 VCNC Inc. All rights reserved
  20. 20 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob,

    bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() State of Transaction Writes = [] Locks[Bob] = (STABLE, 3) Read Bob’s Lock column first. And then read Bob’s Balance column. Lock has state of row and valid commit timestamp information. Copyright 2013 VCNC Inc. All rights reserved
  21. 21 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob,

    bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() State of Transaction Writes = [(Bob, bal, $3)] Locks[Bob] = (STABLE, 3) Bob’s new balance store into client’s memory. This new value will be applied to HBase on commit. Copyright 2013 VCNC Inc. All rights reserved
  22. 22 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob,

    bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() State of Transaction Writes = [(Bob, bal, $3)] Locks[Bob] = (STABLE, 3) Locks[Joe] = (STABLE, 3) Read Joe’s Lock column first. And then read Bob’s Balance column. Copyright 2013 VCNC Inc. All rights reserved
  23. 23 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob,

    bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() State of Transaction Writes = [(Bob, bal, $3), (Joe, bal, $9)] Locks[Bob] = (STABLE, 3) Locks[Joe] = (STABLE, 3) Bob’s new balance store into client’s memory. This new value will be applied to HBase on commit. Copyright 2013 VCNC Inc. All rights reserved
  24. 24 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob,

    bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() State of Transaction Writes = [(Bob, bal, $3), (Joe, bal, $9)] Locks[Bob] = (STABLE, 3) Locks[Joe] = (STABLE, 3) Now, It is time to explain commit operation. Let’s assume that Bob’s row is primary row, and Joe’s is secondary. We will use 2PC protocol for commit. Copyright 2013 VCNC Inc. All rights reserved
  25. 25 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob,

    bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() State of Transaction Writes = [(Bob, bal, $3), (Joe, bal, $9)] Locks[Bob] = (PREWRITTEN, 6, 4, [Joe]) Locks[Joe] = (STABLE, 3) Prewrite Bob’s new balance to Bob’s row. Row became prewritten state, so Haeinsa prevents any other transaction’s access to the row. Remember: checkAndPut is atomic and ensures that value of the row has not been modified since read. << prewritten >> Copyright 2013 VCNC Inc. All rights reserved
  26. 26 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob,

    bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() State of Transaction Writes = [(Joe, bal, $9)] Locks[Bob] = (PREWRITTEN, 6, 4, [Joe]) Locks[Joe] = (PREWRITTEN, 6, 4, , Bob) Prewrite Joe’s new balance to Bob’s row. Remember: checkAndPut is atomic and ensures that value of the row has not been modified since read. << prewritten >> << prewritten >> Copyright 2013 VCNC Inc. All rights reserved
  27. 27 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob,

    bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() State of Transaction Writes = [] Locks[Bob] = (COMMITTED, 6, , [Joe]) Locks[Joe] = (PREWRITTEN, 6, 4, , Bob) Make state of Bob’s row to COMMITTED. Transaction can be treated as succeed from now on. << committed >> << prewritten >> Copyright 2013 VCNC Inc. All rights reserved
  28. 28 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob,

    bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() State of Transaction Writes = [] Locks[Bob] = (COMMITTED, 6, , [Joe]) Locks[Joe] = (STABLE, 6) Make state of Joe’s row to STABLE. Now other transaction can access to the row. << committed >> << stable>> Copyright 2013 VCNC Inc. All rights reserved
  29. 29 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob,

    bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() State of Transaction Writes = [] Locks[Bob] = (STABLE, 6) Locks[Joe] = (STABLE, 6) Make state of Bob’s row to STABLE. Now other transaction can access to the row. << stable>> << stable >> Copyright 2013 VCNC Inc. All rights reserved
  30. 30 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob,

    bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() State of Transaction Writes = [] Locks = {} Transaction completed. All rows are in stable state. << stable>> << stable >> Copyright 2013 VCNC Inc. All rights reserved
  31. Correctness of the algorithm 31 Let's check whether the sequence

    of operations really ensure ACID transaction of multiple rows! Rbob Rjoe C get write get checkAndPut write Copyright 2013 VCNC Inc. All rights reserved
  32. 32 checkAndPut is the atomic operation provided by HBase. So

    we can say that row didn't modified since execution of the get operation. Rbob Rjoe C get write get checkAndPut write checkAndPut ensures that value of the row has not been modified since read. Remember: Every modification via Haeinsa modifies Lock column also. Copyright 2013 VCNC Inc. All rights reserved
  33. 33 Haeinsa don't allows any operations to access unstable rows.

    That means, Haeinsa locks participating rows during commit operation. Rbob Rjoe C get write get checkAndPut write Since the row is not in STABLE state, other transaction can't access to the row during this interval. And each checkAndPut operation ensures that the row has not been accessed by other transaction. Copyright 2013 VCNC Inc. All rights reserved
  34. 34 Atomicity of the transaction ensured by single checkAndPut operation.

    Rbob Rjoe C get write get checkAndPut write This checkAndPut operation determine whether whole transaction is succeed or not. Success of the transaction is determined by atomic operation. << committed >> Copyright 2013 VCNC Inc. All rights reserved
  35. 35 Any of checkAndPut operation fails, all rows can be

    recovered to STABLE state. If state of primary row is COMMITED, the transaction can be treated as succeed, so, apply mutations to each row. If not, delete prewritten values from all rows. Rbob Rjoe C get write get checkAndPut write Any of these operation fails, states of row can be recovered to STABLE. Copyright 2013 VCNC Inc. All rights reserved
  36. There is rows representing balance of Bob and Joe. Let’s

    trace how Haeinsa works by studying the transaction that Bob giving the $7 to Joe. HBase-side Row key Bal Lock Bob 3: $10 State:STABLE CommitTimestamp:3 Joe 3: $2 State:STABLE CommitTimestamp:3 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() Client-side Detailed trace of the transaction << Before Transaction >> 36 Copyright 2013 VCNC Inc. All rights reserved
  37. Nothing to do. Writes and Locks are client-side memory structure.

    HBase-side Row key bal lock Bob 3: $10 State:STABLE CommitTimestamp:3 Joe 3: $2 State:STABLE CommitTimestamp:3 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() Client-side State of Transaction Writes = [] Locks = {} 37 Copyright 2013 VCNC Inc. All rights reserved
  38. Read Bob’s Lock column first. And then read Bob’s Balance

    column. HBase-side Row key bal lock Bob 3: $10 State:STABLE CommitTimestamp:3 Joe 3: $2 State:STABLE CommitTimestamp:3 Client-side State of Transaction Writes = [] Locks[Bob] = (STABLE, 3) BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() 38 Copyright 2013 VCNC Inc. All rights reserved
  39. Bob’s new balance put into writes. Store on client- side

    memory. It will be write on Hbase on commit. HBase-side Row key bal lock Bob 3: $10 State:STABLE CommitTimestamp:3 Joe 3: $2 State:STABLE CommitTimestamp:3 Client-side State of Transaction Writes = [(Bob, bal, $3)] Locks[Bob] = (STABLE, 3) BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() 39 Copyright 2013 VCNC Inc. All rights reserved
  40. Read Joe’s Lock column first. And then read Joe’s Balance

    column. HBase-side Row key bal lock Bob 3: $10 State:STABLE CommitTimestamp:3 Joe 3: $2 State:STABLE CommitTimestamp:3 Client-side State of Transaction Writes = [(Bob, bal, $3)] Locks[Bob] = (STABLE, 3) Locks[Joe] = (STABLE, 3) BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() 40 Copyright 2013 VCNC Inc. All rights reserved
  41. Joe’s new balance put into writes. HBase-side Row key bal

    lock Bob 3: $10 State:STABLE CommitTimestamp:3 Joe 3: $2 State:STABLE CommitTimestamp:3 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() Client-side State of Transaction Writes = [(Bob, bal, $3), (Joe, bal, $9)] Locks[Bob] = (STABLE, 3) Locks[Joe] = (STABLE, 3) 41 Copyright 2013 VCNC Inc. All rights reserved
  42. Prewrite value on primary row. Primary row is selected by

    particular algorithm by Haeinsa. HBase-side Row key bal lock Bob 4: $3 3: $10 State:PREWRITTEN CommitTimestamp:6 PrewriteTimestamp:4 Secondaries:[Joe] Joe 3: $2 State:STABLE CommitTimestamp:3 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() Client-side State of Transaction Writes = [(Bob, bal, $3), (Joe, bal, $9)] Locks[Bob] = (PREWRITTEN, 6, 4, [Joe]) Locks[Joe] = (STABLE, 3) 42 Copyright 2013 VCNC Inc. All rights reserved
  43. Prewrite value on secondary row. Secondary row is the row

    which is not primary row. HBase-side Row key bal lock Bob 4: $3 3: $10 State:PREWRITTEN CommitTimestamp:6 PrewriteTimestamp:4 Secondaries:[Joe] Joe 4: $9 3: $2 State:PREWRITTEN CommitTimestamp:6 PrewriteTimestamp:4 Primary:Bob BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() Client-side State of Transaction Writes = [(Joe, bal, $9)] Locks[Bob] = (PREWRITTEN, 6, 4, [Joe]) Locks[Joe] = (PREWRITTEN, 6, 4, , Bob) 43 Copyright 2013 VCNC Inc. All rights reserved
  44. If prewrite all succeed, change state of primary row to

    COMMITED. The transaction can be treated as succeed at this moment. HBase-side Row key bal lock Bob 4: $3 3: $10 State:COMMITTED CommitTimestamp:6 Secondaries:[Joe] Joe 4: $9 3: $2 State:PREWRITTEN CommitTimestamp:6 PrewriteTimestamp:4 Primary:Bob BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() Client-side State of Transaction Writes = [] Locks[Bob] = (COMMITTED, 6, , [Joe]) Locks[Joe] = (PREWRITTEN, 6, 4, , Bob) 44 Copyright 2013 VCNC Inc. All rights reserved
  45. Change state of secondary row to STABLE. HBase-side Row key

    bal lock Bob 4: $3 3: $10 State:COMMITTED CommitTimestamp:6 Secondaries:[Joe] Joe 4: $9 3: $2 State:STABLE CommitTimestamp:6 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() Client-side State of Transaction Writes = [] Locks[Bob] = (COMMITTED, 6, , [Joe]) Locks[Joe] = (STABLE, 6) 45 Copyright 2013 VCNC Inc. All rights reserved
  46. Change state of primary row to STABLE. HBase-side Row key

    bal lock Bob 4: $3 3: $10 State:STABLE CommitTimestamp:6 Joe 4: $9 3: $2 State:STABLE CommitTimestamp:6 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() Client-side State of Transaction Writes = [] Locks[Bob] = (STABLE, 6) Locks[Joe] = (STABLE, 6) 46 Copyright 2013 VCNC Inc. All rights reserved
  47. Transaction completed. All rows are in stable state. HBase-side Row

    key bal lock Bob 4: $3 3: $10 State:STABLE CommitTimestamp:6 Joe 4: $9 3: $2 State:STABLE CommitTimestamp:6 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal-$7) joeBal = Read(Joe, bal) Write(Joe, bal, joeBal+$7) Commit() Client-side State of Transaction Writes = [] Locks={} 47 Copyright 2013 VCNC Inc. All rights reserved
  48. Performance Optimization • Read only transaction • No Put or

    Delete operation. • No checkAndPut operation. Just get Lock column and check whether it is modified. • Single row transaction • Only single row is participated to the transaction. • Just one checkAndPut operation. (STABLESTABLE) 48 Copyright 2013 VCNC Inc. All rights reserved
  49. Read Only Transaction 49 C is representing Client, and Rbob

    and Rjoe are rows representing balance of Bob and Joe. We will trace how read-only transaction works. Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) joeBal = Read(Joe, bal) Commit() State of Transaction Writes = [] Locks = {} Copyright 2013 VCNC Inc. All rights reserved
  50. 50 Nothing to do. Haeinsa just creates Transaction instance in

    Client memory. Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) joeBal = Read(Joe, bal) Commit() State of Transaction Writes = [] Locks = {} Copyright 2013 VCNC Inc. All rights reserved
  51. 51 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) joeBal

    = Read(Joe, bal) Commit() State of Transaction Writes = [] Locks[Bob] = (STABLE, 3) Read Bob’s Lock column and then read Bob’s Balance column. Copyright 2013 VCNC Inc. All rights reserved
  52. 52 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) joeBal

    = Read(Joe, bal) Commit() State of Transaction Writes = [] Locks[Bob] = (STABLE, 3) Locks[Joe] = (STABLE, 3) Read Joe’s Lock column and then read Joe’s Balance column. Copyright 2013 VCNC Inc. All rights reserved
  53. 53 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) joeBal

    = Read(Joe, bal) Commit() State of Transaction Writes = [] Locks[Bob] = (STABLE, 3) Locks[Joe] = (STABLE, 3) Now we will see what happens in commit operation if transaction contains read operation only. Copyright 2013 VCNC Inc. All rights reserved
  54. 54 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) joeBal

    = Read(Joe, bal) Commit() State of Transaction Writes = [] Locks[Bob] = (STABLE, 3) Locks[Joe] = (STABLE, 3) Get Bob’s lock and check if lock is modified since first read operation executed. If lock modified, the transaction will be aborted. << stable >> Copyright 2013 VCNC Inc. All rights reserved
  55. 55 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) joeBal

    = Read(Joe, bal) Commit() State of Transaction Writes = [] Locks[Bob] = (STABLE, 3) Locks[Joe] = (STABLE, 3) Get Joe’s lock and check if lock is modified since first read operation executed. If lock modified, the transaction will be aborted. If lock did not modified, transaction treated as successful. << stable>> << stable >> Copyright 2013 VCNC Inc. All rights reserved
  56. 56 Rbob Rjoe C BeginTransaction() bobBal = Read(Bob, bal) joeBal

    = Read(Joe, bal) Commit() State of Transaction Writes = [] Locks = {} The transaction completed successfully. This series of operations ensures that rows has not been modified by other concurrent transaction. << stable>> << stable >> Copyright 2013 VCNC Inc. All rights reserved
  57. Single Row Transaction 57 C is representing Client, and Rbob

    is row of Bob. Let’s trace how Haeinsa works with single row transaction Rbob C BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal+$7) bobTot = Read(Bob, total) Write(Bob, total, bobTot+$7) Commit() State of Transaction Writes = [] Locks = {} Copyright 2013 VCNC Inc. All rights reserved
  58. 58 Nothing to do. Haeinsa just creates Transaction instance in

    Client memory. BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal+$7) bobTot = Read(Bob, total) Write(Bob, total, bobTot+$7) Commit() State of Transaction Writes = [] Locks = {} Rbob C Copyright 2013 VCNC Inc. All rights reserved
  59. 59 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal+$7) bobTot

    = Read(Bob, total) Write(Bob, total, bobTot+$7) Commit() State of Transaction Writes = [] Locks[Bob] = (STABLE, 3) Read Bob’s Lock column first. And then read Bob’s Balance column. Lock contains state of the row. Rbob C Copyright 2013 VCNC Inc. All rights reserved
  60. 60 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal+$7) bobTot

    = Read(Bob, total) Write(Bob, total, bobTot+$7) Commit() State of Transaction Writes = [(Bob, bal, $17)] Locks[Bob] = (STABLE, 3) Bob’s new balance store into client’s memory. This new value will be applied to HBase on commit. Rbob C Copyright 2013 VCNC Inc. All rights reserved
  61. 61 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal+$7) bobTot

    = Read(Bob, total) Write(Bob, total, bobTot+$7) Commit() State of Transaction Writes = [(Bob, bal, $17)] Locks[Bob] = (STABLE, 3) Read Bob’s total column. We already have Bob’s lock so we do not read Bob’s lock this time. Rbob C Copyright 2013 VCNC Inc. All rights reserved
  62. 62 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal+$7) bobTot

    = Read(Bob, total) Write(Bob, total, bobTot+$7) Commit() State of Transaction Writes = [(Bob, bal, $17), (Bob, total, $17)] Locks[Bob] = (STABLE, 3) Bob’s new total store into client’s memory. This new value will be applied to HBase on commit. Rbob C Copyright 2013 VCNC Inc. All rights reserved
  63. 63 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal+$7) bobTot

    = Read(Bob, total) Write(Bob, total, bobTot+$7) Commit() State of Transaction Writes = [(Bob, bal, $17), (Bob, total, $17)] Locks[Bob] = (STABLE, 3) Now, It is time to explain commit operation. Commit operation of single row transaction is much simpler than multi-row transaction Rbob C Copyright 2013 VCNC Inc. All rights reserved
  64. 64 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal+$7) bobTot

    = Read(Bob, total) Write(Bob, total, bobTot+$7) Commit() State of Transaction Writes = [(Bob, bal, $17), (Bob, total, $17)] Locks[Bob] = (STABLE, 4) Only one checkAndPut operation needed for single row transaction. All new values are applied to HBase with single Hbase operation. Rbob C << stable>> Copyright 2013 VCNC Inc. All rights reserved
  65. 65 BeginTransaction() bobBal = Read(Bob, bal) Write(Bob, bal, bobBal+$7) bobTot

    = Read(Bob, total) Write(Bob, total, bobTot+$7) Commit() State of Transaction Writes = [] Locks = {} Transaction completed. Single checkAndPut operation ensures that there is no modification of value by other concurrent transaction. Rbob C << stable>> Copyright 2013 VCNC Inc. All rights reserved
  66. Limitation of Haeinsa • No Controls on Timestamp • Timestamp

    in Hbase are used by Haeinsa internally • Timestamp interface does not exists in Haeinsa APIs • Bounded Transaction Size • Targeted for transactions across handful of rows (approx. from 1 to 100s of rows) • Not for transaction against thousands of rows 66 Copyright 2013 VCNC Inc. All rights reserved
  67. Performance of Haeinsa • Performance of Haeinsa can be vary

    by combination of operations of the transaction • If operations are gathered in a small number of rows, the better performance can be 67 Copyright 2013 VCNC Inc. All rights reserved
  68. Measuring Performance • Tested on AWS (c1.xlarge) • Practical Performance

    • Transaction = (3writes + 1read) * 2rows + 1read * 1row • Simulation of most transaction in our service • Better performance than raw Hbase • Because: Hbase does more RPC than Haeinsa (Haeinsa applies writes on commit with checkAndPut) • Worst case Performance • Transaction = 1write* 2rows + 1read * 1row • 2 to 3 times worse than raw Hbase • But: it is much better than other transaction libraries Copyright 2013 VCNC Inc. All rights reserved 68
  69. Practical Performance (Linear Scalability) Copyright 2013 VCNC Inc. All rights

    reserved 69 0 5000 10000 15000 20000 25000 30000 35000 40000 45000 50000 0 200 400 600 800 1000 1200 Tx/Sec ECU of HBase Cluster Haeinsa HBase
  70. Practical Performance (Latency) Copyright 2013 VCNC Inc. All rights reserved

    70 0 5 10 15 20 25 30 35 0 200 400 600 800 1000 1200 ms ECU of HBase Cluster Haeinsa HBase
  71. Practical Performance (Throughput) Copyright 2013 VCNC Inc. All rights reserved

    71 0 10 20 30 40 50 60 0 200 400 600 800 1000 1200 Tx/ECU ECU of HBase Cluster Haeinsa HBase
  72. Worst-case Performance (Linear Scalability) Copyright 2013 VCNC Inc. All rights

    reserved 72 0 20000 40000 60000 80000 100000 120000 0 200 400 600 800 1000 1200 Tx/Sec ECU of HBase Cluster Haeinsa HBase
  73. Worst-case Performance (Latency) Copyright 2013 VCNC Inc. All rights reserved

    73 0 5 10 15 20 25 30 0 200 400 600 800 1000 1200 ms ECU of HBase Cluster Haeinsa HBase
  74. Worst-case Performance (Throughput) Copyright 2013 VCNC Inc. All rights reserved

    74 0 20 40 60 80 100 120 140 0 200 400 600 800 1000 1200 Tx/ECU ECU of HBase Cluster Haeinsa HBase
  75. Conflict Rate • If conflict occurs during commit operation, Haeinsa

    throws ConflictException • If ConflictException catched, our server retries the request with backoff • If maximum retry count exeeds, request fails • We measured conflict rate in our real service • Conflict rate: 0.004%~0.010% • Retry fail rate: 0.0003%~0.0010% 75 Copyright 2013 VCNC Inc. All rights reserved
  76. Use Case • Haeinsa is currently used in real service

    • Between • Mobile service for couples • Processes 300M+ transaction per day by Haeinsa • http://appbetween.us 76 Copyright 2013 VCNC Inc. All rights reserved
  77. Links • Haeinsa Source Codes https://github.com/vcnc/haeinsa • Haeinsa Wiki https://github.com/vcnc/haeinsa/wiki

    • VCNC: Company who maintains Haeinsa http://www.vcnc.co.kr • Bewteen: Service which using Haeinsa http://appbetween.us Copyright 2013 VCNC Inc. All rights reserved 77
  78. How to Reach us • Email haeinsa_dev@vcnc.co.kr • You can

    report us bugs and improvement: https://github.com/vcnc/haeinsa/issues • We are hiring: http://engineering.vcnc.co.kr/jobs Copyright 2013 VCNC Inc. All rights reserved 78