Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Haeinsa Overview (HBase Transaction Library)

VCNC
October 10, 2013

Haeinsa Overview (HBase Transaction Library)

Haeinsa is linearly scalable multi-row, multi-table transaction library for HBase. Haeinsa uses two-phase locking and optimistic concurrency control for implementing transaction. The isolation level of transaction is serializable. Let's see how Haeinsa works briefly.
https://github.com/vcnc/haeinsa

VCNC

October 10, 2013
Tweet

More Decks by VCNC

Other Decks in Programming

Transcript

  1. Haeinsa Overview
    Design Principles, Transaction Algorithms
    and Performance Evaluation of Haeinsa
    1
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  2. There is no transaction in NoSQL
    • ACID transaction in this document
    • Could be series of multiple operations
    • No restrictions on operations (such as single row only)
    • Transaction should guarantees full ACID
    • No full ACID transaction in NoSQL
    • HBase provides row level ACID semantics only
    • Cassandra provides row level ACID semantics only
    • MongoDB operations are atomic only at the level of
    single document
    • Other NoSQL are not so different at all
    2
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  3. Why no ACID for NoSQL?
    • Providing reliable and fast transaction for
    distributed system is hard
    • Don’t misunderstand CAP theorem!
    3
    T1
    T2
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  4. ACID Transaction for NoSQL
    • Google’s Percolator, Megastore, Spanner
    • Provides full ACID properties for distributed system
    • Run on Google’s closed system
    • Attempts to provide ACID transaction for NoSQL
    • HAcid, Omid, HBaseSI and so on
    • But most of were not linear scalable
    • Haeinsa is the ACID transaction library
    • Provides full ACID properties for HBase
    • Linear scalable throughput and fault-tolerant
    • Battle tested (Used in real service)
    4
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  5. About Haeinsa
    • Multi-row, multi-table transaction library for HBase
    • Provides strong ACID property
    • Linear scalable throughput
    • Fault-tolerant against both client and HBase failure
    • Isolation level is serializability
    • Provides all of basic operations: Get, Put, Delete, Scan
    • Used successfully in real service
    • Open Source
    • https://github.com/vcnc/haeinsa
    5
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  6. Sample Code
    HaeinsaTransaction tx = tm.begin();
    HaeinsaPut put1 = new HaeinsaPut(rowKey1);
    put1.add(family, qualifier, value1);
    table.put(tx, put1);
    HaeinsaPut put2 = new HaeinsaPut(rowKey2);
    put2.add(family, qualifier, value2);
    table.put(tx, put2);
    tx.commit();
    Copyright 2013 VCNC Inc. All rights reserved 6

    View full-size slide

  7. Design Principals
    • No modification on HBase
    • Optimistic concurrency control
    • Lock column storing metadata of each row
    • Two-phase commit protocol
    7
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  8. Why no modification on HBase?
    • Haeinsa is a client library, so it can be used for
    HBase cluster as it is.
    • Easy to implement, easy to migrate to Haeinsa from
    existing HBase cluster
    8
    Bare-bone Hbase Cluster
    Client using
    Haeinsa
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  9. Pessimistic concurrency control
    • Wait until other concurrent transaction to be
    completed, and execute after.
    • Lock entire process of transaction
    T1
    T2
    T2
    can start after T2
    completes
    due to locking
    9
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  10. Optimistic concurrency control
    • Proceed without locking, and check conflicts with
    other concurrent transactions before commit
    • Lock only on conflict check logic
    T1
    T2
    T2
    can starts even if T1
    is not completed
    (no locking on this state)
    abort T2
    if conflict with T1
    10
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  11. Why OCC?
    • Better concurrency and performance for low-
    conflict environment.
    • General case of schema design in Hbase for OLTP
    leads to are low-conflict environment.
    • E.g. Each row store all the data of single user.
    11
    Row key Data Lock
    Bob
    3: $10 State:STABLE
    CommitTimestamp:3
    Joe
    3: $2 State:STABLE
    CommitTimestamp:3
    Access to user’s own row only
    is the most case of transaction
    in general OLTP service.
    Joe
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  12. Lock column
    • Special column used by Haeinsa internally
    • Represents locking state of the row
    • Stores transactional metadata of each row
    • Metadata contains state of the row, mutations of
    the transaction and timestamps of the row
    Row key Data Lock
    Bob
    3: $10 State:STABLE
    CommitTimestamp:3
    Joe
    3: $2 State:STABLE
    CommitTimestamp:3
    12
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  13. Why lock column for each row?
    • Lock column contains transactional metadata
    • Percolator contains transactional metadata by each cell
    • Lock column is the basic unit of locking
    • Haeinsa stores metadata in each row
    • Unit of locking is wider than percolator
    • Increases probability of conflict but less overhead
    • But, we can presume that locking by row is small enough
    to achieve low conflict rate
    13
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  14. Two-Phase Commit Protocol
    • 2PC is one of atomic commitment protocol
    14
    Coordinator
    Participant
    Participant
    Coordinator
    Participant
    Participant
    Commit-request phase
    Collect votes from Participants. If all
    participants vote YES, then starts
    commit phase, abort otherwise
    Commit phase
    Send the decision to participants.
    Each participant follow the decision
    and send ack to coordinator
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  15. How read operation works
    1. Read Lock column
    2. If the row is not in stable state, abort the
    transaction or recover the row if possible
    3. Read data from cell
    Row key Bal Lock
    Bob
    3: $10 State:STABLE
    CommitTimestamp:3
    Joe
    3: $2 State:STABLE
    CommitTimestamp:3
    15
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  16. How write operation works
    1. Read Lock column
    2. If the row is not in stable state, abort the
    transaction or recover the row if possible
    3. Store new value in client-side buffer
    4. Write value in HBase only if Lock column did not
    changed (This operation can be executed
    atomically with checkAndPut operation)
    16
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  17. How it works?
    • Let’s trace how Haeinsa works by studying the
    transaction that Bob giving the $7 to Joe.
    17
    HBase-side
    Row key Bal Lock
    Bob 3: $10
    State:STABLE
    CommitTimestamp:3
    Joe 3: $2
    State:STABLE
    CommitTimestamp:3
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal-$7)
    joeBal = Read(Joe, bal)
    Write(Joe, bal, joeBal+$7)
    Commit()
    Client-side
    << Before Transaction >>
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  18. How it works?
    18
    C is representing Client, and Rbob
    and Rjoe
    are rows
    representing balance of Bob and Joe. We will trace
    how Haeinsa works during the transaction
    Rbob
    Rjoe
    C
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal-$7)
    joeBal = Read(Joe, bal)
    Write(Joe, bal, joeBal+$7)
    Commit()
    State of Transaction
    Writes = []
    Locks = {}
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  19. 19
    Nothing to do. Haeinsa just creates Transaction
    instance in Client memory.
    Rbob
    Rjoe
    C
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal-$7)
    joeBal = Read(Joe, bal)
    Write(Joe, bal, joeBal+$7)
    Commit()
    State of Transaction
    Writes = []
    Locks = {}
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  20. 20
    Rbob
    Rjoe
    C
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal-$7)
    joeBal = Read(Joe, bal)
    Write(Joe, bal, joeBal+$7)
    Commit()
    State of Transaction
    Writes = []
    Locks[Bob] = (STABLE, 3)
    Read Bob’s Lock column first. And then read Bob’s
    Balance column. Lock has state of row and valid
    commit timestamp information.
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  21. 21
    Rbob
    Rjoe
    C
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal-$7)
    joeBal = Read(Joe, bal)
    Write(Joe, bal, joeBal+$7)
    Commit()
    State of Transaction
    Writes =
    [(Bob, bal, $3)]
    Locks[Bob] = (STABLE, 3)
    Bob’s new balance store into client’s memory. This
    new value will be applied to HBase on commit.
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  22. 22
    Rbob
    Rjoe
    C
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal-$7)
    joeBal = Read(Joe, bal)
    Write(Joe, bal, joeBal+$7)
    Commit()
    State of Transaction
    Writes =
    [(Bob, bal, $3)]
    Locks[Bob] = (STABLE, 3)
    Locks[Joe] = (STABLE, 3)
    Read Joe’s Lock column first. And then read Bob’s
    Balance column.
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  23. 23
    Rbob
    Rjoe
    C
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal-$7)
    joeBal = Read(Joe, bal)
    Write(Joe, bal, joeBal+$7)
    Commit()
    State of Transaction
    Writes =
    [(Bob, bal, $3),
    (Joe, bal, $9)]
    Locks[Bob] = (STABLE, 3)
    Locks[Joe] = (STABLE, 3) Bob’s new balance store into client’s memory. This
    new value will be applied to HBase on commit.
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  24. 24
    Rbob
    Rjoe
    C
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal-$7)
    joeBal = Read(Joe, bal)
    Write(Joe, bal, joeBal+$7)
    Commit()
    State of Transaction
    Writes =
    [(Bob, bal, $3),
    (Joe, bal, $9)]
    Locks[Bob] = (STABLE, 3)
    Locks[Joe] = (STABLE, 3) Now, It is time to explain commit operation.
    Let’s assume that Bob’s row is primary row, and
    Joe’s is secondary.
    We will use 2PC protocol for commit.
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  25. 25
    Rbob
    Rjoe
    C
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal-$7)
    joeBal = Read(Joe, bal)
    Write(Joe, bal, joeBal+$7)
    Commit()
    State of Transaction
    Writes =
    [(Bob, bal, $3),
    (Joe, bal, $9)]
    Locks[Bob] =
    (PREWRITTEN, 6, 4, [Joe])
    Locks[Joe] = (STABLE, 3)
    Prewrite Bob’s new balance to Bob’s row. Row
    became prewritten state, so Haeinsa prevents any
    other transaction’s access to the row.
    Remember: checkAndPut is atomic and ensures that
    value of the row has not been modified since read.
    << prewritten >>
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  26. 26
    Rbob
    Rjoe
    C
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal-$7)
    joeBal = Read(Joe, bal)
    Write(Joe, bal, joeBal+$7)
    Commit()
    State of Transaction
    Writes =
    [(Joe, bal, $9)]
    Locks[Bob] =
    (PREWRITTEN, 6, 4, [Joe])
    Locks[Joe] =
    (PREWRITTEN, 6, 4, , Bob)
    Prewrite Joe’s new balance to Bob’s row.
    Remember: checkAndPut is atomic and ensures that
    value of the row has not been modified since read.
    << prewritten >>
    << prewritten >>
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  27. 27
    Rbob
    Rjoe
    C
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal-$7)
    joeBal = Read(Joe, bal)
    Write(Joe, bal, joeBal+$7)
    Commit()
    State of Transaction
    Writes = []
    Locks[Bob] =
    (COMMITTED, 6, , [Joe])
    Locks[Joe] =
    (PREWRITTEN, 6, 4, , Bob) Make state of Bob’s row to COMMITTED.
    Transaction can be treated as succeed from now on.
    << committed >>
    << prewritten >>
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  28. 28
    Rbob
    Rjoe
    C
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal-$7)
    joeBal = Read(Joe, bal)
    Write(Joe, bal, joeBal+$7)
    Commit()
    State of Transaction
    Writes = []
    Locks[Bob] =
    (COMMITTED, 6, , [Joe])
    Locks[Joe] = (STABLE, 6)
    Make state of Joe’s row to STABLE. Now other
    transaction can access to the row.
    << committed >>
    << stable>>
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  29. 29
    Rbob
    Rjoe
    C
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal-$7)
    joeBal = Read(Joe, bal)
    Write(Joe, bal, joeBal+$7)
    Commit()
    State of Transaction
    Writes = []
    Locks[Bob] = (STABLE, 6)
    Locks[Joe] = (STABLE, 6)
    Make state of Bob’s row to STABLE. Now other
    transaction can access to the row.
    << stable>>
    << stable >>
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  30. 30
    Rbob
    Rjoe
    C
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal-$7)
    joeBal = Read(Joe, bal)
    Write(Joe, bal, joeBal+$7)
    Commit()
    State of Transaction
    Writes = []
    Locks = {}
    Transaction completed. All rows are in stable state.
    << stable>>
    << stable >>
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  31. Correctness of the algorithm
    31
    Let's check whether the sequence of operations
    really ensure ACID transaction of multiple rows!
    Rbob
    Rjoe
    C
    get
    write
    get
    checkAndPut
    write
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  32. 32
    checkAndPut is the atomic operation provided by
    HBase. So we can say that row didn't modified since
    execution of the get operation.
    Rbob
    Rjoe
    C
    get
    write
    get
    checkAndPut
    write
    checkAndPut ensures that value of the
    row has not been modified since read.
    Remember: Every modification via
    Haeinsa modifies Lock column also.
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  33. 33
    Haeinsa don't allows any operations to access
    unstable rows. That means, Haeinsa locks
    participating rows during commit operation.
    Rbob
    Rjoe
    C
    get
    write
    get
    checkAndPut
    write
    Since the row is not in STABLE state, other transaction
    can't access to the row during this interval. And each
    checkAndPut operation ensures that the row has not
    been accessed by other transaction.
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  34. 34
    Atomicity of the transaction ensured by single
    checkAndPut operation.
    Rbob
    Rjoe
    C
    get
    write
    get
    checkAndPut
    write
    This checkAndPut operation determine whether whole
    transaction is succeed or not. Success of the transaction is
    determined by atomic operation.
    << committed >>
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  35. 35
    Any of checkAndPut operation fails, all rows can be recovered
    to STABLE state. If state of primary row is COMMITED, the
    transaction can be treated as succeed, so, apply mutations to
    each row. If not, delete prewritten values from all rows.
    Rbob
    Rjoe
    C
    get
    write
    get
    checkAndPut
    write
    Any of these operation fails, states
    of row can be recovered to STABLE.
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  36. There is rows representing balance of Bob and Joe.
    Let’s trace how Haeinsa works by studying the
    transaction that Bob giving the $7 to Joe.
    HBase-side
    Row key Bal Lock
    Bob 3: $10
    State:STABLE
    CommitTimestamp:3
    Joe 3: $2
    State:STABLE
    CommitTimestamp:3
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal-$7)
    joeBal = Read(Joe, bal)
    Write(Joe, bal, joeBal+$7)
    Commit()
    Client-side
    Detailed trace of the transaction
    << Before Transaction >>
    36
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  37. Nothing to do. Writes and Locks are client-side
    memory structure.
    HBase-side
    Row key bal lock
    Bob 3: $10
    State:STABLE
    CommitTimestamp:3
    Joe 3: $2
    State:STABLE
    CommitTimestamp:3
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal-$7)
    joeBal = Read(Joe, bal)
    Write(Joe, bal, joeBal+$7)
    Commit()
    Client-side
    State of Transaction
    Writes = []
    Locks = {}
    37
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  38. Read Bob’s Lock column first. And then read Bob’s
    Balance column.
    HBase-side
    Row key bal lock
    Bob 3: $10
    State:STABLE
    CommitTimestamp:3
    Joe 3: $2
    State:STABLE
    CommitTimestamp:3
    Client-side
    State of Transaction
    Writes = []
    Locks[Bob] = (STABLE, 3)
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal-$7)
    joeBal = Read(Joe, bal)
    Write(Joe, bal, joeBal+$7)
    Commit()
    38
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  39. Bob’s new balance put into writes. Store on client-
    side memory. It will be write on Hbase on commit.
    HBase-side
    Row key bal lock
    Bob 3: $10
    State:STABLE
    CommitTimestamp:3
    Joe 3: $2
    State:STABLE
    CommitTimestamp:3
    Client-side
    State of Transaction
    Writes =
    [(Bob, bal, $3)]
    Locks[Bob] = (STABLE, 3)
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal-$7)
    joeBal = Read(Joe, bal)
    Write(Joe, bal, joeBal+$7)
    Commit()
    39
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  40. Read Joe’s Lock column first. And then read Joe’s
    Balance column.
    HBase-side
    Row key bal lock
    Bob 3: $10
    State:STABLE
    CommitTimestamp:3
    Joe 3: $2
    State:STABLE
    CommitTimestamp:3
    Client-side
    State of Transaction
    Writes =
    [(Bob, bal, $3)]
    Locks[Bob] = (STABLE, 3)
    Locks[Joe] = (STABLE, 3)
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal-$7)
    joeBal = Read(Joe, bal)
    Write(Joe, bal, joeBal+$7)
    Commit()
    40
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  41. Joe’s new balance put into writes.
    HBase-side
    Row key bal lock
    Bob 3: $10
    State:STABLE
    CommitTimestamp:3
    Joe 3: $2
    State:STABLE
    CommitTimestamp:3
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal-$7)
    joeBal = Read(Joe, bal)
    Write(Joe, bal, joeBal+$7)
    Commit()
    Client-side
    State of Transaction
    Writes =
    [(Bob, bal, $3),
    (Joe, bal, $9)]
    Locks[Bob] = (STABLE, 3)
    Locks[Joe] = (STABLE, 3)
    41
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  42. Prewrite value on primary row. Primary row is
    selected by particular algorithm by Haeinsa.
    HBase-side
    Row key bal lock
    Bob
    4: $3
    3: $10
    State:PREWRITTEN
    CommitTimestamp:6
    PrewriteTimestamp:4
    Secondaries:[Joe]
    Joe 3: $2
    State:STABLE
    CommitTimestamp:3
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal-$7)
    joeBal = Read(Joe, bal)
    Write(Joe, bal, joeBal+$7)
    Commit()
    Client-side
    State of Transaction
    Writes =
    [(Bob, bal, $3),
    (Joe, bal, $9)]
    Locks[Bob] =
    (PREWRITTEN, 6, 4, [Joe])
    Locks[Joe] = (STABLE, 3)
    42
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  43. Prewrite value on secondary row. Secondary row is
    the row which is not primary row.
    HBase-side
    Row key bal lock
    Bob
    4: $3
    3: $10
    State:PREWRITTEN
    CommitTimestamp:6
    PrewriteTimestamp:4
    Secondaries:[Joe]
    Joe
    4: $9
    3: $2
    State:PREWRITTEN
    CommitTimestamp:6
    PrewriteTimestamp:4
    Primary:Bob
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal-$7)
    joeBal = Read(Joe, bal)
    Write(Joe, bal, joeBal+$7)
    Commit()
    Client-side
    State of Transaction
    Writes =
    [(Joe, bal, $9)]
    Locks[Bob] =
    (PREWRITTEN, 6, 4, [Joe])
    Locks[Joe] =
    (PREWRITTEN, 6, 4, , Bob)
    43
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  44. If prewrite all succeed, change state of primary row
    to COMMITED. The transaction can be treated as
    succeed at this moment.
    HBase-side
    Row key bal lock
    Bob
    4: $3
    3: $10
    State:COMMITTED
    CommitTimestamp:6
    Secondaries:[Joe]
    Joe
    4: $9
    3: $2
    State:PREWRITTEN
    CommitTimestamp:6
    PrewriteTimestamp:4
    Primary:Bob
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal-$7)
    joeBal = Read(Joe, bal)
    Write(Joe, bal, joeBal+$7)
    Commit()
    Client-side
    State of Transaction
    Writes = []
    Locks[Bob] =
    (COMMITTED, 6, , [Joe])
    Locks[Joe] =
    (PREWRITTEN, 6, 4, , Bob)
    44
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  45. Change state of secondary row to STABLE.
    HBase-side
    Row key bal lock
    Bob
    4: $3
    3: $10
    State:COMMITTED
    CommitTimestamp:6
    Secondaries:[Joe]
    Joe
    4: $9
    3: $2
    State:STABLE
    CommitTimestamp:6
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal-$7)
    joeBal = Read(Joe, bal)
    Write(Joe, bal, joeBal+$7)
    Commit()
    Client-side
    State of Transaction
    Writes = []
    Locks[Bob] =
    (COMMITTED, 6, , [Joe])
    Locks[Joe] = (STABLE, 6)
    45
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  46. Change state of primary row to STABLE.
    HBase-side
    Row key bal lock
    Bob
    4: $3
    3: $10
    State:STABLE
    CommitTimestamp:6
    Joe
    4: $9
    3: $2
    State:STABLE
    CommitTimestamp:6
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal-$7)
    joeBal = Read(Joe, bal)
    Write(Joe, bal, joeBal+$7)
    Commit()
    Client-side
    State of Transaction
    Writes = []
    Locks[Bob] = (STABLE, 6)
    Locks[Joe] = (STABLE, 6)
    46
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  47. Transaction completed. All rows are in stable state.
    HBase-side
    Row key bal lock
    Bob
    4: $3
    3: $10
    State:STABLE
    CommitTimestamp:6
    Joe
    4: $9
    3: $2
    State:STABLE
    CommitTimestamp:6
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal-$7)
    joeBal = Read(Joe, bal)
    Write(Joe, bal, joeBal+$7)
    Commit()
    Client-side
    State of Transaction
    Writes = []
    Locks={}
    47
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  48. Performance Optimization
    • Read only transaction
    • No Put or Delete operation.
    • No checkAndPut operation. Just get Lock column and
    check whether it is modified.
    • Single row transaction
    • Only single row is participated to the transaction.
    • Just one checkAndPut operation. (STABLESTABLE)
    48
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  49. Read Only Transaction
    49
    C is representing Client, and Rbob
    and Rjoe
    are rows
    representing balance of Bob and Joe. We will trace
    how read-only transaction works.
    Rbob
    Rjoe
    C
    BeginTransaction()
    bobBal = Read(Bob, bal)
    joeBal = Read(Joe, bal)
    Commit()
    State of Transaction
    Writes = []
    Locks = {}
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  50. 50
    Nothing to do. Haeinsa just creates Transaction
    instance in Client memory.
    Rbob
    Rjoe
    C
    BeginTransaction()
    bobBal = Read(Bob, bal)
    joeBal = Read(Joe, bal)
    Commit()
    State of Transaction
    Writes = []
    Locks = {}
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  51. 51
    Rbob
    Rjoe
    C
    BeginTransaction()
    bobBal = Read(Bob, bal)
    joeBal = Read(Joe, bal)
    Commit()
    State of Transaction
    Writes = []
    Locks[Bob] = (STABLE, 3)
    Read Bob’s Lock column and then read Bob’s
    Balance column.
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  52. 52
    Rbob
    Rjoe
    C
    BeginTransaction()
    bobBal = Read(Bob, bal)
    joeBal = Read(Joe, bal)
    Commit()
    State of Transaction
    Writes = []
    Locks[Bob] = (STABLE, 3)
    Locks[Joe] = (STABLE, 3)
    Read Joe’s Lock column and then read Joe’s Balance
    column.
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  53. 53
    Rbob
    Rjoe
    C
    BeginTransaction()
    bobBal = Read(Bob, bal)
    joeBal = Read(Joe, bal)
    Commit()
    State of Transaction
    Writes = []
    Locks[Bob] = (STABLE, 3)
    Locks[Joe] = (STABLE, 3)
    Now we will see what happens in commit operation
    if transaction contains read operation only.
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  54. 54
    Rbob
    Rjoe
    C
    BeginTransaction()
    bobBal = Read(Bob, bal)
    joeBal = Read(Joe, bal)
    Commit()
    State of Transaction
    Writes = []
    Locks[Bob] = (STABLE, 3)
    Locks[Joe] = (STABLE, 3)
    Get Bob’s lock and check if lock is modified since
    first read operation executed. If lock modified, the
    transaction will be aborted.
    << stable >>
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  55. 55
    Rbob
    Rjoe
    C
    BeginTransaction()
    bobBal = Read(Bob, bal)
    joeBal = Read(Joe, bal)
    Commit()
    State of Transaction
    Writes = []
    Locks[Bob] = (STABLE, 3)
    Locks[Joe] = (STABLE, 3)
    Get Joe’s lock and check if lock is modified since
    first read operation executed. If lock modified, the
    transaction will be aborted. If lock did not modified,
    transaction treated as successful.
    << stable>>
    << stable >>
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  56. 56
    Rbob
    Rjoe
    C
    BeginTransaction()
    bobBal = Read(Bob, bal)
    joeBal = Read(Joe, bal)
    Commit()
    State of Transaction
    Writes = []
    Locks = {}
    The transaction completed successfully. This series
    of operations ensures that rows has not been
    modified by other concurrent transaction.
    << stable>>
    << stable >>
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  57. Single Row Transaction
    57
    C is representing Client, and Rbob
    is row of Bob.
    Let’s trace how Haeinsa works with single row
    transaction
    Rbob
    C
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal+$7)
    bobTot = Read(Bob, total)
    Write(Bob, total, bobTot+$7)
    Commit()
    State of Transaction
    Writes = []
    Locks = {}
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  58. 58
    Nothing to do. Haeinsa just creates Transaction
    instance in Client memory.
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal+$7)
    bobTot = Read(Bob, total)
    Write(Bob, total, bobTot+$7)
    Commit()
    State of Transaction
    Writes = []
    Locks = {}
    Rbob
    C
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  59. 59
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal+$7)
    bobTot = Read(Bob, total)
    Write(Bob, total, bobTot+$7)
    Commit()
    State of Transaction
    Writes = []
    Locks[Bob] = (STABLE, 3)
    Read Bob’s Lock column first. And then read Bob’s
    Balance column. Lock contains state of the row.
    Rbob
    C
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  60. 60
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal+$7)
    bobTot = Read(Bob, total)
    Write(Bob, total, bobTot+$7)
    Commit()
    State of Transaction
    Writes =
    [(Bob, bal, $17)]
    Locks[Bob] = (STABLE, 3)
    Bob’s new balance store into client’s memory. This
    new value will be applied to HBase on commit.
    Rbob
    C
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  61. 61
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal+$7)
    bobTot = Read(Bob, total)
    Write(Bob, total, bobTot+$7)
    Commit()
    State of Transaction
    Writes =
    [(Bob, bal, $17)]
    Locks[Bob] = (STABLE, 3)
    Read Bob’s total column. We already have Bob’s
    lock so we do not read Bob’s lock this time.
    Rbob
    C
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  62. 62
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal+$7)
    bobTot = Read(Bob, total)
    Write(Bob, total, bobTot+$7)
    Commit()
    State of Transaction
    Writes =
    [(Bob, bal, $17),
    (Bob, total, $17)]
    Locks[Bob] = (STABLE, 3)
    Bob’s new total store into client’s memory. This new
    value will be applied to HBase on commit.
    Rbob
    C
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  63. 63
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal+$7)
    bobTot = Read(Bob, total)
    Write(Bob, total, bobTot+$7)
    Commit()
    State of Transaction
    Writes =
    [(Bob, bal, $17),
    (Bob, total, $17)]
    Locks[Bob] = (STABLE, 3)
    Now, It is time to explain commit operation.
    Commit operation of single row transaction is much
    simpler than multi-row transaction
    Rbob
    C
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  64. 64
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal+$7)
    bobTot = Read(Bob, total)
    Write(Bob, total, bobTot+$7)
    Commit()
    State of Transaction
    Writes =
    [(Bob, bal, $17),
    (Bob, total, $17)]
    Locks[Bob] = (STABLE, 4)
    Only one checkAndPut operation needed for single
    row transaction. All new values are applied to
    HBase with single Hbase operation.
    Rbob
    C
    << stable>>
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  65. 65
    BeginTransaction()
    bobBal = Read(Bob, bal)
    Write(Bob, bal, bobBal+$7)
    bobTot = Read(Bob, total)
    Write(Bob, total, bobTot+$7)
    Commit()
    State of Transaction
    Writes = []
    Locks = {}
    Transaction completed. Single checkAndPut
    operation ensures that there is no modification of
    value by other concurrent transaction.
    Rbob
    C
    << stable>>
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  66. Limitation of Haeinsa
    • No Controls on Timestamp
    • Timestamp in Hbase are used by Haeinsa internally
    • Timestamp interface does not exists in Haeinsa APIs
    • Bounded Transaction Size
    • Targeted for transactions across handful of rows
    (approx. from 1 to 100s of rows)
    • Not for transaction against thousands of rows
    66
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  67. Performance of Haeinsa
    • Performance of Haeinsa can be vary by
    combination of operations of the transaction
    • If operations are gathered in a small number of
    rows, the better performance can be
    67
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  68. Measuring Performance
    • Tested on AWS (c1.xlarge)
    • Practical Performance
    • Transaction = (3writes + 1read) * 2rows + 1read * 1row
    • Simulation of most transaction in our service
    • Better performance than raw Hbase
    • Because: Hbase does more RPC than Haeinsa (Haeinsa
    applies writes on commit with checkAndPut)
    • Worst case Performance
    • Transaction = 1write* 2rows + 1read * 1row
    • 2 to 3 times worse than raw Hbase
    • But: it is much better than other transaction libraries
    Copyright 2013 VCNC Inc. All rights reserved 68

    View full-size slide

  69. Practical Performance (Linear Scalability)
    Copyright 2013 VCNC Inc. All rights reserved 69
    0
    5000
    10000
    15000
    20000
    25000
    30000
    35000
    40000
    45000
    50000
    0 200 400 600 800 1000 1200
    Tx/Sec
    ECU of HBase Cluster
    Haeinsa HBase

    View full-size slide

  70. Practical Performance (Latency)
    Copyright 2013 VCNC Inc. All rights reserved 70
    0
    5
    10
    15
    20
    25
    30
    35
    0 200 400 600 800 1000 1200
    ms
    ECU of HBase Cluster
    Haeinsa HBase

    View full-size slide

  71. Practical Performance (Throughput)
    Copyright 2013 VCNC Inc. All rights reserved 71
    0
    10
    20
    30
    40
    50
    60
    0 200 400 600 800 1000 1200
    Tx/ECU
    ECU of HBase Cluster
    Haeinsa HBase

    View full-size slide

  72. Worst-case Performance (Linear Scalability)
    Copyright 2013 VCNC Inc. All rights reserved 72
    0
    20000
    40000
    60000
    80000
    100000
    120000
    0 200 400 600 800 1000 1200
    Tx/Sec
    ECU of HBase Cluster
    Haeinsa HBase

    View full-size slide

  73. Worst-case Performance (Latency)
    Copyright 2013 VCNC Inc. All rights reserved 73
    0
    5
    10
    15
    20
    25
    30
    0 200 400 600 800 1000 1200
    ms
    ECU of HBase Cluster
    Haeinsa HBase

    View full-size slide

  74. Worst-case Performance (Throughput)
    Copyright 2013 VCNC Inc. All rights reserved 74
    0
    20
    40
    60
    80
    100
    120
    140
    0 200 400 600 800 1000 1200
    Tx/ECU
    ECU of HBase Cluster
    Haeinsa HBase

    View full-size slide

  75. Conflict Rate
    • If conflict occurs during commit operation, Haeinsa
    throws ConflictException
    • If ConflictException catched, our server retries the
    request with backoff
    • If maximum retry count exeeds, request fails
    • We measured conflict rate in our real service
    • Conflict rate: 0.004%~0.010%
    • Retry fail rate: 0.0003%~0.0010%
    75
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  76. Use Case
    • Haeinsa is currently used in real service
    • Between
    • Mobile service for couples
    • Processes 300M+ transaction per day by Haeinsa
    • http://appbetween.us
    76
    Copyright 2013 VCNC Inc. All rights reserved

    View full-size slide

  77. Links
    • Haeinsa Source Codes
    https://github.com/vcnc/haeinsa
    • Haeinsa Wiki
    https://github.com/vcnc/haeinsa/wiki
    • VCNC: Company who maintains Haeinsa
    http://www.vcnc.co.kr
    • Bewteen: Service which using Haeinsa
    http://appbetween.us
    Copyright 2013 VCNC Inc. All rights reserved 77

    View full-size slide

  78. How to Reach us
    • Email
    [email protected]
    • You can report us bugs and improvement:
    https://github.com/vcnc/haeinsa/issues
    • We are hiring:
    http://engineering.vcnc.co.kr/jobs
    Copyright 2013 VCNC Inc. All rights reserved 78

    View full-size slide