HBase in Between - Speaker Deck

HBase in Between

by VCNC

Slide 1

Slide 1 text

HBase in Between Myungbo Kim VCNC

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

Agenda • HBase experience – Between Service (OLTP) – Between Log Analysis (OLAP) • Haeinsa – Open-source HBase transaction library – Made by VCNC • Summary

Slide 4

Slide 4 text

Between Architecture HBase (Cluster) ELB (HTTP) API #1 API #2 HTTP ELB #1 (TCP) ELB #2 (TCP) ZooKeep er TCP API #3 ELB #3 (TCP)

Slide 5

Slide 5 text

HBase in OLTP • Between uses HBase as main DB from beginning of service • HBase in AWS – Use CDH 4.4.0 (HBase 0.94.6) – EC2 instances manually – HDFS with replication x3 • HA namenode – M2.4xlarge instances • 68.4GB RAM

Slide 6

Slide 6 text

Why choose HBase? • Messaging is the key feature • Do not need complex schema, query • Prepare for scale (+ AWS) • High write throughput

Slide 7

Slide 7 text

Worth to do?

Slide 8

Slide 8 text

Answer is Yes, but … Made lot of mistakes!

Slide 9

Slide 9 text

Mistake list • Hot Region / Cold Region • Major compaction storm • Long log splitting in RS crash • TCP no delay • Long latency in region balancing • AWS storage issues • …

Slide 10

Slide 10 text

Mistake – (1) • Hot Region / Cold Region – Region is split by file size – Table grows by different speed – Manual split is recommended T1 T1 T1 T2 T1 T1 T1 T1 T1 T2 T1 T1 T2 RS 1 RS 2 T1 RS 1 RS 2

Slide 11

Slide 11 text

Mistake – (2) • Major compaction storm – Run major compaction manually in off-peak Old files Compacted file Peak Off-peak

Slide 12

Slide 12 text

Mistake – (3) • Long latency in region balancing T1 T1 T1 T1 T1 RS 1 RS 2 T1 T1 T1 T1 T1 T1 T1 T1 RS 1 RS 2 T1 T1 T1 T1 T1

Slide 13

Slide 13 text

What we learned • Have to understand HBase to operate it correctly!! • HBase is not yet optimized for Latency in many cases

Slide 14

Slide 14 text

HBase in OLAP • Between analyze user action logs – 300M+ per day • HDFS + HBase + MapReduce + MySQL

Slide 15

Slide 15 text

How we analyze • Cluster in office ( NOT AWS ) – We don’t have a lot of money – Cheap PCs API S3 Log Aggregator HBase (Cluster) MySQL MapReduce SQL Import Download Upload

Slide 16

Slide 16 text

What to analyze • Retention of user • Activity across country, device, gender • Activity pattern depend on length of relationship • Data-driven decision making !

Slide 17

Slide 17 text

Haeinsa

Slide 18

Slide 18 text

Haeinsa – Why we made it • HBase only support ACID semantics for single row – Only support checkAndPut, checkAndDelete • OLTP w/o transaction was NIGHTMARE • No good alternatives outside • Google made transaction on BigTable

Slide 19

Slide 19 text

Haeinsa • Haeinsa is open-source transaction library for HBase • Made & maintained by VCNC • Use basic HBase library to implement – Do not use coprocessor – Do not change HBase

Slide 20

Slide 20 text

Haeinsa • Haeinsa is layer between application and HBase client library Application Haeinsa HBase Client Library

Slide 21

Slide 21 text

Haeinsa Mechanism – (1) Col1 Col2 Col3 Lock Col1 Col2 Col3 Lock row1 row2 CheckAndPut Check

Slide 22

Slide 22 text

Haeinsa Mechanism – (2) Stable Committed Prewritten Stable Aborted Prewritten Success Transaction Failed Transaction

Slide 23

Slide 23 text

Haeinsa – example BeginTransaction() bobBalance = Read(Bob, balance) Write(Bob, balance, bobBalance-$7) joeBalance = Read(Joe, balance) Write(Joe, balance, joeBalance+$7) Commit()

Slide 24

Slide 24 text

R bob R joe C get write get checkAndPut write Haeinsa Mechanism – (3)

Slide 25

Slide 25 text

checkAndPut is the atomic operation provided by HBase. So we can say that row didn't modified since execution of the get operation. R bob R joe C get write get checkAndPut write checkAndPut ensures that value of the row has not been modified since read. Remember: Every modification via Haeinsa modifies Lock column also.

Slide 26

Slide 26 text

Haeinsa don't allows any operations to access unstable rows. That means, Haeinsa locks participating rows during commit operation. R bob R joe C get write get checkAndPut write Since the row is not in STABLE state, other transaction can't access to the row during this interval. And each checkAndPut operation ensures that the row has not been accessed by other transaction.

Slide 27

Slide 27 text

Atomicity of the transaction ensured by single checkAndPut operation. R bob R joe C get write get checkAndPut write This checkAndPut operation determine whether whole transaction is succeed or not. Success of the transaction is determined by atomic operation. << committed >>

Slide 28

Slide 28 text

Any of checkAndPut operation fails, all rows can be recovered to STABLE state. If state of primary row is COMMITED, the transaction can be treated as succeed, so, apply mutations to each row. If not, delete prewritten values from all rows. R bob R joe C get write get checkAndPut write Any of these operation fails, states of row can be recovered to STABLE.

Slide 29

Slide 29 text

Haeinsa – Linearly scalable 0 5000 10000 15000 20000 25000 30000 35000 40000 45000 50000 0 200 400 600 800 1000 1200 Tx/Sec ECU of HBase Cluster Haeinsa HBase

Slide 30

Slide 30 text

Haeinsa - Latency 0 5 10 15 20 25 30 35 0 200 400 600 800 1000 1200 ms ECU of HBase Cluster Haeinsa HBase

Slide 31

Slide 31 text

Haeinsa • Pros – Linearly scalable – Serializability – Low overhead – Fault-tolerant – Not intrusive to original HBase cluster – Proven in practice

Slide 32

Slide 32 text

http://github.com/vcnc/haeinsa

Slide 33

Slide 33 text

Summary • Why use HBase? – If you don’t need complex SQL query – Scalability • $ is bottle-neck, not storage – Auto-sharding – Good write throughput – More structured data for Analysis

Slide 34

Slide 34 text

Summary • Haeinsa – Transaction library for OLTP • Inspired by Google percolator – Cross tables, cross rows – Low overhead – No consistency issues over 3 months – Open in github • http://github.com/vcnc/haeinsa