HBase Archetypes

‹#› © Cloudera, Inc. All rights reserved. Matteo Bertozzi |Apache
HBase Committer & PMC member Apache HBase Archetypes

© Cloudera, Inc. All rights reserved. What is Apache HBase?
• An Open-‐Source, non-‐relation, storage engine • Architecture • Key-‐Values are sorted and partitioned by key • A Master coordinates admin operations and balance partitions across machines. • The Client send and recv data directly from the Machine hosting the partition. T1 T1 T1 T2 T2 Row 00 Row 50 Row 70 Row A0 Row F0 Table Start key Machine machine1.host machine2.host machine3.host machine1.host machine2.host T1:Row 00 T1:Row 01 T1:Row 02 T1:Row 0.. T2:Row A0 T2:Row A1 T2:Row A.. machine1.host T1:Row 50 T1:Row 51 T1:Row 52 T1:Row 5.. T2:Row F0 T2:Row F1 T2:Row F.. machine2.host T1:Row 70 T1:Row 71 T1:Row 72 T1:Row 7.. machine3.host Master Region Servers

© Cloudera, Inc. All rights reserved. 2008 2009 2010 2011
2012 2013 2014 2015 An Apache HBase Timeline HBase becomes top-‐level project Jan ’14 ~20k nodes under managment Feb ’15 v1.0 May ’15 v1.1 Feb ’14 v0.98 May ’12 v0.94 HBase becomes Hadoop sub-‐project Summer ’09 StumbleUpon goes production on HBase ~0.20 Summer ’11 Web Crawl Cache Summer ’11 Messages on HBase Nov ’11 Cassini on HBase Apr ’11 CDH3 GA HBase 0.90.1 Dec ’13 v0.96 Sep ’11 HBase TDG published Jan ’13 Phoenix on HBase Aug ’13 Flurry 1k-‐1k node cluster replication Nov ’12 HBase in Action published

© Cloudera, Inc. All rights reserved. • What data is
being stored? • Entity data • Event data • Why is the data beign stored? • Operational use cases • Analytical use cases • How does the data get in and out? • Real time vs Batch • Random vs Sequential The are primarly two kind of “big data” workloads. They have different storage requirements. So you want to use HBase? En##es& Events&

© Cloudera, Inc. All rights reserved. Entity Centric Data •
Entity data is information about current state • Generally real time reads and writes • Examples: • Accounts • Users • Geolocation points • Click Counts and Metrics • Current Sensors Reading • Scales up with # of Humans and # of Machines/Sensors • Billions of distinct entities

© Cloudera, Inc. All rights reserved. Event Centric Data •
Event centric data are time-‐series data points recording successive points spaced over time intervals. • Generally real time write, some combination of real time read or batch read. • Examples: • Sensor data over time • Historical Stock Ticker data • Historical Metrics • Clicks time-‐series • Scales up due to finer grained intervals, retention policies, and passage of time

© Cloudera, Inc. All rights reserved. • So what kind
of questions are you asking the data? • Entity-‐centric questions • Give me everything about entity E • Give me the most recent event V about entity E • Give me the N most recent events V about entity E • Give me all events V about E between time [t1, t2] • Event and Time-‐centric Questions • Give me an aggregate on each entity between time [t1, t2] • Give me an aggregate on each time interval for entity E • Find events V that match some other given criteria Why are you storing the data?

© Cloudera, Inc. All rights reserved. How does data get
in and out of HBase? HBase Client Put, Incr, Append HBase Replication HBase Client Bulk Import HBase Client Gets, Short-‐Scans HBase Replication HBase Client Full Scan, Map-‐Reduce

© Cloudera, Inc. All rights reserved. How does data get
in and out of HBase? HBase Client HBase Replication HBase Client HBase Client HBase Replication HBase Client Low Latency High Throughput Put, Incr, Append Bulk Import Gets, Short-‐Scans Full Scan, Map-‐Reduce

© Cloudera, Inc. All rights reserved. What system is most
efficient? • It is all physics • You have a limited I/O budget • Use all your I/O by parallelizing   access and read/write sequentially • Choose the system and features that  reduces I/O in general IOPs/s/disk Pick the system that is best for your workload!

© Cloudera, Inc. All rights reserved. The physics of Hadoop
Storage Systems Workload HBase HDFS Low Latency ms, cached min, MR  seconds, Impala Random Read primary index index? small files problem Short Scan sorted partition Full Scan live table  (MR on snapshots) MR, Hive, Impala Random Write log structured not supported Sequential Write HBase overhead  Bulk Load minimal overhead Updates log structured not supported

‹#› © Cloudera, Inc. All rights reserved. The Archetypes HBase
Applications

© Cloudera, Inc. All rights reserved. HBase Application use cases
• The Bad • Large Blobs • Naïve RDBMS port • Analytic Archive • The Maybe • Time series DB • Combined workloads • The Good • Simple Entities • Messaging Store • Graph Store • Metrics Store • There are a lot of HBase applications • some successful, some less so • They have common architecture patterns • They have common trade offs • Archetypes are common architecture patterns • common across multiple use-‐cases • extracted to be repeatable

‹#› © Cloudera, Inc. All rights reserved. Archetypes: The Goods
HBase, you are my soul mate.

© Cloudera, Inc. All rights reserved. Archetype: Simple Entities •
Purely entity data, no releation between entities • Batch or real-‐time, random writes • Real-‐time, random reads • Could be a well-‐done denormalized RDBMS port • Often from many different sources, with poly-‐structured data • Schema • Row per entity • Row key => entity ID, or hash of entity ID • Column qualifier => Property / Field, possibly timestamp • Examples: • Geolocation data • Search index building • Use solr to make text data searchable

© Cloudera, Inc. All rights reserved. Simple Entities Access Pattern
HBase Client HBase Client HBase Client HBase Replication HBase Client Low Latency High Throughput HBase Replication Put, Incr, Append Bulk Import Gets, Short-‐Scans Full Scan, Map-‐Reduce Solr

© Cloudera, Inc. All rights reserved. Archetype: Messaging Store •
Messaging Data: • Realtime random writes: EMail, SMS, MMS, IM • Realtime random updates: Msg read, starred, moved, deleted • Reading of top-‐N entries, sorted by time • Records are of varying size • Some time series, but mostly random read/write • Schema • Row = user/feed/inbox • Row-‐Key = UID or UID + time • Column Qualifier = time or conversation id + time • Examples • Facebook Messages, Xiaomi Messages • Telco SMS/MMS services • Feeds like tumblr, pinterest

© Cloudera, Inc. All rights reserved. Messages Access Pattern HBase
Client HBase Client HBase Client HBase Replication HBase Client Low Latency High Throughput HBase Replication Put, Incr, Append Bulk Import Gets, Short-‐Scans Full Scan, Map-‐Reduce

© Cloudera, Inc. All rights reserved. Facebook Messages -‐ Statistics
Source: HBaseCon 2012 - Anshuman Singh

© Cloudera, Inc. All rights reserved. Archetype: Graph Data •
Graph Data: All entities and relations • Batch or Realtime, random writes • Batch or Realtime, random reads • Its an entity with relation edges • Schema • Row: Node • Row-‐Key: Node ID • Column Qualifier: Edge ID, or property:values • Examples • Web Caches -‐ Yahoo!, Trend Micro • Titan Graph DB with HBase storage backend • Sessionization (financial transactions, click streams, network traffic) • Government (connect the bad guy)

© Cloudera, Inc. All rights reserved. Graph Data Access Pattern
HBase Client HBase Client HBase Client HBase Replication HBase Client Low Latency High Throughput HBase Replication Put, Incr, Append Bulk Import Gets, Short-‐Scans Full Scan, Map-‐Reduce

© Cloudera, Inc. All rights reserved. Archetype: Metrics • Frequently
updated metrics • Increments • Roll ups generated by MR and bulk loaded to HBase • Schema • Row: Entity for a time period • Row key: entity-‐<yymmddhh> (granular time) • Column Qualifier: Property -‐> Count • Examples • Campaign Impression/Click counts (Ad tech) • Sensor data (Energy, Manufacturing, Auto)

© Cloudera, Inc. All rights reserved. Messages Access Pattern HBase
Client HBase Client HBase Client HBase Replication HBase Client Low Latency High Throughput HBase Replication Put, Incr, Append Bulk Import Gets, Short-‐Scans Full Scan, Map-‐Reduce

‹#› © Cloudera, Inc. All rights reserved. Archetypes: The Bad
These are not the droids you are looking for

© Cloudera, Inc. All rights reserved. Current HBase weak spots
• HBase’s architecture can handle a lot • Engineering tradeoffs optimize for some usecases and against others • HBase can still do things it is not optimal for • However, other systems are fundamentally more efficient for some workloads • We’ve seen folks forcing apps into HBase • If there is only one workloads on the data, consider another system • if there is a mixed workload, some cases become “maybes” Just because it is not good today, doesn’t mean it can’t be better tomorrow!

© Cloudera, Inc. All rights reserved. Bad Archetype: Large Blob
Store • Saving large objects > 3MB per cell • Schema • Normal entity pattern, but with some columns with large cells • Examples • Raw photo or video storage in HBase • Large frequently updated structs as a single cell • Problems: • Write amplification when reoptimizing data for read  (compactions on large unchanging data) • Write amplification when large structs are rewritten to update subfields  (cells are atomic, and HBase must rewrite an entire cell) • NOTE: Medium Binary Object (MOB) support coming (lots of 100KB-‐10MB cells)

© Cloudera, Inc. All rights reserved. Bad Archetype: Naïve RDBMS
port • A Naïve port of an RDBMS into HBase, directly copying the schema • Schema • Many tables, just like an RDBMS schema • Row-‐Key: primary key or auto-‐incrementing key, like RDBMS schema • Column Qualifiers: field names • Manually do joins, or secondary indexes (not consistent) • Solution: • HBase is not a SQL Database • No multi-‐region/multi-‐table in HBase transaction (yet) • No built in join support. Must denormalize your schema to use HBase

© Cloudera, Inc. All rights reserved. Bad Archetype: Analytic archive
• Store purely chronological data, partitioned by time • Real time writes, chronological time as primary index • Column-‐centric aggregations over all rows • Bulk reads out, generally for generating periodic reports • Schema • Row-‐Key: date + xxx or salt + date + xxx • Column Qualifiers: properties with data or counters • Example • Machine logs organized by date (causes write hotspotting) • Full fidelity clickstream organized by date (as opposed to campaign)

© Cloudera, Inc. All rights reserved. Bad Archetype: Analytic archive
Problems • HBase not-‐optimal as primary use case • Will get crushed by frequent full table scans • Will get crushed by large compactions • Will get crushed by write-‐side region hot spotting • Solution • Store in HDFS. Use Parquet columnar data storage + Hive/Impala • Build rollups in HDFS+MR. store and serve rollups in HBase

‹#› © Cloudera, Inc. All rights reserved. Archetypes: The Maybe
And this is crazy | But here’s my data | serve it, maybe!

© Cloudera, Inc. All rights reserved. The Maybe’s • For
some applications, doing it right gets complicated. • More sophisticated or nuanced cases • Require considering these questions: • When do you choose HBase vs HDFS storage for time series data? • Are there times where bad archetypes are ok?

© Cloudera, Inc. All rights reserved. Time Series: in HBase
or HDFS? • Time Series I/O Pattern Physics: • Read: collocate related data (Make reads cheap and fast) • Writes: Spread writes out as much as possible (Maximize write throughput) • HBase: Tension between these goals • Spreading writes spreads dat amaking reads inefficient • Colocating on write causes hotspots, underutilizes resources by limiting write throughput. • HDFS: The sweet spots • Sequential writes and sequential read • Just write more files in date-‐dirs; physically spreads writes but logically groups data • Reads for time centric queries: just read files in date-‐dir

© Cloudera, Inc. All rights reserved. Time Series: data flow
• Ingest • Flume or similar direct tool via app • HDFS for historical • No real time serving • Batch queries and generate rollups in Hive/MR • Faster queries in Impala • HBase for recent • Serve individual events • Serve pre-‐computed aggregates

© Cloudera, Inc. All rights reserved. Maybe Archetype: Entity Time
Series • Full fidelity historical record of metrics • Random write to event data, random read specific event or aggregate data • Schema • Row-‐Key: entity-‐timestamp or hash(entity)-‐timestamp. possibly with a salt added after entity. • Column Qualifier: granular timestamp -‐> value • Use custom aggregation to consolidate old data • Use TTL’s to bound and age off old data • Examples: • OpenTSDB is a system on HBase that handles this for numeric values • Lazily aggregates cells for better performance • Facebook Insights, ODS

© Cloudera, Inc. All rights reserved. Entity Time Series Access
Pattern HBase Client HBase Client HBase Client HBase Replication HBase Client Low Latency High Throughput HBase Replication Put, Incr, Append Bulk Import Gets, Short-‐Scans Full Scan, Map-‐Reduce Flume OpenTSDB Custom App

© Cloudera, Inc. All rights reserved. Maybe Archetype: Hybrid Entity
Time Series • Essentially a combo of Metric Archetype and Entity Time Series  with bulk loads of rollups via HDFS • Land data in HDFS and HBase • Keep all data in HDFS for future use • Aggregate in HDFS and write to HBase • HBase can do some aggregates too (counters) • Keep serve-‐able data in HBase • Use TTL to discard old values from HBase

© Cloudera, Inc. All rights reserved. Hybrid Time Series Access
Pattern HBase Client HBase Client HBase Client HBase Replication HBase Client Low Latency High Throughput HBase Replication Put, Incr, Append Bulk Import Gets, Short-‐Scans Full Scan, Map-‐Reduce Flume HDFS

© Cloudera, Inc. All rights reserved. Meta Archetype: Combined workloads
• In this cases, the use of HBase depends on workload • Cases where we have multiple workloads styles • Many cases we want to do multiple thing with the same data • Primary use case (real time, random access) • Secondary use case (analytical) • Pick for your primary,  here’s some patterns on how to do your secondary.

© Cloudera, Inc. All rights reserved. Operational with Analytical access
pattern HBase Client HBase Client HBase Client HBase Scanner Poor Latency! Full Scans Interferece High Throughput HBase Replication Put, Incr, Append Bulk Import Gets, Short-‐Scans Map-‐Reduce

© Cloudera, Inc. All rights reserved. Operational with Analytical access
pattern HBase Client HBase Client HBase Scanner Low Latency Isolated from Full Scans High Throughput HBase Replication Put, Incr, Append Bulk Import Map-‐Reduce High Throughput HBase Replication HBase Client Gets, Short-‐Scans

© Cloudera, Inc. All rights reserved. MR over Table Snapshots
(0.98+) • Previously Map-‐Reduce jobs over HBase required online full table scan • Take a snapshot and run MR job over snapshot files • Doesn’t use HBase client  (or any RPC against the RSs) • Avoid affectung HBase caches • 3-‐5x perf boost. • Still requires more IOPs than HDFS raw files map map map map map map map map reduce reduce reduce map map map map map map map map reduce reduce reduce snapshot

© Cloudera, Inc. All rights reserved. Analytic Archive Access Pattern
HBase Client HBase Client HBase Client HBase Replication HBase Client Low Latency High Throughput HBase Replication Put, Incr, Append Bulk Import Gets, Short-‐Scans Full Scan, Map-‐Reduce

© Cloudera, Inc. All rights reserved. Analytic Archive Snapshot Access
Pattern HBase Client HBase Client HBase Client HBase Replication HBase Client Low Latency Higher Throughput HBase Replication Put, Incr, Append Bulk Import Gets, Short-‐Scans Snapshot Scan, Map-‐Reduce Table Snapshot

© Cloudera, Inc. All rights reserved. Request Scheduling • We
want MR for analytics while serving  low-‐latency requests in one cluster. • Table Isolation (Proposed HBASE_6721) • Avoid having the load on Table X impact Table Y • Request prioritization and scheduling • Current default is FIFO, added Deadline • Prioritize short requests before long scans • Separated rpc-‐handlers for writes/short-‐reads/long-‐reads • Throttling • Limit the request throughput of a MR job 2 3 2 3 Delayed by long scan requests Rescheduled so new request get priority Mixed workload Isolated workload

© Cloudera, Inc. All rights reserved. HDFS + Impala
“Big Data” Workloads Low Latency Batch Random Access Full Scan Short Scan HDFS + MR  (Hive/Pig) HBase + Snapshots (HDFS + MR) HBase + MR HBase Pick the system that is best for your workload!

© Cloudera, Inc. All rights reserved. HBase is evolving to
be an Operational Database • Excels at consistent row-‐centric operations • Dev efforts aimed at using all machine resources efficiently,  reducing MTTR and improving latency predictability. • Projects built on HBase that enable secondary indexing and multi-‐row transactions • Apache Phoenix and others provide a SQL skin for simplified application development • Evolution towards OLTP workloads  • Analytic workloads? • Can be done but will be beaten by direct HDFS + MR/Spark/Impala

HBase Archetypes

HBase Archetypes

More Decks by Matteo Bertozzi

Featured

Transcript