Slide 1

Slide 1 text

HBase 2.0 (The next Major Release) Apache

Slide 2

Slide 2 text

2008 2009 2010 2011 2012 2013 2014 2015 Apache HBase Timeline HBase becomes top-level project Feb ’15 v1.0 May ’15 v1.1 Feb ’14 v0.98 May ’12 v0.94 HBase becomes Hadoop sub-project Dec ’13 v0.96

Slide 3

Slide 3 text

HBase 2.0 • Rolling upgradable from 1.x • Fix known design issues • Code simplification • 2016 Release Main Goals Highlights • HBASE-14350: New Assignment Manager (proc-v2 based) • HBASE-14090: RedoFs: Fix 1M Region and files moving around • HBASE-14123: Better Snapshots & Backups • HBASE-14379: Replication v2: Streaming, Better tooling • HBASE-14070: Hybrid Logical Clocks • HBASE-11425: Off Heaping • HBASE-13936: Dynamic, Online Configuration Apache

Slide 4

Slide 4 text

HBASE-14350: New Assignment Manager • Code Simplification, using the “proc-v2” state machine (HBASE-12439) • no more hbck or rmr znodes for “region in transition” • Simplify Rolling upgrade and “Migrations” • “once all the machines are on v2, execute the migration steps” • Designed for future master/meta splits • More cooperation between operations • no more failures due to concurrent split/merge/balance operations • … HBase 2.0 Apache

Slide 5

Slide 5 text

HBASE-14090: Redo Fs-Layout • Fix Scaling issues, 1M region, lots of snapshots, … • Remove small files (Split Reference, HFileLink) • Proper (and less expensive) atomicity by avoiding create /tmp && rename • Online Snapshot w/o need of flush; no need for RSs coordination • HFile Cleaner without fs scan • Table rename support! • … HBase 2.0 Apache

Slide 6

Slide 6 text

HBASE-14123: Better Snapshot & Backups • Improve fault tollerance of snapshots (based on the new AM work) • Improve performance of snapshots (based on the redofs work) • New Backup “packages”. Single Table or Multiple Tables • Incremental backups (WAL based) • Tooling and API to manage backups • … HBase 2.0 Apache

Slide 7

Slide 7 text

HBASE-14379: Replication v2 • Allow Replication v1 and v2 to coexist • No state in ZooKeeper. Introduce a new system table
 for tracking peers, queues, and log positions. • Admin actions mediated by the master, with support for Security • Streaming data transfer • Support for Bulk-Load • Better metrics • Better tools (e.g. “status check”) • … HBase 2.0 Apache

Slide 8

Slide 8 text

HBASE-14070: Hybrid Logical Clocks • Fix clock problems, that impact row ordering • Leap Seconds • Clock precision (windows clock does not update for ~16ms) • Open the possibility for: • simplification of mvcc/seqid • “global” point-in-time snapshots • … HBase 2.0 Apache

Slide 9

Slide 9 text

HBASE-11425: Off Heaping • Reduce GC pauses by “managing memory directly” • Reduce the number of “cell copies” from HFile-Block to RPC • Performance improvements and less GC stalls • … HBase 2.0 Apache

Slide 10

Slide 10 text

HBASE-13936: Dynamic, Online Configuration • New API for configuration • Single place and definition for each property • Enforce Types, Min, Max and some basic validation • no more copy paste and bad typing “hbase.nax.value” • Online configuration changes • Better tooling • which conf properties are available • which conf properties can be changed at runtime • which conf properties have a non default value HBase 2.0 Apache

Slide 11

Slide 11 text

HBase 2.0 • Rolling upgradable from 1.x • Fix known design issues • Code simplification • 2016 Release Main Goals Highlights • HBASE-14350: New Assignment Manager (proc-v2 based) • HBASE-14090: RedoFs: Fix 1M Region and files moving around • HBASE-14123: Better Snapshots & Backups • HBASE-14379: Replication v2: Streaming, Better tooling • HBASE-14070: Hybrid Logical Clocks • HBASE-11425: Off Heaping • HBASE-13936: Dynamic, Online Configuration Apache Q&A