Upgrade to Pro — share decks privately, control downloads, hide ads and more …

HBase 2.0 SF Meetup Oct '15

Matteo Bertozzi
October 08, 2015
400

HBase 2.0 SF Meetup Oct '15

Matteo Bertozzi

October 08, 2015
Tweet

Transcript

  1. 2008 2009 2010 2011 2012 2013 2014 2015 Apache HBase

    Timeline HBase becomes top-level project Feb ’15 v1.0 May ’15 v1.1 Feb ’14 v0.98 May ’12 v0.94 HBase becomes Hadoop sub-project Dec ’13 v0.96
  2. HBase 2.0 • Rolling upgradable from 1.x • Fix known

    design issues • Code simplification • 2016 Release Main Goals Highlights • HBASE-14350: New Assignment Manager (proc-v2 based) • HBASE-14090: RedoFs: Fix 1M Region and files moving around • HBASE-14123: Better Snapshots & Backups • HBASE-14379: Replication v2: Streaming, Better tooling • HBASE-14070: Hybrid Logical Clocks • HBASE-11425: Off Heaping • HBASE-13936: Dynamic, Online Configuration Apache
  3. HBASE-14350: New Assignment Manager • Code Simplification, using the “proc-v2”

    state machine (HBASE-12439) • no more hbck or rmr znodes for “region in transition” • Simplify Rolling upgrade and “Migrations” • “once all the machines are on v2, execute the migration steps” • Designed for future master/meta splits • More cooperation between operations • no more failures due to concurrent split/merge/balance operations • … HBase 2.0 Apache
  4. HBASE-14090: Redo Fs-Layout • Fix Scaling issues, 1M region, lots

    of snapshots, … • Remove small files (Split Reference, HFileLink) • Proper (and less expensive) atomicity by avoiding create /tmp && rename • Online Snapshot w/o need of flush; no need for RSs coordination • HFile Cleaner without fs scan • Table rename support! • … HBase 2.0 Apache
  5. HBASE-14123: Better Snapshot & Backups • Improve fault tollerance of

    snapshots (based on the new AM work) • Improve performance of snapshots (based on the redofs work) • New Backup “packages”. Single Table or Multiple Tables • Incremental backups (WAL based) • Tooling and API to manage backups • … HBase 2.0 Apache
  6. HBASE-14379: Replication v2 • Allow Replication v1 and v2 to

    coexist • No state in ZooKeeper. Introduce a new system table
 for tracking peers, queues, and log positions. • Admin actions mediated by the master, with support for Security • Streaming data transfer • Support for Bulk-Load • Better metrics • Better tools (e.g. “status check”) • … HBase 2.0 Apache
  7. HBASE-14070: Hybrid Logical Clocks • Fix clock problems, that impact

    row ordering • Leap Seconds • Clock precision (windows clock does not update for ~16ms) • Open the possibility for: • simplification of mvcc/seqid • “global” point-in-time snapshots • … HBase 2.0 Apache
  8. HBASE-11425: Off Heaping • Reduce GC pauses by “managing memory

    directly” • Reduce the number of “cell copies” from HFile-Block to RPC • Performance improvements and less GC stalls • … HBase 2.0 Apache
  9. HBASE-13936: Dynamic, Online Configuration • New API for configuration •

    Single place and definition for each property • Enforce Types, Min, Max and some basic validation • no more copy paste and bad typing “hbase.nax.value” • Online configuration changes • Better tooling • which conf properties are available • which conf properties can be changed at runtime • which conf properties have a non default value HBase 2.0 Apache
  10. HBase 2.0 • Rolling upgradable from 1.x • Fix known

    design issues • Code simplification • 2016 Release Main Goals Highlights • HBASE-14350: New Assignment Manager (proc-v2 based) • HBASE-14090: RedoFs: Fix 1M Region and files moving around • HBASE-14123: Better Snapshots & Backups • HBASE-14379: Replication v2: Streaming, Better tooling • HBASE-14070: Hybrid Logical Clocks • HBASE-11425: Off Heaping • HBASE-13936: Dynamic, Online Configuration Apache Q&A