Upgrade to Pro — share decks privately, control downloads, hide ads and more …

HBase Archetypes

Matteo Bertozzi
June 08, 2015
62

HBase Archetypes

Matteo Bertozzi

June 08, 2015
Tweet

Transcript

  1. ‹#› ©  Cloudera,  Inc.  All  rights  reserved. Matteo  Bertozzi  |Apache

     HBase  Committer  &  PMC  member Apache  HBase  Archetypes
  2. ©  Cloudera,  Inc.  All  rights  reserved. What  is  Apache  HBase?

    • An  Open-­‐Source,  non-­‐relation,  storage  engine   • Architecture   • Key-­‐Values  are  sorted  and  partitioned  by  key     • A  Master  coordinates  admin  operations  and  balance  partitions  across  machines.   • The  Client  send  and  recv  data  directly  from  the  Machine  hosting  the  partition. T1 T1 T1 T2 T2 Row  00 Row  50 Row  70 Row  A0 Row  F0 Table Start  key Machine machine1.host machine2.host machine3.host machine1.host machine2.host T1:Row  00   T1:Row  01   T1:Row  02   T1:Row  0.. T2:Row  A0   T2:Row  A1   T2:Row  A.. machine1.host T1:Row  50   T1:Row  51   T1:Row  52   T1:Row  5.. T2:Row  F0   T2:Row  F1   T2:Row  F.. machine2.host T1:Row  70   T1:Row  71   T1:Row  72   T1:Row  7.. machine3.host Master Region  Servers
  3. ©  Cloudera,  Inc.  All  rights  reserved. 2008 2009 2010 2011

    2012 2013 2014 2015 An  Apache  HBase  Timeline HBase  becomes     top-­‐level  project Jan  ’14   ~20k  nodes   under  managment Feb  ’15   v1.0 May  ’15   v1.1 Feb  ’14   v0.98 May  ’12   v0.94 HBase  becomes   Hadoop  sub-­‐project Summer  ’09   StumbleUpon   goes  production  on   HBase  ~0.20 Summer  ’11   Web  Crawl  Cache Summer  ’11   Messages   on  HBase Nov  ’11   Cassini  on  HBase Apr  ’11   CDH3  GA   HBase  0.90.1 Dec  ’13   v0.96 Sep  ’11   HBase  TDG   published Jan  ’13   Phoenix   on  HBase Aug  ’13   Flurry  1k-­‐1k  node   cluster  replication Nov  ’12   HBase  in  Action   published
  4. ©  Cloudera,  Inc.  All  rights  reserved. • What  data  is

     being  stored?   • Entity  data   • Event  data   • Why  is  the  data  beign  stored?   • Operational  use  cases   • Analytical  use  cases   • How  does  the  data  get  in  and  out?   • Real  time  vs  Batch   • Random  vs  Sequential The  are  primarly  two  kind  of  “big  data”   workloads.  They  have  different  storage   requirements. So  you  want  to  use  HBase? En##es& Events&
  5. ©  Cloudera,  Inc.  All  rights  reserved. Entity  Centric  Data •

    Entity  data  is  information  about  current  state   • Generally  real  time  reads  and  writes   • Examples:   • Accounts   • Users   • Geolocation  points   • Click  Counts  and  Metrics   • Current  Sensors  Reading   • Scales  up  with  #  of  Humans  and  #  of  Machines/Sensors   • Billions  of  distinct  entities
  6. ©  Cloudera,  Inc.  All  rights  reserved. Event  Centric  Data •

    Event  centric  data  are  time-­‐series  data  points  recording  successive  points  spaced   over  time  intervals.   • Generally  real  time  write,  some  combination  of  real  time  read  or  batch  read.   • Examples:   • Sensor  data  over  time   • Historical  Stock  Ticker  data   • Historical  Metrics   • Clicks  time-­‐series   • Scales  up  due  to  finer  grained  intervals,  retention  policies,  and  passage  of  time
  7. ©  Cloudera,  Inc.  All  rights  reserved. • So  what  kind

     of  questions  are  you  asking  the  data?   • Entity-­‐centric  questions   • Give  me  everything  about  entity  E   • Give  me  the  most  recent  event  V  about  entity  E   • Give  me  the  N  most  recent  events  V  about  entity  E   • Give  me  all  events  V  about  E  between  time  [t1,  t2]   • Event  and  Time-­‐centric  Questions   • Give  me  an  aggregate  on  each  entity  between  time  [t1,  t2]   • Give  me  an  aggregate  on  each  time  interval  for  entity  E   • Find  events  V  that  match  some  other  given  criteria Why  are  you  storing  the  data?
  8. ©  Cloudera,  Inc.  All  rights  reserved. How  does  data  get

     in  and  out  of  HBase? HBase  Client Put,  Incr,  Append HBase   Replication HBase  Client Bulk  Import HBase  Client Gets,  Short-­‐Scans HBase   Replication HBase  Client Full  Scan,   Map-­‐Reduce
  9. ©  Cloudera,  Inc.  All  rights  reserved. How  does  data  get

     in  and  out  of  HBase? HBase  Client HBase   Replication HBase  Client HBase  Client HBase   Replication HBase  Client Low   Latency High   Throughput Put,  Incr,  Append Bulk  Import Gets,  Short-­‐Scans Full  Scan,   Map-­‐Reduce
  10. ©  Cloudera,  Inc.  All  rights  reserved. What  system  is  most

     efficient? • It  is  all  physics   • You  have  a  limited  I/O  budget   • Use  all  your  I/O  by  parallelizing  
 access  and  read/write  sequentially   • Choose  the  system  and  features  that
 reduces  I/O  in  general IOPs/s/disk Pick  the  system  that  is  best  for  your  workload!
  11. ©  Cloudera,  Inc.  All  rights  reserved. The  physics  of  Hadoop

     Storage  Systems Workload HBase HDFS Low  Latency ms,  cached min,  MR
 seconds,  Impala Random  Read primary  index index?  small  files  problem Short  Scan sorted partition Full  Scan live  table
 (MR  on  snapshots) MR,  Hive,  Impala Random  Write log  structured not  supported Sequential  Write HBase  overhead
 Bulk  Load minimal  overhead Updates log  structured not  supported
  12. ©  Cloudera,  Inc.  All  rights  reserved. The  physics  of  Hadoop

     Storage  Systems Workload HBase HDFS Low  Latency ms,  cached min,  MR
 seconds,  Impala Random  Read primary  index index?  small  files  problem Short  Scan sorted partition Full  Scan live  table
 (MR  on  snapshots) MR,  Hive,  Impala Random  Write log  structured not  supported Sequential  Write HBase  overhead
 Bulk  Load minimal  overhead Updates log  structured not  supported
  13. ©  Cloudera,  Inc.  All  rights  reserved. The  physics  of  Hadoop

     Storage  Systems Workload HBase HDFS Low  Latency ms,  cached min,  MR
 seconds,  Impala Random  Read primary  index index?  small  files  problem Short  Scan sorted partition Full  Scan live  table
 (MR  on  snapshots) MR,  Hive,  Impala Random  Write log  structured not  supported Sequential  Write HBase  overhead
 Bulk  Load minimal  overhead Updates log  structured not  supported
  14. ©  Cloudera,  Inc.  All  rights  reserved. HBase  Application  use  cases

    • The  Bad   • Large  Blobs   • Naïve  RDBMS  port   • Analytic  Archive • The  Maybe   • Time  series  DB   • Combined  workloads • The  Good   • Simple  Entities   • Messaging  Store   • Graph  Store   • Metrics  Store • There  are  a  lot  of  HBase  applications   • some  successful,  some  less  so   • They  have  common  architecture  patterns   • They  have  common  trade  offs • Archetypes  are  common  architecture  patterns   • common  across  multiple  use-­‐cases   • extracted  to  be  repeatable
  15. ©  Cloudera,  Inc.  All  rights  reserved. Archetype:  Simple  Entities •

    Purely  entity  data,  no  releation  between  entities   • Batch  or  real-­‐time,  random  writes   • Real-­‐time,  random  reads   • Could  be  a  well-­‐done  denormalized  RDBMS  port   • Often  from  many  different  sources,  with  poly-­‐structured  data   • Schema   • Row  per  entity   • Row  key  =>  entity  ID,  or  hash  of  entity  ID   • Column  qualifier  =>  Property  /  Field,  possibly  timestamp   • Examples:   • Geolocation  data   • Search  index  building   • Use  solr  to  make  text  data  searchable
  16. ©  Cloudera,  Inc.  All  rights  reserved. Simple  Entities  Access  Pattern

    HBase  Client HBase  Client HBase  Client HBase   Replication HBase  Client Low   Latency High   Throughput HBase   Replication Put,  Incr,  Append Bulk  Import Gets,  Short-­‐Scans Full  Scan,   Map-­‐Reduce Solr
  17. ©  Cloudera,  Inc.  All  rights  reserved. Archetype:  Messaging  Store •

    Messaging  Data:   • Realtime  random  writes:  EMail,  SMS,  MMS,  IM   • Realtime  random  updates:  Msg  read,  starred,  moved,  deleted   • Reading  of  top-­‐N  entries,  sorted  by  time   • Records  are  of  varying  size   • Some  time  series,  but  mostly  random  read/write   • Schema   • Row  =  user/feed/inbox   • Row-­‐Key  =  UID  or  UID  +  time   • Column  Qualifier  =  time  or  conversation  id  +  time   • Examples   • Facebook  Messages,  Xiaomi  Messages   • Telco  SMS/MMS  services   • Feeds  like  tumblr,  pinterest
  18. ©  Cloudera,  Inc.  All  rights  reserved. Messages  Access  Pattern HBase

     Client HBase  Client HBase  Client HBase   Replication HBase  Client Low   Latency High   Throughput HBase   Replication Put,  Incr,  Append Bulk  Import Gets,  Short-­‐Scans Full  Scan,   Map-­‐Reduce
  19. ©  Cloudera,  Inc.  All  rights  reserved. Archetype:  Graph  Data •

    Graph  Data:  All  entities  and  relations   • Batch  or  Realtime,  random  writes   • Batch  or  Realtime,  random  reads   • Its  an  entity  with  relation  edges   • Schema   • Row:  Node   • Row-­‐Key:  Node  ID   • Column  Qualifier:  Edge  ID,  or  property:values   • Examples   • Web  Caches  -­‐  Yahoo!,  Trend  Micro   • Titan  Graph  DB  with  HBase  storage  backend   • Sessionization  (financial  transactions,  click  streams,  network  traffic)   • Government  (connect  the  bad  guy)
  20. ©  Cloudera,  Inc.  All  rights  reserved. Graph  Data  Access  Pattern

    HBase  Client HBase  Client HBase  Client HBase   Replication HBase  Client Low   Latency High   Throughput HBase   Replication Put,  Incr,  Append Bulk  Import Gets,  Short-­‐Scans Full  Scan,   Map-­‐Reduce
  21. ©  Cloudera,  Inc.  All  rights  reserved. Archetype:  Metrics • Frequently

     updated  metrics   • Increments   • Roll  ups  generated  by  MR  and  bulk  loaded  to  HBase   • Schema   • Row:  Entity  for  a  time  period   • Row  key:  entity-­‐<yymmddhh>  (granular  time)   • Column  Qualifier:  Property  -­‐>  Count   • Examples   • Campaign  Impression/Click  counts  (Ad  tech)   • Sensor  data  (Energy,  Manufacturing,  Auto)
  22. ©  Cloudera,  Inc.  All  rights  reserved. Messages  Access  Pattern HBase

     Client HBase  Client HBase  Client HBase   Replication HBase  Client Low   Latency High   Throughput HBase   Replication Put,  Incr,  Append Bulk  Import Gets,  Short-­‐Scans Full  Scan,   Map-­‐Reduce
  23. ‹#› ©  Cloudera,  Inc.  All  rights  reserved. Archetypes:  The  Bad

    These  are  not  the  droids  you  are  looking  for
  24. ©  Cloudera,  Inc.  All  rights  reserved. Current  HBase  weak  spots

    • HBase’s  architecture  can  handle  a  lot   • Engineering  tradeoffs  optimize  for  some  usecases  and  against  others   • HBase  can  still  do  things  it  is  not  optimal  for   • However,  other  systems  are  fundamentally  more  efficient  for  some  workloads   • We’ve  seen  folks  forcing  apps  into  HBase   • If  there  is  only  one  workloads  on  the  data,  consider  another  system   • if  there  is  a  mixed  workload,  some  cases  become  “maybes” Just  because  it  is  not  good  today,  doesn’t  mean  it  can’t  be  better  tomorrow!
  25. ©  Cloudera,  Inc.  All  rights  reserved. Bad  Archetype:  Large  Blob

     Store • Saving  large  objects  >  3MB  per  cell   • Schema   • Normal  entity  pattern,  but  with  some  columns  with  large  cells   • Examples   • Raw  photo  or  video  storage  in  HBase   • Large  frequently  updated  structs  as  a  single  cell   • Problems:   • Write  amplification  when  reoptimizing  data  for  read
 (compactions  on  large  unchanging  data)   • Write  amplification  when  large  structs  are  rewritten  to  update  subfields
 (cells  are  atomic,  and  HBase  must  rewrite  an  entire  cell)   • NOTE:  Medium  Binary  Object  (MOB)  support  coming  (lots  of  100KB-­‐10MB  cells)
  26. ©  Cloudera,  Inc.  All  rights  reserved. Bad  Archetype:  Naïve  RDBMS

     port • A  Naïve  port  of  an  RDBMS  into  HBase,  directly  copying  the  schema   • Schema   • Many  tables,  just  like  an  RDBMS  schema   • Row-­‐Key:  primary  key  or  auto-­‐incrementing  key,  like  RDBMS  schema   • Column  Qualifiers:  field  names   • Manually  do  joins,  or  secondary  indexes  (not  consistent)   • Solution:   • HBase  is  not  a  SQL  Database   • No  multi-­‐region/multi-­‐table  in  HBase  transaction  (yet)   • No  built  in  join  support.  Must  denormalize  your  schema  to  use  HBase
  27. ©  Cloudera,  Inc.  All  rights  reserved. Bad  Archetype:  Analytic  archive

    • Store  purely  chronological  data,  partitioned  by  time   • Real  time  writes,  chronological  time  as  primary  index   • Column-­‐centric  aggregations  over  all  rows   • Bulk  reads  out,  generally  for  generating  periodic  reports   • Schema   • Row-­‐Key:  date  +  xxx  or  salt  +  date  +  xxx   • Column  Qualifiers:  properties  with  data  or  counters   • Example   • Machine  logs  organized  by  date  (causes  write  hotspotting)   • Full  fidelity  clickstream  organized  by  date  (as  opposed  to  campaign)
  28. ©  Cloudera,  Inc.  All  rights  reserved. Bad  Archetype:  Analytic  archive

     Problems • HBase  not-­‐optimal  as  primary  use  case   • Will  get  crushed  by  frequent  full  table  scans   • Will  get  crushed  by  large  compactions   • Will  get  crushed  by  write-­‐side  region  hot  spotting   • Solution   • Store  in  HDFS.  Use  Parquet  columnar  data  storage  +  Hive/Impala   • Build  rollups  in  HDFS+MR.  store  and  serve  rollups  in  HBase
  29. ‹#› ©  Cloudera,  Inc.  All  rights  reserved. Archetypes:  The  Maybe

    And  this  is  crazy  |  But  here’s  my  data  |  serve  it,  maybe!
  30. ©  Cloudera,  Inc.  All  rights  reserved. The  Maybe’s • For

     some  applications,  doing  it  right  gets  complicated.   • More  sophisticated  or  nuanced  cases   • Require  considering  these  questions:   • When  do  you  choose  HBase  vs  HDFS  storage  for  time  series  data?   • Are  there  times  where  bad  archetypes  are  ok?
  31. ©  Cloudera,  Inc.  All  rights  reserved. Time  Series:  in  HBase

     or  HDFS? • Time  Series  I/O  Pattern  Physics:   • Read:  collocate  related  data  (Make  reads  cheap  and  fast)   • Writes:  Spread  writes  out  as  much  as  possible  (Maximize  write  throughput)   • HBase:  Tension  between  these  goals   • Spreading  writes  spreads  dat  amaking  reads  inefficient   • Colocating  on  write  causes  hotspots,  underutilizes  resources  by  limiting  write   throughput.   • HDFS:  The  sweet  spots   • Sequential  writes  and  sequential  read   • Just  write  more  files  in  date-­‐dirs;  physically  spreads  writes  but  logically  groups  data   • Reads  for  time  centric  queries:  just  read  files  in  date-­‐dir
  32. ©  Cloudera,  Inc.  All  rights  reserved. Time  Series:  data  flow

    • Ingest   • Flume  or  similar  direct  tool  via  app   • HDFS  for  historical   • No  real  time  serving   • Batch  queries  and  generate  rollups  in  Hive/MR   • Faster  queries  in  Impala   • HBase  for  recent   • Serve  individual  events   • Serve  pre-­‐computed  aggregates
  33. ©  Cloudera,  Inc.  All  rights  reserved. Maybe  Archetype:  Entity  Time

     Series • Full  fidelity  historical  record  of  metrics   • Random  write  to  event  data,  random  read  specific  event  or  aggregate  data   • Schema   • Row-­‐Key:  entity-­‐timestamp  or  hash(entity)-­‐timestamp.  possibly  with  a  salt  added  after  entity.   • Column  Qualifier:  granular  timestamp  -­‐>  value   • Use  custom  aggregation  to  consolidate  old  data   • Use  TTL’s  to  bound  and  age  off  old  data   • Examples:   • OpenTSDB  is  a  system  on  HBase  that  handles  this  for  numeric  values   • Lazily  aggregates  cells  for  better  performance   • Facebook  Insights,  ODS
  34. ©  Cloudera,  Inc.  All  rights  reserved. Entity  Time  Series  Access

     Pattern HBase  Client HBase  Client HBase  Client HBase   Replication HBase  Client Low   Latency High   Throughput HBase   Replication Put,  Incr,  Append Bulk  Import Gets,  Short-­‐Scans Full  Scan,   Map-­‐Reduce Flume OpenTSDB Custom  App
  35. ©  Cloudera,  Inc.  All  rights  reserved. Maybe  Archetype:  Hybrid  Entity

     Time  Series • Essentially  a  combo  of  Metric  Archetype  and  Entity  Time  Series
 with  bulk  loads  of  rollups  via  HDFS   • Land  data  in  HDFS  and  HBase   • Keep  all  data  in  HDFS  for  future  use   • Aggregate  in  HDFS  and  write  to  HBase   • HBase  can  do  some  aggregates  too  (counters)   • Keep  serve-­‐able  data  in  HBase   • Use  TTL  to  discard  old  values  from  HBase
  36. ©  Cloudera,  Inc.  All  rights  reserved. Hybrid  Time  Series  Access

     Pattern HBase  Client HBase  Client HBase  Client HBase   Replication HBase  Client Low   Latency High   Throughput HBase   Replication Put,  Incr,  Append Bulk  Import Gets,  Short-­‐Scans Full  Scan,   Map-­‐Reduce Flume HDFS
  37. ©  Cloudera,  Inc.  All  rights  reserved. Meta  Archetype:  Combined  workloads

    • In  this  cases,  the  use  of  HBase  depends  on  workload   • Cases  where  we  have  multiple  workloads  styles   • Many  cases  we  want  to  do  multiple  thing  with  the  same  data   • Primary  use  case  (real  time,  random  access)   • Secondary  use  case  (analytical)   • Pick  for  your  primary,
 here’s  some  patterns  on  how  to  do  your  secondary.
  38. ©  Cloudera,  Inc.  All  rights  reserved. Operational  with  Analytical  access

     pattern HBase  Client HBase  Client HBase  Client HBase  Scanner Poor   Latency!   Full  Scans     Interferece High   Throughput HBase   Replication Put,  Incr,  Append Bulk  Import Gets,  Short-­‐Scans Map-­‐Reduce
  39. ©  Cloudera,  Inc.  All  rights  reserved. Operational  with  Analytical  access

     pattern HBase  Client HBase  Client HBase  Scanner Low   Latency   Isolated   from   Full  Scans   High   Throughput HBase   Replication Put,  Incr,  Append Bulk  Import Map-­‐Reduce High   Throughput HBase  Replication HBase  Client Gets,  Short-­‐Scans
  40. ©  Cloudera,  Inc.  All  rights  reserved. MR  over  Table  Snapshots

     (0.98+) • Previously  Map-­‐Reduce  jobs  over  HBase  required  online  full  table  scan   • Take  a  snapshot  and  run  MR  job  over  snapshot  files   • Doesn’t  use  HBase  client
 (or  any  RPC  against  the  RSs)   • Avoid  affectung  HBase  caches   • 3-­‐5x  perf  boost.   • Still  requires  more  IOPs  than  HDFS  raw  files map map map map map map map map reduce reduce reduce map map map map map map map map reduce reduce reduce snapshot
  41. ©  Cloudera,  Inc.  All  rights  reserved. Analytic  Archive  Access  Pattern

    HBase  Client HBase  Client HBase  Client HBase   Replication HBase  Client Low   Latency High   Throughput HBase   Replication Put,  Incr,  Append Bulk  Import Gets,  Short-­‐Scans Full  Scan,   Map-­‐Reduce
  42. ©  Cloudera,  Inc.  All  rights  reserved. Analytic  Archive  Snapshot  Access

     Pattern HBase  Client HBase  Client HBase  Client HBase   Replication HBase  Client Low   Latency Higher   Throughput HBase   Replication Put,  Incr,  Append Bulk  Import Gets,  Short-­‐Scans Snapshot  Scan,   Map-­‐Reduce Table  Snapshot
  43. ©  Cloudera,  Inc.  All  rights  reserved. Request  Scheduling • We

     want  MR  for  analytics  while  serving
 low-­‐latency  requests  in  one  cluster.   • Table  Isolation  (Proposed  HBASE_6721)   • Avoid  having  the  load  on  Table  X  impact  Table  Y   • Request  prioritization  and  scheduling   • Current  default  is  FIFO,  added  Deadline   • Prioritize  short  requests  before  long  scans   • Separated  rpc-­‐handlers  for  writes/short-­‐reads/long-­‐reads   • Throttling   • Limit  the  request  throughput  of  a  MR  job 2 3 2 3 Delayed by long scan requests Rescheduled so new request get priority Mixed workload Isolated workload
  44. ‹#› ©  Cloudera,  Inc.  All  rights  reserved. Conclusions Pick  the

     system  that  is  best  for  your  workload!
  45. ©  Cloudera,  Inc.  All  rights  reserved. HDFS   +  Impala

    “Big  Data”  Workloads Low   Latency Batch Random  Access Full  Scan Short  Scan HDFS  +  MR
 (Hive/Pig) HBase  +  Snapshots   (HDFS  +  MR) HBase  +  MR HBase Pick  the  system  that  is  best  for  your  workload!
  46. ©  Cloudera,  Inc.  All  rights  reserved. HBase  is  evolving  to

     be  an  Operational  Database • Excels  at  consistent  row-­‐centric  operations   • Dev  efforts  aimed  at  using  all  machine  resources  efficiently,
 reducing  MTTR  and  improving  latency  predictability.   • Projects  built  on  HBase  that  enable  secondary  indexing  and  multi-­‐row  transactions   • Apache  Phoenix  and  others  provide  a  SQL  skin  for  simplified  application  development   • Evolution  towards  OLTP  workloads
 • Analytic  workloads?   • Can  be  done  but  will  be  beaten  by  direct  HDFS  +  MR/Spark/Impala