Upgrade to Pro — share decks privately, control downloads, hide ads and more …

HBaseCon 2016 West - Containerizing Apache HBase Clusters

HBaseCon 2016 West - Containerizing Apache HBase Clusters

At Facebook, all production HBase clusters run in a containerized environment, with every daemon running inside its own LXC container. Containerization allows us to ensure isolation between services running on the same host and simplify operations, but sometimes abstractions leak and problems can't be addressed inside the container. In this talk, we will discuss how Facebook runs HBase as a stateful service inside containers and we will discuss some of the issues we've found when doing so.

https://hbase.apache.org/hbasecon-archives.html#HBaseCon_West_2016

312bbf425110c1e63fb48e52d5b61a1b?s=128

Javier Maestro

May 24, 2016
Tweet

Transcript

  1. Containerizing Apache HBase Clusters David Pope & Javier Maestro Production

    Engineers - HBase
  2. Container Overview

  3. None
  4. 1964 IBM 360 1979 UNIX chroot 1982 BSD chroot 1999

    FreeBSD jail 2013 Docker 2007-8 cgroups LXC
  5. Container Platforms

  6. None
  7. HBase Containers @ Facebook

  8. LEASE BUY

  9. Buy vs. Lease • Our scale • Timing • Control

    / Full Ownership • Financial Infrastructure (os, network) Container Platform (Tupperware) Application Services Physical (data center, hardware)
  10. Tupperware Overview scheduler server db host3 host4 host1 host2 config.tw

  11. Tupperware Spec

  12. Tupperware Benefits • Configuration Spec • Deployments • Scheduler •

    Health Monitor • Logging • Canary • Web UI / CLI / API • Elasticity (auto-scaling)
  13. HBase Cell rack rack rack rack

  14. Types of servers controllers nodes

  15. High Availability

  16. Server Pools controller pool (jobs) node pool (jobs) regionserver datanode

    master zk
  17. Stateful Elastic “cloud”

  18. Behind the Container

  19. The “Noisy Neighbor”

  20. • High iops on /dev/sda • Synchronous logging The “Noisy

    Neighbor” From the Container HBase Container /dev/sda
  21. • Configuration Management putting load on /dev/sda • Memory pressure

    forcing paging on /dev/sda • Large configuration subscriptions • Bloated packages The “Noisy Neighbor” From the Host System HBase Container Host System /dev/sda
  22. Performance & The Bug

  23. • Increased latency and timeouts • Cyclical spikes in io-wait

    across all of the Region Servers / Datanodes Performance & The Bug From the Container
  24. • Cyclical spikes across all disks hitting 100% utilization •

    No sign of any applications accessing the disks (iops == 0) Performance & The Bug From the Host System
  25. None
  26. • Log entries of a “learning cycle” every few minutes

    • Correlation to the drives locking up • Configuration mode to enable this “learning cycle” Performance & The Bug From the Hardware RAID Controller learning cycle HBase Container Host System
  27. None
  28. The Scheduler Apocalypse

  29. The Scheduler Apocalypse scheduler host3 host4 host1 host2

  30. Conclusions

  31. Conclusions • Containers provide a rich suite of tools and

    technologies to create standard, consistent and repeatable services • However, there are critical decisions to be made: • Buy vs. Lease • What parts of the Container Platform to use • You still need to be aware of what is happening behind the container • The leverage of the container goes both ways
  32. Q&A

  33. Thanks!