Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Apache HBase Operations @ Pinterest

Apache HBase Operations @ Pinterest

HBase Operations on EC2

Jeremy Carroll

June 13, 2013
Tweet

More Decks by Jeremy Carroll

Other Decks in Technology

Transcript

  1. Operations
    Jeremy Carroll
    Operations Engineer
    HBaseCon 2013

    View Slide

  2. We help people discover things they love
    and inspire them to do those things…

    View Slide

  3. HBase in Production
    Overview
    • All running on Amazon Web Services
    • 5 production clusters and growing
    • Mix SSD and SATA clusters
    • Billions of page views per month

    View Slide

  4. With lots of patches
    Designing for EC2
    • CDH 4.2.x
    • HDFS-3912
    • HBase 0.94.7
    • HBASE-8284
    • One zone per cluster / no rack locality
    • RegionServers - Ephemeral disk only
    • Redundant clusters for availability
    • HDFS-4721 • HDFS-3703 • HDFS-9503
    • HBASE-8389• HBASE-8434 • HBASE-7878

    View Slide

  5. Configuration
    Cluster Setup
    • Managed splitting w/pre split tables
    • Bloom filters for pretty much everything
    • Manual / Rolling major compactions
    • Reverse DNS on EC2
    • 3 ZooKeepers in quorum
    • 1 NameNode / Sec-NameNode / Master
    • 1 EBS volume for fsImage / 1 Elastic IP
    • 10-50 nodes per cluster

    View Slide

  6. Fact-driven “Fry” method using Puppet
    Provisioning
    • User-data passed in to drive config management
    • Repackaged modifications to HDFS / HBase
    • Ubuntu .deb packages created with FPM
    • Synced to S3, nodes configured with s3-apt plugin
    • Mount + format ephemerals on boot
    • Ext4 / nodiratime / nodealloc / lazy_itable_init

    View Slide

  7. ---- HBASE MODULE ----
    class { 'hbase':
    cluster => 'feeds_e',
    namenode => 'ec2-xxx-xxx-xxx-xxx.compute-1.amazonaws.com',
    zookeeper_quorum => 'zk1,zk2,zk3',
    hbase_site_opts => {
    'hbase.replication' => true,
    'hbase.snapshot.enabled' => true,
    'hbase.snapshot.region.timeout' => '35000',
    'replication.sink.client.ops.timeout' => '20000',
    'replication.sink.client.retries.number' => '3',
    'replication.source.size.capacity' => '4194304',
    'replication.source.nb.capacity' => '100',
    ...
    }
    }
    ---- FACT BASED VARIABLES ----
    $hbase_heap_size = $ec2_instance_type ? {
    'hi1.4xlarge' => '24000',
    'm2.2xlarge' => '24000',
    'm2.xlarge' => '11480',
    'm1.xlarge' => '11480',
    'm1.large' => '6500',
    ...
    }
    Puppet Module

    View Slide

  8. Designed for EC2
    Service Monitoring
    • Wounded (dying) vs Operational
    • High value metrics first
    • Overall health
    • Alive / dead nodes
    • Service up/down
    • Fsck / Blocks / % Space
    • Replication status
    • Regions needing splits
    • fsImage checkpoint
    • Zookeeper quorum
    • Synthetic transactions (get / put)
    • Queues (flush / compaction / rpc)
    • Latency (client / filesystem)

    View Slide

  9. Designed for EC2
    Service Monitoring

    View Slide

  10. Instrumentation
    Metrics
    • OpenTSDB for high cardinality metrics
    • Per region stats collection
    • tCollector
    • RegionServer HTTP JMX
    • HBase REST
    • GangliaContext for hadoop-metrics

    View Slide

  11. OpenTSDB
    Table RegionServer Region
    Slicing and Dicing

    View Slide

  12. Using R
    Tables
    Regions
    StoreFiles

    View Slide

  13. Tuning Performance
    Compaction
    +Logs

    View Slide

  14. Operational Intelligence
    Dashboards

    View Slide

  15. S3 + HBase Snapshots
    Backups
    • Full NameNode backup every 60 mins
    • EBS Volume as an name.dir for crash recovery
    • HBase snapshots + ExportSnapShot

    View Slide

  16. Additional Tuning
    Solid State Clusters
    • Lower block size down from 32k
    • Something a lot smaller. 8-16k
    • Placement groups for 10Gb networking
    • Increase DFSBandwidthPerSec
    • Kernel tuning for TCP
    • Compaction threads
    • Disk elevator to noop

    View Slide

  17. Process
    Planning for Launch
    • Pyres queue asynchronous reads / writes
    • Allows for tuning a system before it goes live
    • Tuning
    • Schema
    • Hot spots
    • Compaction
    • Canary roll out to new users
    • 10% -> 30% -> 80% -> 100%

    View Slide

  18. Pssstt. We’re hiring

    View Slide