Druid Ecosystem @ Yahoo!

26290e7e829b985a6bcb44da8213029e?s=47 Imply
April 02, 2019

Druid Ecosystem @ Yahoo!

Presentation by Niketh Sabbineni, Principal Engineer @ Yahoo!, for the San Francisco Bay Area Druid Meetup at Unity.

26290e7e829b985a6bcb44da8213029e?s=128

Imply

April 02, 2019
Tweet

Transcript

  1. 2.

    Who am I 2 Yahoo Confidential & Proprietary ▪ Principal

    Engineer @ Yahoo ▪ CTO @ Bookpad ▪ SDE @ Amazon
  2. 3.

    Flurry Overview 3 Yahoo Confidential & Proprietary ▪ Measure →

    Analyse → Insights → Action ▪ 1M Apps ▪ 2.1B Devices ▪ 100B+ Events (daily) ▪ 10B Sessions (daily) ▪ Raw data well over 20PB
  3. 4.

    Features 4 Yahoo Confidential & Proprietary ▪ Realtime ▪ Crash

    ▪ Technical ▪ Audience ▪ Retention ▪ .... ▪ .... ▪ Free Free Free!
  4. 5.

    Why Druid ? 5 Yahoo Confidential & Proprietary ▪ Realtime

    + Batch ▪ Horizontally Scalable ▪ Sub Second Query Latency ▪ Resilient to failures ▪ Custom plugins
  5. 7.

    Architecture 7 Yahoo Confidential & Proprietary Collectors Hbase Storm Druid

    Kafka Map Reduce Druid Metrics Cluster UI Programmatic Alerts Hive Pivot External
  6. 8.

    Architecture 8 Yahoo Confidential & Proprietary Collectors Hbase Storm Druid

    kafka Map Reduce Druid Metrics Cluster UI Programmatic Alerts Hive Pivot Collectors Hbase Storm Druid kafka Map Reduce R e p li c a ti o n External
  7. 9.

    Architecture 9 Yahoo Confidential & Proprietary • 300 Historicals -

    256GB Ram / 7TB SSD • 80 Middle Managers • HDFS / Kafka • 5 clusters in Flurry • 18 Clusters in Yahoo/Oath • Imply Pivot / Superset • Hive / SQL
  8. 11.

    Querying 11 Yahoo Confidential & Proprietary ▪ Column Types -

    String/Float/Double/Long/Custom ▪ Heterogenous Nodes - Ensure constant ram/disk ratio ▪ Cost Balancer - diskNormalized ▪ Partitioning - Broker uses Shardspec & less paging ▪ Spill to Disk ▪ Sketch (Count Distinct) Size - Adjust sketch sizes ▪ LimitSpec
  9. 12.

    Ingestion 12 Yahoo Confidential & Proprietary ▪ Query / Segment

    Granularity (50% space) ▪ Staggered Runs with Replication (30% compute) ▪ Partitioning - Call result in smaller segment sizes ▪ Reindexing ( Dimensions ) ▪ Late Arriving Data - Periodic Backfills
  10. 13.

    Monitoring 13 Yahoo Confidential & Proprietary ▪ Health Checks -

    Processes / Disks / Ram ▪ Querying - Time, Failure counts, GC Time, Paging Time ▪ Ingestion Tasks - Ingest lag, Waiting task counts ▪ Coordinator - Load Queue Size, Disk Size ▪ SSL Certificate Expiry ▪ Metrics Cluster ▪ Use Pivot / Turnilo for root cause analysis ▪ Kafka / HDFS - Name node storage
  11. 14.

    Q/A 14 Yahoo Confidential & Proprietary ▪ Niketh Sabbineni niketh@apache.org

    niketh.sabbineni@gmail.com ▪ Ankit Kothari ankitkothari@verizonmedia.com