Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Elastic Hadoop Clusters on Mesos with Apache Myriad (incubating)

mohit
May 12, 2016

Elastic Hadoop Clusters on Mesos with Apache Myriad (incubating)

mohit

May 12, 2016
Tweet

More Decks by mohit

Other Decks in Technology

Transcript

  1. © 2016 Mesosphere, Inc. All Rights Reserved. 1 Apache Myriad

    (incubating) ApacheCon NA, May 2016 Mohit Soni Adam Bordelon mohit@ me@
  2. © 2016 Mesosphere, Inc. All Rights Reserved. 2 AGENDA •

    Why Myriad ? • What is Myriad ? • Scheduling Modes • Demo ! Apache Mesos, Hadoop, and Myriad logos are trademarks of the Apache Software Foundation
  3. © 2016 Mesosphere, Inc. All Rights Reserved. 3 WHY MYRIAD

    ? Mesos Agent Agent Agent Agent Agent Agent YARN NM NM NM NM NM NM Isolated clusters
  4. © 2016 Mesosphere, Inc. All Rights Reserved. 4 UTILIZATION FLUCTUATES

    Server sprawl; Excess capacity
  5. © 2016 Mesosphere, Inc. All Rights Reserved. 5 WHY MYRIAD

    ? Mesos Agent Agent Agent Agent Agent Agent YARN NM NM NM NM NM NM Isolated clusters
  6. © 2016 Mesosphere, Inc. All Rights Reserved. 6 BREAKING SILOS

    Mesos Agent YARN Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent
  7. © 2016 Mesosphere, Inc. All Rights Reserved. 7 THE PROBLEM

    WITH RESOURCE SILOS • Overprovision to handle maximum capacity • No elasticity to handle spikes/drops in load • Hadoop and non-Hadoop services cannot co-locate ◦ If they do, they need strong isolation • Even test/prod Hadoop clusters don’t co-locate ◦ Makes it difficult to share the data layer
  8. © 2016 Mesosphere, Inc. All Rights Reserved. 8 AGENDA •

    Why Myriad ? • What is Myriad ? • Coarse vs Fine grained scheduling • Demo !
  9. © 2016 Mesosphere, Inc. All Rights Reserved. 9 WHAT IS

    MYRIAD ? • A framework that delegates resources between Apache Mesos and Apache YARN ◦ Implements both Mesos Scheduler interface and YARN scheduler interface ◦ Takes Mesos resource offers and launches/grows NMs ◦ Can kill/shrink NMs to give resources back to Mesos • With Myriad, Mesos is your datacenter’s kernel, and YARN is a managed service running on top of it
  10. © 2016 Mesosphere, Inc. All Rights Reserved. 10 MYRIAD ARCHITECTURE

    YARN RM Mesos Scheduler Web Server
  11. © 2016 Mesosphere, Inc. All Rights Reserved. 11 MYRIAD ARCHITECTURE

    Myriad/RM ZK Agent NM HDFS A B C Agent NM HDFS B C1 C2 Agent NM HDFS A B C3 Mesos
  12. © 2016 Mesosphere, Inc. All Rights Reserved. 12 AGENDA •

    Why Myriad ? • What is Myriad ? • Scheduling Modes • Demo !
  13. © 2016 Mesosphere, Inc. All Rights Reserved. 13 SCHEDULING MODES

    • Coarse-grained • Fine-grained
  14. © 2016 Mesosphere, Inc. All Rights Reserved. 14 COARSE GRAINED

    Myriad/RM Agent Agent Mesos Myriad API /api/flexup {“profile”: “medium”, “instances”: 1} HDFS HDFS
  15. © 2016 Mesosphere, Inc. All Rights Reserved. 15 COARSE GRAINED

    Myriad/RM Agent Agent Mesos Myriad API /api/flexup {“profile”: “medium”, “instances”: 1} Offer HDFS HDFS
  16. © 2016 Mesosphere, Inc. All Rights Reserved. 16 COARSE GRAINED

    Myriad/RM Agent Agent Mesos HDFS HDFS Executor NM
  17. © 2016 Mesosphere, Inc. All Rights Reserved. 17 COARSE GRAINED

    Myriad/RM Agent Agent Mesos HDFS HDFS Executor NM Resources allocated for future YARN containers
  18. © 2016 Mesosphere, Inc. All Rights Reserved. 18 FINE GRAINED

    Myriad/RM Agent Agent Mesos HDFS HDFS Executor NM Resources allocated for future YARN containers Myriad API /api/flexup {“profile”: “zero”, “instances”: 1}
  19. © 2016 Mesosphere, Inc. All Rights Reserved. 19 FINE GRAINED

    Myriad/RM Agent Agent Mesos HDFS HDFS Executor NM Resources allocated for future YARN containers Myriad API /api/flexup {“profile”: “zero”, “instances”: 1} Executor NM No upfront resource allocation
  20. © 2016 Mesosphere, Inc. All Rights Reserved. 20 SCHEDULING YARN

    TASKS Myriad/RM Agent Agent Mesos HDFS HDFS Executor NM YARN App Executor NM
  21. © 2016 Mesosphere, Inc. All Rights Reserved. 21 COARSE GRAINED

    Myriad/RM Agent Agent Mesos HDFS HDFS Executor NM YARN App Executor NM C3 C1 C2 Containers launched using previously allocated resources
  22. © 2016 Mesosphere, Inc. All Rights Reserved. 22 FINE GRAINED

    Myriad/RM Agent Agent Mesos HDFS HDFS Executor NM YARN App Executor NM C3 C1 C2 Offer
  23. © 2016 Mesosphere, Inc. All Rights Reserved. 23 FINE GRAINED

    Myriad/RM Agent Agent Mesos HDFS HDFS Executor NM YARN App Executor NM C3 C1 C2 C4 C5 Containers launched when resources from this agent are offered to Myriad
  24. © 2016 Mesosphere, Inc. All Rights Reserved. 24 TASKS COMPLETE

    Myriad/RM Agent Agent Mesos HDFS HDFS Executor NM Executor NM C3 C4
  25. © 2016 Mesosphere, Inc. All Rights Reserved. 25 TASKS COMPLETE

    Myriad/RM Agent Agent Mesos HDFS HDFS Executor NM Executor NM C3 C4 Freed resources remain allocated for future use Freed resources reclaimed by Mesos
  26. © 2016 Mesosphere, Inc. All Rights Reserved. 26 ALL TASKS

    COMPLETE Myriad/RM Agent Agent Mesos HDFS HDFS Executor NM Executor NM Freed resources remain allocated for future use Freed resources reclaimed by Mesos
  27. © 2016 Mesosphere, Inc. All Rights Reserved. 27 AGENDA •

    Why Myriad ? • What is Myriad ? • Scheduling Modes • Demo !
  28. © 2016 Mesosphere, Inc. All Rights Reserved. 28 APACHE MESOS

    LOVES MYRIAD • Support Hadoop2: Run any YARN app, e.g. Hive, Pig • Sharing: Remove static partitioning, resource silos ◦ Borrow YARN resources when Tier-1 services spike ◦ Backfill unused capacity with best-effort Hadoop jobs • Portable: Works with unmodified Mesos distro
  29. © 2016 Mesosphere, Inc. All Rights Reserved. 29 APACHE HADOOP

    LOVES MYRIAD • Elastic scaling: NM count and capacity • Fault-tolerant: Maintain RM/NM capacity • Cluster utilization: colocated services, shared resources • Isolation: Cpu, memory, disk, network, gpus, etc. • Multitenancy: many Hadoop clusters on one Mesos cluster • Portable: Works with any modern Hadoop2 distro
  30. © 2016 Mesosphere, Inc. All Rights Reserved. 30 FEATURES IN

    MYRIAD 0.1 (DEC 2015) • Scale up/down NM capacity via REST API • Remote Distribution of RM/NM binaries • RM failure/discovery using Marathon/Mesos-DNS • Myriad HA, Task reconciliation • Job history server, Timeline server • UX: REST API and WebUI
  31. © 2016 Mesosphere, Inc. All Rights Reserved. 31 FEATURES IN

    MYRIAD 0.2 (~MAY 2016) • Improved Fine-Grained Scaling • CGroups fixes • Dockerized NM • Support for Mesosphere DC/OS
  32. © 2016 Mesosphere, Inc. All Rights Reserved. 32 UPCOMING FEATURES

    • Improved multitenancy • Revocable resources, reservations • Security integration • Autoscaling policies (data locality, node drain) • Docker networking, IP-per-container, etc. • Queuing theory scheduling
  33. © 2016 Mesosphere, Inc. All Rights Reserved. 33 RESOURCES •

    http://myriad.incubator.apache.org • dev@myriad.incubator.apache.org • https://github.com/apache/incubator-myriad • https://issues.apache.org/jira/browse/MYRIAD