Managing Resources at Scale with Apache Mesos

Managing Resources at Scale with Apache Mesos

Slide deck from my Large Scale Production Engineering (LSPE) Meetup

Other talks at



June 14, 2014


  1. Managing Resources at Scale with Apache Mesos Dharmesh Kakadia @dharmeshkakadia

    Large Scale Production Engineering Meetup June, 2014
  2. whoami •  Research Assistant @ Microsoft Research India •  Have

    been stuck with schedulers •  Working on predicting resource requirements and execution time of distributed jobs/query, to improve resource management @ MSR •  Love large scale data/cloud/distributed-* •  Writing a book on Apache Mesos
  3. Mesos is a data center kernel

  4. Why? •  Because distributed systems ◦  everything fails ◦  everything

    need to scale, linearly ◦  are hard to get right •  Because Murphy’s law •  Lamport got a Turing award for a reason
  5. Symptoms •  I have a lot of data or I

    have a lot of applications •  They are dynamic •  I have low resource utilization
  6. Mesos Analytics ML Schedulers Graph Processing Databases Web frameworks

  7. Why now? •  Single Machine VMs Containers •  More powerful

    machine but even more data •  One kind of analysis all kinds of analytics •  Static Dynamic •  Everything connected
  8. Why now? •  Can’t afford static partitioning anymore •  Can’t

    afford to be in-accessible •  Can’t afford to wait for releasing next feature
  9. What you care about? •  Scalable •  Fault tolerant • 

    High resource utilization •  Isolation
  10. Bonus •  Mesos-isphy anything. Extremely easy to port any. • 

    Battle tested in the field. •  Great community. •  Awesome UI.
  11. Who is using Mesos?

  12. Popular?

  13. Give it a try •  Mesos has always been good

    in tooling. Its becoming even more easier. •  Run over AWS. Now also, Elastic Mesos() •  Vargant scripts •  Chef-cookbooks •  Binary packages, debs,..
  14. None
  15. None