Managing Resources at Scale with Apache Mesos

Slide 1

Slide 1 text

Managing Resources at Scale with Apache Mesos Dharmesh Kakadia @dharmeshkakadia Large Scale Production Engineering Meetup June, 2014

Slide 2

Slide 2 text

whoami ●  Research Assistant @ Microsoft Research India ●  Have been stuck with schedulers ●  Working on predicting resource requirements and execution time of distributed jobs/query, to improve resource management @ MSR ●  Love large scale data/cloud/distributed-* ●  Writing a book on Apache Mesos

Slide 3

Slide 3 text

Mesos is a data center kernel

Slide 4

Slide 4 text

Why? •  Because distributed systems ○  everything fails ○  everything need to scale, linearly ○  are hard to get right •  Because Murphy’s law •  Lamport got a Turing award for a reason

Slide 5

Slide 5 text

Symptoms •  I have a lot of data or I have a lot of applications •  They are dynamic •  I have low resource utilization

Slide 6

Slide 6 text

Mesos Analytics ML Schedulers Graph Processing Databases Web frameworks

Slide 7

Slide 7 text

Why now? ●  Single Machine VMs Containers ●  More powerful machine but even more data ●  One kind of analysis all kinds of analytics ●  Static Dynamic ●  Everything connected

Slide 8

Slide 8 text

Why now? •  Can’t afford static partitioning anymore •  Can’t afford to be in-accessible •  Can’t afford to wait for releasing next feature

Slide 9

Slide 9 text

What you care about? •  Scalable •  Fault tolerant •  High resource utilization •  Isolation

Slide 10

Slide 10 text

Bonus •  Mesos-isphy anything. Extremely easy to port any. •  Battle tested in the field. •  Great community. •  Awesome UI.

Slide 11

Slide 11 text

Who is using Mesos?

Slide 12

Slide 12 text

Popular?

Slide 13

Slide 13 text

Give it a try •  Mesos has always been good in tooling. Its becoming even more easier. •  Run over AWS. Now also, Elastic Mesos() •  Vargant scripts •  Chef-cookbooks •  Binary packages, debs,..