Slide 1

Slide 1 text

Benjamin  Hindman    –  @benh   Apache  Mesos   incubator.apache.org/mesos   @ApacheMesos  

Slide 2

Slide 2 text

history   Berkeley  research  project  including  Benjamin   Hindman,  Andy  Konwinski,  Matei  Zaharia,  Ali   Ghodsi,  Anthony  D.  Joseph,  Randy  Katz,  Scott   Shenker,  Ion  Stoica   http://incubator.apache.org/mesos/research.html  

Slide 3

Slide 3 text

Mesos  aims  to  make  it  easier  to  build   distributed  applications/frameworks  and   share  datacenter  resources  

Slide 4

Slide 4 text

applications/frameworks   services   analytics  

Slide 5

Slide 5 text

analytics   services   applications/frameworks  

Slide 6

Slide 6 text

deploying  things  today:   static  partitioning   Node   Node   Hadoop   Node   Node   service   …  

Slide 7

Slide 7 text

static  partitioning   considered  harmful   Node   Node   Hadoop   Node   Node   service   …  

Slide 8

Slide 8 text

static  partitioning  considered  harmful   hard  to  fully  utilize  machines  (e.g.,  72  GB  RAM   and  24  CPUs)   Node   Node   Hadoop   Node   Node   service   …  

Slide 9

Slide 9 text

static  partitioning  considered  harmful   harder  to  deal  with  failures   Node   Node   Hadoop   Node   Node   service   …   X  

Slide 10

Slide 10 text

static  partitioning  considered  harmful   Node   Node   Hadoop   Node   Node   service   …   Node   Node   Node   harder  to  scale  elastically  

Slide 11

Slide 11 text

Mesos   Node   Node   Hadoop   Node   Node   service   …  

Slide 12

Slide 12 text

level  of  indirection   Mesos   Node   Node   Node   Node   Hadoop   service   …   Node   Node   Hadoop   Node   Node   service   …  

Slide 13

Slide 13 text

a  “kernel”  for  the  datacenter   Mesos   Node   Node   Node   Node   Hadoop   service   …   Node   Node   Hadoop   Node   Node   service   …  

Slide 14

Slide 14 text

Twitter’s  “kernel”  for  the  datacenter   Mesos   Node   Node   Node   Node   Hadoop   service   …   Node   Node   Hadoop   Node   Node   service   …  

Slide 15

Slide 15 text

architecture   Mesos   master   Mesos  slave   Mesos  slave  

Slide 16

Slide 16 text

Mesos   master   Mesos   master   architecture   Mesos   master   Mesos  slave   Mesos  slave  

Slide 17

Slide 17 text

applications/frameworks   1.  scheduler  

Slide 18

Slide 18 text

architecture   Mesos   master   Mesos  slave   Mesos  slave   service  Y   scheduler   requests  resources,   assign  tasks  

Slide 19

Slide 19 text

frameworks   1.  scheduler   2.  executor  (optional,  if  you  don’t  just  want  to   run  a  single  command)  

Slide 20

Slide 20 text

architecture   Mesos   master   Mesos  slave   Mesos  slave   service  Y   scheduler   service  Y   task   (Netty   server)   service  Y   executor   Netty   Server   runs  tasks,  reports   status  updates  

Slide 21

Slide 21 text

architecture   service  X   scheduler   allocation   module   Mesos   master   Mesos  slave   Mesos  slave   decides  how  to  allocate   resources   service  Y   scheduler   service  Y   task   (Netty   server)   service  Y   executor   Netty   Server  

Slide 22

Slide 22 text

“two-­‐level  scheduling”   Mesos:  controls  resource  allocations  to   applications/frameworks   applications/frameworks:  make  decisions  about   what  to  run  

Slide 23

Slide 23 text

dominant  resource-­‐fairness   default  allocation  policy  (see   incubator.apache.org/mesos/research.html  for   more  info)     help  us  write  new  allocators!  

Slide 24

Slide 24 text

architecture   service  X   scheduler   allocation   module   Mesos   master   Mesos  slave   service  X   executor   Mesos  slave   task   launches,  isolates,   and  monitors  tasks   and  executors   service  Y   scheduler   service  Y   task   (Netty   server)   service  Y   executor   Netty   Server   request   offer  

Slide 25

Slide 25 text

“kernel”  primitives  for  building   frameworks   messaging  (unreliable)   mechanisms  for  high-­‐availability   fault-­‐detection   resource  isolation  (cgroups)   resource  monitoring  

Slide 26

Slide 26 text

Mesos   Mesos   Node   Node   Node   Node   Hadoop   …   Node   Node   Node   Node   Spark  

Slide 27

Slide 27 text

Mesos  at  Twitter   Mesos   Node   Node   Node   Node   Hadoop   …   Node   Node   Node   Node   Spark   Storm  

Slide 28

Slide 28 text

Twitter  framework   a  framework  that  makes  deploying  and   managing  productions  servers  easy   jobs/servers  are  submitted  to  the  framework  via   a  configuration  file   provides  mechanisms:   » rolling  restarts/updates   » relaunching  processes  after  failures  (if  requested)   » and  more!  

Slide 29

Slide 29 text

demo  

Slide 30

Slide 30 text

details   50,000+  lines  of  C++   libprocess  for  asynchronous  actor  style   concurrency  (github.com/libprocess)   APIs  in  C++,  Java,  Python   protobuf  for  data  transport,  data  types   zookeeper  support  for  high-­‐availability   linux  control  groups  support  (LXC/cgroups)  

Slide 31

Slide 31 text

frameworks   •  Hadoop  (0.20.205  and  0.20.2-­‐cdh3u3)   •  MPICH2  (Open  Source  MPI  framework)   •  Spark  (github.com/mesos/spark)   •  DPark  (github.com/douban/dpark)   •  Storm  (github.com/nathanmarz/storm)  

Slide 32

Slide 32 text

genomics  researchers  using  Hadoop   and  Spark   Building  a  new  framework  for  job   workflows,  wants  to  use  Spark  and   Hadoop  too   Built  DPark  (a  Python  clone  of   Spark),  also  running  MPI   Hadoop  and  Spark  used  by  machine   learning  researchers  

Slide 33

Slide 33 text

future   smarter  allocator  support  (priority,  weighted   fair-­‐sharing,  etc)   better  resource  monitoring/collection   other  primitives  for  building  applications/ frameworks  systems?   other  frameworks!?  

Slide 34

Slide 34 text

try  it  out!   run  on  bare-­‐metal  or  virtual  machines  –  develop   against  Mesos  API  and  run  in  private  datacenter,   or  the  cloud,  or  both!  

Slide 35

Slide 35 text

questions?   incubator.apache.org/mesos   @ApacheMesos