Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Apache Mesos (Twitter Open Source Open House)

benh
July 26, 2012

Apache Mesos (Twitter Open Source Open House)

Short talk on motivation behind Apache Mesos and the resulting architecture as well as a look into how it's being used at Twitter.

benh

July 26, 2012
Tweet

More Decks by benh

Other Decks in Technology

Transcript

  1. history   Berkeley  research  project  including  Benjamin   Hindman,  Andy

     Konwinski,  Matei  Zaharia,  Ali   Ghodsi,  Anthony  D.  Joseph,  Randy  Katz,  Scott   Shenker,  Ion  Stoica   http://incubator.apache.org/mesos/research.html  
  2. Mesos  aims  to  make  it  easier  to  build   distributed

     applications/frameworks  and   share  datacenter  resources  
  3. deploying  things  today:   static  partitioning   Node   Node

      Hadoop   Node   Node   service   …  
  4. static  partitioning   considered  harmful   Node   Node  

    Hadoop   Node   Node   service   …  
  5. static  partitioning  considered  harmful   hard  to  fully  utilize  machines

     (e.g.,  72  GB  RAM   and  24  CPUs)   Node   Node   Hadoop   Node   Node   service   …  
  6. static  partitioning  considered  harmful   harder  to  deal  with  failures

      Node   Node   Hadoop   Node   Node   service   …   X  
  7. static  partitioning  considered  harmful   Node   Node   Hadoop

      Node   Node   service   …   Node   Node   Node   harder  to  scale  elastically  
  8. Mesos   Node   Node   Hadoop   Node  

    Node   service   …  
  9. level  of  indirection   Mesos   Node   Node  

    Node   Node   Hadoop   service   …   Node   Node   Hadoop   Node   Node   service   …  
  10. a  “kernel”  for  the  datacenter   Mesos   Node  

    Node   Node   Node   Hadoop   service   …   Node   Node   Hadoop   Node   Node   service   …  
  11. Twitter’s  “kernel”  for  the  datacenter   Mesos   Node  

    Node   Node   Node   Hadoop   service   …   Node   Node   Hadoop   Node   Node   service   …  
  12. Mesos   master   Mesos   master   architecture  

    Mesos   master   Mesos  slave   Mesos  slave  
  13. architecture   Mesos   master   Mesos  slave   Mesos

     slave   service  Y   scheduler   requests  resources,   assign  tasks  
  14. frameworks   1.  scheduler   2.  executor  (optional,  if  you

     don’t  just  want  to   run  a  single  command)  
  15. architecture   Mesos   master   Mesos  slave   Mesos

     slave   service  Y   scheduler   service  Y   task   (Netty   server)   service  Y   executor   Netty   Server   runs  tasks,  reports   status  updates  
  16. architecture   service  X   scheduler   allocation   module

      Mesos   master   Mesos  slave   Mesos  slave   decides  how  to  allocate   resources   service  Y   scheduler   service  Y   task   (Netty   server)   service  Y   executor   Netty   Server  
  17. “two-­‐level  scheduling”   Mesos:  controls  resource  allocations  to   applications/frameworks

      applications/frameworks:  make  decisions  about   what  to  run  
  18. architecture   service  X   scheduler   allocation   module

      Mesos   master   Mesos  slave   service  X   executor   Mesos  slave   task   launches,  isolates,   and  monitors  tasks   and  executors   service  Y   scheduler   service  Y   task   (Netty   server)   service  Y   executor   Netty   Server   request   offer  
  19. “kernel”  primitives  for  building   frameworks   messaging  (unreliable)  

    mechanisms  for  high-­‐availability   fault-­‐detection   resource  isolation  (cgroups)   resource  monitoring  
  20. Mesos   Mesos   Node   Node   Node  

    Node   Hadoop   …   Node   Node   Node   Node   Spark  
  21. Mesos  at  Twitter   Mesos   Node   Node  

    Node   Node   Hadoop   …   Node   Node   Node   Node   Spark   Storm  
  22. Twitter  framework   a  framework  that  makes  deploying  and  

    managing  productions  servers  easy   jobs/servers  are  submitted  to  the  framework  via   a  configuration  file   provides  mechanisms:   » rolling  restarts/updates   » relaunching  processes  after  failures  (if  requested)   » and  more!  
  23. details   50,000+  lines  of  C++   libprocess  for  asynchronous

     actor  style   concurrency  (github.com/libprocess)   APIs  in  C++,  Java,  Python   protobuf  for  data  transport,  data  types   zookeeper  support  for  high-­‐availability   linux  control  groups  support  (LXC/cgroups)  
  24. frameworks   •  Hadoop  (0.20.205  and  0.20.2-­‐cdh3u3)   •  MPICH2

     (Open  Source  MPI  framework)   •  Spark  (github.com/mesos/spark)   •  DPark  (github.com/douban/dpark)   •  Storm  (github.com/nathanmarz/storm)  
  25. genomics  researchers  using  Hadoop   and  Spark   Building  a

     new  framework  for  job   workflows,  wants  to  use  Spark  and   Hadoop  too   Built  DPark  (a  Python  clone  of   Spark),  also  running  MPI   Hadoop  and  Spark  used  by  machine   learning  researchers  
  26. future   smarter  allocator  support  (priority,  weighted   fair-­‐sharing,  etc)

      better  resource  monitoring/collection   other  primitives  for  building  applications/ frameworks  systems?   other  frameworks!?  
  27. try  it  out!   run  on  bare-­‐metal  or  virtual  machines

     –  develop   against  Mesos  API  and  run  in  private  datacenter,   or  the  cloud,  or  both!