Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Apache Mesos (Twitter Intern Open House)

benh
August 01, 2012

Apache Mesos (Twitter Intern Open House)

Tech talk on Apache Mesos for the Twitter Intern Open House.

benh

August 01, 2012
Tweet

More Decks by benh

Other Decks in Technology

Transcript

  1. Benjamin  Hindman    –  @benh   Jie  Yu  –  @jie_yu

      Apache  Mesos   incubator.apache.org/mesos   @ApacheMesos  
  2. history   Berkeley  research  project  including  Benjamin   Hindman,  Andy

     Konwinski,  Matei  Zaharia,  Ali   Ghodsi,  Anthony  D.  Joseph,  Randy  Katz,  Scott   Shenker,  Ion  Stoica   incubator.apache.org/mesos/research.html  
  3. motivation:  static  partitioning   Node   Node   analytics  

    Node   Node   service   Node   Node   service   …   Node  
  4. static  partitioning   considered  harmful   Node   Node  

    Hadoop   Node   Node   service   …  
  5. static  partitioning  considered  harmful   hard  to  fully  utilize  machines

     (e.g.,  72  GB  RAM   and  24  CPUs)   Node   Node   Hadoop   Node   Node   service   …  
  6. static  partitioning  considered  harmful   harder  to  deal  with  failures

      Node   Node   Hadoop   Node   Node   service   …   X  
  7. harder  to  scale  elastically   static  partitioning  considered  harmful  

    Node   Node   Hadoop   Node   Node   service   …   Node   Node   Node  
  8. Mesos   Mesos   Node   Node   Node  

    Node   Hadoop   service   …   Node   Node   Hadoop   Node   Node   service   …  
  9. level  of  indirection   Mesos   Node   Node  

    Node   Node   Hadoop   service   …   Node   Node   Hadoop   Node   Node   service   …  
  10. better  utilization   Node   Hadoop   service   Hadoop

      Hadoop   need per machine isolation!
  11. easier  to  deal  with  failures   Mesos   Node  

    Node   Node   Node   Hadoop   service   X  
  12. Mesos:   1)  efficiently  share  datacenter  resources   2)  make

     it  easier  to  build  distributed  services   and  analytics  frameworks    
  13. a  “kernel”  for  the  datacenter   Mesos   Node  

    Node   Node   Node   Hadoop   service   …   Node   Node   Hadoop   Node   Node   service   …  
  14. Mesos   master   Mesos   master   architecture  

    Mesos   master   Mesos  slave   Mesos  slave  
  15. architecture   Mesos   master   Mesos  slave   Mesos

     slave   service  Y   scheduler   requests  resources,   assign  tasks  
  16. services  and  frameworks   1.  scheduler   2.  executor  (optional,

     if  you  don’t  just  want  to   run  a  single  command)  
  17. architecture   Mesos   master   Mesos  slave   Mesos

     slave   service  Y   scheduler   service  Y   task   (Netty   server)   service  Y   executor   Netty   Server   runs  tasks,  reports   status  updates  
  18. architecture   service  X   scheduler   allocation   module

      Mesos   master   Mesos  slave   Mesos  slave   decides  how  to  allocate   resources   service  Y   scheduler   service  Y   task   (Netty   server)   service  Y   executor   Netty   Server  
  19. “two-­‐level  scheduling”   Mesos:  controls  resource  allocations  to   applications/frameworks

      applications/frameworks:  make  decisions  about   what  to  run  
  20. architecture   service  X   scheduler   allocation   module

      Mesos   master   Mesos  slave   service  X   executor   Mesos  slave   task   launches,  isolates,   and  monitors  tasks   and  executors   service  Y   scheduler   service  Y   task   (Netty   server)   service  Y   executor   Netty   Server   request   offer  
  21. “kernel”  primitives  for  building   frameworks   messaging  (unreliable)  

    mechanisms  for  high-­‐availability   fault-­‐detection   resource  isolation  (cgroups)   resource  monitoring  
  22. resource  isolation  in  Mesos     summer  intern  project  

     (May  –  August)   •  why  important?   •  how  to  achieve  it?   •  current  status  
  23. hadoop   scheduler   allocation   module   Mesos  

    master   Mesos  slave   Mesos  slave   service   scheduler   service   task   (Netty   server)   service   executor   Netty   Server   hadoop   executor   Analytic   Task   Offered:   4  CPUs   12G  Memory   Actual:   4  CPUs   12G  Memory   Memory leak! Offered:   4  CPUs   12G  Memory   Actual:   4  CPUs   12G  Memory   w/o  isolation   Offered:   4  CPUs   12G  Memory   Actual:   4  CPUs   2G  Memory   Offered:   4  CPUs   12G  Memory   Actual:   4  CPUs   22G  Memory   Total:   8  CPUs   24G  Memory  
  24. hadoop   scheduler   allocation   module   Mesos  

    master   Mesos  slave   Mesos  slave   service   scheduler   service   task   (Netty   server)   service   executor   Netty   Server   hadoop   executor   Analytic   Task   Total:   8  CPUs   24G  Memory   Offered:   4  CPUs   12G  Memory   Actual:   4  CPUs   12G  Memory   Memory leak! Offered:   4  CPUs   12G  Memory   Actual:   4  CPUs   12G  Memory   w/  isolation  
  25. how  to  achieve  it?     virtual  machines   pros:

    ✔  strong isolation ✔  security cons: ✘  performance ✘  deployment ✘  debugging
  26. how  to  achieve  it?     OS  containers   pros:

    ✔  performance ✔  deployment cons: ✘  weak isolation ✘  security what we use in Mesos Linux control groups
  27. Linux  control  groups  (cgroups)   isolation  for  CPU,  memory,  disk

     I/O,  network  I/O   supported  by  existing  Linux  kernel   low  performance  cost   easy  resource  usage  monitoring   event  notification  mechanism   support  pause  /  resume   simple  interface  to  control  
  28. current  status     support  isolation  for  CPUs  and  memory

       -­‐-­‐  easily  extensible  to  support  disk  I/O     support  out-­‐of-­‐memory  event  notification    -­‐-­‐  admin  can  define  policies  (e.g.  kill,  pause)     support  pausing  and  resuming  executors     support  monitoring  actual  resource  usage    -­‐-­‐  including  a  new  front-­‐end  UI     ready  to  be  checked  in!  
  29. Mesos   Mesos   Node   Node   Node  

    Node   Hadoop   …   Node   Node   Node   Node   Spark  
  30. Mesos  at  Twitter   Mesos   Node   Node  

    Node   Node   Hadoop   …   Node   Node   Node   Node   Spark   Storm  
  31. analytics   •  Hadoop  (0.20.205  and  0.20.2-­‐cdh3u3)   •  MPICH2

     (Open  Source  MPI  framework)   •  Spark  (github.com/mesos/spark)   •  DPark  (github.com/douban/dpark)   •  Storm  (github.com/nathanmarz/storm)  
  32. details   built  in  C++,  APIs  in  C++,  Java,  Python

      uses  libprocess  for  asynchronous  actor  style   concurrency  (github.com/libprocess)  
  33. genomics  researchers  using  Hadoop   and  Spark   Building  a

     new  framework  for  job   workflows,  wants  to  use  Spark  and   Hadoop  too   Built  DPark  (a  Python  clone  of   Spark),  also  running  MPI   Hadoop  and  Spark  used  by  machine   learning  researchers  
  34. try  it  out!   run  on  bare-­‐metal  or  virtual  machines

     –  develop   against  Mesos  API  and  run  in  private  datacenter,   or  the  cloud,  or  both!  
  35. Twitter                  

       Open  Source   twitter.github.com   @TwitterOSS