Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Apache Mesos (Twitter Intern Open House)

592f0962ae9f1eb53adfc8ed0893d9a4?s=47 benh
August 01, 2012

Apache Mesos (Twitter Intern Open House)

Tech talk on Apache Mesos for the Twitter Intern Open House.

592f0962ae9f1eb53adfc8ed0893d9a4?s=128

benh

August 01, 2012
Tweet

Transcript

  1. Benjamin  Hindman    –  @benh   Jie  Yu  –  @jie_yu

      Apache  Mesos   incubator.apache.org/mesos   @ApacheMesos  
  2. history   Berkeley  research  project  including  Benjamin   Hindman,  Andy

     Konwinski,  Matei  Zaharia,  Ali   Ghodsi,  Anthony  D.  Joseph,  Randy  Katz,  Scott   Shenker,  Ion  Stoica   incubator.apache.org/mesos/research.html  
  3. motivation:  static  partitioning   Node   Node   analytics  

    Node   Node   service   Node   Node   service   …   Node  
  4. frameworks   services  

  5. frameworks   services  

  6. static  partitioning   considered  harmful   Node   Node  

    Hadoop   Node   Node   service   …  
  7. static  partitioning  considered  harmful   hard  to  fully  utilize  machines

     (e.g.,  72  GB  RAM   and  24  CPUs)   Node   Node   Hadoop   Node   Node   service   …  
  8. static  partitioning  considered  harmful   harder  to  deal  with  failures

      Node   Node   Hadoop   Node   Node   service   …   X  
  9. harder  to  scale  elastically   static  partitioning  considered  harmful  

    Node   Node   Hadoop   Node   Node   service   …   Node   Node   Node  
  10. Mesos   Mesos   Node   Node   Node  

    Node   Hadoop   service   …   Node   Node   Hadoop   Node   Node   service   …  
  11. level  of  indirection   Mesos   Node   Node  

    Node   Node   Hadoop   service   …   Node   Node   Hadoop   Node   Node   service   …  
  12. Mesos:   1)  efficiently  share  datacenter  resources  

  13. better  utilization   Mesos   Node   Node   Node

      Node   Hadoop   service  
  14. better  utilization   Node  

  15. better  utilization   Node   Hadoop   service   Hadoop

      Hadoop  
  16. better  utilization   Node   Hadoop   service   Hadoop

      Hadoop   need per machine isolation!
  17. easier  to  deal  with  failures   Mesos   Node  

    Node   Node   Node   Hadoop   service   X  
  18. enables  elasticity   Mesos   Node   Node   Node

      Node   Hado op   service  
  19. Mesos:   1)  efficiently  share  datacenter  resources   2)  make

     it  easier  to  build  distributed  services   and  analytics  frameworks    
  20. a  “kernel”  for  the  datacenter   Mesos   Node  

    Node   Node   Node   Hadoop   service   …   Node   Node   Hadoop   Node   Node   service   …  
  21. architecture   Mesos   master   Mesos  slave   Mesos

     slave  
  22. Mesos   master   Mesos   master   architecture  

    Mesos   master   Mesos  slave   Mesos  slave  
  23. services  and  frameworks   1.  scheduler  

  24. architecture   Mesos   master   Mesos  slave   Mesos

     slave   service  Y   scheduler   requests  resources,   assign  tasks  
  25. services  and  frameworks   1.  scheduler   2.  executor  (optional,

     if  you  don’t  just  want  to   run  a  single  command)  
  26. architecture   Mesos   master   Mesos  slave   Mesos

     slave   service  Y   scheduler   service  Y   task   (Netty   server)   service  Y   executor   Netty   Server   runs  tasks,  reports   status  updates  
  27. architecture   service  X   scheduler   allocation   module

      Mesos   master   Mesos  slave   Mesos  slave   decides  how  to  allocate   resources   service  Y   scheduler   service  Y   task   (Netty   server)   service  Y   executor   Netty   Server  
  28. “two-­‐level  scheduling”   Mesos:  controls  resource  allocations  to   applications/frameworks

      applications/frameworks:  make  decisions  about   what  to  run  
  29. dominant  resource-­‐fairness   default  allocation  policy  (see   incubator.apache.org/mesos/research.html  for

      more  info)     help  us  write  new  allocators!  
  30. architecture   service  X   scheduler   allocation   module

      Mesos   master   Mesos  slave   service  X   executor   Mesos  slave   task   launches,  isolates,   and  monitors  tasks   and  executors   service  Y   scheduler   service  Y   task   (Netty   server)   service  Y   executor   Netty   Server   request   offer  
  31. “kernel”  primitives  for  building   frameworks   messaging  (unreliable)  

    mechanisms  for  high-­‐availability   fault-­‐detection   resource  isolation  (cgroups)   resource  monitoring  
  32. resource  isolation  in  Mesos     summer  intern  project  

     (May  –  August)   •  why  important?   •  how  to  achieve  it?   •  current  status  
  33. hadoop   scheduler   allocation   module   Mesos  

    master   Mesos  slave   Mesos  slave   service   scheduler   service   task   (Netty   server)   service   executor   Netty   Server   hadoop   executor   Analytic   Task   Offered:   4  CPUs   12G  Memory   Actual:   4  CPUs   12G  Memory   Memory leak! Offered:   4  CPUs   12G  Memory   Actual:   4  CPUs   12G  Memory   w/o  isolation   Offered:   4  CPUs   12G  Memory   Actual:   4  CPUs   2G  Memory   Offered:   4  CPUs   12G  Memory   Actual:   4  CPUs   22G  Memory   Total:   8  CPUs   24G  Memory  
  34. hadoop   scheduler   allocation   module   Mesos  

    master   Mesos  slave   Mesos  slave   service   scheduler   service   task   (Netty   server)   service   executor   Netty   Server   hadoop   executor   Analytic   Task   Total:   8  CPUs   24G  Memory   Offered:   4  CPUs   12G  Memory   Actual:   4  CPUs   12G  Memory   Memory leak! Offered:   4  CPUs   12G  Memory   Actual:   4  CPUs   12G  Memory   w/  isolation  
  35. how  to  achieve  it?     virtual  machines   pros:

    ✔  strong isolation ✔  security cons: ✘  performance ✘  deployment ✘  debugging
  36. how  to  achieve  it?     OS  containers   pros:

    ✔  performance ✔  deployment cons: ✘  weak isolation ✘  security what we use in Mesos Linux control groups
  37. Linux  control  groups  (cgroups)   isolation  for  CPU,  memory,  disk

     I/O,  network  I/O   supported  by  existing  Linux  kernel   low  performance  cost   easy  resource  usage  monitoring   event  notification  mechanism   support  pause  /  resume   simple  interface  to  control  
  38. current  status     support  isolation  for  CPUs  and  memory

       -­‐-­‐  easily  extensible  to  support  disk  I/O     support  out-­‐of-­‐memory  event  notification    -­‐-­‐  admin  can  define  policies  (e.g.  kill,  pause)     support  pausing  and  resuming  executors     support  monitoring  actual  resource  usage    -­‐-­‐  including  a  new  front-­‐end  UI     ready  to  be  checked  in!  
  39. monitoring  realtime  resource   usage  for  each  executor  

  40. Mesos   Mesos   Node   Node   Node  

    Node   Hadoop   …   Node   Node   Node   Node   Spark  
  41. Mesos  at  Twitter   Mesos   Node   Node  

    Node   Node   Hadoop   …   Node   Node   Node   Node   Spark   Storm  
  42. demo  

  43. analytics   •  Hadoop  (0.20.205  and  0.20.2-­‐cdh3u3)   •  MPICH2

     (Open  Source  MPI  framework)   •  Spark  (github.com/mesos/spark)   •  DPark  (github.com/douban/dpark)   •  Storm  (github.com/nathanmarz/storm)  
  44. details   built  in  C++,  APIs  in  C++,  Java,  Python

      uses  libprocess  for  asynchronous  actor  style   concurrency  (github.com/libprocess)  
  45. genomics  researchers  using  Hadoop   and  Spark   Building  a

     new  framework  for  job   workflows,  wants  to  use  Spark  and   Hadoop  too   Built  DPark  (a  Python  clone  of   Spark),  also  running  MPI   Hadoop  and  Spark  used  by  machine   learning  researchers  
  46. try  it  out!   run  on  bare-­‐metal  or  virtual  machines

     –  develop   against  Mesos  API  and  run  in  private  datacenter,   or  the  cloud,  or  both!  
  47. questions?   incubator.apache.org/mesos   @ApacheMesos

  48. Twitter                  

       Open  Source   twitter.github.com   @TwitterOSS