Apache Mesos 
with Amazon EC2 SpotFleet

Apache Mesos 
with Amazon EC2 SpotFleet

Mesos Meetup Tokyo #2
https://mesos.connpass.com/event/58545/

ref1. Production deployment of the Docker container with Marathon
https://speakerdeck.com/kotatsu360/production-deployment-of-the-docker-container-with-marathon

ref2. Apache Mesos / Marathon を本番で運用するための5つのTips
http://tech.vasily.jp/entry/apache-mesos-and-marathon-tips

231dbe4867c139c325b61e808b757750?s=128

Tatsuro Mitsuno

July 25, 2017
Tweet

Transcript

  1. © 2017 VASILY,Inc. Apache Mesos
 with Amazon EC2 SpotFleet Mesos

    Meetup Tokyo #2
 2017/07/25 Tue. Tatsuro MITSUNO
  2. © 2017 VASILY,Inc. I am ... ▸ Tatsuro Mitsuno /

    ޫ໺ ୡ࿕ ▸ 2012/04 Yahoo Japan Corporation ▸ 2016/04 VASILY, Inc. ▸ Infrastructure Engineer ▸ Twitter GitHub Qiita: @kotatsu360 Icon illustrated by YOSHI 
 https://ja-jp.facebook.com/yoshi.yone.7 Yukihiro "Matz" Matsumoto
 (VASILY technical advisor ) 
  3. © 2017 VASILY,Inc. TOC ▸ I am ... ▸ Motivation

    to use Mesos ▸ IQON Crawler ▸ Apache Mesos with Amazon EC2 SpotFleet ▸ AutoScaling ▸ Deployment ▸ Tips
  4. © 2017 VASILY,Inc. Motivation to use Apache Mesos

  5. © 2017 VASILY,Inc. Ҏ্ͷϑΝογϣϯ&$αΠτ͔Βͷ΂ສ఺Λ௒͑Δ঎඼Λܝࡌ ݄ؒສਓҎ্͕ར༻͢Δ೔ຊ࠷େڃͷϑΝογϣϯαΠτ

  6. © 2017 VASILY,Inc. Google Play ϕετΞϓϦ J04ΞϓϦ ɾ"QQ4UPSF#&45 ɾ&TTFOUJBMೝఆ "OESPJEΞϓϦ

    ɾ೥(PPHMF1MBZϕετΞϓϦ ɾ೥(PPHMF1MBZϕετΞϓϦ ɾ೥(PPHMF1MBZϕετΞϓϦ
 ɹɹϕετΠϊϕʔςΟϒΞϓϦେ৆ ɾ(PPHMF͔Βτοϓσϕϩούʔೝఆ ೥࿈ଓϕετΞϓϦड৆͸ ੈքͰ7"4*-:͚ͩ
  7. © 2017 VASILY,Inc. IQON Overview EC sites core part of

    IQON IQON Crawler 
  8. © 2017 VASILY,Inc. online-shot: Icons made by Vectors Market from

    www.flaticon.com is licensed by CC 3.0 BY dress: Icons made by Freepik from www.flaticon.com is licensed by CC 3.0 BY more than 300 EC sites per day more than 1 million fashion items per day Scale of crawl 
  9. © 2017 VASILY,Inc. Problem: Increase Cost of Crawler at 2016

    Dec Items Cost 
  10. © 2017 VASILY,Inc. What caused it ? ▸ Application ...

    ? ▸ No. Application is enough flexible. ▸ Infrastructure is too legacy. 
  11. © 2017 VASILY,Inc. IQON Crawler Application Overview ▸ Parallel Distributed

    Processing Application ( called `worker` ) ▸ flexible worker D worker D worker C worker C worker B worker B worker A worker A Independent
 from each other via SQS worker ( Ruby ) SQS 
  12. © 2017 VASILY,Inc. IQON Crawler Infrastructure Overview ▸ EC2 Instances

    without AutoScaling ▸ Manual construction using Chef ▸ Worker managed by supervisord ▸ inflexible ... XPSLFS # XPSLFS " &$*OTUBODF XPSLFS " XPSLFS $ XPSLFS % XPSLFS " XPSLFS # XPSLFS " &$*OTUBODF XPSLFS " XPSLFS $ XPSLFS % XPSLFS "  
  13. © 2017 VASILY,Inc. Constraint of crawl ▸ As a matter

    of fact, cost reduction is easy. Reduce instances. ▸ But, It is the worst way. ▸ It is meaningless if crawl time is longer. “Today's only 50% off !!” 24 hours: Icons made by Vectors Market from www.flaticon.com is licensed by CC 3.0 BY Sunny: Icons made by Madebyoliver from www.flaticon.com is licensed by CC 3.0 BY Sleep: Icons made by Swifticons from www.flaticon.com is licensed by CC 3.0 BY smartphone: Icons made by Flat Icons from www.flaticon.com is licensed by CC 3.0 BY
  14. © 2017 VASILY,Inc. Find a good way... Items Cost +Avoid

    lowering the speed of crawling 
  15. © 2017 VASILY,Inc. Mesos is good way Items Cost 

  16. © 2017 VASILY,Inc. Effect money-bag: Icons made by from www.flaticon.com

    is licensed by CC 3.0 BY racing: Icons made by Freepik from www.flaticon.com is licensed by CC 3.0 BY crawler cost 70% down crawler operating time 67% down 
  17. © 2017 VASILY,Inc. TOC ▸ I am ... ▸ Motivation

    to use Mesos ▸ IQON Crawler ▸ Apache Mesos with Amazon EC2 SpotFleet ▸ AutoScaling ▸ Deployment ▸ Tips
  18. © 2017 VASILY,Inc. Apache Mesos with Amazon EC2 SpotFleet

  19. © 2017 VASILY,Inc. New infrastructure ▸ Scalable ▸ Flexible ▸

    Low cost 
  20. © 2017 VASILY,Inc. Deploy Mesos & AWS Integration New infrastructure

    © 2017 VASILY,Inc. 
  21. © 2017 VASILY,Inc. New infrastructure (summary) scaling request controll SpotFleet

    Instance scaling request Lambda Cloudwatch / Cloudwatch Event update request 
  22. © 2017 VASILY,Inc. New infrastructure (summary) scaling request controll SpotFleet

    Instance scaling request Lambda Cloudwatch / Cloudwatch Event update request  Application (IQON Crawler) AutoScaling Infrastructure (Mesos cluster) AutoScaling
  23. © 2017 VASILY,Inc. New infrastructure (summary) scaling request controll SpotFleet

    Instance scaling request Lambda Cloudwatch / Cloudwatch Event update request  Application (IQON Crawler) AutoScaling Infrastructure (Mesos cluster) AutoScaling Sorry, Application AutoScaling is outside the scope of this presentation
  24. © 2017 VASILY,Inc. New infrastructure key factors ▸ Infrastructure (Mesos

    cluster) AutoScaling ▸ Separation of source code and container by fetcher 
  25. © 2017 VASILY,Inc. Infrastructure (Mesos cluster) AutoScaling ▸ Mesos Slave(s)

    ▸ custom AMI based Ubuntu 16.04.x LTS ▸ mesos-slave and Docker installed ▸ set hostname, ip, and crontab using UserData ▸ EC2 SpotFleet / AutoScaling ▸ Mesos Master ▸ custom AMI based Ubuntu 16.04.x LTS ▸ marathon master, zookeeper ▸ EC2 Instance / not AutoScaling 
  26. © 2017 VASILY,Inc. Infrastructure (Mesos cluster) AutoScaling 1. event fires


    every few minutes ▸ IQON Mesos cluster does not use SpotFleet AutoScaling ▸ AutoScaling is triggered by the resources required by containers 2. check resources required by containers 4. change target capacity  3. check resources of mesos cluster
  27. © 2017 VASILY,Inc. Reason why does not use SpotFleet AutoScaling

    ▸ At first I was using SpotFleet AutoScaling. ▸ trigger: load average, cpu usage, memory usage, network traffic ...   Increase crawler task Increase crawler OK The resources of the cluster are insufficient... I wait until resources can be reserved ! instance doesn't increase marathon wait instance metrics don't change
  28. © 2017 VASILY,Inc. Reason why does not use SpotFleet AutoScaling

    ▸ AutoScaling is triggered by the resources required by containers   wait! cloudwatch event 1.check 2.compare 3.request I increase container
  29. © 2017 VASILY,Inc. AutoScaling Support Script: Chek Termination if [

    $(curl -s -LI http://169.254.169.254/latest/meta-data/spot/termination-time -o /dev/null -w "%{http_code}") -eq 404 ]; then : else systemctl stop mesos-slave docker ps -q | xargs docker stop; # SIGTERM, and after a grace period, SIGKILL.
 # stop monitoring
 # ... fi If a instance is marked termination,
 kill tasks for graceful (best effort) set it to /etc/cron.d/ using UserData 
  30. © 2017 VASILY,Inc. Separation of source code and container by

    fetcher 2. update request ▸ IQON Docker container image do not include source code ▸ At the time of deployment using mesos fetcher (container registory service) Amazon S3 1. copy source code 3. fetch source code,
 container image
  31. © 2017 VASILY,Inc. Separation of source code and container by

    fetcher ▸ A container image containing source code is very simple and comfortable. ▸ but ... ▸ A large number of tags are registered in the container registry ▸ Can not separate application and environment ▸ Container image caching does not work at deployment time ▸ At the time of deployment using mesos fetcher </> Mesos fetcher Icons made by Roundicons from www.flaticon.com is licensed by Creative Commons BY 3.0
  32. © 2017 VASILY,Inc. More Detail Production deployment of the Docker

    container with Marathon https://speakerdeck.com/kotatsu360/production-deployment-of-the-docker-container-with-marathon 
  33. © 2017 VASILY,Inc. TOC ▸ I am ... ▸ Motivation

    to use Mesos ▸ IQON Crawler ▸ Apache Mesos with Amazon EC2 SpotFleet ▸ AutoScaling ▸ Deployment ▸ Tips
  34. © 2017 VASILY,Inc. Tips

  35. © 2017 VASILY,Inc. Proxy request to Sandbox ▸ Mesos Web

    UI is very comfortable ▸ check resources for cluster ▸ check resources for each task 
  36. © 2017 VASILY,Inc. Proxy request to Sandbox Mesos Master Mesos

    Slave 1.http(s) request 2.http(s) responce 3. http(s) request 4.http(s) responce Direct HTTP Request: Browser <-> Mesos Slave 
  37. © 2017 VASILY,Inc. Proxy request to Sandbox Mesos Master Mesos

    Slave Direct HTTP Request is incompatible with AWS Private Subnet. private subnet ❌ ⭕ ELB 
  38. © 2017 VASILY,Inc. Proxy request to Sandbox nginx Mesos Slave

    Direct HTTP Request: Browser <- nginx -> Mesos Slave private subnet ELB nginx proxy 
  39. © 2017 VASILY,Inc. Proxy request to Sandbox ▸ step1: set

    unique hostname to mesos-slave using UserData ▸ step2: set CNAME (or Alias) Record to ELB ▸ step3: config nginx *.mesos-slave.xxxxx.yyyyy -> ELB echo "$(hostname).mesos-slave.xxxxx.yyyyy" > /etc/mesos-slave/hostname server { # [NOTE] Web UI -> 5051 ELB -> 15051 nginx -> 5051 Mesos Slave listen 15051; server_name .mesos-slave.xxxxx.yyyyy; location / { resolver 10.0.0.2; # [NOTE] Set to allow name resolution on a regular basis. 
 # Direct writing only solves the name with nginx restart if ($host ~* (.*)\.mesos-slave\.xxxxx\.yyyyy) { set $mesos_slave_server "$1.YOUR_AWS_REGION.compute.internal"; } proxy_pass http://${mesos_slave_server}:5051; } } 
  40. © 2017 VASILY,Inc. Other Tips Apache Mesos / Marathon Λຊ൪Ͱӡ༻͢ΔͨΊͷ5ͭͷTips


    http://tech.vasily.jp/entry/apache-mesos-and-marathon-tips 
  41. © 2017 VASILY,Inc. Summary

  42. © 2017 VASILY,Inc. IQON Crawler x Mesos Cluster ▸ IQON

    Crawler is running on Mesos cluster ▸ Mesos cluster is running on AWS EC2 SpotFleet ▸ Scalable, Flexible and Low cost 
  43. © 2017 VASILY,Inc. We are Hiring https://www.wantedly.com/companies/vasily