Slide 1

Slide 1 text

© 2017 VASILY,Inc. Apache Mesos
 with Amazon EC2 SpotFleet Mesos Meetup Tokyo #2
 2017/07/25 Tue. Tatsuro MITSUNO

Slide 2

Slide 2 text

© 2017 VASILY,Inc. I am ... ▸ Tatsuro Mitsuno / ޫ໺ ୡ࿕ ▸ 2012/04 Yahoo Japan Corporation ▸ 2016/04 VASILY, Inc. ▸ Infrastructure Engineer ▸ Twitter GitHub Qiita: @kotatsu360 Icon illustrated by YOSHI 
 https://ja-jp.facebook.com/yoshi.yone.7 Yukihiro "Matz" Matsumoto
 (VASILY technical advisor )

Slide 3

Slide 3 text

© 2017 VASILY,Inc. TOC ▸ I am ... ▸ Motivation to use Mesos ▸ IQON Crawler ▸ Apache Mesos with Amazon EC2 SpotFleet ▸ AutoScaling ▸ Deployment ▸ Tips

Slide 4

Slide 4 text

© 2017 VASILY,Inc. Motivation to use Apache Mesos

Slide 5

Slide 5 text

© 2017 VASILY,Inc. Ҏ্ͷϑΝογϣϯ&$αΠτ͔Βͷ΂ສ఺Λ௒͑Δ঎඼Λܝࡌ ݄ؒສਓҎ্͕ར༻͢Δ೔ຊ࠷େڃͷϑΝογϣϯαΠτ

Slide 6

Slide 6 text

© 2017 VASILY,Inc. Google Play ϕετΞϓϦ J04ΞϓϦ ɾ"QQ4UPSF#&45 ɾ&TTFOUJBMೝఆ "OESPJEΞϓϦ ɾ೥(PPHMF1MBZϕετΞϓϦ ɾ೥(PPHMF1MBZϕετΞϓϦ ɾ೥(PPHMF1MBZϕετΞϓϦ
 ɹɹϕετΠϊϕʔςΟϒΞϓϦେ৆ ɾ(PPHMF͔Βτοϓσϕϩούʔೝఆ ೥࿈ଓϕετΞϓϦड৆͸ ੈքͰ7"4*-:͚ͩ

Slide 7

Slide 7 text

© 2017 VASILY,Inc. IQON Overview EC sites core part of IQON IQON Crawler

Slide 8

Slide 8 text

© 2017 VASILY,Inc. online-shot: Icons made by Vectors Market from www.flaticon.com is licensed by CC 3.0 BY dress: Icons made by Freepik from www.flaticon.com is licensed by CC 3.0 BY more than 300 EC sites per day more than 1 million fashion items per day Scale of crawl

Slide 9

Slide 9 text

© 2017 VASILY,Inc. Problem: Increase Cost of Crawler at 2016 Dec Items Cost

Slide 10

Slide 10 text

© 2017 VASILY,Inc. What caused it ? ▸ Application ... ? ▸ No. Application is enough flexible. ▸ Infrastructure is too legacy.

Slide 11

Slide 11 text

© 2017 VASILY,Inc. IQON Crawler Application Overview ▸ Parallel Distributed Processing Application ( called `worker` ) ▸ flexible worker D worker D worker C worker C worker B worker B worker A worker A Independent
 from each other via SQS worker ( Ruby ) SQS

Slide 12

Slide 12 text

© 2017 VASILY,Inc. IQON Crawler Infrastructure Overview ▸ EC2 Instances without AutoScaling ▸ Manual construction using Chef ▸ Worker managed by supervisord ▸ inflexible ... XPSLFS # XPSLFS " &$*OTUBODF XPSLFS " XPSLFS $ XPSLFS % XPSLFS " XPSLFS # XPSLFS " &$*OTUBODF XPSLFS " XPSLFS $ XPSLFS % XPSLFS "

Slide 13

Slide 13 text

© 2017 VASILY,Inc. Constraint of crawl ▸ As a matter of fact, cost reduction is easy. Reduce instances. ▸ But, It is the worst way. ▸ It is meaningless if crawl time is longer. “Today's only 50% off !!” 24 hours: Icons made by Vectors Market from www.flaticon.com is licensed by CC 3.0 BY Sunny: Icons made by Madebyoliver from www.flaticon.com is licensed by CC 3.0 BY Sleep: Icons made by Swifticons from www.flaticon.com is licensed by CC 3.0 BY smartphone: Icons made by Flat Icons from www.flaticon.com is licensed by CC 3.0 BY

Slide 14

Slide 14 text

© 2017 VASILY,Inc. Find a good way... Items Cost +Avoid lowering the speed of crawling

Slide 15

Slide 15 text

© 2017 VASILY,Inc. Mesos is good way Items Cost

Slide 16

Slide 16 text

© 2017 VASILY,Inc. Effect money-bag: Icons made by from www.flaticon.com is licensed by CC 3.0 BY racing: Icons made by Freepik from www.flaticon.com is licensed by CC 3.0 BY crawler cost 70% down crawler operating time 67% down

Slide 17

Slide 17 text

© 2017 VASILY,Inc. TOC ▸ I am ... ▸ Motivation to use Mesos ▸ IQON Crawler ▸ Apache Mesos with Amazon EC2 SpotFleet ▸ AutoScaling ▸ Deployment ▸ Tips

Slide 18

Slide 18 text

© 2017 VASILY,Inc. Apache Mesos with Amazon EC2 SpotFleet

Slide 19

Slide 19 text

© 2017 VASILY,Inc. New infrastructure ▸ Scalable ▸ Flexible ▸ Low cost

Slide 20

Slide 20 text

© 2017 VASILY,Inc. Deploy Mesos & AWS Integration New infrastructure © 2017 VASILY,Inc.

Slide 21

Slide 21 text

© 2017 VASILY,Inc. New infrastructure (summary) scaling request controll SpotFleet Instance scaling request Lambda Cloudwatch / Cloudwatch Event update request

Slide 22

Slide 22 text

© 2017 VASILY,Inc. New infrastructure (summary) scaling request controll SpotFleet Instance scaling request Lambda Cloudwatch / Cloudwatch Event update request Application (IQON Crawler) AutoScaling Infrastructure (Mesos cluster) AutoScaling

Slide 23

Slide 23 text

© 2017 VASILY,Inc. New infrastructure (summary) scaling request controll SpotFleet Instance scaling request Lambda Cloudwatch / Cloudwatch Event update request Application (IQON Crawler) AutoScaling Infrastructure (Mesos cluster) AutoScaling Sorry, Application AutoScaling is outside the scope of this presentation

Slide 24

Slide 24 text

© 2017 VASILY,Inc. New infrastructure key factors ▸ Infrastructure (Mesos cluster) AutoScaling ▸ Separation of source code and container by fetcher

Slide 25

Slide 25 text

© 2017 VASILY,Inc. Infrastructure (Mesos cluster) AutoScaling ▸ Mesos Slave(s) ▸ custom AMI based Ubuntu 16.04.x LTS ▸ mesos-slave and Docker installed ▸ set hostname, ip, and crontab using UserData ▸ EC2 SpotFleet / AutoScaling ▸ Mesos Master ▸ custom AMI based Ubuntu 16.04.x LTS ▸ marathon master, zookeeper ▸ EC2 Instance / not AutoScaling

Slide 26

Slide 26 text

© 2017 VASILY,Inc. Infrastructure (Mesos cluster) AutoScaling 1. event fires
 every few minutes ▸ IQON Mesos cluster does not use SpotFleet AutoScaling ▸ AutoScaling is triggered by the resources required by containers 2. check resources required by containers 4. change target capacity 3. check resources of mesos cluster

Slide 27

Slide 27 text

© 2017 VASILY,Inc. Reason why does not use SpotFleet AutoScaling ▸ At first I was using SpotFleet AutoScaling. ▸ trigger: load average, cpu usage, memory usage, network traffic ... Increase crawler task Increase crawler OK The resources of the cluster are insufficient... I wait until resources can be reserved ! instance doesn't increase marathon wait instance metrics don't change

Slide 28

Slide 28 text

© 2017 VASILY,Inc. Reason why does not use SpotFleet AutoScaling ▸ AutoScaling is triggered by the resources required by containers wait! cloudwatch event 1.check 2.compare 3.request I increase container

Slide 29

Slide 29 text

© 2017 VASILY,Inc. AutoScaling Support Script: Chek Termination if [ $(curl -s -LI http://169.254.169.254/latest/meta-data/spot/termination-time -o /dev/null -w "%{http_code}") -eq 404 ]; then : else systemctl stop mesos-slave docker ps -q | xargs docker stop; # SIGTERM, and after a grace period, SIGKILL.
 # stop monitoring
 # ... fi If a instance is marked termination,
 kill tasks for graceful (best effort) set it to /etc/cron.d/ using UserData

Slide 30

Slide 30 text

© 2017 VASILY,Inc. Separation of source code and container by fetcher 2. update request ▸ IQON Docker container image do not include source code ▸ At the time of deployment using mesos fetcher (container registory service) Amazon S3 1. copy source code 3. fetch source code,
 container image

Slide 31

Slide 31 text

© 2017 VASILY,Inc. Separation of source code and container by fetcher ▸ A container image containing source code is very simple and comfortable. ▸ but ... ▸ A large number of tags are registered in the container registry ▸ Can not separate application and environment ▸ Container image caching does not work at deployment time ▸ At the time of deployment using mesos fetcher > Mesos fetcher Icons made by Roundicons from www.flaticon.com is licensed by Creative Commons BY 3.0

Slide 32

Slide 32 text

© 2017 VASILY,Inc. More Detail Production deployment of the Docker container with Marathon https://speakerdeck.com/kotatsu360/production-deployment-of-the-docker-container-with-marathon

Slide 33

Slide 33 text

© 2017 VASILY,Inc. TOC ▸ I am ... ▸ Motivation to use Mesos ▸ IQON Crawler ▸ Apache Mesos with Amazon EC2 SpotFleet ▸ AutoScaling ▸ Deployment ▸ Tips

Slide 34

Slide 34 text

© 2017 VASILY,Inc. Tips

Slide 35

Slide 35 text

© 2017 VASILY,Inc. Proxy request to Sandbox ▸ Mesos Web UI is very comfortable ▸ check resources for cluster ▸ check resources for each task

Slide 36

Slide 36 text

© 2017 VASILY,Inc. Proxy request to Sandbox Mesos Master Mesos Slave 1.http(s) request 2.http(s) responce 3. http(s) request 4.http(s) responce Direct HTTP Request: Browser <-> Mesos Slave

Slide 37

Slide 37 text

© 2017 VASILY,Inc. Proxy request to Sandbox Mesos Master Mesos Slave Direct HTTP Request is incompatible with AWS Private Subnet. private subnet ❌ ⭕ ELB

Slide 38

Slide 38 text

© 2017 VASILY,Inc. Proxy request to Sandbox nginx Mesos Slave Direct HTTP Request: Browser <- nginx -> Mesos Slave private subnet ELB nginx proxy

Slide 39

Slide 39 text

© 2017 VASILY,Inc. Proxy request to Sandbox ▸ step1: set unique hostname to mesos-slave using UserData ▸ step2: set CNAME (or Alias) Record to ELB ▸ step3: config nginx *.mesos-slave.xxxxx.yyyyy -> ELB echo "$(hostname).mesos-slave.xxxxx.yyyyy" > /etc/mesos-slave/hostname server { # [NOTE] Web UI -> 5051 ELB -> 15051 nginx -> 5051 Mesos Slave listen 15051; server_name .mesos-slave.xxxxx.yyyyy; location / { resolver 10.0.0.2; # [NOTE] Set to allow name resolution on a regular basis. 
 # Direct writing only solves the name with nginx restart if ($host ~* (.*)\.mesos-slave\.xxxxx\.yyyyy) { set $mesos_slave_server "$1.YOUR_AWS_REGION.compute.internal"; } proxy_pass http://${mesos_slave_server}:5051; } }

Slide 40

Slide 40 text

© 2017 VASILY,Inc. Other Tips Apache Mesos / Marathon Λຊ൪Ͱӡ༻͢ΔͨΊͷ5ͭͷTips
 http://tech.vasily.jp/entry/apache-mesos-and-marathon-tips

Slide 41

Slide 41 text

© 2017 VASILY,Inc. Summary

Slide 42

Slide 42 text

© 2017 VASILY,Inc. IQON Crawler x Mesos Cluster ▸ IQON Crawler is running on Mesos cluster ▸ Mesos cluster is running on AWS EC2 SpotFleet ▸ Scalable, Flexible and Low cost

Slide 43

Slide 43 text

© 2017 VASILY,Inc. We are Hiring https://www.wantedly.com/companies/vasily