Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Apache Mesos 
with Amazon EC2 SpotFleet

Apache Mesos 
with Amazon EC2 SpotFleet

Mesos Meetup Tokyo #2
https://mesos.connpass.com/event/58545/

ref1. Production deployment of the Docker container with Marathon
https://speakerdeck.com/kotatsu360/production-deployment-of-the-docker-container-with-marathon

ref2. Apache Mesos / Marathon を本番で運用するための5つのTips
http://tech.vasily.jp/entry/apache-mesos-and-marathon-tips

Tatsuro Mitsuno

July 25, 2017
Tweet

More Decks by Tatsuro Mitsuno

Other Decks in Technology

Transcript

  1. © 2017 VASILY,Inc.
    Apache Mesos

    with Amazon EC2 SpotFleet
    Mesos Meetup Tokyo #2

    2017/07/25 Tue. Tatsuro MITSUNO

    View Slide

  2. © 2017 VASILY,Inc.
    I am ...
    ▸ Tatsuro Mitsuno / ޫ໺ ୡ࿕
    ▸ 2012/04 Yahoo Japan Corporation
    ▸ 2016/04 VASILY, Inc.
    ▸ Infrastructure Engineer
    ▸ Twitter GitHub Qiita: @kotatsu360
    Icon illustrated by YOSHI 

    https://ja-jp.facebook.com/yoshi.yone.7
    Yukihiro "Matz" Matsumoto

    (VASILY technical advisor )

    View Slide

  3. © 2017 VASILY,Inc.
    TOC
    ▸ I am ...
    ▸ Motivation to use Mesos
    ▸ IQON Crawler
    ▸ Apache Mesos with Amazon EC2 SpotFleet
    ▸ AutoScaling
    ▸ Deployment
    ▸ Tips

    View Slide

  4. © 2017 VASILY,Inc.
    Motivation to use Apache Mesos

    View Slide

  5. © 2017 VASILY,Inc.
    Ҏ্ͷϑΝογϣϯ&$αΠτ͔Βͷ΂ສ఺Λ௒͑Δ঎඼Λܝࡌ
    ݄ؒສਓҎ্͕ར༻͢Δ೔ຊ࠷େڃͷϑΝογϣϯαΠτ

    View Slide

  6. © 2017 VASILY,Inc.
    Google Play
    ϕετΞϓϦ
    J04ΞϓϦ
    ɾ"QQ4UPSF#&45
    ɾ&TTFOUJBMೝఆ
    "OESPJEΞϓϦ
    ɾ೥(PPHMF1MBZϕετΞϓϦ
    ɾ೥(PPHMF1MBZϕετΞϓϦ
    ɾ೥(PPHMF1MBZϕετΞϓϦ

    ɹɹϕετΠϊϕʔςΟϒΞϓϦେ৆
    ɾ(PPHMF͔Βτοϓσϕϩούʔೝఆ
    ೥࿈ଓϕετΞϓϦड৆͸
    ੈքͰ7"4*-:͚ͩ

    View Slide

  7. © 2017 VASILY,Inc.
    IQON Overview
    EC sites
    core part of IQON
    IQON
    Crawler


    View Slide

  8. © 2017 VASILY,Inc.
    online-shot: Icons made by Vectors Market from www.flaticon.com is licensed by CC 3.0 BY
    dress: Icons made by Freepik from www.flaticon.com is licensed by CC 3.0 BY
    more than
    300 EC sites
    per day
    more than
    1 million fashion items
    per day
    Scale of crawl

    View Slide

  9. © 2017 VASILY,Inc.
    Problem: Increase Cost of Crawler at 2016 Dec
    Items
    Cost


    View Slide

  10. © 2017 VASILY,Inc.
    What caused it ?
    ▸ Application ... ?
    ▸ No. Application is enough flexible.
    ▸ Infrastructure is too legacy.

    View Slide

  11. © 2017 VASILY,Inc.
    IQON Crawler Application Overview
    ▸ Parallel Distributed Processing Application ( called `worker` )
    ▸ flexible
    worker
    D
    worker
    D
    worker
    C
    worker
    C
    worker
    B
    worker
    B
    worker
    A
    worker
    A
    Independent

    from each other via SQS
    worker ( Ruby )
    SQS

    View Slide

  12. © 2017 VASILY,Inc.
    IQON Crawler Infrastructure Overview
    ▸ EC2 Instances without AutoScaling
    ▸ Manual construction using Chef
    ▸ Worker managed by supervisord
    ▸ inflexible ...
    XPSLFS
    #
    XPSLFS
    "
    &$*OTUBODF
    XPSLFS
    "
    XPSLFS
    $
    XPSLFS
    %
    XPSLFS
    "
    XPSLFS
    #
    XPSLFS
    "
    &$*OTUBODF
    XPSLFS
    "
    XPSLFS
    $
    XPSLFS
    %
    XPSLFS
    "


    View Slide

  13. © 2017 VASILY,Inc.
    Constraint of crawl
    ▸ As a matter of fact, cost reduction is easy. Reduce instances.
    ▸ But, It is the worst way.
    ▸ It is meaningless if crawl time is longer.
    “Today's only 50% off !!”
    24 hours: Icons made by Vectors Market from www.flaticon.com is licensed by CC 3.0 BY
    Sunny: Icons made by Madebyoliver from www.flaticon.com is licensed by CC 3.0 BY
    Sleep: Icons made by Swifticons from www.flaticon.com is licensed by CC 3.0 BY
    smartphone: Icons made by Flat Icons from www.flaticon.com is licensed by CC 3.0 BY

    View Slide

  14. © 2017 VASILY,Inc.
    Find a good way...
    Items
    Cost

    +Avoid lowering the speed of crawling

    View Slide

  15. © 2017 VASILY,Inc.
    Mesos is good way

    Items
    Cost

    View Slide

  16. © 2017 VASILY,Inc.
    Effect
    money-bag: Icons made by from www.flaticon.com is licensed by CC 3.0 BY
    racing: Icons made by Freepik from www.flaticon.com is licensed by CC 3.0 BY
    crawler cost
    70% down
    crawler operating time
    67% down

    View Slide

  17. © 2017 VASILY,Inc.
    TOC
    ▸ I am ...
    ▸ Motivation to use Mesos
    ▸ IQON Crawler
    ▸ Apache Mesos with Amazon EC2 SpotFleet
    ▸ AutoScaling
    ▸ Deployment
    ▸ Tips

    View Slide

  18. © 2017 VASILY,Inc.
    Apache Mesos
    with Amazon EC2 SpotFleet

    View Slide

  19. © 2017 VASILY,Inc.
    New infrastructure
    ▸ Scalable
    ▸ Flexible
    ▸ Low cost

    View Slide

  20. © 2017 VASILY,Inc.
    Deploy
    Mesos & AWS Integration
    New infrastructure
    © 2017 VASILY,Inc.

    View Slide

  21. © 2017 VASILY,Inc.
    New infrastructure (summary)
    scaling request
    controll
    SpotFleet Instance
    scaling
    request
    Lambda
    Cloudwatch /
    Cloudwatch Event
    update request

    View Slide

  22. © 2017 VASILY,Inc.
    New infrastructure (summary)
    scaling request
    controll
    SpotFleet Instance
    scaling
    request
    Lambda
    Cloudwatch /
    Cloudwatch Event
    update request

    Application (IQON Crawler) AutoScaling
    Infrastructure (Mesos cluster) AutoScaling

    View Slide

  23. © 2017 VASILY,Inc.
    New infrastructure (summary)
    scaling request
    controll
    SpotFleet Instance
    scaling
    request
    Lambda
    Cloudwatch /
    Cloudwatch Event
    update request

    Application (IQON Crawler) AutoScaling
    Infrastructure (Mesos cluster) AutoScaling
    Sorry, Application AutoScaling is
    outside the scope of this presentation

    View Slide

  24. © 2017 VASILY,Inc.
    New infrastructure key factors
    ▸ Infrastructure (Mesos cluster) AutoScaling
    ▸ Separation of source code and container by fetcher

    View Slide

  25. © 2017 VASILY,Inc.
    Infrastructure (Mesos cluster) AutoScaling
    ▸ Mesos Slave(s)
    ▸ custom AMI based Ubuntu 16.04.x LTS
    ▸ mesos-slave and Docker installed
    ▸ set hostname, ip, and crontab using UserData
    ▸ EC2 SpotFleet / AutoScaling
    ▸ Mesos Master
    ▸ custom AMI based Ubuntu 16.04.x LTS
    ▸ marathon master, zookeeper
    ▸ EC2 Instance / not AutoScaling

    View Slide

  26. © 2017 VASILY,Inc.
    Infrastructure (Mesos cluster) AutoScaling
    1. event fires

    every few minutes
    ▸ IQON Mesos cluster does not use SpotFleet AutoScaling
    ▸ AutoScaling is triggered by the resources required by containers
    2. check resources required by containers
    4. change
    target capacity

    3. check resources of mesos cluster

    View Slide

  27. © 2017 VASILY,Inc.
    Reason why does not use SpotFleet AutoScaling
    ▸ At first I was using SpotFleet AutoScaling.
    ▸ trigger: load average, cpu usage, memory usage, network traffic ...


    Increase
    crawler task
    Increase
    crawler
    OK
    The resources of the cluster are insufficient...
    I wait until resources can be reserved !
    instance
    doesn't increase
    marathon wait
    instance metrics
    don't change

    View Slide

  28. © 2017 VASILY,Inc.
    Reason why does not use SpotFleet AutoScaling
    ▸ AutoScaling is triggered by the resources required by containers


    wait!
    cloudwatch
    event
    1.check
    2.compare
    3.request
    I increase container

    View Slide

  29. © 2017 VASILY,Inc.
    AutoScaling Support Script: Chek Termination
    if [ $(curl -s -LI http://169.254.169.254/latest/meta-data/spot/termination-time -o /dev/null -w "%{http_code}") -eq 404 ]; then
    :
    else
    systemctl stop mesos-slave
    docker ps -q | xargs docker stop; # SIGTERM, and after a grace period, SIGKILL.

    # stop monitoring

    # ...
    fi
    If a instance is marked termination,

    kill tasks for graceful (best effort)
    set it to /etc/cron.d/ using UserData

    View Slide

  30. © 2017 VASILY,Inc.
    Separation of source code and container by fetcher
    2. update request
    ▸ IQON Docker container image do not include source code
    ▸ At the time of deployment using mesos fetcher
    (container registory service)
    Amazon S3
    1. copy source code
    3. fetch source code,

    container image

    View Slide

  31. © 2017 VASILY,Inc.
    Separation of source code and container by fetcher
    ▸ A container image containing source code is very simple and comfortable.
    ▸ but ...
    ▸ A large number of tags are registered in the container registry
    ▸ Can not separate application and environment
    ▸ Container image caching does not work at deployment time
    ▸ At the time of deployment using mesos fetcher
    >
    Mesos fetcher
    Icons made by Roundicons from www.flaticon.com is licensed by Creative Commons BY 3.0

    View Slide

  32. © 2017 VASILY,Inc.
    More Detail
    Production deployment of the Docker container with Marathon
    https://speakerdeck.com/kotatsu360/production-deployment-of-the-docker-container-with-marathon

    View Slide

  33. © 2017 VASILY,Inc.
    TOC
    ▸ I am ...
    ▸ Motivation to use Mesos
    ▸ IQON Crawler
    ▸ Apache Mesos with Amazon EC2 SpotFleet
    ▸ AutoScaling
    ▸ Deployment
    ▸ Tips

    View Slide

  34. © 2017 VASILY,Inc.
    Tips

    View Slide

  35. © 2017 VASILY,Inc.
    Proxy request to Sandbox
    ▸ Mesos Web UI is very comfortable
    ▸ check resources for cluster
    ▸ check resources for each task

    View Slide

  36. © 2017 VASILY,Inc.
    Proxy request to Sandbox
    Mesos
    Master
    Mesos
    Slave
    1.http(s) request
    2.http(s) responce
    3. http(s) request
    4.http(s) responce
    Direct HTTP Request: Browser <-> Mesos Slave

    View Slide

  37. © 2017 VASILY,Inc.
    Proxy request to Sandbox
    Mesos
    Master
    Mesos
    Slave
    Direct HTTP Request is incompatible with AWS Private Subnet.
    private subnet

    ⭕ ELB

    View Slide

  38. © 2017 VASILY,Inc.
    Proxy request to Sandbox
    nginx
    Mesos
    Slave
    Direct HTTP Request: Browser <- nginx -> Mesos Slave
    private subnet
    ELB
    nginx proxy

    View Slide

  39. © 2017 VASILY,Inc.
    Proxy request to Sandbox
    ▸ step1: set unique hostname to mesos-slave using UserData
    ▸ step2: set CNAME (or Alias) Record to ELB
    ▸ step3: config nginx
    *.mesos-slave.xxxxx.yyyyy -> ELB
    echo "$(hostname).mesos-slave.xxxxx.yyyyy" > /etc/mesos-slave/hostname
    server {
    # [NOTE] Web UI -> 5051 ELB -> 15051 nginx -> 5051 Mesos Slave
    listen 15051;
    server_name .mesos-slave.xxxxx.yyyyy;
    location / {
    resolver 10.0.0.2;
    # [NOTE] Set to allow name resolution on a regular basis. 

    # Direct writing only solves the name with nginx restart
    if ($host ~* (.*)\.mesos-slave\.xxxxx\.yyyyy) {
    set $mesos_slave_server "$1.YOUR_AWS_REGION.compute.internal";
    }
    proxy_pass http://${mesos_slave_server}:5051;
    }
    }

    View Slide

  40. © 2017 VASILY,Inc.
    Other Tips
    Apache Mesos / Marathon Λຊ൪Ͱӡ༻͢ΔͨΊͷ5ͭͷTips

    http://tech.vasily.jp/entry/apache-mesos-and-marathon-tips

    View Slide

  41. © 2017 VASILY,Inc.
    Summary

    View Slide

  42. © 2017 VASILY,Inc.
    IQON Crawler x Mesos Cluster
    ▸ IQON Crawler is running on Mesos cluster
    ▸ Mesos cluster is running on AWS EC2 SpotFleet
    ▸ Scalable, Flexible and Low cost

    View Slide

  43. © 2017 VASILY,Inc.
    We are Hiring
    https://www.wantedly.com/companies/vasily

    View Slide