Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Missing Pieces of Amazon ECS (for me)

The Missing Pieces of Amazon ECS (for me)

2db33f44183cdc9ea0ec523924cab3a0?s=128

Takumi Sakamoto

September 21, 2016
Tweet

Transcript

  1. The Missing Pieces of Amazon ECS (for me) Takumi Sakamoto

    2016.09.21
  2. Who am I? @takus Site Reliability Engineer in SmartNews

  3. http://press.forkwell.com/post/150381780957/interview-takus

  4. What is SmartNews? • News Discovery App for Mobile •

    Algorithm-driven article selection • 18M+ Downloads in World Wide https://www.smartnews.com/en/
  5. How SmartNews uses Amazon ECS? (Digests ver.)

  6. Amazon ECS in SmartNews • Running dozens of production services

    • origin server for CDN • internal API • web crawler • Provides internal deployment tool for developers • heroku-style CLI tool • service discovery with Consul • monitoring with Datadog • logging with Fluentd logging driver
  7. Deploy w/Heroku-style CLI Amazon ECR $ spaas images:init --repository myapp

    $ docker build -t NAMESPACE/myapp:0.0.1 . $ docker push NAMESPACE/myapp:0.0.1 Amazon ECS $ spaas create --service myapp --image NAMESPACE/myapp --tag 0.0.1 $ spaas deploy --service myapp 0.0.2 $ spaas rollback --service myapp $ spaas ps:scale --service myapp 2 $ spaas ps:scale:cpu --service myapp 1024 $ spaas ps:scale:memory --service myapp 2048 $ spaas ps:role --service myapp MYAPP_IAM_ROLE $ spaas config:set SERVICE_TAGS=web
  8. Monitoring w/ Datadog By Container Instance By ECS Task Family

    By ECS Task Revision
  9. The Missing Pieces

  10. 1. Run a task on every instance • Similar to

    DaemonSet in Kubernetes • run a ECS task on every container instances • define RestartPolicy for each task • Use case • run a logs collection daemon (fluentd) • run a monitoring daemon (dd-agent) • run a cluster storage daemon (glusterd)
  11. ECS Create Daemon API // Running dd-agent on every container

    instance in mycluster // Restart dd-agent on unexpected failure $ aws ecs create-daemon \ --cluster mycluster \ --daemon-name dd-agent \ --task-definition dd-agent:10 \ --restart-policy always
  12. Workaround • Start an ECS task with a user data

    script • the task will not be restarted when it dies • hard to deploy new version • other tasks will be deployed without required daemon • Writing an ECS scheduler • yes, we can. but I want managed one
  13. 2. Mark an instance unschedulable • Similar to kubectl cordon

    in Kubernetes • instance will be marked unschedulable • new tasks will not be scheduled in this mode • Use case • container instance replacement for security updates • shrink cluster size
  14. ECS Maintenance API // Mark instance as unschedulable $ aws

    ecs disable-container-instance \ --cluster mycluster \ --container-instance 04ad4550-f218-4d91-86f9-e8012b239261 // Wait until long running tasks finished // Do maintenance tasks or Terminate instance // Mark instance as schedulable $ aws ecs enable-container-instance \ --cluster mycluster \ --container-instance 04ad4550-f218-4d91-86f9-e8012b239261
  15. Workaround • Terminate hook in AutoScaling • new tasks will

    be assigned while waiting current running tasks finished • SpotFleet doesn't support terminate hook
  16. Summaries • The missing pieces for me • run a

    task on every container instance • mark an instance unschedulable • Hope these feature will be released soon