The Missing Pieces of Amazon ECS (for me)

The Missing Pieces of Amazon ECS (for me) Takumi Sakamoto
2016.09.21

Who am I? @takus Site Reliability Engineer in SmartNews

http://press.forkwell.com/post/150381780957/interview-takus

What is SmartNews? • News Discovery App for Mobile •
Algorithm-driven article selection • 18M+ Downloads in World Wide https://www.smartnews.com/en/

How SmartNews uses Amazon ECS? (Digests ver.)

Amazon ECS in SmartNews • Running dozens of production services
• origin server for CDN • internal API • web crawler • Provides internal deployment tool for developers • heroku-style CLI tool • service discovery with Consul • monitoring with Datadog • logging with Fluentd logging driver

Deploy w/Heroku-style CLI Amazon ECR $ spaas images:init --repository myapp
$ docker build -t NAMESPACE/myapp:0.0.1 . $ docker push NAMESPACE/myapp:0.0.1 Amazon ECS $ spaas create --service myapp --image NAMESPACE/myapp --tag 0.0.1 $ spaas deploy --service myapp 0.0.2 $ spaas rollback --service myapp $ spaas ps:scale --service myapp 2 $ spaas ps:scale:cpu --service myapp 1024 $ spaas ps:scale:memory --service myapp 2048 $ spaas ps:role --service myapp MYAPP_IAM_ROLE $ spaas conﬁg:set SERVICE_TAGS=web

Monitoring w/ Datadog By Container Instance By ECS Task Family
By ECS Task Revision

The Missing Pieces

1. Run a task on every instance • Similar to
DaemonSet in Kubernetes • run a ECS task on every container instances • deﬁne RestartPolicy for each task • Use case • run a logs collection daemon (ﬂuentd) • run a monitoring daemon (dd-agent) • run a cluster storage daemon (glusterd)

ECS Create Daemon API // Running dd-agent on every container
instance in mycluster // Restart dd-agent on unexpected failure $ aws ecs create-daemon \ --cluster mycluster \ --daemon-name dd-agent \ --task-deﬁnition dd-agent:10 \ --restart-policy always

Workaround • Start an ECS task with a user data
script • the task will not be restarted when it dies • hard to deploy new version • other tasks will be deployed without required daemon • Writing an ECS scheduler • yes, we can. but I want managed one

2. Mark an instance unschedulable • Similar to kubectl cordon
in Kubernetes • instance will be marked unschedulable • new tasks will not be scheduled in this mode • Use case • container instance replacement for security updates • shrink cluster size

ECS Maintenance API // Mark instance as unschedulable $ aws
ecs disable-container-instance \ --cluster mycluster \ --container-instance 04ad4550-f218-4d91-86f9-e8012b239261 // Wait until long running tasks ﬁnished // Do maintenance tasks or Terminate instance // Mark instance as schedulable $ aws ecs enable-container-instance \ --cluster mycluster \ --container-instance 04ad4550-f218-4d91-86f9-e8012b239261

Workaround • Terminate hook in AutoScaling • new tasks will
be assigned while waiting current running tasks ﬁnished • SpotFleet doesn't support terminate hook

Summaries • The missing pieces for me • run a
task on every container instance • mark an instance unschedulable • Hope these feature will be released soon

The Missing Pieces of Amazon ECS (for me)

The Missing Pieces of Amazon ECS (for me)

Takumi Sakamoto

More Decks by Takumi Sakamoto

Other Decks in Programming

Featured

Transcript