Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Operating ECS in production

Operating ECS in production

E8f66870d1204779ecc45f2695faa73e?s=128

Michael Wittig

April 12, 2018
Tweet

Transcript

  1. https://github.com/widdix/aws-cf-templates Operating ECS in production

  2. Hello! I am Michael Wittig AWS in Action (2nd ed)

    cloudonaut.io AWS Community Hero Independent AWS Consultant Twitter @hellomichibye 2
  3. ECS Orchestrates Docker containers for you Manages Network and per

    Task Security 3
  4. ECS Cluster 4 ECS Cluster ECS Instance = EC2 Instance

    running ecs-agent ECS optimized AMI ECS Instance ECS Instance
  5. Task Definition 5 ECS Cluster aws ecs run-task \ --count

    2 Task Definition Image ... ECS Task 1..N Container ECS Task 1..N Container
  6. ECS Scheduling 6 ECS Cluster ECS Instance ECS Instance ECS

    Instance ECS Task 1..N Container ECS Task 1..N Container Placement constraints and strategies
  7. ECS Networking ◎ Public/Private Load Balancer ◎ Elastic Network Interface

    (ENI) per task ◦ Public IP ◦ Private IP ◦ Per Task Security Group 7
  8. ECS Service 8 ECS Cluster Task Definition ECS Task ECS

    Task ECS Service ◎ Observer ◎ ENI ◎ Load Balancer ◎ Deployment
  9. Operating ECS Challenges 9

  10. 1. Spinning up a cluster 10 Demo

  11. Fault Tolerant Auto Scaling Group, Availability Zones 11 Demo

  12. 2. Updating a cluster New ECS optimized AMIs are released

    frequently! 12
  13. Rolling Update CloudFormation replaces EC2 instances in Auto Scaling Groups

    in small batches. 13 Demo
  14. But what about inflight requests? 14

  15. Instance Draining Move all tasks from ECS instance before Instance

    is terminated. 15 Demo
  16. Implementing Instance Draining ◎ Auto Scaling Lifecycle hook ◦ Drain

    Instance ◦ Wait until drained ◦ Complete Lifecycle hook 16
  17. Tasks are not rescheduled once places! Your last batch of

    ECS instances will end up with 0 tasks! 17
  18. 3. Scaling a cluster Or adding/removing EC2 instances. 18 Demo

  19. We don’t know how many tasks we can schedule! 19

  20. 20 ECS Cluster Available: CPU 100 Memory 200 Available: CPU

    100 Memory 200 Available: CPU 100 Memory 200 Available: CPU 300 Memory 600 Task CPU 200 Memory 200
  21. Schedulable Containers 1. Define largest possible task (CPU/memory) 2. For

    each instance: a. Calculate how many largest possible tasks would fit b. Report to CloudWatch 3. Scale based on the sum of this metric 21 Credits http://garbe.io/blog/2017/04/12/a-better-solution-to-ecs-autoscaling/
  22. 22 ECS Cluster Available: CPU 100 Memory 200 Schedulable 0

    Available: CPU 100 Memory 200 Schedulable 0 Available: CPU 100 Memory 200 Schedulable 0 Available: CPU 300 Memory 600 Schedulable 0 Largest Task CPU 200 Memory 200
  23. No CloudWatch Events emitted when task launch failed due to

    capacity shortage. 23
  24. 4. Public load balancing 24

  25. Public load balancing 25 ECS Cluster ECS Instance ECS Instance

    ECS Task 1..N Container ECS Task 1..N Container ECS Task 1..N Container Load Balancer (ALB) ◎ Path based ◎ Host based DNS
  26. 5. Internal service discovery / load balancing 26

  27. Internal load balancing 27 ECS Task Frontend LB (internet-facing) ECS

    Task ECS Task Catalog (internal) ECS Task ECS Task Shopping card (internal) ECS Task ECS Task DNS DNS DNS
  28. Catalog Catalog Internal Route 53 (with per task ENI) 28

    ECS Task Frontend LB (internet-facing) ECS Task ECS Task ECS Task ECS Task ECS Task ECS Task DNS DNS DNS
  29. 6. Logging & Monitoring CloudWatch. 29

  30. Credits Special thanks to all the people who made and

    released these awesome resources for free: ◎ Presentation template by SlidesCarnival ◎ Photographs by Pexels 30
  31. Thanks! http://bit.ly/amazon-web-services-in-action-2nd-edition https://github.com/widdix/aws-cf-templates https://cloudonaut.io Twitter @hellomichibye Mail michael@widdix.de 31