Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Zero-Downtime Delivery with Blue/Green Automati...

Zero-Downtime Delivery with Blue/Green Automatic Testing

It's not unusual for a DevOps team to inherit vast infrastructures of servers with applications and configurations that had accumulated over the years.
In this case-study, blue/green deployment techniques, traditionally used for application releases, were applied to server configuration management, utilizing AWS CodeDeploy on Amazon ECS. This approach enabled the implementation of automated testing pipelines for validating new configurations before routing live traffic. By subjecting configurations to the same testing practices as code, potential issues could be proactively identified and mitigated, minimizing the risk of downtime and enabling efficient rollback capabilities when needed.

Monica Colangelo

October 21, 2024
Tweet

More Decks by Monica Colangelo

Other Decks in Technology

Transcript

  1. DEVOPS MODEL “DevOps is the combination of cultural philosophies, practices,

    and tools that increases an organization’s ability to deliver applications and services at high velocity: evolving and improving products at a faster pace than organizations using traditional software development and infrastructure management processes.”
  2. Diverse application portfolio mix of active and legacy systems, each

    with unique requirements and quirks High-volume news CMS dozens of new articles daily and constant SEO-driven URL change requests from journalists 120.000 customer website each with unique domains and custom rules requiring specific configurations THE SCALE OF OUR CHALLENGE
  3. CONFIGURATION CHAOS Years of accumulated changes, patches, and quick fixes

    Modifying even a single line could cause unpredictable failures Test environment drastically different from production Critical security risks
  4. Release duration: 3-4 hours Rollback time: 1-2 hours Error rate:

    40% of deployments Uptime: 99.5% (downtime of 1 hour per week = 2 days per year)
  5. THE REALITY CHECK No time or budget for backend overhaul

    Development teams focused on other business priorities Lack of confidence
  6. VERSION CONTROL CONTAINERIZ ATION INFRASTRCTURE AS CODE TECHNOLOGICAL UPGRADE Track

    changes, collaborate on configurations, and roll back if needed Ability to always use the latest, most secure base image Abstraction from machine management Infrastructure management more accessible and transparent GIT DOCKER TERRAFORM
  7. Closure The old blue environment is now free to become

    the next 'green' for future updates Monitoring Once the shift is complete, green becomes the new live environment Initiation The live version runs in the blue environment Planning A new version is installed and tested in the green environment Execution If tests pass, traffic is gradually shifted from blue to green BLUE/GREEN DEPLOYMENT
  8. TECHNOLOGICAL UPGRADE Amazon Elastic Container Service (ECS) AWS Fargate for

    Amazon ECS AWS CodeDeploy Elastic Load Balancing
  9. AWS CodeCommit AWS CodePipeline developer AWS CodeBuild Registry Amazon ECR

    Application Load Balancer AWS Fargate for Amazon ECS “Blue” target group Service
  10. AWS CodeCommit Application Load Balancer AWS CodeBuild AWS CodeDeploy AWS

    CodePipeline developer Registry Amazon ECR AWS Fargate for Amazon ECS “Green” target group Service “Blue” target group Service
  11. AWS CodeCommit Application Load Balancer AWS CodeBuild AWS CodeDeploy AWS

    CodePipeline developer Registry Amazon ECR AWS Fargate for Amazon ECS “Green” target group Service “Blue” target group Service
  12. AWS CodeCommit Application Load Balancer AWS CodeBuild AWS CodeDeploy AWS

    CodePipeline developer Registry Amazon ECR AWS Fargate for Amazon ECS “Green” target group Service “Blue” target group Service
  13. AWS CodeCommit Application Load Balancer AWS CodeBuild AWS CodeDeploy AWS

    CodePipeline developer Registry Amazon ECR AWS Fargate for Amazon ECS “Green” target group Service
  14. REQUIREMENTS FOR NO-REGRESSION TESTS Cover 1 URLs generated dynamically by

    the backend (like those from the CMS) 3 Provide a representative sample without making the test execution time unreasonable 2 Include important but rarely visited URLs 4 Adapt to the constantly changing nature of our content
  15. TESTING URLS COMPILATION 1 2 3 4 Incorporating a random

    selection of less frequently accessed URLs to catch edge cases Prioritizing frequently accessed URLs to ensure we're testing the most critical paths Ensuring representation from all major sections of our websites and applications Including URLs that had caused issues in the past AWS Batch ELB Access logs bucket URLs lists bucket
  16. AWS CodeCommit Application Load Balancer AWS CodeBuild AWS CodeDeploy AWS

    CodePipeline developer Registry Amazon ECR AWS Fargate for Amazon ECS “Green” target group Service “Blue” target group Service AWS Lambda
  17. AWS CodeCommit Application Load Balancer AWS CodeBuild AWS CodeDeploy AWS

    CodePipeline developer Registry Amazon ECR AWS Fargate for Amazon ECS “Green” target group Service “Blue” target group Service AWS Lambda
  18. AWS CodeCommit Application Load Balancer AWS CodeBuild AWS CodeDeploy AWS

    CodePipeline developer Registry Amazon ECR AWS Fargate for Amazon ECS “Green” target group Service “Blue” target group Service AWS Lambda
  19. AWS CodeCommit Application Load Balancer AWS CodeBuild AWS CodeDeploy AWS

    CodePipeline developer Registry Amazon ECR AWS Fargate for Amazon ECS “Green” target group “Blue” target group Service Service AWS Lambda
  20. Release duration: 3-4 hours 15 minutes Rollback time: 1-2 hours

    immediate Error rate: 40% of deployments ZERO to production Uptime: 99.5% 99.99999% (downtime not measurable)
  21. TAKEAWAYS This transformation has changed not just how we work,

    but how we think about our work Testing is fundamental Bridging the skills gap Shared Language of Infrastructure Continuous learning culture Focus on High- Value Tasks Automation as an enabler