Upgrade to Pro — share decks privately, control downloads, hide ads and more …

DeveloperWeek 2020 Progressive Delivery: Patterns & Benefits

Dave Karow
February 13, 2020

DeveloperWeek 2020 Progressive Delivery: Patterns & Benefits

Progressive Delivery allows you to switch from high-stakes “big bang” releases to the gradual exposure of code changes in production. The goal is to observe changes in the health of your systems and user behavior before ramping up to your entire user population.

Early adopters of CD invented their own Progressive Delivery tooling and practices, freeing up their teams to move fast, limit the blast radius of issues found in production and focus engineering cycles on work that delivers more impact, not just more releases.

What can we learn from the ways these pioneers of Progressive Delivery implemented gradual release mechanisms and automated repeatable and trustworthy reporting of KPIs?

In this fast-moving talk, we’ll introduce the ideas and look at specific examples. We’ll sum up what we’ve seen by documenting the patterns in checklists you can take away to establish or extend these proven patterns in your own environment.

Dave Karow

February 13, 2020
Tweet

More Decks by Dave Karow

Other Decks in Programming

Transcript

  1. The future is already here — it's just not very

    evenly distributed. William Gibson @davekarow
  2. Carlos Sanchez (Sr. Cloud Software Engineer @ Adobe) https://blog.csanchez.org/2019/01/22/progressive-delivery-in-kubernetes-blue-green-and-canary-deployments/ Progressive

    Delivery is the next step after Continuous Delivery, where new versions are deployed to a subset of users and are evaluated in terms of correctness and performance before rolling them to the totality of the users and rolled back if not matching some key metrics. @davekarow
  3. Potential Benefits of Progressive Delivery Avoid Downtime Limit the Blast

    Radius Limit WIP / Achieve Flow Learn During the Process @davekarow
  4. How You Roll Matters Approach Benefits Blue/Green Deployment Canary Release

    Feature Flag Rollout Feature Delivery Platform Avoid Downtime Limit The Blast Radius Limit WIP / Achieve Flow Learn During The Process https://www.split.io/blog/learn-the-four-shades-of-progressive-delivery/ Harvey Balls by Sschulte at English Wikipedia [CC BY-SA (https://creativecommons.org/licenses/by-sa/3.0)] @davekarow
  5. How You Roll Matters @davekarow Approach Benefits Blue/Green Deployment Canary

    Release Feature Flag Rollout Feature Delivery Platform Avoid Downtime Limit The Blast Radius Limit WIP / Achieve Flow Learn During The Process @davekarow https://www.split.io/blog/learn-the-four-shades-of-progressive-delivery/ Harvey Balls by Sschulte at English Wikipedia [CC BY-SA (https://creativecommons.org/licenses/by-sa/3.0)]
  6. How You Roll Matters @davekarow Approach Benefits Blue/Green Deployment Canary

    Release Feature Flag Rollout Feature Delivery Platform Avoid Downtime Limit The Blast Radius Limit WIP / Achieve Flow Learn During The Process @davekarow https://www.split.io/blog/learn-the-four-shades-of-progressive-delivery/ Harvey Balls by Sschulte at English Wikipedia [CC BY-SA (https://creativecommons.org/licenses/by-sa/3.0)]
  7. How You Roll Matters @davekarow Approach Benefits Blue/Green Deployment Canary

    Release Feature Flag Rollout Feature Delivery Platform Avoid Downtime Limit The Blast Radius Limit WIP / Achieve Flow Learn During The Process @davekarow https://www.split.io/blog/learn-the-four-shades-of-progressive-delivery/ Harvey Balls by Sschulte at English Wikipedia [CC BY-SA (https://creativecommons.org/licenses/by-sa/3.0)]
  8. How You Roll Matters @davekarow Approach Benefits Blue/Green Deployment Canary

    Release Feature Flag Rollout Feature Delivery Platform Avoid Downtime Limit The Blast Radius Limit WIP / Achieve Flow Learn During The Process @davekarow https://www.split.io/blog/learn-the-four-shades-of-progressive-delivery/ Harvey Balls by Sschulte at English Wikipedia [CC BY-SA (https://creativecommons.org/licenses/by-sa/3.0)]
  9. Let’s Venture Into the Wild! Bruce Turner from AustinTX https://www.flickr.com/people/66994844@N00

    [CC BY (https://creativecommons.org/licenses/by/2.0)] @davekarow
  10. Booking.com’s experience with Manage: “asynchronous feature release” • Deploying has

    no impact on user experience • Deploy more frequently with less risk to business and users • The big win is Agility @davekarow
  11. Monitoring the needle in the haystack If you roll out

    a change to just 5% of your population ...and 20% (1 in 5) of the exposed users get an error, that’s a HUGE problem! But, what % of your total user population is getting that error? 1% @davekarow
  12. Booking.com’s experience with Monitor: “Experimentation as a safety net” •

    Each new feature is wrapped in its own experiment • Allows: monitoring and stopping of individual changes • The developer or team responsible for the feature can enable and disable it... • ...regardless of who deployed the new code that contained it. @davekarow
  13. Booking.com safety net automated: “circuit breaker” • Active for the

    first three minutes of feature release • Severe degradation → automatic abort of that feature • Acceptable divergence from core value of local ownership and responsibility where it’s a “no brainer” that users are being negatively impacted @davekarow
  14. Booking.com’s experience with Experimentation: A way to validate ideas •

    Measure (in a controlled manner) the impact changes have on user behaviour • Every change has a clear objective (explicitly stated hypothesis on how it will improve user experience) • Measuring allows validation that desired outcome is achieved @davekarow
  15. The quicker we manage to validate new ideas the less

    time is wasted on things that don’t work and the more time is left to work on things that make a difference. Booking’s Big Takeaway @davekarow
  16. • Built a targeting engine that could “split” traffic between

    existing and new code • Impact analysis was by hand only (and took ~2 weeks), so nobody did it :-( Essentially just feature flags without automated feedback LinkedIn early days: a modest start for XLNT @davekarow
  17. LinkedIn XLNT Today A controlled release with standardized KPI calculation

    launched very 5 minutes 100 releases per day 6000 metrics that can be “followed” by any stakeholder: “What releases are moving the numbers I care about?” @davekarow
  18. Lessons learned at LinkedIn • Build for scale: no more

    coordinating over email • Make it trustworthy: targeting and analysis must be rock solid • Design for diverse teams, not just data scientists Ya Xu Head of Data Science, LinkedIn Decisions Conference 10/2/2018 @davekarow
  19. Step 1 Feature flags Step 2 Sensors Correlation Step 3

    Stats Engine Causation “Holy Grail” Mgmt console System of record Alerting Increasing functionality & company adoption Cost to build and maintain Summing it up: The patterns are proven @davekarow Maturity hasn’t come easily or fast for the pioneers
  20. Checklists to DIY or Buy • Foundational Capabilities You’ll Need

    • How-To’s: Monitor & Experiment @davekarow
  21. Decouple deploy from release ❏ Allow changes of exposure w/o

    new deploy or rollback ❏ Support targeting by UserID, attribute (population), random hash Foundational Capability #1 @davekarow
  22. Automate a reliable and consistent way to answer, “Who have

    we exposed this to so far?” ❏ Record who hit a flag, which way they were sent, and why ❏ Confirm that targeting is working as intended ❏ Confirm that expected traffic levels are reached Foundational Capability #2 @davekarow
  23. Automate a reliable and consistent way to answer, “How is

    it going for them (and us)?” ❏ Automate comparison of system health (errors, latency, etc…) ❏ Automate comparison of user behavior (business outcomes) ❏ Make it easy to include “Guardrail Metrics” in comparisons to avoid the local optimization trap Foundational Capability #3 @davekarow
  24. Limit the blast radius of unexpected consequences so you can

    replace the “big bang” release night with more frequent, less stressful rollouts. Build on the foundational capabilities to: ❏ Ramp in stages, starting with dev team, then dogfooding, then % of public ❏ Monitor at feature rollout level, not just globally (vivid facts vs faint signals) ❏ Alert at the team level (build it/own it) ❏ Kill if severe degradation detected (stop the pain now, triage later) ❏ Continue to ramp up healthy features while “sick” are ramped down or killed How-To: Release Faster With Less Risk @davekarow
  25. Focus precious engineering cycles on “what works” with experimentation, making

    statistically rigorous observations about what moves KPIs (and what doesn’t). Build on the foundational capabilities to: ❏ Target an experiment to a specific segment of users ❏ Ensure random, deterministic, persistent allocation to A/B/n variants ❏ Ingest metrics chosen before the experiment starts (not cherry-picked after) ❏ Compute statistical significance before proclaiming winners ❏ Design for diverse audiences, not just data scientists (buy-in needed to stick) How-To: Engineer for Impact (Not Output) @davekarow
  26. Whatever you are, try to be a good one. William

    Makepeace Thackeray @davekarow