DeveloperWeek 2020 Progressive Delivery: Patterns & Benefits

Progressive Delivery Patterns & Benefits Dave Karow Continuous Delivery Evangelist
Split.io @davekarow

The future is already here — it's just not very
evenly distributed. William Gibson @davekarow

What is Progressive Delivery? Patterns In The Wild Summing It
Up QR Code @davekarow

What is Progressive Delivery and what are the potential benefits?
@davekarow @davekarow

Carlos Sanchez (Sr. Cloud Software Engineer @ Adobe) https://blog.csanchez.org/2019/01/22/progressive-delivery-in-kubernetes-blue-green-and-canary-deployments/ Progressive
Delivery is the next step after Continuous Delivery, where new versions are deployed to a subset of users and are evaluated in terms of correctness and performance before rolling them to the totality of the users and rolled back if not matching some key metrics. @davekarow

Potential Benefits of Progressive Delivery Avoid Downtime Limit the Blast
Radius Limit WIP / Achieve Flow Learn During the Process @davekarow

How You Roll Matters Approach Benefits Blue/Green Deployment Canary Release
Feature Flag Rollout Feature Delivery Platform Avoid Downtime Limit The Blast Radius Limit WIP / Achieve Flow Learn During The Process https://www.split.io/blog/learn-the-four-shades-of-progressive-delivery/ Harvey Balls by Sschulte at English Wikipedia [CC BY-SA (https://creativecommons.org/licenses/by-sa/3.0)] @davekarow

How You Roll Matters @davekarow Approach Benefits Blue/Green Deployment Canary
Release Feature Flag Rollout Feature Delivery Platform Avoid Downtime Limit The Blast Radius Limit WIP / Achieve Flow Learn During The Process @davekarow https://www.split.io/blog/learn-the-four-shades-of-progressive-delivery/ Harvey Balls by Sschulte at English Wikipedia [CC BY-SA (https://creativecommons.org/licenses/by-sa/3.0)]

Feature Delivery Platform Capabilities @davekarow

Let’s Venture Into the Wild! Bruce Turner from AustinTX https://www.flickr.com/people/66994844@N00
[CC BY (https://creativecommons.org/licenses/by/2.0)] @davekarow

Booking.com @davekarow

Booking.com: a well-documented example of the pattern @davekarow

https://medium.com/booking-com-development/moving-fast-breaking-things-and-fixing-them-as-quickly-as-possible-a6c16c5a1185 @davekarow

@davekarow @davekarow

@davekarow

Feature Flag Managing exposure like a dimmer or light board
0% 10% 20% 50% 100% @davekarow

Booking.com’s experience with Manage: “asynchronous feature release” • Deploying has
no impact on user experience • Deploy more frequently with less risk to business and users • The big win is Agility @davekarow

@davekarow

Monitoring the needle in the haystack If you roll out
a change to just 5% of your population ...and 20% (1 in 5) of the exposed users get an error, that’s a HUGE problem! But, what % of your total user population is getting that error? 1% @davekarow

@davekarow

Booking.com’s experience with Monitor: “Experimentation as a safety net” •
Each new feature is wrapped in its own experiment • Allows: monitoring and stopping of individual changes • The developer or team responsible for the feature can enable and disable it... • ...regardless of who deployed the new code that contained it. @davekarow

Booking.com safety net automated: “circuit breaker” • Active for the
first three minutes of feature release • Severe degradation → automatic abort of that feature • Acceptable divergence from core value of local ownership and responsibility where it’s a “no brainer” that users are being negatively impacted @davekarow

Booking.com Circuit Breaker (Automatic Stopping) @davekarow

Guardrail metrics @davekarow

@davekarow

Booking.com’s experience with Experimentation: A way to validate ideas •
Measure (in a controlled manner) the impact changes have on user behaviour • Every change has a clear objective (explicitly stated hypothesis on how it will improve user experience) • Measuring allows validation that desired outcome is achieved @davekarow

Feature Flag Experimentation example @davekarow 50% 50%

The quicker we manage to validate new ideas the less
time is wasted on things that don’t work and the more time is left to work on things that make a difference. Booking’s Big Takeaway @davekarow

LinkedIn XLNT/LIX @davekarow

• Built a targeting engine that could “split” traffic between
existing and new code • Impact analysis was by hand only (and took ~2 weeks), so nobody did it :-( Essentially just feature flags without automated feedback LinkedIn early days: a modest start for XLNT @davekarow

LinkedIn XLNT Today A controlled release with standardized KPI calculation
launched very 5 minutes 100 releases per day 6000 metrics that can be “followed” by any stakeholder: “What releases are moving the numbers I care about?” @davekarow

Lessons learned at LinkedIn • Build for scale: no more
coordinating over email • Make it trustworthy: targeting and analysis must be rock solid • Design for diverse teams, not just data scientists Ya Xu Head of Data Science, LinkedIn Decisions Conference 10/2/2018 @davekarow

Step 1 Feature flags Step 2 Sensors Correlation Step 3
Stats Engine Causation “Holy Grail” Mgmt console System of record Alerting Increasing functionality & company adoption Cost to build and maintain Summing it up: The patterns are proven @davekarow Maturity hasn’t come easily or fast for the pioneers

Split implements these patterns as a service: split.io @davekarow

@davekarow

Checklists to DIY or Buy • Foundational Capabilities You’ll Need
• How-To’s: Monitor & Experiment @davekarow

Decouple deploy from release ❏ Allow changes of exposure w/o
new deploy or rollback ❏ Support targeting by UserID, attribute (population), random hash Foundational Capability #1 @davekarow

Automate a reliable and consistent way to answer, “Who have
we exposed this to so far?” ❏ Record who hit a flag, which way they were sent, and why ❏ Confirm that targeting is working as intended ❏ Confirm that expected traffic levels are reached Foundational Capability #2 @davekarow

Automate a reliable and consistent way to answer, “How is
it going for them (and us)?” ❏ Automate comparison of system health (errors, latency, etc…) ❏ Automate comparison of user behavior (business outcomes) ❏ Make it easy to include “Guardrail Metrics” in comparisons to avoid the local optimization trap Foundational Capability #3 @davekarow

Limit the blast radius of unexpected consequences so you can
replace the “big bang” release night with more frequent, less stressful rollouts. Build on the foundational capabilities to: ❏ Ramp in stages, starting with dev team, then dogfooding, then % of public ❏ Monitor at feature rollout level, not just globally (vivid facts vs faint signals) ❏ Alert at the team level (build it/own it) ❏ Kill if severe degradation detected (stop the pain now, triage later) ❏ Continue to ramp up healthy features while “sick” are ramped down or killed How-To: Release Faster With Less Risk @davekarow

Focus precious engineering cycles on “what works” with experimentation, making
statistically rigorous observations about what moves KPIs (and what doesn’t). Build on the foundational capabilities to: ❏ Target an experiment to a specific segment of users ❏ Ensure random, deterministic, persistent allocation to A/B/n variants ❏ Ingest metrics chosen before the experiment starts (not cherry-picked after) ❏ Compute statistical significance before proclaiming winners ❏ Design for diverse audiences, not just data scientists (buy-in needed to stick) How-To: Engineer for Impact (Not Output) @davekarow

Whatever you are, try to be a good one. William
Makepeace Thackeray @davekarow

Progressive Delivery Resources tinyurl.com/pd4u2020 (just posted to twitter as well)
@davekarow

DeveloperWeek 2020 Progressive Delivery: Patter...

DeveloperWeek 2020 Progressive Delivery: Patterns & Benefits

More Decks by Dave Karow

Other Decks in Programming

Featured

Transcript