Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Deliver Results, Not Just Releases (Manage, Monitor, Experiment)

Dave Karow
November 19, 2019

Deliver Results, Not Just Releases (Manage, Monitor, Experiment)

Here is the deck I presented at #GOTOcph on the Global Supertrends track.

How do companies like Netflix, LinkedIn and Booking.com crush it year after year? Yes, they release early and often—but they also build control and observability into their CD pipeline to turn releases into results.

Progressive delivery and the statistical observation of real users (sometimes known as “shift right testing” or “feature experimentation”) are essential CD practices. They free teams up to move fast, control risk and focus engineering cycles on work that delivers results, not just releases.

Dave Karow

November 19, 2019
Tweet

More Decks by Dave Karow

Other Decks in Programming

Transcript

  1. Deliver Results, Not Just Releases How Leaders Do Continuous Delivery

    To Get Ahead, Not Just Go Faster @davekarow
  2. The future is already here — it's just not very

    evenly distributed. William Gibson @davekarow
  3. Manage is About Control of Exposure ...grant early access ...limit

    the blast radius ...define surface area for learning How do we decouple deploy from release? And decouple revert from rollback? @davekarow
  4. 11 Feature Flag Managing exposure like a dimmer or light

    board 0% 10% 20% 50% 100% @davekarow
  5. Booking.com’s experience with “asynchronous feature release” • Deploying has no

    impact on user experience • Deploy more frequently with less risk to business and users • The big win is Agility
  6. Booking.com Monitoring: “Experimentation as a safety net” • Each new

    feature is wrapped in its own experiment • Allows: monitoring and stopping of individual changes • The developer or team responsible for the feature can enable and disable it... • ...regardless of who deployed the new code that contained it. @davekarow
  7. Monitoring the needle in the haystack If you roll out

    a change to just 5% of your population ...and 20% (1 in 5) of the exposed users get an error, that’s a HUGE problem! But, what % of your total user population is getting that error? 1% @davekarow
  8. Booking.com safety net automated: “circuit breaker” • Active for the

    first three minutes of feature release • Severe degradation → automatic abort of that feature • Acceptable divergence from core value of local ownership and responsibility where it’s a “no brainer” that users are being negatively impacted @davekarow
  9. Booking.com Experimentation: “Experimentation as a way to validate ideas” •

    Measure (in a controlled manner) the impact changes have on user behaviour • Every change has a clear objective (explicitly stated hypothesis on how it will improve user experience) • Measuring allows validation that desired outcome is achieved @davekarow
  10. The quicker we manage to validate new ideas the less

    time is wasted on things that don’t work and the more time is left to work on things that make a difference. https://medium.com/booking-com-development/moving-fast-breaking-things-and-fixing-them-as-quickly-as-possible-a6c16c5a1185 @davekarow
  11. • Built a targeting engine that could “split” traffic between

    existing and new code • Impact analysis was by hand (~2 weeks), so nobody did it :-( Essentially just feature flags without automated feedback. LinkedIn early days: a modest start for XLNT @davekarow
  12. LinkedIn XLNT Today A controlled release (with built-in observability) every

    5 minutes 100 releases per day 6000 metrics that can be “followed” by any stakeholder: “What releases are moving the numbers I care about?”
  13. Lessons learned at LinkedIn • Build for scale: no more

    coordinating over email • Make it trustworthy: targeting & analysis must be rock solid • Design for diverse teams, not just data scientists Ya Xu Head of Data Science, LinkedIn Decisions Conference 10/2/2018 @davekarow
  14. Step 1 Feature flags Step 2 Sensors Correlation Step 3

    Stats Engine Causation “Holy Grail” Mgmt console System of record Alerting | > $50M annual cost | > $30M annual cost | > $25M annual cost Increasing functionality & company adoption Cost to build and maintain Experimentation platforms are pervasive in modern development, but homegrown and costly @davekarow
  15. That’s why we created Split Adil’s Story: 1. Used XLNT

    at LinkedIn 2. Left LinkedIn & Missed XLNT 3. Built MVP @ RelateIQ 4. Saved white screen of death 5. October 2015: Founded Split @davekarow
  16. Whatever you are, try to be a good one. William

    Makepeace Thackeray @davekarow @davekarow
  17. “Our success at Amazon is a function of how many

    experiments we do per year, per month, per week, per day.” 41
  18. Leaders innovate in a fast, safe and smart iteration loop

    Iterate safe Releases inevitably fail; they invest in detecting and recovering quickly from failure Iterate smart Many new features are ineffective; they measure which features work and focus there Iterate fast Code deployment is fast with code commits deployed to prod within the day 3min window 1deploy / second across hundreds of services to detect and kill bad code 20% of features move the business forward