Slide 1

Slide 1 text

Deliver Results, Not Just Releases How Leaders Do Continuous Delivery To Get Ahead, Not Just Go Faster @davekarow

Slide 2

Slide 2 text

The future is already here — it's just not very evenly distributed. William Gibson @davekarow

Slide 3

Slide 3 text

What Do These Leaders All Have In Common? @davekarow

Slide 4

Slide 4 text

Progressive Delivery: - Control of Release - Observability of Impact At the user level @davekarow

Slide 5

Slide 5 text

Three Pillars of Progressive Delivery @davekarow

Slide 6

Slide 6 text

Booking.com: A well documented example of this pattern. @davekarow

Slide 7

Slide 7 text

Booking.com: Let’s see how they describe it. @davekarow

Slide 8

Slide 8 text

Great read https://medium.com/booking-com-development/moving-fast-breaking-things-and-fixing-them-as-quickly-as-possible-a6c16c5a1185 @davekarow

Slide 9

Slide 9 text

Manage @davekarow

Slide 10

Slide 10 text

Manage is About Control of Exposure ...grant early access ...limit the blast radius ...define surface area for learning How do we decouple deploy from release? And decouple revert from rollback? @davekarow

Slide 11

Slide 11 text

11 Feature Flag Managing exposure like a dimmer or light board 0% 10% 20% 50% 100% @davekarow

Slide 12

Slide 12 text

https://medium.com/booking-com-development/moving-fast-breaking-things-and-fixing-them-as-quickly-as-possible-a6c16c5a1185 @davekarow

Slide 13

Slide 13 text

Booking.com’s experience with “asynchronous feature release” ● Deploying has no impact on user experience ● Deploy more frequently with less risk to business and users ● The big win is Agility

Slide 14

Slide 14 text

Monitor @davekarow

Slide 15

Slide 15 text

Booking.com Monitoring: “Experimentation as a safety net” ● Each new feature is wrapped in its own experiment ● Allows: monitoring and stopping of individual changes ● The developer or team responsible for the feature can enable and disable it... ● ...regardless of who deployed the new code that contained it. @davekarow

Slide 16

Slide 16 text

Monitoring the needle in the haystack If you roll out a change to just 5% of your population ...and 20% (1 in 5) of the exposed users get an error, that’s a HUGE problem! But, what % of your total user population is getting that error? 1% @davekarow

Slide 17

Slide 17 text

@davekarow

Slide 18

Slide 18 text

@davekarow

Slide 19

Slide 19 text

Booking.com safety net automated: “circuit breaker” ● Active for the first three minutes of feature release ● Severe degradation → automatic abort of that feature ● Acceptable divergence from core value of local ownership and responsibility where it’s a “no brainer” that users are being negatively impacted @davekarow

Slide 20

Slide 20 text

Booking.com Circuit Breaker @davekarow

Slide 21

Slide 21 text

Experim @davekarow

Slide 22

Slide 22 text

Booking.com Experimentation: “Experimentation as a way to validate ideas” ● Measure (in a controlled manner) the impact changes have on user behaviour ● Every change has a clear objective (explicitly stated hypothesis on how it will improve user experience) ● Measuring allows validation that desired outcome is achieved @davekarow

Slide 23

Slide 23 text

23 Feature Flag Experimentation Example 50% 50% @davekarow

Slide 24

Slide 24 text

Guardrail metrics @davekarow

Slide 25

Slide 25 text

The quicker we manage to validate new ideas the less time is wasted on things that don’t work and the more time is left to work on things that make a difference. https://medium.com/booking-com-development/moving-fast-breaking-things-and-fixing-them-as-quickly-as-possible-a6c16c5a1185 @davekarow

Slide 26

Slide 26 text

An example of the journey from MVP to trusted system: Linked In XLNT @davekarow

Slide 27

Slide 27 text

● Built a targeting engine that could “split” traffic between existing and new code ● Impact analysis was by hand (~2 weeks), so nobody did it :-( Essentially just feature flags without automated feedback. LinkedIn early days: a modest start for XLNT @davekarow

Slide 28

Slide 28 text

LinkedIn XLNT Today A controlled release (with built-in observability) every 5 minutes 100 releases per day 6000 metrics that can be “followed” by any stakeholder: “What releases are moving the numbers I care about?”

Slide 29

Slide 29 text

Lessons learned at LinkedIn ● Build for scale: no more coordinating over email ● Make it trustworthy: targeting & analysis must be rock solid ● Design for diverse teams, not just data scientists Ya Xu Head of Data Science, LinkedIn Decisions Conference 10/2/2018 @davekarow

Slide 30

Slide 30 text

Step 1 Feature flags Step 2 Sensors Correlation Step 3 Stats Engine Causation “Holy Grail” Mgmt console System of record Alerting | > $50M annual cost | > $30M annual cost | > $25M annual cost Increasing functionality & company adoption Cost to build and maintain Experimentation platforms are pervasive in modern development, but homegrown and costly @davekarow

Slide 31

Slide 31 text

That’s why we created Split Adil’s Story: 1. Used XLNT at LinkedIn 2. Left LinkedIn & Missed XLNT 3. Built MVP @ RelateIQ 4. Saved white screen of death 5. October 2015: Founded Split @davekarow

Slide 32

Slide 32 text

@davekarow

Slide 33

Slide 33 text

@davekarow

Slide 34

Slide 34 text

@davekarow

Slide 35

Slide 35 text

Community… this is bigger than any of us @davekarow

Slide 36

Slide 36 text

https://www.youtube.com/watch?v=Y_D9t98dGh4 Google “Talia Nassi Testing In Production” @davekarow

Slide 37

Slide 37 text

https://www.split.io/blog/how-to-avoid-lying-to-yourself-with-statistics/ @davekarow

Slide 38

Slide 38 text

Sustainability: We are killing the release night https://www.split.io/blog/on-a-mission-to-kill-release-nights/ @davekarow

Slide 39

Slide 39 text

Whatever you are, try to be a good one. William Makepeace Thackeray @davekarow @davekarow

Slide 40

Slide 40 text

Appendix

Slide 41

Slide 41 text

“Our success at Amazon is a function of how many experiments we do per year, per month, per week, per day.” 41

Slide 42

Slide 42 text

Leaders innovate in a fast, safe and smart iteration loop Iterate safe Releases inevitably fail; they invest in detecting and recovering quickly from failure Iterate smart Many new features are ineffective; they measure which features work and focus there Iterate fast Code deployment is fast with code commits deployed to prod within the day 3min window 1deploy / second across hundreds of services to detect and kill bad code 20% of features move the business forward