Slide 1

Slide 1 text

DevOps Topologies 10 years on: what have we learned about silos, collaboration, and flow? Matthew Skelton, Conflux - co-author of Team Topologies Eficode DEVOPS Conference, London | 2024-03-14 K26 Image CC BYNC Duncan Rawlinson https://duncan.co/land-of-plenty/

Slide 2

Slide 2 text

Photo goes here Matthew Skelton Founder at Conflux Co-author of Team Topologies “Ecosystem engineering” LinkedIn: linkedin.com/in/matthewskelton Mastodon: mastodon.social/@matthewskelton 2

Slide 3

Slide 3 text

Team Topologies Organizing business and technology teams for fast flow Matthew Skelton & Manuel Pais IT Revolution Press, September 2019 Order via stores worldwide: teamtopologies.com/book 3

Slide 4

Slide 4 text

Ten years after SODOR and DevOps Topologies, what is holding back lower-performing organizations from improving their IT delivery performance? 4

Slide 5

Slide 5 text

10 years on: DOTs and SODOR The cost of tangled software Accelerate metrics + TT for flow Success with fast flow 5

Slide 6

Slide 6 text

Almost all decisions need to use the twin ‘lenses’ of fast flow and team cognitive load 👓 6

Slide 7

Slide 7 text

[key point] 7

Slide 8

Slide 8 text

10 years on: DevOps Topologies and State of DevOps Reports 8

Slide 9

Slide 9 text

9 2013: Epic battles between Dev and Ops

Slide 10

Slide 10 text

10 https://blog.matthewskelton.net/2013/10/22/what-team-s tructure-is-right-for-devops-to-flourish/

Slide 11

Slide 11 text

11 https://web.devopstopologies.com/

Slide 12

Slide 12 text

12 https://web.devopstopologies.com/ Used by Netflix, Condé Nast, Accenture, etc. etc.

Slide 13

Slide 13 text

Thanks to Gene Kim and IT Revolution Press (and many other inspiring people) 13

Slide 14

Slide 14 text

14

Slide 15

Slide 15 text

15 https://confluxhq.com/d evops-topologies

Slide 16

Slide 16 text

16 Image taken from devopstopologies.com - licensed under CC BY-SA Example: Anti-Type B (DevOps Team Silo)

Slide 17

Slide 17 text

17 Image taken from devopstopologies.com - licensed under CC BY-SA Example: Anti-Type E (Rebranded SysAdmin)

Slide 18

Slide 18 text

18 Image taken from devopstopologies.com - licensed under CC BY-SA Example: Anti-Type H (Fake SRE)

Slide 19

Slide 19 text

19 Image taken from devopstopologies.com - licensed under CC BY-SA Example: Type 1 (Dev and Ops Collaboration)

Slide 20

Slide 20 text

20 Image taken from devopstopologies.com - licensed under CC BY-SA Platform grouping Stream-aligned team Collaboration Flow of value DevOps Topologies (2013) Team Topologies (2019+)

Slide 21

Slide 21 text

21 Image taken from devopstopologies.com - licensed under CC BY-SA Example: Type 3 (Ops as Infrastructure-as-a-Service (Platform))

Slide 22

Slide 22 text

Stream-aligned team 22 Image taken from devopstopologies.com - licensed under CC BY-SA Platform grouping Stream-aligned team Flow of value DevOps Topologies (2013) Team Topologies (2019+) XaaS XaaS

Slide 23

Slide 23 text

23 Image taken from devopstopologies.com - licensed under CC BY-SA Example: Type 5 (DevOps Team with an Expiry Date)

Slide 24

Slide 24 text

24 Image taken from devopstopologies.com - licensed under CC BY-SA Platform grouping Stream-aligned team Flow of value DevOps Topologies (2013) Team Topologies (2019+) Facilitating Enabling team

Slide 25

Slide 25 text

Platform grouping 25 Flow of change Stream-aligned team Stream-aligned team Complicated Subsystem team XaaS XaaS Collaboration Stream-aligned team xN 💡 Team Topologies diagrams are always just “snapshots in time”, never fixed designs

Slide 26

Slide 26 text

26 0 - Success lies in addressing dynamic interactions, not static structure [alone]

Slide 27

Slide 27 text

27 State of DevOps Reports (SODOR)

Slide 28

Slide 28 text

28 State of DevOps reports 2013 2014 2015 2016 2017 2018 2019 Annual survey of 1000-5000 IT professionals worldwide using rigorous statistical methods 2020 2021 2022 2023 https://www.puppet.com/resources/history-of-devops-reports

Slide 29

Slide 29 text

29 State of DevOps 2019 “The use of cloud… is predictive of software delivery performance and availability.” “High performers favor strategies that create community structures at both low and high levels in the organization...”

Slide 30

Slide 30 text

30 State of DevOps 2019 “Heavyweight change approval processes, such as change approval boards, negatively impact speed and stability. In contrast, having a clearly understood process for changes drives speed and stability, as well as reductions in burnout.”

Slide 31

Slide 31 text

31 State of DevOps 2019 “We use a structural equation model (SEM), which is a predictive model used to test relationships” → Improvements in practices predict improved organizational performance

Slide 32

Slide 32 text

33 Team Topologies practices are predictive of higher organizational performance - SODOR https://www.puppet.com/resources/state-of-devops-report “Highly evolved organizations tend to follow the Team Topologies model”

Slide 33

Slide 33 text

34 Team Topologies ideas are now a key part of the AWS ‘Well-architected’ guidance https://docs.aws.amazon.com/wellarchitected/latest/devops-guida nce/oa.std.1-organize-teams-into-distinct-topology-types-to-optimiz e-the-value-stream.html “[OA.STD.1] Organize teams into distinct topology types to optimize the value stream”

Slide 34

Slide 34 text

35 Team Topologies ideas are now a key part of the Azure ‘Cloud Adoption Framework’ https://learn.microsoft.com/en-us/azure/cloud-adoption-framew ork/ready/considerations/devops-teams-topologies “Consider establishing an enabling team that can … support applications and platforms … ”

Slide 35

Slide 35 text

36 2013 2024+

Slide 36

Slide 36 text

37 2013 2024+ Practitioner-led co-evolution

Slide 37

Slide 37 text

38 Financial Conduct Authority (FCA) in the UK ‘Implementing Technology Change’ - 2021 report “Overall, we found that firms that deployed smaller, more frequent releases had higher change success rates than those with longer release cycles. Firms that made effective use of agile delivery methodologies were also less likely to experience a change incident.” https://www.fca.org.uk/publications/multi-firm-reviews/implementing-technology-change

Slide 38

Slide 38 text

The cost of tangled software 39

Slide 39

Slide 39 text

40 If each engineer in the organization is blocked for 1 hour per working day, how much does this cost?

Slide 40

Slide 40 text

41 ● Fully-loaded cost: €160k per year ● 260 paid days per year ● Total of 400 engineers

Slide 41

Slide 41 text

42 Engineer fully weighted cost per year Engineer cost per day Hours blocked per 8-hour day Days blocked per 260-day year Cost of blockers per engineer per year Number of engineers Total cost of blockers per year €160,000.00 €615.38 1 32.5 €20,000.00 400 €8,000,000.00 €8 million per year 💥

Slide 42

Slide 42 text

Untangling ≃ Decoupling 43

Slide 43

Slide 43 text

Decoupling: separating things that do not need to be together 44

Slide 44

Slide 44 text

45

Slide 45

Slide 45 text

Decoupling enables shorter time-to-value 46

Slide 46

Slide 46 text

Decoupling enables multiple, independent flows of change, each with its own cadence 47

Slide 47

Slide 47 text

48 Multiple, independent flows

Slide 48

Slide 48 text

Decoupling of teams & technology vs Mingling of ideas for learning 49

Slide 49

Slide 49 text

Decoupling: teams, software, technology, deployments, data, business concepts, ... 50

Slide 50

Slide 50 text

Mingling: principles, practices, learning, techniques, approaches, ... 51

Slide 51

Slide 51 text

52 Cost of Delay https://blackswanfarming.com/four-steps-to-quantifying-cost-of-delay/

Slide 52

Slide 52 text

53 1 - Organizing for fast flow helps us to remove inter-team dependencies, improving financial efficiency and time-to-value

Slide 53

Slide 53 text

Accelerate metrics + TT for flow 54

Slide 54

Slide 54 text

🔍 Use the 4 Key Metrics from Accelerate and add “blocker count” 55

Slide 55

Slide 55 text

Accelerate Building and Scaling High Performing Technology Organizations Nicole Forsgren, Jez Humble, Gene Kim IT Revolution Press, 2018 Order via stores worldwide: https://itrevolution.com/book/accelerate/ 56

Slide 56

Slide 56 text

4 key metrics: ‘Accelerate’ lead time deployment frequency Mean Time To Restore change fail percentage

Slide 57

Slide 57 text

4 key metrics: ‘Accelerate’ lead time deployment frequency Mean Time To Restore change fail percentage Encourage fast flow

Slide 58

Slide 58 text

4 key metrics: ‘Accelerate’ lead time deployment frequency Mean Time To Restore change fail percentage Encourage operability

Slide 59

Slide 59 text

What % of lead time is actual work? Example: 120 hours / (120+630) x 100 = 16% Flow Efficiency 60

Slide 60

Slide 60 text

Measure ‘wait time’? https://www.isixsigma.com/methodology/lean-methodology/identify-constraints-and-reduce-wait-time-processes/ 61

Slide 61

Slide 61 text

Measuring ‘wait time’ is hard Count the number of blocking waits as a proxy ✨ 62

Slide 62

Slide 62 text

4 key metrics & ‘blocker count’ lead time deployment frequency Mean Time To Restore change fail percentage ‘blocker count’ as a proxy for flow efficiency 63

Slide 63

Slide 63 text

Use 4 key metrics 📊 + “blocker count” to assess and find better service & team boundaries for flow 64

Slide 64

Slide 64 text

“If we adjusted the service & team boundary here, would it improve the 4 key metrics?” 💓 65

Slide 65

Slide 65 text

“If we adjusted the service & team boundary here, reduce the blocker count?” 📉 66

Slide 66

Slide 66 text

67 TeamForm teamform.co

Slide 67

Slide 67 text

TeamOS 68 teamos.is Disclosure: Matthew Skelton has invested personally in the company behind TeamOS

Slide 68

Slide 68 text

Techniques from the Team Topologies community 69

Slide 69

Slide 69 text

70 Independent Service Heuristics (ISH)

Slide 70

Slide 70 text

71 “The Independent Service Heuristics (ISH) are rules-of-thumb (clues) for identifying candidate value streams and domain boundaries by seeing if they could be run as a separate SaaS/cloud product.” https://teamtopologies.com/ish

Slide 71

Slide 71 text

72 User Needs Mapping (UNM)

Slide 72

Slide 72 text

73 “User Needs Mapping attempts to capture the first 4 steps of the Wardley Mapping process … for identifying potential team [and service] boundary” https://teamtopologies.com/unm

Slide 73

Slide 73 text

74 Team Interaction Modeling (TIM)

Slide 74

Slide 74 text

75 “[Team Interaction Modeling helps] to describe how to re-organize … teams and their interactions to achieve better flow and deliver value faster.” https://teamtopologies.com/tim

Slide 75

Slide 75 text

76 2 - Evolving at speed requires a set of core principles and practices, with people trained up and engaged

Slide 76

Slide 76 text

Success with fast flow 77

Slide 77

Slide 77 text

Case Study JP Morgan 78

Slide 78

Slide 78 text

Case Study “How JP Morgan Applied Team Topologies to Improve Flow in a Market Leading Enterprise Platform” 79 https://www.youtube.com/ watch?v=y3OL7dv2l48 Fast Flow Conf 💖 https://www.fastflowconf.com/

Slide 79

Slide 79 text

Case Study 80

Slide 80

Slide 80 text

Case Study “60% of dependencies reduced through better team design” 💥 81

Slide 81

Slide 81 text

82 Engineer fully weighted cost per year Engineer cost per day Hours blocked per 8-hour day Days blocked per 260-day year Cost of blockers per engineer per year Number of engineers Total cost of blockers per year €160,000.00 €615.38 1 32.5 €20,000.00 400 €8,000,000.00 €8 million per year 💥 Remember this? [Note: these are not figures from JP Morgan]

Slide 82

Slide 82 text

83 3 - Unblock, not coordinate.

Slide 83

Slide 83 text

Case Study GOV.UK Home Office 84

Slide 84

Slide 84 text

Case Study “How the Home Office’s Immigration Technology department reduced its cloud costs by 40%” 85 https://www.gov.uk/government/case-studies/how- the-home-offices-immigration-technology-departm ent-reduced-its-cloud-costs-by-40

Slide 85

Slide 85 text

86 https://www.gov.uk/government/case-studies/how-the-home-offices-immigration-technology-department-reduced-its-cloud-costs-by-40

Slide 86

Slide 86 text

Case Study Making service owners accountable for the $ spend for their service helps to clarify service boundaries 💡 87

Slide 87

Slide 87 text

Use cost metrics as a “financial scalpel” 🔪 to split services apart for fast flow 88

Slide 88

Slide 88 text

Principles from Team Topologies that suggest certain architectures 95

Slide 89

Slide 89 text

96 Multiple, independent flows, fractally

Slide 90

Slide 90 text

Respect Conway’s Law (aka ‘sociotechnical mirroring’) 97

Slide 91

Slide 91 text

Clear ongoing ownership of services and systems 98

Slide 92

Slide 92 text

Stream-aligned teams have end-to-end responsibility for a service (You Build It, You Run It) 99

Slide 93

Slide 93 text

Platforms improve flow and reduce extraneous cognitive load 100

Slide 94

Slide 94 text

Teams are small (~9), slowly changing, with ‘aligned autonomy’ 101

Slide 95

Slide 95 text

Teams are empowered to sense and adjust boundaries to improve flow on a frequent basis 102

Slide 96

Slide 96 text

103 Architecture for fast flow resembles an ecosystem of loosely-coupled independently-viable services with clear boundaries and ownership aligned to the flow of business value.

Slide 97

Slide 97 text

104 Adaptive Systems with Domain-Driven Design, Wardley Mapping, and Team Topologies: Architecture for Flow – 7 July 2024 Susanne Kaiser See https://www.infoq.com/presentations/ddd-wardley- mapping-team-topology/

Slide 98

Slide 98 text

105 4 - Multiple, loosely-coupled flows of value, with significant automation & helper tooling.

Slide 99

Slide 99 text

Team Topologies is a set of coherent patterns to encourage emergent behaviours for fast flow 106 Matthew Skelton

Slide 100

Slide 100 text

TT pattern #1 107 4 team types (well, 3 + 1) grouping Image © 2019 Matthew Skelton and Manuel Pais. Used with permission.

Slide 101

Slide 101 text

Each team type (or grouping) has specific expected behaviors 108

Slide 102

Slide 102 text

Each team type (or grouping) has a specific relation to flow and team cognitive load 109

Slide 103

Slide 103 text

TT pattern #2 110 3 team interaction modes Image © 2019 Matthew Skelton and Manuel Pais. Used with permission.

Slide 104

Slide 104 text

The constraints on interactions provide signals to tell us when boundaries are not good for fast flow 111

Slide 105

Slide 105 text

The constraints on interactions provide signals to tell us about intent/mission, capabilities, skills, strategy, and lots more… 112

Slide 106

Slide 106 text

TT pattern #3 113 fast flow

Slide 107

Slide 107 text

Organizing for fast flow means we are happy with: duplication, (a few) different versions, async + eventual consistency, ‘internal marketplace’, etc. 114

Slide 108

Slide 108 text

TT pattern #4 115 team cognitive load

Slide 109

Slide 109 text

Using team cognitive load as a key architectural and design principle means we have a humane, compassionate, and realistic workplace 116

Slide 110

Slide 110 text

TT pattern #5 117 thinnest viable platform (TVP)

Slide 111

Slide 111 text

TVP avoids ‘platform bloat’ by focusing on enhancing flow and reducing team cognitive load - rather than technology 118

Slide 112

Slide 112 text

TT pattern #6 119 empower teams to adjust boundaries to enhance flow

Slide 113

Slide 113 text

Empowering teams to adjust boundaries for flow uses local knowledge for regular incremental gains, avoiding a dreaded ‘Re-org’ every 5 years 120

Slide 114

Slide 114 text

121 “The work is delivered in many small changes that are uncoordinated to enable flow. … Management’s job is to provide context, prioritization and to coordinate across teams. Lending resources if needed across teams to unblock things. … It works well within a high trust culture.” Adrian Cockcroft https://mastodon.social/@adrianco/111174832280576410 Technology strategy advisor, Partner at OrionX.net (ex Amazon Sustainability, AWS, Battery Ventures, Netflix, eBay, Sun Microsystems, CCL)

Slide 115

Slide 115 text

122 5 - Team Cognitive Load as a key design heuristic

Slide 116

Slide 116 text

123 6 - Long-lived value streams, not short-lived projects

Slide 117

Slide 117 text

124 Re-aligned architecture

Slide 118

Slide 118 text

125 Fast feedback via deployment pipelines

Slide 119

Slide 119 text

126 Good technical practices (TDD, …)

Slide 120

Slide 120 text

127 Team ownership of software & services

Slide 121

Slide 121 text

128 Configuration in version control (Git)

Slide 122

Slide 122 text

129 Cloud-native: transparent in operation

Slide 123

Slide 123 text

130 Cloud-native: designed for automation

Slide 124

Slide 124 text

131 Continuous testing performance scanning deployment monitoring right-sizing integration

Slide 125

Slide 125 text

132 Psychological safety

Slide 126

Slide 126 text

133 Active diffusion of learning across team boundaries

Slide 127

Slide 127 text

134 7 - Strong technical and social practices as a foundation

Slide 128

Slide 128 text

10 years on: DOTs and SODOR The cost of tangled software Accelerate metrics + TT for flow Success with fast flow 135

Slide 129

Slide 129 text

Ten years after SODOR and DevOps Topologies, what is holding back lower-performing organizations from improving their IT delivery performance? 136

Slide 130

Slide 130 text

organizing for flow feels alien to many 👾 137

Slide 131

Slide 131 text

138 limited mindset shifts

Slide 132

Slide 132 text

CxO concerns ignored 139

Slide 133

Slide 133 text

‘silver bullet’ re-org 140

Slide 134

Slide 134 text

“structure” will fix it 141

Slide 135

Slide 135 text

tight coupling in time 142

Slide 136

Slide 136 text

limited diffuse learning 143

Slide 137

Slide 137 text

limited psych safety 144

Slide 138

Slide 138 text

145 success factors

Slide 139

Slide 139 text

146 0 - Address the dynamic interactions between teams and groups, not just static structure

Slide 140

Slide 140 text

147 1 - Organizing for fast flow helps us to remove inter-team dependencies, improving financial efficiency and time-to-value

Slide 141

Slide 141 text

148 2 - Evolving at speed requires a set of core principles and practices, with people trained up and engaged

Slide 142

Slide 142 text

149 3 - Unblock, not coordinate.

Slide 143

Slide 143 text

150 4 - Multiple, loosely-coupled flows of value, with significant automation & helper tooling.

Slide 144

Slide 144 text

151 5 - Team Cognitive Load as a key design heuristic

Slide 145

Slide 145 text

152 6 - Long-lived value streams, not short-lived projects

Slide 146

Slide 146 text

153 7 - Strong technical and social practices as a foundation

Slide 147

Slide 147 text

Almost all decisions need to use the twin ‘lenses’ of fast flow and team cognitive load 👓 154

Slide 148

Slide 148 text

155 What’s your perspective? Which aspects resonate? Which things feel incorrect?

Slide 149

Slide 149 text

156 expert-led coaching and group learning for adopting fast flow and Team Topologies confluxhq.com

Slide 150

Slide 150 text

thank you confluxhq.com Copyright (c) 2017-2024 Conflux group of companies. All Rights Reserved. The name “Conflux” and the filled C device are Registered Trademarks ® in multiple jurisdictions.