recently worked for 5 years for a large ﬁnancial institution in the technical architecture/devtooling team where I spend a lot of time on solving their software delivery problems I already spoke about this experience during the past devopsdays in March in London and it consisted of the implementation of a release management solution to better visualize all the dependencies between dev teams and ops teams and devs teams between each other. The solution was tactical in the sense that it considered the existing organizational structure as a given. Today I will talk about the more strategic options that become available when we are allowed to adapt the organizational structure to our needs. But ﬁrst let me give you a small background on the whole context: When I started there about 6 years ago, there was a lightweight software development process which was not even followed by most teams, not very disciplined, more of a hero culture. But it was a very people-friendly atmosphere where most people personally knew each other. co-located (200 in IT). Things got done, although maybe not in the most efﬁcient way. The environment grew organically. In general management was a bit frustrated about the lack of documentation, all information was in the people’s heads. They saw this as a problem for e.g. outsourcing. Business users were quite spoiled and most of the time got what they requested. It was not very clear what were the priorities so most of the time the loudest voices were treated ﬁrst. It was quite easy to start new projects but it also happened that projects were abandoned because the end users didn’t really like it. This was considered a waste of money, something we should avoid. During the 5 years I have worked there (which covered two big mergers) I have seen this mostly organic environment gradually change to one with a more hierarchical organigram and more heavy-weight processes with more focus on planning and up-front thinking. The thinking behind it was that we needed more to put more effort in planning, estimation and control, become more “mature” in building software, in order to get more value out of our IT investments. Also software delivery gradually got more complex and more and more control was added to the process. It started with one person coordinating the deployments for the most core team. He then moved into an enterprise release manager role that later became mandatory for all teams and gradually more templates and reviews were added. The complexity was tackled in a top-down way.
Database Storage Server OS App Support Scheduling Monitoring Network Firewall Service Desk Release Management Dev tools technical dependencies: frameworks, app servers, middleware, db’s, servers, scheduling, monitoring, networks, ﬁrewalls ... all managed by different teams in ops/infra, spread over the world, sometimes different teams per environment, and lots of changes to team structures and processes -> unreliable, difﬁcult to communicate between devs and ops because they are in a different world
U App V App Z Dev team 1 Dev team 2 Dev team 3 Dev team 4 Dev team 5 Dev team 6 Change request Change request Change request Change request Change request Change request Change request Change request Change request Change request
U App V App Z Dev team 1 Dev team 2 Dev team 3 Dev team 4 Dev team 5 Dev team 6 Change request Change request Change request Change request Change request Change request Change request Change request Change request Change request Solution was to bring release management under control by making it consistent and automating where possible, in order to make the dependencies better visible but after doing some research I realized that this was only a tactical solution. The dependencies are indeed visible now, but still present. If one team fails its deadline it drags a whole chain of dependencies with it. There are other, more fundamental solutions for this problem of dependencies and maybe we can simply avoid these painful orchestrated releases. To understand these solutions we must take a look at those who seem to have solved the problem of software delivery: the modern companies.
Spotify, Github, Facebook, Netﬁx, Etsy, ... - biggest example because also a traditional enterprise: GDS Government Digital Services, they build the sites of the UK government gov.uk (still very early but the results that are out are looking very promising) So why are they able to constantly deliver new features of their services all the time without any disruption and at the same time scale at an exponential rate? All of this on top of a complex infrastructure and very secure. Where exactly are they different from the traditional enterprises?
fact that the reality consists of complex technology and the unpredictable nature of human behavior and rapid rate of change. For these reasons it is impossible to predict the required behavior of a large system. Instead it must be built gradually, using a trial-and-error approach with a very fast feedback loop between the design and the reality. The people close to the reality must be involved as soon as possible in the decisions on the design. The organization structure must allow for this need. As we will see this is the key to solving many of the problems with dependencies. Not taking into account the need for this feedback loop in the organization structure will make the dependencies much harder to solve.
team 1 Dev team 2 Ops Middleware Database Storage Server OS App Support Scheduling Monitoring Network Firewall Service Desk Release Management Dev tools Dev team 3 These modern companies all have cross-functional teams On the other hand, the departments and teams in traditional enterprises are divided into functional silos, split up by technology Bad reasoning: 1) The assumption is that the process that should lead to a working system is known in advance and can therefore be split up into stages where each team is responsible for one stage. Once the work is ﬁnished the team passes its results on to the next team. In this approach there is no need for a feedback loop because the assumption is that the reality is fully known upfront. 2) Because the reality is considered ﬁxed and to improve the efﬁciency, the work is done in large batches, so the whole system is built in one go. Having time-consuming hand-offs between the teams is ﬁne because they only happen once per team. Summarized: the focus is on optimization of resources instead of speed of delivery/quick feedback loop. Following our fundamental assumption, in reality we need this feedback cycle to het a working system. So the “hand-off” communication channel is used a lot more than it was initially intended for, which leads to these unreliable dependencies. To avoid time-consuming and high-effort hand-offs between teams all stages to build the system should be grouped into one team. Example of time consuming communication between teams: setting up a new developer: it takes a month to open the ﬁrewall to his dev db
Database Storage Server OS Scheduling Monitoring Network Firewall Service Desk Release Management Technical services Service team 2 Service team 3 Dev tools Solution: group all experience in one team and make them build AND support the app. Pushing the technical dependencies inside one team makes them easier to digest so the ﬁrst problem of dependencies is hereby mostly taken care of. Example: Amazon’s two-pizza teams Advantages: Focus on the end-to-end service for the client Decisions that involve a trade-off between new features vs operational stability can be solved within the team Cross-pollination: devs will get inﬂuenced by ops to add better logging, monitoring, deployability and ops will be inﬂuenced by devs to automate their work. Small technology-speciﬁc teams are still useful to build shared tooling and best practices around cross-cutting needs but should be as advisory and opt-in for the teams, not imposed. More on this later.
organization chart in a traditional organization is typically very hierarchical with many levels of managers between the upper management and the people on the work ﬂoor. Information (in the form of work assignments and the coordination that must happen between different the teams or individuals that do the work) is ﬂowing mostly downward Bad reasoning: because of the complex environment, we need a lot of coordination so we need a lot of managers with small team sizes and many hierarchic layers. Problems: 1) Long feedback loop: Due to the large number of hubs it has to pass before it reaches the work ﬂoor the chances are high that it doesn’t reﬂect anymore the original intent The feedback ﬂowing back up is very limited/changed underway and so the requirements can not be adapted to the reality Management gets isolated from the reality, gap between the place where the decision is taken and the reality Politics and diplomacy can also blur the quality of the information ﬂowing back up “to please the boss”. This can result in ivory tower management: decisions are made high up in the organization structure and imposed to the lower layers of the organization, the people on the work ﬂoor simply have to execute what they are told, even if the instructions don’t make any sense to them. The workﬂoor becomes a very unattractive place to work. 2) Promotion path Competent technical people are promoted by moving them up in the organization chart, either as manager or architect, away from the action and into a role that they may be less comfortable with. As a result the roles of developer and ops person become very unsexy. This effect enforces itself with the strong and critical minds moving either up or away and the followers staying at the bottom of the hierarchy. These jobs are also often considered for out-sourcing, making the functional division problem even bigger. So there is less and less feedback ﬂowing up. The closer you are to the reality the less you have to say. Consequences: New features are as much a product of user needs as they are a product of the technological capabilities. In an environment where this technological knowledge doesn’t reach the decision makers there will only be conventional features, not the disrupting ones that take advantage of the reality, in this case the new technologies. The risk of having inexperienced and/or unmotivated teams that deliver poorly designed systems. And integration with other systems and backward compatibility is very hard to get right in the ﬁrst place. This results in more dependencies and coupling than necessary.
Service team 3 Service team 5 Service team 6 Service team 4 Service team 7 Service team 1 The solution is to remove most of these layers of management and to give power to the teams, to create self-organizing teams. Higher management should only give them directions but let them decide on the rest. They are closest to the reality so they are best placed to make the decisions. Advantages: 1) This way the feedback loop doesn’t have to travel so far each time. 2) Empowering people and giving them responsibility is the best way to unlock the true potential of the people. They will be more passionate and more creative in their solutions. So we will be able to make better use of the people’s capacities. 3) This empowerment will attract competent people and we need these ﬁrst and foremost because of these disruptive features they can build but also to come up with solutions that are decoupled (avoids dependency) or backward compatible (keep multiple versions in the code, use feature ﬂags, ...). This makes the functional dependencies less harmful. People who were previously in the middle management can be converted to coordinators or reverted back to their previous roles as technical specialists. Also: they can solve the coordination problems more organically/ad hoc based on the reality. ================= The new organization chart now looks more like a collection of small independent mini-startups within the big organization By now we have decreased the impact of dependencies. We have decreased the technical dependencies and the communication problems between devs and ops by creating cross-functional teams. And we have decreased the functional dependencies to the minimum and made the remaining ones uni-directional by creating highly skilled self-organizing teams that can deliver decoupled software that also supports backwards compatibility (using and feature ﬂags and supporting multiple versions of the API at the same time).
will we get a global optimization on organization level with only this local optimization on team level? Will this global optimization emerge automatically? In nature we see this emergence of higher-order properties a lot. But they have needed millions of years of evolution to come into place. We don’t have that much time so we need to put some effort/creativity to inﬂuence the self-organizing teams so they will align to a level where they will give us a similar higher-order beneﬁt/value. This is the domain of enterprise architecture.
Striving for global optimization on organizational level is the domain of enterprise architecture. This will be done by putting pressure on the teams to increase their alignment There are two main reasons why the organization wants better alignment: - to add value: a) alignment of the business processes and alignment of the apps to these business processes b) alignment of the data: making sure that the same exchange rate is used by ﬁnancial instruments, using the customer id in all apps, using the same granularity, ... business process with no holes and not too many overlaps and data that can be easily integrated will give the customer a better end-to-end user experience (e.g. a customer who receives a separate invoice from each team, separate service desks, the user wants to see all his relevant information in one integrated view) - to reduce costs: a) standardization of the processes of software development: this is, like line management, a way of coordinating the work. b) standardization of the used technologies c) re-use of shared tools and/or code to avoid duplication of work
a trade-off between the direct business-speciﬁc needs and the more general organizational alignment needs. Putting too much effort on the business-speciﬁc will harm the organization because it will not be able to get its alignment. And too much effort on the alignment needs will make the teams miss out on interesting business opportunities. In order to make the right decision, there must be an organizational awareness by the teams, the purpose is to come to a balance that brings the most value to the company. This approach will lead to a global optimization for the organization. Unfortunately this is very difﬁcult to calculate and there is a tendency to put higher weight on those aspects that are easiest to measure. Enterprises typically have a separate team for enterprise architecture and sometimes they have the power to impose their rules to the teams. This approach suffers from the same problems as the managers imposing their work: the decisions are taken far away from the action and there is no feedback loop back to the EA team on the consequences of their decisions. Each situation is different and choosing for a architecture of full-alignment “by design” will not bring us to this global optimization but instead will optimize for the local needs of the EA team. Example of process alignment with templates that are so generic that only 20% applies to each project
my own experience: build re-usable architectural building blocks based on the common needs by all teams. Although it creates a dependency to a different team (something we have tried to avoid as much as possible so far) if the re-used building block is not considered a core feature by the team and the support is decent/you make it the easiest way for them to get their need fulﬁlled they will be happy to use it so they can concentrate on what really matters to them. Ideally the building block has extension points so the teams can build upon it if they have any special needs. Or accept patches/forking. This stuff should always be opt-out: if it doesn’t bring value in your speciﬁc situation: feel free to build your own solution. It will also ensure that the architecture team stays customer friendly and stays away from the ivory tower. This requires organizational awareness by both the dev team and the architecture team., focus on getting the global optimization Having these architectural building blocks for common functionality also makes it faster for new projects to get up to speed. They can focus straight away on their business problems instead. example of build and deployment infra + problem of lack of review that turned out to be good
Erik Dörnenburg Jeanne Ross <> www.infoq.com/presentations/Questions-for-an-Enterprise-Architect http://www.youtube.com/watch?v=feI6_-v10Dk They are both right! They just work in a different context EA is where the needs of traditional enterprises and modern companies start to diverge. See the videos of Erik Doernenburg (head of technology at Thoughtworks) and Jeanne Ross (Director of the Center for Information Systems Research at MIT).
<> The IT systems are the rockstars of the organization The IT systems must be team players to support the rockstars Total value Autonomy Autonomy Rock star teams vs team players: Innovative IT are the rock stars of the organization. The future of the company depends on their success so they should not be slowed down by any efforts to align because we will always lose more than we gain. Mature IT does not provide enough value on its own to the organization Maybe they were rock stars in their prime but since then their value has decreased and now they only remain to support the new rockstars (either innovative IT or non-IT). The mature systems should be well aligned to facilitate their usage by the rock stars and because it reduces costs. Here individual freedom will add less value than alignment.
new 20 years old 30 years old 10 years old Mature Innovative Traditional enterprises have more mature IT than modern ones. They may also have innovative IT but it will be less than the mature IT they have. The system is built from layers of mature applications, which may have been innovative IT in their time. But as time passed their value decreased and more and more other applications have started to depend on them.
teams Hierarchical organization chart Self-organizing teams Self-organizing teams Focus on alignment Focus on autonomy ≠ Mostly mature IT Mostly innovative IT Properties of mature apps: - mature business domain so: " - no need to quickly adapt to changing business needs -> low pressure on frequent delivery " - no need for exponential scalability in a matter of weeks - more dependencies on them because they exist longer and therefore the risk of changing them is higher, also the effort to change the dependent apps is higher when they are changed with no backward compatibility -> high pressure on visibility of dependencies, high pressure to keep them stable, to decrease risk-taking, to stick with what you know - deliver little value so cost effectiveness is important -> high pressure to standardize