a background as a developer, technical lead and software architect in traditional enterprises. In my last mission I was also responsible for the company-wide software delivery process, at least the development side, in a financial institution.
to a public audience so this feels as my personal release into the wild. Because I want to be a good student of DevOps I have added some monitoring. Let us have a look at the results at the end of the presentation!
talk about my experience in improving the traditional software delivery process of this financial institution. First I will describe how the IT department was organized. Then I will give you a closer look at the biggest problems that existed. And finally I will explain the first steps we took to solve the problems. But first of all let me give you a bit of theory on my view.
to have conflicting requirements regarding software delivery: ! (click) ! they want to have their features implemented as fast as possible and for this they put pressure on the development teams ! and ! (click) ! they want to keep the existing environment as reliable as possible and for this they put pressure on the operations teams But this is not possible with the current software delivery process: ! if you want the features implemented fast this will have an impact on the stability of the environment ! if you want to keep a reliable environment you will have to slow down the changes (click) The solution is to redefine the software delivery process, to DevOps-ify it, so it is able to support both requirements.
seems to have conflicting requirements regarding software delivery: ! (click) ! they want to have their features implemented as fast as possible and for this they put pressure on the development teams ! and ! (click) ! they want to keep the existing environment as reliable as possible and for this they put pressure on the operations teams But this is not possible with the current software delivery process: ! if you want the features implemented fast this will have an impact on the stability of the environment ! if you want to keep a reliable environment you will have to slow down the changes (click) The solution is to redefine the software delivery process, to DevOps-ify it, so it is able to support both requirements.
Business seems to have conflicting requirements regarding software delivery: ! (click) ! they want to have their features implemented as fast as possible and for this they put pressure on the development teams ! and ! (click) ! they want to keep the existing environment as reliable as possible and for this they put pressure on the operations teams But this is not possible with the current software delivery process: ! if you want the features implemented fast this will have an impact on the stability of the environment ! if you want to keep a reliable environment you will have to slow down the changes (click) The solution is to redefine the software delivery process, to DevOps-ify it, so it is able to support both requirements.
Business seems to have conflicting requirements regarding software delivery: ! (click) ! they want to have their features implemented as fast as possible and for this they put pressure on the development teams ! and ! (click) ! they want to keep the existing environment as reliable as possible and for this they put pressure on the operations teams But this is not possible with the current software delivery process: ! if you want the features implemented fast this will have an impact on the stability of the environment ! if you want to keep a reliable environment you will have to slow down the changes (click) The solution is to redefine the software delivery process, to DevOps-ify it, so it is able to support both requirements.
us have a look at the three levels that make up the software delivery process: The process itself ! The process should be as simple and consistent as possible to get the job done and should be clear for everyone The tools to implement it ! The process must be automated where possible, especially where things must happen fast or in high volume The culture - the people that use the process and the tools ! It is well known that people resist change and therefore this level is the most difficult and time consuming one. ! But it is also the most rewarding one: ! ! if you can convince them of the value that the change brings you will likely succeed ! ! if not even the best processes and tools won't get you anywhere ! ! !
apply the theory to the problems that existed in the software delivery process of the traditional enterprise I worked for. But first let me first give you some insight into how the IT department was organized. Let me start with the area of Strategy and Architecture. The general rule was to “buy before build”: first look on the market to buy a commercial application and only if nothing exists build it in-house. This rule had a big impact on the whole organization of the IT department.
a heterogeneous environment with many different technologies, modern and not so modern. Although there were some rules to which these applications had to comply each application had its own technology stack, mode of installation, way of configuration, way of testing, ...
consequence was that a lot of focus had to be put on the integration of these applications to make sure that they provide a consistent service to the business. This was done by development tools that specialize in ETL and EAI. A central data hub was created between the applications to simplify the integration problems.
integration The IT department was driven by “top-down” process frameworks like TOGAF, CMMi and ITIL. They were less aware of the more modern, lightweight, bottom-up processes like agile and DevOps.
Focus on integration Top-down process frameworks In the area of development and operations the teams were quite heterogeneous. You had young people and older people, passionate and less passionate, experienced and newcomers Generally accepted practices like unit-testing and logging were certainly not used by all teams, let alone more advanced practices like continuous integration, feature flags, etc.
on integration Top-down process frameworks Heterogeneous dev and ops teams There were company-wide monthly (and for the core applications quarterly) releases with cold deployments during the weekend.
Top-down process frameworks Heterogeneous dev and ops teams Low frequency of releases There was a lot of manual work involved in the software delivery process
Testing Configuration management = automated process = manual process There existed a change management tool and shared version control tools and build and deployment scripts for the most used development technologies. But all the other activities were manual. Configuration management like knowing which versions of software component belonged together, which versions were installed in which environment, etc had to be tracked manually. The developers filled in the deployment requests in Word templates and mailed them to the release coordinator who did a validation and added it to the release plan. At the day of deployment he sent it to the ops teams for deployment. Testing was also a manual process which happened at the end of the process in a shared environment using a full copy of the production data.
Top-down process frameworks Heterogeneous dev and ops teams Manual work Low frequency of releases Lightweight bottom- up processes One or a few decoupled products Built with automation in mind Modern technologies More homogeneous community Higher frequency of releases One important thing that you may have noted is that traditional enterprises are very different from the modern, younger companies like Flickr, Amazon, Facebook etc who .... In modern companies automation is built in from the ground up
modern company And because of this difference it seems obvious that applying DevOps to a traditional company can not be done in the same way as how the modern companies do it.
existing structure In traditional companies we have to take into account the existing structure and gradually add the modern extensions to it and integrate them with the existing structure We should also be very cautious, starting with the biggest and easiest problems and each time verifying that what we are trying to do actually makes sense in our context. And this as evolve to a better situation.
departments Making changes to the software delivery process is a huge undertaking because it covers many people and many departments, especially as a lot of activities are still very manual.
dev and ops teams gradually growing over time and the increased complexity of technologies release coordination started taking more and more time. This in its turn put more and more pressure on the deployment window and the downstream processes
more and more time Because the deployment instructions had to be written in free text they were very error prone: First of all they lacked standardization. Every this had a different way of requesting their deployments. Sometimes they were too vague for the ops team to understand, or steps were missing, or there was a small typo ... All of this caused a lot of friction between the teams, loss of time and higher risk for creating snowflake servers.
more time Devs have difficulty to communicate with ops Configuration management was too vague: There was no clear list of all business applications and it was difficult to know which components belonged to which application. Configuration management had to be tracked manually and people regularly made mistakes or just forgot to update the information. This caused a lack of confidence in the information.
Takes more and more time Devs have difficulty to communicate with ops Too vague and not reliable Testing happened at the end of the process so any delays that occurred upstream immediately shortened the period that was foreseen for testing. Testing was manual so even with a full testing period it was impossible to do a full regression test. For these reasons it happened regularly that change requests could not be signed off by the testers and therefore had to be removed from the release. Because of all the integrations a lot of changes depended on one another so this usually caused a chain reaction. Worst of all: after the impacted changes were all removed there was no time left to retest the remaining environment.
Referring back to the theory of needing a fast and reliable environment, on short term there was a higher need for a more reliable environment than for delivering the changes faster so that’s what we had to focus on. And it was very clear to me that the solution should start with bringing the configuration management under control. It is the core of the system and a lot of building blocks were already available. With a small effort a big improvement could be made.
three levels of configuration items: business applications, software components and source code of which the changes had to be versioned. First a list of business applications was created which was shared between Enterprise Architecture and Service Desk. The teams that were responsible for the applications were also clearly listed. Change requests and deployment requests could only refer to one application. Then the application was statically or “slowly changingly” linked to the components it was made up of. And finally we had the source code that contained all files necessary to build the component.
configuration management tool was created to manage the store the information about the configuration items and a software repository was created to store the deployable files that are related to the software components
Testing = automated process = manual process Configuration management In fact this was one tool that implemented both pieces. This tool was integrated with the build automations and received the built files (exe’s, dll’s, config files etc) as well as any relevant meta data like who executed the build, the commit messages including a reference to the change requests and the file diffs.
Testing Configuration management Software repository NEW NEW = automated process = manual process In fact this was one tool that implemented both pieces. This tool was integrated with the build automations and received the built files (exe’s, dll’s, config files etc) as well as any relevant meta data like who executed the build, the commit messages including a reference to the change requests and the file diffs.
implementation of a release management tool that was acquired on the market (it is now part of BMC and is called RPM: Release Process Manager). This tool facilitated the developers with the creation of their deployment requests and the release coordinator with the planning and coordination of them.
management Software repository = automated process = manual process Release management The tool was integrated with the configuration management tool to get the available components and with the change management tool to get the change requests
management Software repository Release management NEW = automated process = manual process The tool was integrated with the configuration management tool to get the available components and with the change management tool to get the change requests
application and a release. Automatically a list of the applicable change requests is shown, including their statuses. This made it easier for the release coordinator to validate the deployment requests For each deployment of a component he appends a step and selects the applicable version number. Automatically the relevant deployment related information is shown. The tool was also able to show the versions of a component by environment, both currently as historically.
deployed versions from the release management tool and this allowed to provide overviews like these. It was also possible to do automated consistency checks like giving a warning when a certain change request was not implemented by any component or when a component implemented change requests for multiple releases.
automated deployment of the component, they was still the need for an ops person to execute the command of the deployment script. The reason for this was simply that the security requirements for the release management tool were way stricter if it needed a connection to all production servers. So this was delayed to a later moment.
the relatively simple problems to the more complex ones. But the advantage here is that we can start small, focussing on the most complex and volatile areas and gradually extend the test coverage to the rest of the environment. Ideally this effort should also include the gradual automation of environment creation and server provisioning. But the code base is huge and there are many business applications and even development languages that have limited or no support for automated testing. So it will be a long and costly journey.
gets more and more controlled and automated I expect the release frequency to raise and two week releases should definitely become possible. It may also become more interesting to allow applications to be released on their own schedule as long as they are not too much integrated with other apps.
traditional enterprise? I’m not so sure because this would require changing the core of the enterprise and the culture of the people. Maybe this is something that has to be built in from the start and adding it as an afterthought would take too much effort for what it delivers.