Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A DevOps journey at ABN AMRO @ DevOpsDays oNLine 2020

A DevOps journey at ABN AMRO @ DevOpsDays oNLine 2020

Recently ABN AMRO embraced their DevOps journey, recognising that they need to continue in the path of continuous improvement. The market demands are high, and they continue to crunch knowledge and create new insights on top of the previous transformations.

João Rosa

July 09, 2020
Tweet

More Decks by João Rosa

Other Decks in Business

Transcript

  1. A DevOps journey at ABN AMRO 1 João Rosa (Strategic

    Software Delivery Consultant) - Xebia
  2. Learnings } Provide CLEAR GUIDANCE for teams, with minimum set

    of requirements } Not the program, but LINE ORGANISATION OWNS the transformation } Set clear MILESTONES to push delivery } Experiment, fail and ADAPT } CONTINUOUS DIALOGUE with 2nd & 3rd line parties } Mindful of culture and SUB-CULTURES – Everyone is unique! } Make it VISUAL and fun, to strive for TRANSPARENCY 2
  3. 4 ABN AMRO Agile transformation done in 2017 A GRID

    is where blocks are grouped together within the same business area A BLOCK is a small team, that owns a certain part of functionality end-to-end A CIRCLE is a group working within a special subject area or with unique skills A TRIANGLE is a community of members with shared interests
  4. Operating model outlined for moving to Cloud & DevOps 6

    1 2 3 4 5 A DevOps block and its PO are responsible for all change & run tasks for the application(s) it manages A DevOps block can independently release functionality into production A DevOps block has an automate everything mindset A DevOps block will use easily consumable standardized services (e.g., infra, security) All team members in a DevOps block contribute to change (80% of their time) and application related run work 10 CADM & CISO set standards for enterprise & security architecture and ensure adherence from blocks (incl. signoff on pipeline and other automation) 7 The tower monitors & signals integrity and currency of app and infra landscape, with intervention mandate for major incidents The helpdesk and bridge execute SOPs defined by the DevOps blocks or route incidents towards the DevOps blocks 8 Scarce expert resources join blocks temporarily in flow-to-work mode 6 DevOps toolchain DevOps DevOps DevOps Tower Security & Architecture Standards Shared services Service catalogue with API consumable infra, platform, DevOps & security services +95% Tailor made <5% SOC Helpdesk Bridge Business grids (distribution, product & enabling) DevOps DevOps DevOps RET DevOps Infra & platform broker Public cloud DevOps DevOps DevOps Tools e.g. monitoring Security & compliance services DevOps DevOps DevOps Specialty infra services Security tools DevOps DevOps DevOps e.g. CMS, Mainframe Infra managed services Infra components The broker sets offering, pricing and SLAs for standardized services (public cloud, on- premise infra, security & CI/CD) 9 Key principles
  5. Good building blocks, but also still a lot of detailing

    to do for the IT Transformation powered by the Apollo program 7 Buy-in of the transformation on ExBo and ExCo level Planning, planning, planning How to cope with the change in a heavily outsourced environment HR and restructuring and process design incl. risk control framework Define our change management approach for teams Engagement of ABN AMRO on all levels and dimensions and facilitate trainings Build the strategic platform incl. engineering system
  6. Define what DevOps is at ABN AMRO 8 DEVOPS AT

    ABN AMRO. DevOps is a way of working that emphasizes collaboration between business, software development and operations. DevOps extends the Agile principles by further streamlining and automating the product lifecycle and enabling cross-functional teams to take ownership of their product from an end-to-end perspective. Keep learning You build it, you run it, you fix it, you own it Automate everything mind- set All team members contribute to change and run work Create flow Use easy consumable and standardized services analyse and prioritize work Backlog management build code Development measure code quality Test Application monitoring Monitoring Events validation of acceptance Change Deploy & release Incidents deploy code into production solve incidents event creation Team autonomy Nail agile Everything as code
  7. “Beau’s hell” – filtering and aligning capability requirements to a

    bare minimum but leaving room for inspiration 10 Req_ID Capability Type Requirements Assessment Comments Reference Material Control reference CO_L1_001 Level 1 Control The integrity of the configuration items used for my applications and services is guaranteed by a fully accurate and timely updated Configuration Management Database (CMDB) by my team in ServiceNow. To be filled in CBSP reference information will be shared during CBSP QuickScan sessions. https://intranet.nl.eu.abnamro.com/nl/assets/108-48-20-IT-Configuration-Management-Policy-July2019_tcm582- 1743557.pdf C-00006187 - EC_ISO-04 Application inventory CO_L1_002 Level 1 Control Service recovery plans must be available for CIA rating Availability = 1 and for CIA rating Availability =2 and must be updated at least once a year. Disaster Recovery tests are defined and scheduled for all our application(s) with CIA Availability = 1. Results are registered in DR Dashboard on connections. Disaster Recovery test is performed at least every 12 months for our applications with Recovery Time Objective (RTO) 0-1, and at least every 24 months for all our applications with RTO 2-4. To be filled in CBSP reference information will be shared during CBSP QuickScan sessions. https://social.connect.abnamro.com/communities/service/html/communitystart?communityUuid=2690c749-1ae7-403f-9a99-a32b6e59fe5e. C-00007725 - Service Recovery Plans are available C-00007726 - Quarterly the BCO verifies that the frequency of DR testing matches the frequency required by the BCM policy C-00007727 - Quarterly the IT-SCM SPoC monitors timely execution of planned DR and , if applicable, follow-up C-00007728 - Quarterly the Business Continuity Officer verifies recording of DR related issues, including recording of proper follow-up CO_L1_003 Level 1 Control All applications owned by my team are registered in the One Application Referential (OAR ). To be filled in https://social.connect.abnamro.com/communities/service/html/communitystart?communityUuid=b627c98d-7e64-4ed1-8d38-2f8d002ec03a C-00012952 - The application data in the Asset Inventory is complete CO_L1_004 Level 1 Control Block administration is up-to-date with all required information (see Guidelines on connections page). Please pay attention to: - all necessary information on what your block is supporting: owned OAR's, block email, email addresses of team members, phone numbers. - correct administration of your teams DevOps roles as this will define your teams rights in ServiceNow (Product Owner, Scrum Master, IT Engineer etc.) - update AGF with relevant team information. To be filled in https://social.connect.abnamro.com/communities/service/html/communitystart?communityUuid=3b27a50e-d57c-46ee-9d91-8c5fd94e695a C-00014834 - Quarterly I&A performs a check on the manual part of the JoMoLea process, C-00015173 AWS - IAM setup and Monitoring CO_L1_005 Level 1 Requirement Service Administration in ServiceNow is up-to-date: - All relations of your applications are defined in ServiceNow (upstream & downstream relations) and understood by the entire team. - All stacks/resource groups/Configuration Items are tagged to the correct Business Application of your Business Service - All end users are subscripted to the service, in order to be able to raise calls via de Self-serving portal. (if applicable). To be filled in https://social.connect.abnamro.com/communities/service/html/communitystart?communityUuid=37ce103e-1701-4ab1-9954-a1a635967946 Not Applicable CO_L1_006 Level 1 Requirement Roles & responsibilities: - Process roles to handle Major and Complex Incidents including the communication via the prescribed channels are formally recognized, defined and assigned in the DevOps team. - Process roles to approve Root Cause Analysis documents and the underlying SIP actions are formally recognized, defined and assigned in the DevOps team. - Segregation of accountability and responsibility between the Product Owner, IT Lead and the DevOps team with regard to the execution of the Incident and Problem process is fully implemented. - Segregation of accountability and responsibility between the Product Owner, IT Lead and the DevOps team and between Dev-engineers & Ops-engineers with regard to the execution of the Change Management process is fully implemented (e.g. There is a single owner who is responsible for assessing Major and Emergency changes). To be filled in Not Applicable CO_L1_007 Level 1 Control Only the central IT service management tooling (ServiceNow) is used for core processes Incident, Problem, Change and Call management. To be filled in https://ibmaabpr.service- now.com/u_published_documents_dms_revision_list.do?sysparm_userpref_module=6a46d7c04f385300feb3d19f0310c75d&sysparm_view=OPS%20Manu al&sysparm_query=dms_type=ee5155444ffc9340a300d2ff0310c797^ORdms_type=d7649d804f30d340a300d2ff0310c7a9^ORdms_type=75e459c04f30d34 0a300d2ff0310c76c^ORdms_type=565595044f30d340a300d2ff0310c74c^ORdms_type=70f5d1c44f30d340a300d2ff0310c754^ORdms_type=4f36d5c44f30d 340a300d2ff0310c7bd^EQ^GROUPBYdms_u_record^ORDERBYdms_type^ORDERBYrev_attachment&sysparm_clear_stack=true C-00010849 Incidents are registered correctly C-00010877 Yearly the effectiveness of automated controls for change management in the Service Management Application is tested CO_L1_008 Level 1 Control On call duty for DevOps team members during and outside office hours is in place for owned critical applications and business services chain(s) with impact (CIA for Availability = 1). To be filled in Will be worked out by Apollo program and published when available. Check this Apollo page: https://social.connect.abnamro.com/wikis/home?lang=nl#!/wiki/W7a3dfeeec2fa_4143_a0dc_1ac023f65e31/page/Organisational%20Design C-xxxxxxxx- to be provided CO_L1_009 Level 1 Control Service Commitments ( e.g. SLA ) are defined for Availability for each application with Availability level 1 for team and vendor performance To be filled in https://social.connect.abnamro.com/communities/service/html/communitystart?communityUuid=b9928e38-76df-4863-8e69-3c66d72e370a C-00014804 - IT Incident Management resolution times are met CO_L1_010 Level 1 Control For every off-premises application, the team has delivered an exit plan according to the existing ABN AMRO exit strategy. This plan is fitting for use and nature of the concerning application and is approved by responsible DAO and BAO. To be filled in https://social.connect.abnamro.com/wikis/home?lang=nl#!/wiki/W4a15ff48670e_4510_a692_e52743f8cd78/page/Set%20up%20Exit%20strategy C -00015176 AWS - Secure disposal of data, CO_L2_011 Level 2 Requirement When responsibilities have changed (e.g. due to higher maturity or changes in your team) block administration is updated. To be filled in https://social.connect.abnamro.com/communities/service/html/communitystart?communityUuid=3b27a50e-d57c-46ee-9d91-8c5fd94e695a (also contributes to RCF control: C-00014834 - Quarterly I&A performs a check on the manual part of the JoMoLea process) Not Applicable CO_L2_012 Level 2 Requirement The team knows where to find the change calendar and how to use it to speed up the Root Cause Analyses (e.g. technical analysis) process in case of disturbances To be filled in https://social.connect.abnamro.com/communities/service/html/communitystart?communityUuid=aa328a19-ed02-49be-801b-c2b6c39d3883 Not Applicable CO_L2_013 Level 2 Requirement To reduce the number and impact of future incidents, Problem Management is used by the team to identify the actual cause of one or more incidents through recurring incident analysis To be filled in https://social.connect.abnamro.com/communities/service/html/communitystart?communityUuid=1890cce5-6a88-4140-931d-192330bed0ad Not Applicable CO_L2_015 Level 2 Control A status change of a Configuration Item stored in the Service Now configuration management database can only be done following the change management process. To be filled in https://social.connect.abnamro.com/communities/service/html/communitystart?communityUuid=b325f6cb-c415-4849-80f4-1ab83ed7e255 C-00011327 LC_IT-04 Deltas of the reconciliation between the CMDB and the daily infrastructure scan are discussed and followed up C-00011328 LC_IT-04 Differences between changed CIs and registered CI changes in ServiceNow Blue are discussed with IBM and monitored C-00015869 LC_IT-04 Differences between changed CIs and registered CI changes in CMDB are discussed with DevOps teams CO_L2_016 Level 2 Control Retention/backup services are in place according to RTO - RPO requirements agreed with Business. To be filled in CBSP reference information will be shared during CBSP QuickScan sessions. For AWS refer to: https://social.connect.abnamro.com/wikis/home?lang=en#!/wiki/Wbb310a1c98f8_4ed8_97fb_ed4d14b3a06d/page/Standards%20%26%20Guidelines For IBM environments; refer to TSM in your team. C-00015172 - AWS - Backup and retention of data C-00015177 Azure - Secure disposal of data CO_L2_017 Level 2 Requirement Root cause analysis (RCA) are drawn up on major incidents by all suppliers including Cloud and SAAS Service providers To be filled in Not Applicable CO_L2_018 Level 2 Requirement Knowledge articles in Service Now for user support are created and published To be filled in https://aabsiampr.service- now.com/myit?id=myit_kb_article&sys_id=10ef229ddb29d3480f4416d15b961983&knowledge_base=678ec474db9ddf80bd2c83305b961966 Not Applicable CO_L2_019 Level 2 Requirement Availability, incident and change handling is regularly discussed with stakeholders including Cloud or SAAS-Service providers. To be filled in Not Applicable CO_L3_020 Level 3 Requirement To sustain the required Business level of availability our team uses the Mean Time Between Failure (MTBF) indicator to make reliability improvements for components that have failed after a breakdown and to shorten maintenance and repair time. To be filled in https://social.connect.abnamro.com/wikis/home?lang=nl#!/wiki/W894ba23ada96_4868_883b_d28d07865797/page/D2C%20- %20Detect%20to%20Correct%20-%20Value%20Stream Not Applicable CO_L3_021 Level 3 Requirement An effective capacity management plan, including forecast, for all our used IT components is in place to deliver the highest quality service—at the lowest possible cost. To be filled in CBSP reference information will be shared during CBSP QuickScan sessions. https://social.connect.abnamro.com/communities/service/html/communitystart?communityUuid=d6a3b434-f3a7-4df2-b6ee- f46d14809ed5&ftHelpTip=true. Not Applicable CO_L3_022 Level 3 Requirement Continual Service Improvement is embedded in the DevOps way of working and improvement initiatives, derived from relevant measurements and KPis, are recorded in Service Now while the actions themselves are put in the Back log. To be filled in https://social.connect.abnamro.com/communities/service/html/communitystart?communityUuid=692949e8-0718-40be-9385-d8b2306b4547 Not Applicable CO_L3_023 Level 3 Requirement The CMDB is automatically updated when changes occur in the IT Landscape To be filled in CBSP reference information will be shared during CBSP QuickScan sessions. Not Applicable What is mandatory and what is an efficiency requirement? What compliancy is referenced? Where can you find detailed information What does my team need to do and how do I score myself on it
  8. Channel all the knowledge towards the teams onshore and offshore

    where required 11 SECURITY WIZARD CI/CD ENABLER PLATFORM SUPPORT MONITOR ENGINEERING SUPPORT RUN SUPPORT INTEGRATOR TRANSFORMATION FACILITATOR MIGRATION Outcomes Grid Journey } Planning in 2 month cycles } 6 CoEs provide experts } One uniform channel } Build-up support offshore Transformation support team ready
  9. Results for the grid • Cost optimization with pragmatic approach

    to business-critical applications • Chain improvements with drop on calls (~x10) per week on call center • Time-to-market reduction • Release anytime for mobile applications • Extreme automation for the software lifecycle • First teams to adopt the new risk and compliance processes • Cloud approval process • Dutch and European Central Banks verification tests 22
  10. Learnings } Provide CLEAR GUIDANCE for teams, with minimum set

    of requirements } Not the program, but LINE ORGANISATION OWNS the transformation } Set clear MILESTONES to push delivery } Experiment, fail and ADAPT } CONTINUOUS DIALOGUE with 2nd & 3rd line parties } Mindful of culture and SUB-CULTURES – Everyone is unique! } Make it VISUAL and fun, strive for TRANSPARENCY 28