Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Rise, the ruin and the rescue

The Rise, the ruin and the rescue

Developers and architects spend most of their time adding features to their products (so-called maintenance) and are often annoyed about the many deficits of these systems: even supposedly simple things are becoming incredibly difficult with these legacy systems, and the time to market is getting worse and worse as business calls for more and more features. It’s rare to find time to reduce technical debt and clean up increasingly messy dependencies.

Gernot Starke explores possibilities to systematically escape legacy hell and reduce technical and other debt in your systems. You’ll learn strategical and tactical improvement approaches you can scale to fit your actual situation. Gernot conducts a breadth-first search for existing problems, issues, and risks within your system and clearly identifies technical, organizational, and communicative debts and determines their severity in order to concentrate on the worst of them. You’ll start to improve the situation using a number of strategic approaches, including brain size, where you systematically simplify and reduce and migrate toward self-contained systems or microservices; change by split and change by extraction to reduce dependencies; improve domain focus and incrementally introduce domain-driven design practices in legacy systems (restructure to domain); and improve modularization. Gernot explains each of the approaches based on a (not very hypothetical) large-scale ecommerce system. You’ll hear about the rise, decline, and rescue of that system.

Dr. Gernot Starke

November 07, 2019
Tweet

More Decks by Dr. Gernot Starke

Other Decks in Programming

Transcript

  1. The Rise, the Ruin & the Rescue. Improving a large

    eCommerce system Dr. Gernot Starke
  2. Similarities … • to existing companies are desired. • Everything

    (except 3 slides) has really happened... • All persons and companies mentioned here are fictitious © Dr. Gernot Starke.
  3. about the company. Internationally operating sales company, administration located in

    Colorado (USA), headquarter in Berlin (Germany). Bundles the expertise of over 500 specialists across the entire sales and technical spectrum: from efficient procurement and partner support to the design and introduction of forward-looking configuration and sales processes to customer-oriented sales and service. With more than 1,900,000 customers worldwide, 900,000 of them in Europe, SAMM Inc. is internationally and innovatively positioned, at the same time traditionally and focused on sustainable customer and partner orientation.
  4. 1992 1999 2004 2009 2015 Atlas Software GmbH & Co.

    KG Dr. Blue & Partner Red Inc. Hodor KG Engineering WebDev Inc. (London + München) Red Holding + Red Europa SAMM International Yellow Finance Inc Subsidaries in Hungary& Pakistan Grey Inc 2019
  5. Wie groß soll Ihr Regal sein? Breite: Höhe: Tiefe: Böden:

    weiter… Maße Material Farben Bestellung cm cm cm 60 60 25 1 Beratung http:/ /samm24.com/de/shop/furniture/config
  6. 1992 1997 2003 2008 2013 „Atlas 1“ (Host) „Atlas 2“

    (AS/400, Cobol) Backoffice- catalogue (Java + Host) eGov (Python) Web-Catalogue (Java) Pricemaster (Smalltalk) ComSuite (Java & Python) SAMM Sales (Java & Co) VENOM (Java & Co) Campaigner (Java, PHP) various competitors „WaWi“ (Host, Cobol)
  7. Example: 1992 1997 2003 2008 2013 „Atlas 1“ (Host) „Atlas

    2“ (AS/400, Cobol) Backoffice- Katalog (Java + Host) eGov (Python) Web-Katalog (Java) Pricemaster (Smalltalk) ComSuite (Java & Python) SAMM Sales (Java & Co) VENOM (Java & Co) Campaigner (Java, PHP) various competitors „WaWi“ (Host, Cobol)
  8. Integration project /-desaster • Fixed end date, fixed budget •

    Requirements only roughly defined • Management only pursues project goals „Atlas 2“ (AS/400, Cobol) SAMM Sales (Java & Co)
  9. … Quick‘n Dirty … Wrapper-Code for two-way transformations. „Atlas 2“

    (AS/400, Cobol) SAMM Sales (Java & Co) Atlas2SAMM Mapping SAMM2Atlas Mapping Atlas2SAMM Glue SAMM2Atlas Glue
  10. Technical Debt… „Atlas 2“ (AS/400, Cobol) SAMM Sales (Java &

    Co) Atlas2SAMM Mapping SAMM2Atlas Mapping Atlas2SAMM Glue SAMM2Atlas Glue
  11. Donald. • CTO • Formerly: M&A Board of Directors (Vorstand),

    CEO Dr. Taler Finance & Controlling Delivery Research Information Technology External Relations Supply Sales HR Internal Systems Customer Systems Accounting Marketing Partner Relations Innovation Market Research BER BER DUR DUR 5 2 4 4 35 5 15 4 2 5 6 BER BER DUR BER DY IT FC ER RS
  12. The situation. • strong decline in revenue from private customers

    • Time-to-Market unacceptable • so far no market entry in mobile internet • Hardly and innovation in products • Unused portfolio of patents
  13. Conclusion from Donald: Current system makes innovation impossible: • Code

    generelly too bad • Clear know-how deficits in VENOM team
  14. 44 Big Bang ... Requirem ents specification Data- m igration

    Architecture + Development Plan: approx 30 people, 3+ Scrum-Teams individual employees
  15. 100 Days • Highly promising prototypes • Ramp-Up delayed •

    so far only 6 (planned: 16) employees • Workaround: more external consultants
  16. 6 Month • VENOM: Strong decline in sales and earnings

    • Complax clarification of central requirements delays features in new system • byzz.io: first employees quit to find better jobs
  17. 12 Month 48 Management decides: • New system (prematurely) going

    live • Data migration declared „finished“
  18. Hours later • New system shows disastrous errors in operation

    • byzz.io Administration massively overwhelmed Results lead to: • Rollback to previous system (VENOM) • Rollback of data migration
  19. Breaking News. Donald H. becomes Minister of Technology Donald H.,

    long-time CIO of SAMM Inc., will take over the leadership of the new Ministry of Technology and Digitization at the beginning of the year. The spokeswoman of SAMM Inc. praised the good cooperation and the successful implementation of even critical projects – with Donald H., SAMM Inc. lost an extraordinary leader. Fake news #1
  20. aim42... 54 • identifies real problems and pains in systems

    • helps priorize („cost of pain)“ • suggests remedies
  21. Cathy, Developer • „DRAG“ – too many dependencies • Build

    and deploy: horror • Increasing extra-work-hours • Lack of overview • Hardly any innovation
  22. Michael, UI-Developer • Small changes take increasingly longer to finish

    • Increasing number of (self-introduced) bugs • Too much technical debt
  23. Stakeholder know many problems. Wrong data in archive Releases require

    manual work Releases depend on certain people Failing Communication with marketing departement Build process slow and unreliable Too many dependencies between subsystems
  24. Analyze: Architecture ... • Internal structure • Code: • Implemenentation

    • Metrics • Concepts Stakeholder Architecture Context Quality Data Processes breadth search
  25. Statical (Code) Analysis • Different languages complicate analysis... • Conventional

    metrics fail! Price Management Campaigns Pricing Engine Hodor Optical Archive Legend: Java PHP Python C/C++ Hask ell Cobol PL/ SQL Flash HTML/ JS Sales Frontend Private + Corporate Configurator Shell Client Contracts UDS (User Data Service) Order & Fullfillment Vouchers & Rebates eGov Shop Sales & Contracts Archive External Partners Price Management Data Warehouse Marketing & Sales Campaigns Atlas Customs & Logistics Pricing Engine Sales Backend Private+Corp Hodor Optical Archive Post-Sales Services Security Extensions Legend: Java PHP Python C/C++ Hask ell Cobol PL/ SQL Flash HTML/ JS Sales Backend eGovernment
  26. Analyze: Tactical Tornado. UDS (User Data Service) Order & Fullfillment

    Vouchers & Rebates Sales & Contracts Archive Price Management Marketing & Sales Campaigns Atlas Customs & Logistics Pricing Engine Sales Backend Private+Corp Sales Backend eGovernment Haskell, complicated domain logic Middleware (Message-Queue)
  27. ... in price calculation. UDS (User Data Service) Order &

    Fullfillment Vouchers & Rebates Sales & Contracts Archive Price Management Marketing & Sales Campaigns Atlas Customs & Logistics Pricing Engine Sales Backend Private+Corp Sales Backend eGovernment Haskell, complicated domain logic Middleware (Message-Queue) Pricing calculations (partially) implemented within middleware L
  28. Level-1: Component Disorder Sales Frontend Private + Corporate Configurator Shell

    Client Contracts UDS (User Data Service) Order & Fullfillment Vouchers & Rebates eGov Shop Sales & Contracts Archive External Partners Price Management Data Warehouse Marketing & Sales Campaigns Atlas Customs & Logistics Pricing Engine Sales Backend Private+Corp Hodor Optical Archive Post-Sales Services Security Extensions Legend: Java PHP Python C/C++ Hask ell Cobol PL/ SQL Flash HTML/ JS Sales Backend eGovernment
  29. Level-1: Component Disorder „Heisenbugs“ in price calculation Haskell developer rarely

    present Overly complicated interaction between validation and calculation Highly volatile external interfaces NGO, pharmacy & government pricing overly dependend Self-made middleware
  30. Analyze - Concepts ... Configurator Shell Meta Configurator Configuration Data

    Akquisition Legend: Java Prolog Python Drools Flash HTML/ JS Configuration Expert Configuration Validator Specific Configurator Sales Frontend Voucher generate use (compile-time) Pricing Engine User Mgnmt Sales Backend use (runtime) Client Contracts breadth search Stakeholder Architecture Context Quality Data Processes Fake news #2
  31. Analyze: Quality ... How well are quality requirements achieved? •

    Performance • Flexibility • Security • etc... i.e. ATAM Software Product Quality Functional Suitability Reliability Performance efficiency Operability Security Compatibility Maintain- ability Transfer- ability Appropriate- ness Accuracy Compliance Availability Fault tolerance Recover- ability Compliance Time- behaviour Resource- utilisation Compliance Appropriate- ness Recognise- ability Learnability Ease-of-use Helpfulness Attractiveness Technical accessibility Compliance Confidential- ity Integrity Non- repudiation Account- ability Authenticity Compliance Replace- ability Co- existence Inter- operability Compliance Modularity Reusability Analyzability Changeability Modification stability Testability Compliance Portability Adaptability Installability Compliance breadth search Stakeholder Architecture Context Quality Data Processes
  32. Analyze: Quality … Prio Attribute Szenario 1 Performance configurator display

    potential add-ons< 5 sec 1 Operability Configurator (private customers) usable on iOS & Android 2 Performance Price calculation finished < 10 sec 2 Changeability New product category < 30d live ... <etc> Architectural Approach on-demand loading of add-ons Multiple data sources for add-ons, only serialized queries possible Add-on queries partially hindered by wrong data in optical archive Configuration implemented Flash Parts of pricing data need to be retrieved from optical archive <u.v.a.m.> Software Product Quality Functional Suitability Reliability Performance efficiency Operability Security Compatibility Maintain- ability Transfer- ability Appropriate- ness Accuracy Compliance Availability Fault tolerance Recover- ability Compliance Time- behaviour Resource- utilisation Compliance Appropriate- ness Recognise- ability Learnability Ease-of-use Helpfulness Attractiveness Technical accessibility Compliance Confidential- ity Integrity Non- repudiation Account- ability Authenticity Compliance Replace- ability Co- existence Inter- operability Compliance Modularity Reusability Analyzability Changeability Modification stability Testability Compliance Portability Adaptability Installability Compliance Add-on calculation 10- 180 sec due to time- consuming queries Adobe-Flash not usable on iOS and Android Data required for pricing located in >4 different data sources
  33. Analyze: Data ... • Structure • Data model • Content

    • Distribution / Replication • Security / Privacys Stakeholder Architecture Context Quality Data Processes breadth search
  34. Data model horror • Massive performance issues • Data model

    (based upon former AS/400 DB2) • 5 tables, >400 columns (!!) each • Massively (!) coupled ... (500) ... (400) ... (400) ... (300) ... (400) Highly inperformant and complicated data model Row-count exceeds standard-DB capabilities (re-load required every 6 month)
  35. Analyze: Processes ... Requirements/ Business-Analyse • Development • Test •

    Rollout • Operations Stakeholder Architecture Context Quality Data Processes breadth search
  36. Organizational Disorder • 40% Development Inhouse, • 30% (external) Contractors

    • 30% Near-/Offshore with external support Heterogeneous: • Contracting • Development and release processes • Environments for sources, build, test + deploy Intransparent decision processes Lack of consistency within source code Need for coordination across teams & companies Chaos
  37. Requirements chaos: Conflicts between business departments Requirements chaos: Always changing

    priorities Management optimizing „projects“ instead of product
  38. Analysis identified problems… Flash-Konfigurator zu aufwändig in der Pflege falsche

    Daten im Archiv Enge Kopplung der Pricing Engine Releases dauern zu lange Releases benötigen viele manuelle Eingriffe zu hoher Anteil an manuellem Test Betriebsübergabe von einzelnen Personen abhängig Haskell Entwicklerin zu selten anwesend konkrete Preise hängen von zu vielen Parametern ab Know-How Flaschenhälse in Entwicklung und Betrieb Einkauf & Produktdesign komplett un-agil Scrum in Entwicklung kollidiert mit Planung in Fachbereichen Produktdaten verteilt auf zwei Sales-Backends mangelnde Qualität (Performance, Robustheit, Verfügbarkeit, Sicherheit) bei Verkaufs- und Vertragsdaten „Heisenbugs“ bei Preisberechnung „komplizierte“ Interaktion zwischen Validierung und Preisbestimmung Expertenwissen über Konfiguration verteilt auf Prolog und Drools Flash als Sicherheitsrisiko zu viele Datenquellen Pricing- Schnittstelle(n) sehr volatil optisches Archiv enthält falsche Daten breadth search
  39. Conclusion: • Many parts are „good enough“ • Overall structure

    („big picture“) really bad • Code quality partially bad • High competence in development teams • Data migration extremely risky -> postpone
  40. Analyse identifies problems & solution ideas Flash-Konfigurator zu aufwändig in

    der Pflege falsche Daten im Archiv Enge Kopplung der Pricing Engine Releases dauern zu lange Releases benötigen viele manuelle Eingriffe zu hoher Anteil an manuellem Test Betriebsübergabe von einzelnen Personen abhängig Haskell Entwicklerin zu selten anwesend konkrete Preise hängen von zu vielen Parametern ab Know-How Flaschenhälse in Entwicklung und Betrieb Einkauf & Produktdesign komplett un-agil Scrum in Entwicklung kollidiert mit Planung in Fachbereichen Produktdaten verteilt auf zwei Sales-Backends mangelnde Qualität (Performance, Robustheit, Verfügbarkeit, Sicherheit) bei Verkaufs- und Vertragsdaten „Heisenbugs“ bei Preisberechnung „komplizierte“ Interaktion zwischen Validierung und Preisbestimmung Expertenwissen über Konfiguration verteilt auf Prolog und Drools Flash als Sicherheitsrisiko zu viele Datenquellen Pricing- Schnittstelle(n) sehr volatil optisches Archiv enthält falsche Daten Pricing engine into one single component Introduce PO (product owner) as role factor-out SCS for NPOs create executable specifications/ tests with Cucumber
  41. Systematic Modernisation. • Keep treasures: continous restructure • Rapid removal

    of critical hot spots • Transformation to SCS („High-Level Modularization“) • Transition to agile organization
  42. Integrate improvements into day-to-day development... day-to-day development time tactical improvement

    long-term improvement tactical improvement tactical improvement tactical improvement
  43. Agile Practices. gently but stringently introduce „agile“ in: • Development

    • IT-Operations • Sales • Organization • ... ✓
  44. Further organizational change. • Specification-by-Example • Fully automated acceptance tests

    for „domains“: • Pharmacy • Government / Embassies • Private customers • NGO/NPO
  45. Architecture-Modernisation (1). Change-by- Extract. Client Flawed (incohesive) System Client „other“

    other features 2 Client Flawed System Client „other“ other features 1 3 Better other features Client (reduced) Flawed System Client „other“
  46. Architecture-Modernisation (2). Change-by- Split. Client Type 1 Flawed System Client

    Type 2 1 Client Type 1 Reduced to Type 1 Client Type 2 Reduced to Type 2 2 New Type 1 System Client Type 1 Client Type 2 New Type 2 System 3
  47. Venom-Splits (1). Venom, Split-1 NGOs, User Groups Venom, Split-2 Private

    User Corporate Users Government Users Operations Internal Users Operations Internal Users
  48. Venom-Splits (2): NGO-Spezifics. NGOs, User Groups Operations Internal Users kann

    entfallen Security Extensions Sales Frontend Private + Corporate Configurator Shell Client Contracts UDS (User Data Service) Order & Fullfillment Voucher s & Rebates eGov Shop Sales & Contracts Archive External Partners Price Manageme nt Data Warehouse Marketing & Sales Campaigns Atlas Customs & Logistics Pricing Engine Sales Backend Private+Corp Hodor Optical Archive Post-Sales Services Sales Backend eGovernme nt
  49. Venom-Splits (3): Commons. Estimated reduction: from 2 Mio LOC (VENOM)

    down to <200 kLOC in NGO-Split common NGOs, User Groups Operations Internal Users Sales Frontend Client Data NGO User Management Inventory Sales Order & Contracts Archive Pricing Engine Sales Backend Security Extensions DB Commons Client Contract Common Price Mngmt Commons Common User Data verkleinert
  50. 3 Month later... Retrospective for NGO Split: • time-to-market: 5

    days (instead of >30 in VENOM) • Production-Bugs reduced to <2/Woche (formerly >10) • Developer-Happyness: ++ • Inter-Team coordination required more effort • Scrum-of-Scrum for Common Services
  51. 9 Month later... Re-Architecture Pricing-Engine (Java replaces Haskell) • JBoss

    Drools with modular rule sets • Merge Price-Management with Pricing-Engine
  52. 18 Month later... • Stable increase of business revenue and

    turnover • MVP „VENOM2Go“ very successful • (nearly) zero employee fluctuation
  53. From the press. Apple brings in expert for software modernization

    Aaron Schwarz, until recently responsible for the internal VENOm system of SAMM Inc. disclosed his immediate move to Apple Inc. in Cupertino, California. Schwarz helped SAMM Inc. out of a severe crisis within only 18 month by re-architecting the (huge) internal e-commerc system. The VENOM system is now regarded as a model for successful modernization strategies in e-business. Schwarz writes that the VENOM-Teams are now mature enough and didn‘t need his ongoing support, so he could move on to new endavours. At Apple Schwarz will be responsible for the software architecture of the Apple iBank, based upon various legacy systems of some international banks Apple recently acquired. Fake news #3