Slide 1

Slide 1 text

Systematic Software Improvement Dr. Gernot Starke

Slide 2

Slide 2 text

Promise

Slide 3

Slide 3 text

Reality

Slide 4

Slide 4 text

Thesis: Education focused on „build-from-scratch“ Software of systems

Slide 5

Slide 5 text

Thesis: Business requires more maintenance competence

Slide 6

Slide 6 text

Thesis: Improvement is more than Refactoring of single classes of Systems

Slide 7

Slide 7 text

These: Verbesserung ist mehr als Refactoring „Große“ Umbauten bedeuten (oft): • Umbau DB-Struktur, Datenmigration • Austausch von Software-Infrastruktur (z.B. Frameworks) • umfangreiche Änderung interner Abläufe • massive Änderung interner Schnittstellen

Slide 8

Slide 8 text

Thesis: Management responsible for budget ignores architecture principles

Slide 9

Slide 9 text

Architecture Improvement Method

Slide 10

Slide 10 text

analyze evaluate improve

Slide 11

Slide 11 text

analyze evaluate improve • architecture • code • runtime • organization

Slide 12

Slide 12 text

analyze evaluate improve determine „value“ of problems / risks / issues and their remedies

Slide 13

Slide 13 text

analyze evaluate improve • define improvement strategy • refactor • re-architect • re-organize • remove debt

Slide 14

Slide 14 text

Common Wording analyze evaluate improve crosscutting practices & principles Issue (Problem) Improvement (remedy) Cost Risk Cause cost of improvement improvement has risks or consequence improvements resolve cause (root) causes of issues cost of issue (potential) cost of risk risk might result in issue solve issue with improvement(s) improvement solves issue(s)

Slide 15

Slide 15 text

analyze evaluate improve collect… Groundwork (1) analyze evaluate improve crosscutting practices & principles Iterate!

Slide 16

Slide 16 text

Groundwork (2) analyze evaluate improve crosscutting practices & principles collect issues! collect improvements! m:n

Slide 17

Slide 17 text

Groundwork (3) analyze evaluate improve crosscutting practices & principles collect issues collect opportunities for improvement create from Explicit Assumption Improvement Backlog keep explicit list or table helps understand Issue List keep explicit list or table m:n mapping

Slide 18

Slide 18 text

Groundwork (4) analyze evaluate improve crosscutting practices & principles fundamental Legend: collect issues collect opportunities for improvement create from change has impact Impact Analysis might create new problems Expect Denial Explicit Assumption Improvement Backlog Fail Fast Fast Feedback Separate Cause From Effect Slide or Write Traceability keep trace to problem stakeholders deny problems traces help prove your points keep explicit list or table helps understand root cause analysis presentation or written report solution to what problem(s) Issue List Artifact keep explicit list or table m:n mapping

Slide 19

Slide 19 text

Expect Denial

Slide 20

Slide 20 text

„Analysis“ Overview analyze evaluate improve Qualitative Analysis Context Analysis Stakeholder Analysis Stakeholder Interview prepares validates external stakeholder Quantitative Analysis finds risks and non-risks gives overview fundamental crosscutting Legend: collect issues collect improvement opportunities Development Process Analysis part of find input for

Slide 21

Slide 21 text

analyze evaluate improve „Analysis“ Details Qualitative Analysis ATAM Context Analysis Issue Tracker Analysis Data Analysis Documentation Analysis Runtime Analysis Stakeholder Analysis Stakeholder Interview prepares Requirements Analysis foundation for part of validates external stakeholder Quantitative Analysis finds risks and non-risks identify risk areas Questionnaire prepares gives overview Software Archeology supported by measure at runtime Static Code Analysis measure code supports fundamental crosscutting Legend: collect issues collect improvement opportunities part of Development Process Analysis part of find input for Infrastructure Analysis part of Instrument System provide better information

Slide 22

Slide 22 text

„Analysis“ Overview Qualitative Analysis Context Analysis Stakeholder Analysis Stakeholder Interview prepares validates external stakeholder Quantitative Analysis finds risks and non-risks gives overview fundamental crosscutting Legend: collect problems collect improvement opportunities Development Process Analysis part of find input for analyze evaluate improve Talk to the right people!

Slide 23

Slide 23 text

„Analysis“ Overview Qualitative Analysis Context Analysis Stakeholder Analysis Stakeholder Interview prepares validates external stakeholder Quantitative Analysis finds risks and non-risks gives overview fundamental crosscutting Legend: collect problems collect improvement opportunities Development Process Analysis part of find input for analyze evaluate improve Understand the neighbourhood!

Slide 24

Slide 24 text

Context Example

Slide 25

Slide 25 text

Context Exercise > Create a context diagram of your SuD > Briefly describe the external interfaces > Analyse risks

Slide 26

Slide 26 text

Qualitative Analysis Context Analysis Stakeholder Analysis Stakeholder Interview prepares validates external stakeholder Quantitative Analysis finds risks and non-risks gives overview fundamental crosscutting Legend: collect problems collect improvement opportunities Development Process Analysis part of find input for „Analysis“ Overview analyze evaluate improve Systemic issues with the organization?

Slide 27

Slide 27 text

„Analysis“ Overview Qualitative Analysis Context Analysis Stakeholder Analysis Stakeholder Interview prepares validates external stakeholder Quantitative Analysis finds risks and non-risks gives overview fundamental crosscutting Legend: collect problems collect improvement opportunities Development Process Analysis part of find input for analyze evaluate improve Quality issues?

Slide 28

Slide 28 text

Qualitative Analysis Preparation Identify the relevant stakeholders Kickoff Present the ATAM method Present the business objectives and architecture goals Present the architecture of the system Evaluation Explain detailed the architecture approaches Create a quality tree and scenarios Analyze architecture approaches with respect to the scenarios Follow-up Present the results

Slide 29

Slide 29 text

Qualitative Analysis Software Product Quality Attributes ISO 25010 Functional Suitability Reliability Performance efficiency Operability Security Compatibility Maintain- ability Transfer- ability Appropriate- ness Accuracy Compliance Availability Fault tolerance Recover- ability Compliance Time- behaviour Resource- utilisation Compliance Appropriate- ness Recognise- ability Learnability Ease-of-use Helpfulness Attractiveness Technical accessibility Compliance Confidential- ity Integrity Non- repudiation Account- ability Authenticity Compliance Replace- ability Co- existence Inter- operability Compliance Modularity Reusability Analyzability Changeability Modification stability Testability Compliance Portability Adaptability Installability Compliance

Slide 30

Slide 30 text

„Analysis“ Overview Qualitative Analysis Context Analysis Stakeholder Analysis Stakeholder Interview prepares validates external stakeholder Quantitative Analysis finds risks and non-risks gives overview fundamental crosscutting Legend: collect problems collect improvement opportunities Development Process Analysis part of find input for analyze evaluate improve Measure!

Slide 31

Slide 31 text

Stakeholder Analysis analyze evaluate improve top-management, business-management, project-management, product- management, process-management, client, subject-matter-expert, business-experts, business-development, enterprise-architect, IT-strategy, lead-architect, developer, tester, qa-representative, configuration-manager, release-manager, maintenance-team, external service provider, hardware- designer, rollout-manager, infrastructure-planner, infrastructure-provider, IT-administrator, DB-administrator, system-administrator, security- or safety-representative, end-user, hotline, service-technician, scrum-master, product-owner, business-controller, marketing, related-projects, public or government agency, authorities, standard-bodies, external service- or interface providers, industry- or business associations, trade-groups, competitors Role / Name Description Intention Contribution Contact Identify the right people!

Slide 32

Slide 32 text

Stakeholder Analysis (II) who MIGHT have problems or know things... analyze evaluate improve • use (pre-interview) questionnaire • conduct personal interviews: e.g. what are your top-3 issues with... 1. the system 2. the development / maintenance process 3. operation / infrastructure of the system 4. ...

Slide 33

Slide 33 text

Stakeholder Exercise > Identify stakeholder groups for your system > What‘s their contribution to analysis?

Slide 34

Slide 34 text

Stakeholder Exercise-2 > Select 2 important stakeholder groups > For each of those, sketch 2-3 specific questions

Slide 35

Slide 35 text

analyze evaluate improve Static Code Analysis (here: SonarQube dashboard / Apache PDFbox)

Slide 36

Slide 36 text

analyze evaluate improve Static Code Analysis (here: afferent coupling)

Slide 37

Slide 37 text

Perishable Food Packaging > Embedded software + information systems > Regulated domain -> safety critical > Goal: Decrease SW development cost

Slide 38

Slide 38 text

Food: Analysis > Stakeholder analysis and -interviews > Development Process Analysis > Qualitative Analysis + View-Based-Understanding > Quantitative Analysis, Static Code Analysis > Central problem areas: > Lack of overview („knowledge islands“) > Low code quality > ad-hoc development: No systematic processes

Slide 39

Slide 39 text

Food: Root-Cause Analysis > Company focus primarily on hardware > Software development scattered in various departments > No (planned) software architecture

Slide 40

Slide 40 text

Food: Analysis (excerpt) issue (problem) description problem-cost time-to-market > 6 month (!) from business or government requirement to production sales loss might be > 1M$ production log data loss architecture does not ensure complete production logs - data records might get lost! Large volumes of perishable food could be at risk > 10-100k $ per incident scattered knowledge + low code quality no synergy effects, no conceptual integrity, no re-use between departments, ... >5-50k $ per maintenance update self-developed OR-mapper expensive maintenance, high know-how requirements, high deviation in performance 5-10k $ per maintenance update

Slide 41

Slide 41 text

Food: System Overview > C# / .NET as development & production platform Machine Operational Support Sales Support Database Machine Sensors Message Queue Legend: COTS C# Data Storage & Reporting Machine Configuration Frontend Machine Configuration Backend

Slide 42

Slide 42 text

Food: Safety Risk Wrong usage of Message Queue: > 1.-3. has to be transactional > Reporting „commits“ to MQ after 2! (too early!) > Problem in reporting leads to lost data! Machine Operational Support Sales Support Database Machine Sensors Message Queue Legend: COTS C# Data Storage & Reporting Machine Configuration Frontend Machine Configuration Backend 1 2 3

Slide 43

Slide 43 text

EU Telecom Provider > Business Intelligence Portfolio to support Marketing & Sales

Slide 44

Slide 44 text

Telco: Analysis > View-Based-Understanding > Data Analysis > (few) stakeholder interviews > Central problem areas: > BI Reporting highly fragmented & diverse > Report implementation details driven by business experts (provided data models + SQL query details as „requirements“) > Implementation partially based upon proprietary meta-model

Slide 45

Slide 45 text

Telco: Analysis (excerpt) problem / risk description problem-cost high development cost business benchmarks showed development to be overly expensive (and slow) per report-type 50-200% non-transparent software and data architecture of >50 developers and BI experts, only very few understood whole DWH vendor-lock-in proprietary tools implemented to process (proprietary) meta-model, high yearly license cost, 50 k€ license fee / yr, O(1000) dev-hrs wasted developer exodus core developers upset as company announced large outsourcing deal, (nearly) annihilating internal development 6-18 month without new business features

Slide 46

Slide 46 text

Croc: Sales & ERP Provider > Niche provider for sales & ERP „standard“ solution > Origin in „perishable“ market - but growing > 80% of clients: low-margin-high-volume > 20% of clients: low-volume-very-high-margin > Original idea: Universal-Core + Configuration > Starting point: low (dev + runtime) performance Company name changed due to anonymity requirements!

Slide 47

Slide 47 text

Croc: Analysis > Brief stakeholder analysis and -interviews > Static Code Analysis > Runtime Analysis > Data Analysis (including data model) > Central problem areas: > Excellent code quality („clean code“) - but very few unit tests > Extremely high configurability of everything > >150 developers with extremely different options

Slide 48

Slide 48 text

Croc: Analysis (2) „Configuration is the sequel to programming, with unsuitable means“ > Configuring UI structure, UI behavior, workflows, business and validation rules, reports and interfaces > Horrible persistent data structures for both runtime and configuration data > Some configuration stored in various XML formats

Slide 49

Slide 49 text

Croc: Analysis (3) > Few key tables with 500-700 columns (!!) each. > Stores complete application state - including cursor position. „Clean“ Code XML Configuration DB Legend: COTS Code Table-1 Table-2 Table-3 Table-4 Database Relational Data

Slide 50

Slide 50 text

analyze evaluate improve „Analysis“ Details Qualitative Analysis ATAM Context Analysis Issue Tracker Analysis Data Analysis Documentation Analysis Runtime Analysis Stakeholder Analysis Stakeholder Interview prepares Requirements Analysis foundation for part of validates external stakeholder Quantitative Analysis finds risks and non-risks identify risk areas Questionnaire prepares gives overview Software Archeology supported by measure at runtime Static Code Analysis measure code supports fundamental crosscutting Legend: collect issues collect improvement opportunities part of Development Process Analysis part of find input for Infrastructure Analysis part of Instrument System provide better information

Slide 51

Slide 51 text

Collect Issues > For your system, collect a few issues > from various sources: requirements, architecture, implementation/code, operations, development-process issue (problem) description ...

Slide 52

Slide 52 text

Collect Improvements > For your issues, collect a improvement opportunities > issue (problem) improvement

Slide 53

Slide 53 text

„Evaluate“ Overview fundamental crosscutting Legend: issue list improvement backlog Estimate Issue Cost create from Estimate Improvement Cost Estimate in Interval Estimate Feature Value analyse impact Explicit Assumption requires based upon

Slide 54

Slide 54 text

fundamental crosscutting Legend: issue list improvement backlog Estimate Issue Cost create from Estimate Improvement Cost Estimate in Interval Estimate Feature Value analyse impact Explicit Assumption requires based upon „Evaluate“ Overview Map problems to „business“ terms!

Slide 55

Slide 55 text

„Evaluate“ Concepts Estimate Assumption Unit (Measure) Parameter Observation Probability based upon how certain? Correction Factors can observe require / allow Intervall time, money, etc Subject what do we estimate? tacke uncertainty by based upon

Slide 56

Slide 56 text

Rail Transport Provider > Heterogeneous IT landscape > Problem areas: > 6-12 month from initial business requirement to production („time-to-market“) > Stability, reliability > Performance

Slide 57

Slide 57 text

Rail - aim42 Analysis > Stakeholder Analysis + -Interviews > yielded several problems + problem-areas > Issue Tracker Analysis + Software Archeology > Qualitative (ATAM-like) Analysis > Static Code Analysis > Development Process Analysis

Slide 58

Slide 58 text

Rail (1): Overview Ticket Sales Frontend Cash Management Client Personalization Client Data / Contract User Management Rail Itinerary Vouchers Rebate and Reduction Cards Inter-European Connections (HAFAS) External Partners Booking Office Ticket Price Management Data Warehouse Marketing & Sales Campaigns Travel Agents API & UI Pricing Engine Ticket Sales Backend Legend: Java PHP Python C/C++ Web Server Extensions Pricing Data Store Haskell Cobol Security Extensions PL/ SQL bad!

Slide 59

Slide 59 text

Rail (2): Challenges > Embrace new sales channels (mobile) > requires (much) higher availability > Marketing demands rapid price adjustments

Slide 60

Slide 60 text

Rail (4): Analysis (excerpt) issue (problem) description problem-cost time-to-market 6-12 month (!) from business requirement to production configuration of certain ticket types crashes backend when either end-users or sales-clerks configure specific ticket-types (groups > 5 persons, more than one rebate reason, border crossing or >2 train changes), several backend processes crash know-how drain in development many dissatisfied developers and business experts leave (development) organization, migration from internal to external development, fix-price projects

Slide 61

Slide 61 text

7%# 6%# 12%# 8%# 67%# Cost%Distribu+on%for%So/ware%% Requirements# Design#/#Architecture# (ini9al)#Programming# Integra9on# Maintenance# Rail (5): Evaluation (excerpt) What‘s the (additional) cost of „heterogenity“? 1. Explicit assumptions • Heterogenity „costs“ in all phases • Phase effort is known h"p://courses.cs.vt.edu/~csonline/SE/Lessons/LifeCycle/

Slide 62

Slide 62 text

Rail (6)... Collected tasks in which additional effort might occur.. h"p://courses.cs.vt.edu/~csonline/SE/Lessons/LifeCycle/

Slide 63

Slide 63 text

Estimate Issue Cost > For your issues, estimate cost intervals > determine appropriate parameters & influences > explicit assumptions > issue (problem) description cost (in interval)

Slide 64

Slide 64 text

„Improve“ Overview

Slide 65

Slide 65 text

„Improve“ Practices > Anticorruption Layer > Assertions > Automated-Tests > Branch-For-Improvement > Extract-Reusable-Component > Front-End-Switch > Group-Improvement-Actions > Handle-If-Else-Chains > Improve-Code-Layout > Improve Logging > Interface Segregation Principle > Introduce Boy Scout Rule > Introduce-Layering > Isolate-Changes > Keep-Data-Toss-Code > Manage Complex Client Dependencies With Facade > Measure-Everything > Never-Change-Running-System > Never-Rewrite-Running-System > Quality-Driven-Software-Architecture > Refactoring > Refactoring-Plan > Remove-Nested-Control-Structures > Sample-For-Improvement > Schedule-Work > Untangle-Code > Use Invariants To Kill Zombies

Slide 66

Slide 66 text

No content

Slide 67

Slide 67 text

http://paulhammant.com/2013/07/14/legacy-application-strangulation-case-studies/ Strangulation

Slide 68

Slide 68 text

Propose Strategy > For your issues, propse an improvement strategy > e.g. what improvements to „cluster“ > major improvement/migration steps/phases > What additional information do you need!

Slide 69

Slide 69 text

Automated Tests > Risk: Changes fail existing processes in prod > Put this into numbers: > Which processes are impacted by the new feature's code changes? > Estimate the hourly cost of those processes failing in production > Estimate the probability of each process's failure

Slide 70

Slide 70 text

Unit Tests > If you don’t have any, start with new features > Reproduce each bug as a unit test > Write small tests > Use self-explaining test case names

Slide 71

Slide 71 text

Integration Tests > Priority: Test your API! > cheaper than UI testing > usually not acceptance tested > Don’t use mocks if you’re not forced to! > only if 3rd-party regularly blocks you

Slide 72

Slide 72 text

No content

Slide 73

Slide 73 text

Thesis: Logging is the most underestimated task in IT

Slide 74

Slide 74 text

State of Logging > Growing number of user transactions > much larger log files > log files distributed across multiple systems > Increasing demand for real-time analysis

Slide 75

Slide 75 text

Information Types > Operational data > Actions or state of the application > User interaction

Slide 76

Slide 76 text

Stakeholders Developer failure analysis after weeks Operations real-time health information Product Owner weekly usage reports ??? some complex daily report?

Slide 77

Slide 77 text

Improve Logging > Diagnostic contexts > Filters > Defined log format > Log aggregation > CorrelationID

Slide 78

Slide 78 text

Customer Story > A well-known German bank > Web application for customer self-service > Customers call support hotline for failures > Hotline shall track state of transactions

Slide 79

Slide 79 text

Customer Story Support??? Customer Developer

Slide 80

Slide 80 text

Customer Story CorrelationID! New CorrelationID: B6F51C-6324-AC336-6339 CID: B6F51C... CID: B6F51C... CID: B6F51C... CID: B6F51C... Customer Developer Corre... what?

Slide 81

Slide 81 text

Customer Story Customer

Slide 82

Slide 82 text

Thesis: Each piece of relevant information is actually an event

Slide 83

Slide 83 text

“An event is anything that we can observe occurring at a particular point in time.” — Alexander Dean, Unified Log Processing, Manning

Slide 84

Slide 84 text

Event Streams Taking the next step to continuous reporting

Slide 85

Slide 85 text

aim42.org

Slide 86

Slide 86 text

github.com/aim42

Slide 87

Slide 87 text

Questions? Comments? Dr. Gernot Starke, @gernotstarke [email protected] http://gernotstarke.de innoQ Deutschland GmbH Krischerstr. 100 40789 Monheim am Rhein Germany Phone: +49 2173 3366-0 innoQ Schweiz GmbH [email protected] Gewerbestr. 11 CH-6330 Cham Switzerland Phone: +41 41 743 0116 www.innoq.com Offices in: Berlin Darmstadt München