Slide 1

Slide 1 text

Valuable Software Engineering Arie van Deursen, TU Delft MODELS 2024, Linz, Austria @[email protected] 1 Adele Bloch-Bauer I, Gustav Klimt, 1907. Wikipedia

Slide 2

Slide 2 text

David Notkin 1955-2013 “ … the intent is to make the engineering of software more effective so that society can benefit even more from the amazing potential of software.” 2 “If we care about influence, as I hope we do, then adding value to society is the real measure we should pursue.” ACM TOSEM Editorial, 2013

Slide 3

Slide 3 text

My Journey 3 1990 2000 2010 2020 RESEARCH SOCEITY Domain models Reconstructed models Variability models Software process models Language models

Slide 4

Slide 4 text

The Financial Sector • Data intensive • Software intensive • High stakes • Highly regulated • Long (system, data) lifetimes 4 High impact societal sector, with critical software engineering challenges (and modeling tradition in finance)

Slide 5

Slide 5 text

ING Bank Global bank based in The Netherlands Five year collaboration with TU Delft: • Explainable AI • Human-AI decision making • Data integration • Incident management and AIOps • Release planning • Search-based testing and repair 5

Slide 6

Slide 6 text

Agile at Scale at ING • ING Bank: 15,000 IT staff • Self-organizing teams (5-9 developers) • Short iterations (1-4 weeks) • User stories, features, epics • Delivered in releases (2-6 months) • Quarterly planning of all releases 6 Years of high-quality data available at ING

Slide 7

Slide 7 text

7 Elvan Kula, PhD thesis, TU Delft, 2025 Next month, @ ASE 2024! Today

Slide 8

Slide 8 text

Why is My Project Late? What are factors affecting timely epic delivery? • Let’s ask! How do these factors impact schedule deviation? • Let’s measure and model! 8 Elvan Kula et al IEEE TSE 2022

Slide 9

Slide 9 text

Timely Epic Delivery: Perceived Factors Survey 1: Which factors? • 289 responses • 25 factors; 5 dimensions Survey 2: Factor importance? • 337 responses • Rated impact level per factor Factor top 10: 1. Requirements refinement 2. Task dependencies 3. Organizational alignment 4. Organizational politics 5. Geographic distribution 6. Technical dependencies 7. Agile maturity 8. Regular delivery 9. Team stability 10. Skills and knowledge 9

Slide 10

Slide 10 text

Measuring Delay: Balanced Relative Error • If actual delivery date after estimated date (”late”, pos%): • If actual delivery date before estimated date (“early”, neg%): • Collected BRE from 3,771 epics (273 teams), for 3 years 10

Slide 11

Slide 11 text

13 Predictor Variables • 35 metrics for 20 factors • 13 metrics explain 67% of variation (MARS model, ) • Match with perception? ▪ Underestimated: size ▪ Agreed effect: dependencies, seniority, stability ▪ Overestimated: refinement, geography, ▪ Agreed little effect: coverage, code smells, … 11

Slide 12

Slide 12 text

Dynamic Delay Prediction • Delay knowledge increases as epic unfolds (in milestones) • Mobility literature: Delay adheres to patterns, which can be learned by clustering delay time series • Is epic delay subject to patterns? • Can patterns improve delay prediction? 12

Slide 13

Slide 13 text

Epic Delay Patterns 13 Elvan Kula et al FSE 2023 Dataset: 4,040 epics of at least 10 sprints from 270 teams, 2017—2022 % epics in category: 36% 44% 14% 6%

Slide 14

Slide 14 text

Delay Patterns Improve Delay Prediction 14

Slide 15

Slide 15 text

Epic Conclusions • There are measurable factors contributing to epic delay ▪ Size, project dependencies, past performance • Delay follows patterns ▪ Largest pattern is timely at start with delay peak at end, due to security and incidents • Factors + patterns predict delay, dynamically ▪ Beats the global and iterative SoTA baselines 15

Slide 16

Slide 16 text

Midway Reflection Secrets to success? • A well-chosen meta-model (schema) of the data collected • Carefully collected multi-year data • Involvement of people (surveys) to give meaning to results • Learned models that are interpretable 16 Alhambra in winter

Slide 17

Slide 17 text

The Public Sector • Government digitalization affects all aspects of society • Taxes, permits, pensions, social benefits, health insurance, … • Infrastructure, traffic, sector regulation, open government, … 17

Slide 18

Slide 18 text

Public Sector Challenges • Complex (political) decision making • High demands on privacy, availability, transparency, inclusion, accessibility,… • Long (system, data) lifetimes • Accountability to minister, parliament, voter • Poor government digitalization undermines trust and democracy 18

Slide 19

Slide 19 text

Advisory Council for IT Assessment (AcICT) • Dutch independent council (2015) • Advices ministers and parliament on risks and chances of success in complex government IT systems • Enshrined in law since 2024: ▪ Govt obliged to submit systems for assessment, collaborate, and respond • All reports are public 19 https://www.adviescollegeicttoetsing.nl/

Slide 20

Slide 20 text

Council Organization • Five cabinet-appointed council members ▪ Experts from society, industry and academia • Supported by office of ~25 assessors • Assessment takes around six months • Data collection, interviews, analysis and advice formulation, fact checking, response from minister, … • Outcome: 8-10 page advice to minister 20

Slide 21

Slide 21 text

A Word of Caution • Council focuses on large (> €5M) projects ▪ From these, council selects high risk projects ▪ This gives biased, distorted view of government IT • Other sectors have failures too, but these are less … public • The public sector also has plenty of successes • The public sector is full of hard-working, amazing professionals dedicated to making society better 21 Council members, 2023

Slide 22

Slide 22 text

10-20 Assessments per Year; Spread over Ministries 22 Dataset of > 100 public reports. Analysis WIP

Slide 23

Slide 23 text

Project Cost 23 Total: €12.8 billion Median: €34 million Maximum: Defense, €3.2 billion

Slide 24

Slide 24 text

Assessment Framework Risk Areas 1. Business case, benefits, finance 2. Project organization and ownership 3. Risk management and project dependencies 4. Alignment business processes and IT solution 5. Scope control 6. Architecture, functional feasibility, technical realizability 7. Planning and realization 8. Procurement, tendering 9. Acceptance and transition to line 24 https://www.adviescollegeicttoetsing.nl/onze- werkwijze/documenten/publicaties/2021/12/01/toetskader-acict

Slide 25

Slide 25 text

Risk area prevalence in 100 reports 25

Slide 26

Slide 26 text

Levels of Impact Manual inspection of 100 reports Identified 3 types of impact 1. MINOR: Continue, with suggestions for improvement 2. REVISE: Continue, with urgent interventions 3. MAJOR: Abort or major interventions 26

Slide 27

Slide 27 text

Project Types • Build new system • Replace existing system • Adjust existing system substantially • Engage in major new procurement agreement 27

Slide 28

Slide 28 text

Impact Differences per Project Type 28 Green field / replacement: • 35-40% revise • 35-40% major Evolution: 72% (13/18) minor impact

Slide 29

Slide 29 text

Example 1: OpenVMS • Unemployment benefit systems: ▪ From: Cobol + Codasyl on Itanium/OpenVMS ▪ To: Java + relational database on Linux • Automated code conversion • 2020-2025, budget €36M • Assessment in 2023 ▪ Half (€19M) of budget spent 29

Slide 30

Slide 30 text

Advice Halfway assessment organization itself decided to terminate project • Hard to maintain code explosion: ”JOBOL” Advice: 1. Build multi-disciplinary re-platforming team 2. Develop multiple alternative scenarios 3. Draw lessons from failed conversion 30 https://www.adviescollegeicttoetsing.nl/documenten/publicaties/2023/12/21/bit-advies-programma-openvms

Slide 31

Slide 31 text

Example 2: Traffic Management • Replace 26 traffic management systems • Adopt & customize COTS solution • 2015-2026, budget €166M (originally €35M) • Assessment in 2022: Half (€83M) of budget spent 31

Slide 32

Slide 32 text

Advice 1. Align ambition and capabilities 2. Take lead over supplier 3. Prioritize moving maintenance and operations to line 4. Organize fallback scenario 32

Slide 33

Slide 33 text

Example 3: Funding Education • Distribute €32B/year over all Dutch educational institutions • Modernize current .NET systems ▪ Target: Rule-based platform + Java • 2019-2024, budget €18M ▪ €12M spent in 2022 • Assessments in 2021 and 2023 33

Slide 34

Slide 34 text

Advice • Professionalize culture: ▪ Governance & finance ▪ Development & maintenance • Terminate use of (niche) rule-based platform • Invest in current .NET systems to safeguard continuity • Plan for stepwise modernization 34 https://www.adviescollegeicttoetsing.nl/onderzoeken/documenten/publicaties/2023/09/25/bit -advies-doorontwikkelen-applicatielandschap-bekostiging-2

Slide 35

Slide 35 text

[ MODELS in Assessments? ] Prevalence Low code: • Blueriq, Mendix, Oracle Apex, … Domain-specific frameworks • Rule, case, law, document management and archiving • Real estate, traffic, … • ERP, SAP, MS Dynamics, … General modeling / UML, SysML Risks Identified Investments in ‘economies of scope’ shared with too few stakeholders • Vendor lock-in • Niche technology • Lifetime of technology Use of models not a clear differentiator between success and failure 35

Slide 36

Slide 36 text

Recurring Advice Based on manual analysis of reports: 1. Reduce risk: Make it smaller 2. Articulate needs 3. Strengthen governance 4. Define mitigation measures 5. Invest in own capabilities 36 Egon Schiele, Leutnant Heinrich Wagner, Wikipedia

Slide 37

Slide 37 text

Contrasting the two Studies Release planning @ ING • Problem well-scoped • Rich, well-structured data • Good fit with quantitative research methods • Clear research results • More research has potential to further optimize solutions Assessments @ AcICT • Problem broad & messy • Thick, but unstructured data • Must resort to qualitative research methods • Hard to get clear results • Tempting to move problems out of (SE) research scope 37

Slide 38

Slide 38 text

Models to the Rescue? “ ... the most direct benefits of MDE can be summarized as: • Increase of communication effectiveness of stakeholders • Increase in the productivity of the development team thanks to the (partial) automation of the development process” 38 2012

Slide 39

Slide 39 text

Increase in Productivity? • Of little use if you don’t know what you want … • Benefits come when combined with shorter feedback cycles • More work for the problem owner! ▪ Give feedback, take into production, manage organizational change • Done well, the benefits then become: ▪ Value delivered earlier; lower costs of failure 39

Slide 40

Slide 40 text

Increase of Communication Effectiveness? • “Tactical” communication: ▪ Domain model, bound to code ▪ Within Eric Evan’s “bounded context” • “Strategic” communication: ▪ High level, coarse-grain “features” ▪ Reason about feasibility, cost, progress ▪ Devise alternative roadmaps ▪ Align with (conflicted) stakeholders 40 Substantial body of work on model-driven approaches that support this In practice strong reliance on scaled agile frameworks like SAFe.

Slide 41

Slide 41 text

AI to the Rescue? • Capabilities of foundation models are mind blowing • This will affect many aspects of software engineering • The ambitions of Artificial General Intelligence reach even higher • Will generative AI solve our problems? 41 Sam Altman (img src Wikipedia)

Slide 42

Slide 42 text

I Large Language Models for Code 42 When to invoke code completion? (AIware, 2024, ) Benchmark for long-context tasks, with JetBrains 600,000 actual code completions, ICSE 2024 Summarizing binaries, SANER 2023 Memorization in LLMs, ICSE 2024 Ambition to boost developer productivity!

Slide 43

Slide 43 text

Generative AI for the Government? Please let’s not have: Given that government is document heavy And that complex projects have a large document footprint The last thing we need is continuous generation of plausible-sounding text no one feels responsible for Needed instead: • Clear sense of direction • To the point documents • Crisp communication • Consensus building • Accountability 43 Supporting this is hard: Models might help; LLMs won’t.

Slide 44

Slide 44 text

Focused AI Timnit Gebru (SaTML 2023): We should build smaller-scale systems (that are well-scoped and well-defined) for which we can provide specifications for expected behavior, tolerance and safety protocols. 44 Timnit Gebru, img Wikipedia https://x.com/NicolasPapernot/status/1623885641380425728

Slide 45

Slide 45 text

Model-Driven Data-Driven Tycho Brahe (1567-1601) Meticulous data collection of planet positions 45 Johannes Kepler (1571-1630) Models / laws of planet movement

Slide 46

Slide 46 text

We need to: • Think about the value we want to bring to society • Have the courage to attack hard / urgent problems • Strengthen strategic communication in IT • Let models learn from data • Focus the use of AI: explainable and with guarantees For each of these, modeling is indispensable 46

Slide 47

Slide 47 text

Valuable Software Engineering Arie van Deursen, TU Delft MODELS 2024, Linz, Austria @[email protected] 47 Adele Bloch-Bauer I, Gustav Klimt, 1907. Wikipedia