Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Softwarearchäologie_mit_KI_-_Vom_Scherbenhaufen...

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
Avatar for Markus Harrer Markus Harrer
February 13, 2026
7

 Softwarearchäologie_mit_KI_-_Vom_Scherbenhaufen_zum_klaren_Gesamtbild__OOP_2026_.pdf

Avatar for Markus Harrer

Markus Harrer

February 13, 2026
Tweet

More Decks by Markus Harrer

Transcript

  1. Softwarearchäologie mit KI Vom Scherbenhaufen zum klaren Gesamtbild OOP 2026,

    13.02.2026, München Markus Harrer AI-assisted Software Evolutionist 1
  2. Die schnelllebige KI-Welt 2 Devstral VibeCoding ChatGPT Deep Research GPT

    o3 / o4-mini Veo 3 Claude 4 Gemini CLI Context Engineering Agents GLM-4.5 KimiK2 Qwen3-235B-A22B Skills Kilo Code GPT-5 Qwen3-Next gpt-oss DeepSeek-OCR MiniMax M2 gpt-oss-safeguard Gemini 3 GPT-5.3
  3. generisch agentiv unterstützend spezialisiert Fülle an KI-Tools Codex CLI 5

    Landschaft der KI-Coding-Assistenten, korrigiert und erweitert von Markus Harrer, basierend auf den Ideen von Bilgin Ibryam (https://generativeprogrammer.com/p/ai- coding-assistants-landscape/), aufgegeben zu aktualisieren im Juli 2025 …
  4. Meine Sicht auf das KI-Thema 5 Weg vom Holz /

    Code streicheln, hin zum Holz / Code bearbeiten früher heute Bild links: © Deutsche Fotothek , CC BY-SA 3.0 Bild Mitte: © Firma Altendorf, CC BY-SA 3.0 Unported morgen? (eher nicht*) Bild rechts: © Firma Anthon GmbH, fair use *da Fließbandarbeit != Individualentwicklung
  5. KI als Retter für Legacy Systeme? 8 Legacy System Wisdom

    Blower LogOmagic SellChef Die Zentrale The Sourcerer
  6. KI einfach mal machen lassen? 9 Claude Code: „Refactor everything“

    Ähnliche Version: https://www.youtube.com/watch?v=MAnQ5u6JqdI (das war als Witz gedacht) Das steht nun hier, weil einige einmal nach dieser Demo gemeint hatten, es ginge wirklich so einfach Nö!
  7. 10

  8. Archäologen suchen nach Hinweisen, die Menschen vor uns hinterlassen haben,

    und versuchen zu verstehen, was diese bedeuten. Detektive der Vergangenheit “ Archäologe 12 „Archaeology at work", English Heritage Education Service (https://www.youtube.com/watch?v=TFejIkYDH9Q)
  9. Moderne Archäologietechniken Ausgrabung Typologie Chaîne Opératoire1 Umfang der KI-Nutzung wenig

    hoch 14 Was ist da? Wie wurde es genutzt? Was ist es? 1 in etwa so ausgesprochen: „schenn opörätuar“, ~Ablaufkette auf deutsch
  10. Ich hatte früher schon meinen Spaß (und LLMs wissen das!)

    22 Meine Software-Analytics-Repositories auf GitHub pandas matplotlib Plotly Jupyter Notebook
  11. Ephemeral Jupyter Notebooks 28 = Verständliche Wegwerfanalysen als Ausgrabungswerkzeuge Ein

    Werkzeug zur Erstellung einer computergestützten Erzählung – ein Dokument, das Computercode, dessen Ausgabe (Text, Plots usw.) und menschenlesbaren Text kombiniert, um die Logik zu erklären. Fernando Pérez & Brian Granger: Project Jupyter: Computational Narratives as the Engine of Collaborative Data Science“ (2015).
  12. Weniger Blackboxmagic-AI, 29 Legacy Code Agentic AI Handlungs- anweisungen Legacy

    Code Guided AI Handlungs- ideen ✗ ✓ mehr „Guided AI“
  13. 27 Das Sortieren kann sehr mühsam und repetitiv werden Typologie

    Bild: https://pixabay.com/de/photos/%C3%A4gypten-tempel- hieroglyphen-pharao-1197835/ (NadineDoerle)
  14. Typologieerstellung mit LLMs 28 Verwendung vorhandener Namensschemata Repository-Muster repository/*Repository.java Abstrahiert

    den Datenzugriff durch Kapselung der Logik, die zum Abrufen, Speichern und Abfragen von Daten erforderlich ist. . ├── model │ ├── Owner.java │ ├── Person.java │ ├── Pet.java │ ├── Specialty.java │ ├── Vet.java │ └── Visit.java ├── repository │ ├── OwnerRepository.java │ ├── PetRepository.java │ ├── VetRepository.java │ └── VisitRepository.java └── web ├── OwnerController.java ├── PetController.java ├── PetValidator.java ├── VetController.java └── VisitController.java repository Repository Repository Repository Repository
  15. Typologie-Prompt für Claude Code 29 Analyze the production Java code

    in this codebase and extract distinct concepts. Categorize them into two groups: - technical_concepts: architectural patterns, design tactics, or technical structures - business_concepts: domain-relevant ideas, rules, or terms that represent key business logic For each concept, provide: - name: a short, descriptive name - explanation: a concise description of what the concept is - rationale: why this concept likely exists in the codebase (technical or domain motivation) - file_globs: glob-style patterns of the used naming conventions to identify where this concept appears in the codebase (e.g., **/**Service.java, **/invoicing/**.java) Output the result as a well-structured YAML file with two top-level sections: technical_concepts and business_concepts. Focus only on Java production code (exclude test files, scripts, and configuration files).
  16. Das erwartete Ergebnis vom LLM 30 technical_concepts: - name: "Boundary"

    description: "Defines the interfaces for communication between the core business logic (Interactors) and the outer layers (e.g., UI, web services). It includes Request and Response Models." rationale: "Separates the core application from the delivery mechanism, allowing the presentation layer to change independently of the business rules." file_globs: - "**/boundary/*.java" ... business_concepts: - name: "Site" description: "A 'Site' represents a distinct container or context for content like comments, files, and schedules. Most other business concepts are scoped within a specific site." rationale: "The concept of a 'Site' allows for multi-tenancy or partitioning of data, where different users or groups can have their own isolated space within the application." file_globs: - "**/site/**/*.java" Glob-Muster + etwas manuelles Bearbeiten…
  17. Typologie-Bewertung: Was bringt’s? Wie viele Dateien kann man Konzepten zuordnen?

    Verteilung für ein kleines Softwaresystem (~300 Quellcodedateien) Legacy Code Verteilung für ein kleines Softwaresystem (~300 Quellcodedateien) Wenn du eine Datei kennst, die ein Konzept implementiert, kennst du alle anderen Dateien, die dasselbe Konzept implementieren! 31
  18. Code-Inventarisierung 46 Nicht nur „was ist da“, sondern „wovon ist

    wie viel da?“ Technische Konzepte Boundary: 66 file(s) Interactor: 45 file(s) Entity: 62 file(s) Gateway: 17 file(s) Delivery: 19 file(s) RESTful API: 15 file(s) Dependency Injection: 1 file(s) Request Model: 30 file(s) Response Model: 22 file(s) POJO Entities: 10 file(s) Validation: 8 file(s) Business-Konzepte Site: 23 file(s) Comment: 28 file(s) Creator: 9 file(s) File: 22 file(s) Scheduling: 49 file(s) Todo List: 43 file(s) Mail Notification: 8 file(s) + Webseite / Forms + Tabellen + Integrationen … + Zusammenhänge
  19. 34 Erweiterte Typologie Bewertung der konzeptionellen Integrität Quelle: L. Adams

    Gilmour, Early Medieval Pottery from Flaxengate, Lincoln Bild: https://pixabay.com/de/photos/arch%C3%A4ologie- arch%C3%A4ologische-ausgrabung-59150/
  20. DService KService BService Erweiterte Typologie 35 Dateien innerhalb von Konzepten

    erkennen, die nicht das tun, was alle anderen tun CService
  21. Erweiterte Typologie Bewertung der konzeptionellen Integrität mit LLMs [...] Bitte

    analysiere den folgenden Quellcode und bewerte, wie gut er das hier angegebene Konzept umsetzt. [...] 36 https://github.com/feststelltaste/software-analytics/tree/master/demos/20260213_OOP_2026
  22. Erweiterte Typologie Bewertung der konzeptionellen Integrität mit LLMs Der Code

    [...] stimmt perfekt mit dem Konzept [...] überein. Konfidenz: 1,0 37 https://github.com/feststelltaste/software-analytics/tree/master/demos/20260213_OOP_2026
  23. BService Erweiterte Typologie 35 Dateien innerhalb von Konzepten erkennen, die

    nicht das tun, was alle anderen tun Beispiel: Konzept des Services zum Bier auftischen CService DService KService
  24. „Ich habe gar keine Auto Struktur“ 56 Ausblick: Neurosymbolic pattern

    mining? AST/CST + LLMs Abstract/Concrete Syntax Tree
  25. Die Ablaufkette Die Verkettung aller Schritte des Lebenszyklus eines Artefakts

    wie z. B. 1. Erstellung 2. Nutzung 3. Wartung 4. Reparatur 5. Entsorgung ... Bild: https://fr.wikipedia.org/wiki/Cha%C3%AEne_op%C3%A9rat oire#/media/Fichier:Cha%C3%AEne_op%C3%A9ratoire.png 42
  26. Chaîne Opératoire für Code? Warum? Es gibt uns einen gewissen

    Eindruck von der Komplexität dieser Gesellschaften. [...] bringt uns in die Gedankenwelt dieser Gesellschaften. Quelle: https://www.youtube.com/watch?v=MNp5q3pqkmQ Jason Cohen https://www.intothedustarchaeology.com/ Kein Archäologe, aber spielt einen im TV “ 43
  27. Die Ablaufkette aller Schritte des Lebenszyklus eines Artefakts Chaîne Opératoire

    t 44 create public class Customer add testNameCheck() change to BusinessPartner fix tech debt add calculateBonus() refactor testNameCheck() delete BusinessPartner public class BusinessPartner { private String name; private double bonus; public BusinessPartner(String name) { this.name = name; } public double calculateBonus() { ...
  28. Breaking the Magician's Code: Magic's Biggest Secrets Finally Revealed 64

    ... using a combination of glob and grep. Claude Code is making use of agentic search “ Anthropic: Transform Legacy Systems into Strategic Assets - Code Modernization with AI https://www.youtube.com/watch?v=8qtSeQuNv0o
  29. 48

  30. Was bringt mir das jetzt? 69 Weniger Angst Oh Mist,

    so viel Code!!!! Oh Mist, so viel Code!!!! OK, so viel könnte ich kennen OK, so viel könnte ich kennen Das hier sind ja alles nur Repositories Das hier sind ja alles nur Repositories → →
  31. Was bringt mir das jetzt? 70 Neue Optionen Code Konzept

    Idee → → Überdenken? Standardisieren? Verbessern? Auch: https://www.innoq.com/de/blog/2025/10/modern-legacy-dank-ki/ sub encrypt{my($p)=@_;my$a="s";my$b="x" ;my$c=reverse($p.$a.$b.$p);my$d=0;f or(split//,$c.$p.$a.uc($b).reverse( $p)){ $d+=ord($_)*3+length($c)%7 }return "MEGA".$d."END"}print encrypt($ARGV[0]); Ach, eigentlich wollten die nur Passwörter hashen
  32. Mehr zum Thema II 53 Meine Sammlung zum Thema „Software

    Analytics“ https://github.com/feststelltaste/awesome-software-analytics
  33. Manual Work Transformation Tools Guided AI AI assistants AI agents

    Developers manually analyze, reason about, and fix issues (based on deep domain and system knowledge) Human-based creation of formal rules and recipes to perform consistent, automated code transformations Human-led detection of issues or anti-patterns, followed by localized AI-generated fixes within defined areas Human-guided AI-based task execution for fixing code in smaller areas / clearly scoped contexts Autonomous systems orchestrate analysis, transformation and validation of modernization workflows General Idea Special issues like redesign of critical parts of business logic or performance optimization Framework migrations, API upgrades, bulk renames, restructurings Identifying systemic issues and using AI to propose or apply localized solutions Summarizing code, generating tests & comments, renaming identifiers, writing code snippets Cleanup ideation, multistep refactoring planning, smaller bug fixing across code bases Possible Use Cases ++ + o - -- Control How much humans can be in the loop -- - - + ++ Risk How likely changes go wrong -- - - o ++ Breadth How wide the method can operate ++ ++ + o o Accuracy How well problematic spots are addressed o ++ ++ o - Traceability How well actions can be tracked ~ - o o o Efforts How much work setup and use need -- ++ o - + Volume How much can be processed Light Version 1.2 Markus Harrer AI for Legacy Modernization: When and How to Use (or not)
  34. Manual Work Transformation Tools1 Guided AI AI assistants AI agents

    Developers manually analyze, reason about, and fix issues (based on deep domain and system knowledge) Human-based creation of formal rules and recipes to perform consistent, automated code transformations Human-led detection of issues or anti-patterns, followed by localized AI-generated fixes within defined areas Human-guided AI-based task execution for fixing code in smaller areas / clearly scoped contexts Autonomous systems orchestrate analysis, transformation and validation of modernization workflows General Idea Special issues like redesign of critical parts of business logic or performance optimization Framework migrations, API upgrades, bulk renames, restructurings Identifying systemic issues and using AI to propose or apply localized solutions Summarizing code, generating tests & comments, renaming identifiers, writing code snippets Cleanup ideation, automated, multistep refactoring, bug fixing across multiple code bases Possible Use Cases Very High (humans drive everything) High (humans define transformation logic, execution is automatic) Medium (humans guide focus, agents generate and apply solutions) Low (humans initiate, roughly guide and review AI’s results) Very Low (agents make decisions and act with minimal intervention) Control How much humans can be in the loop Low to Medium (may suffer from outdated assumptions, overconfidence or unclear goals) Medium (when creating recipes) to none (during execution, but also depends on recipe quality) Low (with good problem localization that allows suggestions in limited contexts) High to medium (depends on scope and tasks) Very high to high (esp. with broad tasks and high autonomy + wrong tool use) Risk How likely changes go wrong Very narrow (limited by developers’ cognitive capacities) Narrow (limits defined by AST, LST or recipe capabilities) Narrow (scoped to recognizable patterns or metrics) Limited (current file, code block or interaction context) Very broad (across files, services and task types) Breadth How wide the method can operate Human-level quality (varies by experience) High (precise and deterministic) High (during analysis), medium (during fixing) Medium (but error-prone outside narrow, familiar contexts / training) Medium (depends on prompt quality, feedback loops, available tools) Accuracy How well problematic spots are addressed High (with peer review and diffs) Very high (rules, recipes, diffs) High (analysis steps, reports3, diffs) Medium (prompt history, diffs) Medium (prompts, execution paths, diffs) Traceability How well actions can be tracked Variable (depends on task difficulty) Low to medium4 (depends on reusing existing recipes or creating new ones) Medium (because data analysis needed) Medium (prompt writing, instruction definition, model tuning) First low (“it’s just prompts”), later high (MCPs, skills, orches- tration, validation, security, …) Efforts How much work setup and use require Limited due to the need for deep contextual understanding High-volume, homogeneous code bases Mid-sized codebases (with structural issues) Localized impact (limited by context window ) Large, heterogeneous systems (with recurring issues) Volume How much can be processed 1 e.g. Codemods, OpenRewrite, Rector 2 e.g. using jQAssistant, Semgrep, CodeScene 3 e.g. using Jupyter Notebooks 4 for new recipes, AI might be used Full Version 1.2 Markus Harrer MCP: Model Context Protocol AST: Abstract Syntax Tree LST: Lossless Semantic Tree AI for Legacy Modernization: When and How to Use (or not)