stack’ to be a bit more high level. The PM job will move ‘down the stack’ and a bit closer to code. We will meet in the middle and swap a lot of accountabilities” Ulloa, Butler, Haniyur, Miller, Amos, Sarkar, Storey: Product Manager Practices for Delegating Work to Generative AI: "Accountability must not be delegated to non-human actors" ICSE SEIP 2026. Thurs, 4:45 in Oceania V. Blurring of software team roles
reusable and helped us progress a lot faster UI looked exactly how I had imagined it it taught me (a business student) so much… Fall 2025: Four weeks in….
features into existence without reviewing their implementations… I've found myself getting lost in my own projects. I no longer have a firm mental model of what they can do and how they work” – Simon Willison, https://simonwillison.net/2026/Feb/15/cognitive-debt/ Not just in my Startup course…
conversation I’ve had with practitioners about cognitive and intent debt, and proposed triple debt model1 Acknowledgements: Arty Starr, Adam Tornhill, Keith Mann, Kent Beck, Dave Thomas, and many others… 1. From Technical Debt to Cognitive and Intent Debt: Rethinking Software Health in the Age of AI. M. Storey. To appear ACM Queue 2026 https://margaretstorey.com/blog/2026/02/18/cognitive -debt-revisited/
out AI MAKES IT WORSE Poor quality generated code. Semi-approval already happening. AI CAN HELP Automated refactoring, test generation, code explanation, skills. CODE MAY DISAPPEAR Some argue code may be hidden infrastructure managed by agents. Code is “accidental complexity”. THE DEEPER RISK AI is a multiplier, and right now it's multiplying the debts we're worst at managing.
developers’ minds, not just source code. What is software? 2. DESIGN JUSTIFICATION Why structural decisions were made: data models, integration points, architecture choices. 1. WORLD-TO-PROGRAM MAPPING How real-world concepts map to program structure. What's included, what's left out. 3. MODIFICATION CAPACITY Can the team reason about safe change paths? (Biggest risk with AI-generated code.)
understood by teams PEOPLE Developers, PMs, agents ARTIFACTS Code, docs, tests, specs INTERACTIONS Reviews, retros, conversations Understanding is distributed across all three. Today, teams include both people and AI agents.
shared understanding of the software across a software team over time.1 Persistent gaps in shared mental models, transactive memory, common ground Different from momentary cognitive load and individual comprehension debt2 at the level of the source code With AI, cognitive debt accumulates faster due to cognitive surrender3 2. Alakmeh, Anderson, Jackson, Vaz Pereira, Akirmak, Estey, Prikladnicki, van der Hoek, Storey & Fritz, Grasping AI Reliance in Program Comprehension and Coding through the AIRELI Persona Taxonomy. ICPC 2026, 3. Shaw, S. D. & Nave, G. “Thinking-Fast, Slow, and Artificial: How AI is Reshaping Human Reasoning and the Rise of Cognitive Surrender.” Available at SSRN 6097646 (2026).
of half-truths and, in this age of AI assisted coding… Not the brazen lie of complete competance. Not the comic lie of the demo that “mostly works. The polished, rational lie borne of urgency: We understand it well enough.” “In software, we call this “moving fast.” In reality, we are borrowing against cognition…. Eventually pay interest in confusion” – Russell Miles (response to Cognitive Debt post4) 19 4. https://www.softwareenchiridion.com/p/on-cognitive-debt-and-the-care-of
fatigue trying to figure it out.5 Reluctance to modify code written by AI. TEAM SIGNALS People ask each other basic questions. Slow onboarding. Low bus factor.6 No awareness of who knows what. 5. A. Starr & M.A. Storey. Theory of Troubleshooting: The Developer's Cognitive Experience of Overcoming Confusion. To appear ACM TOSEM. 2026 6. Tornhill, A. 2015. Your Code as a Crime Scene. Pragmatic Programmers.
and regrouping when things break System Walkthroughs for the purpose of shared understanding of architecture Habitat Thinking4 design the environment for (re)building shared understanding across the team Design for Troubleshooting5 ensure “diagnostic legibility”4 is preserved as developers lack familiarity with code they didn’t write but may need to troubleshoot
SCORE8 Five questions to ask in a retro: 1. Explain modules without looking at code? 2. Can you debug at 2am without AI? 3. Do you know why this implementation? 4. Long time to onboard someoneone new? 5. Is code maintainable if AI goes away? Yes < 3/5 → Cognitive Debt! FEATURE COMPREHENSION SCORE7 AI-generated quiz on Naur's 3 layers: 1. world-program mapping 2. design justification 3. modification capacity scored at team level per feature, in retros. 7. Leonid Sokolovskiy: https://levelup.gitconnected.com/we-can-measure-how-fast-we-ship-can-we-measure-how-well-we-understand-wha t-we-built-a5ad2749c8fa 8. Adham Sersour: https://www.linkedin.com/feed/update/urn:li:activity:7442198356598239232/ But caution is needed when we create new metrics as we may damage what we can’t see but value…
existing architecture” → because the architecture's why was never externalized “Developers spend significant time on post-generation editing” → paying the interest on that missing rationale “Context established early in a conversation is lost as session lengthens” → intent lives nowhere persistent “Quality varies depending on which team member is prompting” → because intent is locked in people's heads, not artifacts 9. R. Garg, Patterns for Reducing Friction in AI-Assisted Development, https://martinfowler.com/articles/reduce-friction-ai/ April 8, 2026
and constraints that guide how humans and AI agents evolve the system. It lives in artifacts, or more often, in the absence of them.1 “AI is a loudspeaker, not an asymmetric force. A team with strong intent practices finds AI amplifies clarity. A team with weak practices finds AI fills gaps with plausible guesses.” — Sandro Ponticelli (comment to Triple Debt Model post)
can explain why, users are unhappy Agents struggle, they make plausible but wrong decisions, use more tokens Nothing is written down, forgotten requirements, constraints
capture what was decided and why. Domain-Driven Design makes domain intent explicit before it is encoded EXECUTABLE INTENT BDD specifications and tests to verify purpose may show if the system may drifts from its intent CONTEXT ARTIFACTS Context and harness engineering (skills, agent instructions, and playbooks) produce feedforward and feedback guidance for both humans and agents
several times in my career, didn't know it had a name. ” – Alistair Cockburn, LinkedIn post Storey, Russo, Novielli, Kobayashi, and Wang. A Disruptive Research Playbook for Studying Disruptive Innovations. TOSEM 2024.
in the code. Observable, measurable, refactorable. Focus of this conference and still critical. COGNITIVE DEBT Lives in people. Erosion of shared understanding across a team over time. Invisible until it paralyzes. INTENT DEBT Lives in artifacts, or their absence. Missing rationale that humans and agents need to evolve the system safely. AI tools are very good at producing code and measuring technical debt. Very few are optimized for improving understanding or preserving intent. This may be where the real risk lies.
understanding erodes, capturing intent becomes harder because no one knows what rationale to document. TECHNICAL → COGNITIVE Messy code is harder to build shared understanding around. Poor architecture fragments the team's mental model. INTENT → TECHNICAL Can the team reason about safe change paths? (Biggest risk with AI-generated code.)
this way of thinking does make a fair bit of sense. The article includes useful sections to diagnose and mitigate each kind of debt. The three interact with each other, and the article outlines some general activities teams should do to keep it all under control” – Martin Fowler, Fragments Newsletter April 2, 2026
“We have no time to understand or document intent” “We will understand it later, and document our decisions then” “Can anyone predict what happens if we make this change?” “We must rollback and recreate what we decided and implemented”
habitat Tools may not be enough to decrease frustration, fatigue and burnout. Cognitive debt isn't just a project risk; it's a physiological one (Starr & Storey) How to balance the three types of debt Impact on project outcomes? Acceptable amounts of debt for sustainability? What tradeoffs are ok? How to reason about these three types of debt? Can we measure and not introduce harmful metrics? Can we sense them without measurement? Other types of debt? Social debt, process debt, org. debt, skill debt, community debt… What role will AI play in causing and mitigating all of these?