AI coding state of play, Agile meets Architecture, March 10 2026

© 2026 Thoughtworks What AI augmentation means for technical leaders
March 2026 1

© 2026 Thoughtworks AI relevance for technical leaders Use AI
for your own work Help teams amplify the good, not the bad and the ugly Make it as safe as possible to use AI Use AI to improve and modernise the systems landscape

© 2026 Thoughtworks Context engineering: Beyond rules files and MCP
servers

© 2026 Thoughtworks 4 - The backend code is Python,
it's in ./app - Build system: Poetry - Remember to activate the virtual environment before any Python or Poetry command Simplest form: “Rules” files AGENTS.md

© 2026 Thoughtworks Context Engineering: Beyond MCP servers and rules
files 5 Skills Rules Specs Subagents Plugins Commands MCP Servers

© 2026 Thoughtworks Skills example Modularisation of rules Can be
loaded by LLM just-in-time Can include more files Often refer to already installed CLIs

© 2026 Thoughtworks Context Engineering “in a nutshell” Reusable instructions
and conventions Skills Rules Specs Com- mands Context interfaces MCP Servers Tools Skills “Intelligentlyˮ loaded, just in time Manage and monitor context size

© 2026 Thoughtworks Subagents Main conversation (“orchestratorˮ) Research Implemen- tation
Code Review spawn report spawn report spawn report

© 2026 Thoughtworks Context Engineering: Ask yourself… Coding conventions you
want to amplify? → Skills Workflows to build for modernisation initiatives? → Subagents, Skills Tools that should be available in your org? → CLIs, MCP servers, LSPs, … Versioning and distribution? Is it making things better, or worse? What are practices you want to amplify? → Skills

© 2026 Thoughtworks A familiar beast rears its head: Dev
Sandboxes “X is not installedˮ OutOfMemoryError Internet access: Yes or no, when and where to?

© 2026 Thoughtworks Less supervision: Ask yourself… Where do you
want to experiment with cloud agents? How do you help your teams gauge the appropriate level of supervision?

© 2026 Thoughtworks 25 Probability …that AI gets something wrong
Impact …if AI gets something wrong Detectability …that AI got something wrong → Know the AI tool, know and engineer the context → Know your confidence level in the requirements → Know the use case criticality → Know your feedback loops Which workflow? How much review? How long without supervision?

Impact …if AI gets something wrong Detectability …that AI got something wrong “You have to be this tallˮ to reduce supervision

© 2025 Thoughtworks More autonomy, less supervision: Beware of security
and cost

© 2026 Thoughtworks Security - unwanted command execution > Prompt
Injection > Bypassing allow lists

© 2026 Thoughtworks The lethal trifecta Access to Private Data
Ability to Externally Communicate Exposure to Untrusted Content Simon Willison https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/

© 2025 Thoughtworks Cost - “the honeymoon is over” Beginning
of 2024 “Generating 100 lines of code only costs about 12 cents, compare that to a developer salary!ˮ Keynote at Craft Conf 2024 (simplified) Summer 2025 viberank.app

© 2025 Thoughtworks Cost - “the honeymoon is over” Beginning
of 2024 “Generating 100 lines of code only costs about 12 cents, compare that to a developer salary!ˮ Keynote at Craft Conf 2024 (simplified) Summer 2025 viberank.app $20 Flat Rates $200 Flat Rates Request limiting

© 2025 Thoughtworks Why do 100 LOC cost more like
$2,50, not $0,12? Plan Review the plan Research existing code Implement the first task Make changes Run the tests Fix the tests Check lint errors Fix lint errors Check the browser “Itʼs not rendering, debugˮ Fix Code review Improve a method Summarise

© 2026 Thoughtworks Less human supervision: Ask yourself… How to
sandbox coding agents? Does everybody understand the lethal trifecta? Where do you want to experiment with cloud agents? How do you help your teams gauge the appropriate level of supervision?

© 2025 Thoughtworks Experiment: Structural tests as agent feedback “External
SDKs may only be imported by files in server/clientsˮ

© 2025 Thoughtworks Harness Executed by GPU inference Executed by
CPU Coding Agent Principles Rules, Examples Ref Docs How-tos CfRs Guides feed-forward Runtime state Static state Agent-as- judge Sensors Mutation testing feed-back self-correction CPU-executed when possible Shift sensors left Human Steering Loop Uses to build steering loop LSPs CLIs, scripts

Impact …if AI gets something wrong Detectability …that AI got something wrong Harness Our trust in the agent

© 2025 Thoughtworks Data dashboard, Typescript Will harnessable topologies be
the new abstraction layer? Structure Tech Stack Harness template instantiate Guides Sensors

© 2026 Thoughtworks Strong forces are tempting humans out of
the loop Where can your organisation give in to that pull, where is it dangerous? Context engineering Powerful lever of amplification – both good and bad

AI coding state of play, Agile meets Architectu...

AI coding state of play, Agile meets Architecture, March 10 2026

More Decks by Birgitta Boeckeler

Other Decks in Technology

Featured

Transcript