Upgrade to Pro — share decks privately, control downloads, hide ads and more …

AI coding state of play, Agile meets Architectu...

AI coding state of play, Agile meets Architecture, March 10 2026

Overview of the current state of AI coding tools, and questions technical leaders should be asking themselves right now.
- Context engineering
- Increased autonomy, reduced supervision - ups and downs
- Harnesses

Avatar for Birgitta Boeckeler

Birgitta Boeckeler

March 12, 2026
Tweet

More Decks by Birgitta Boeckeler

Other Decks in Technology

Transcript

  1. © 2026 Thoughtworks AI relevance for technical leaders Use AI

    for your own work Help teams amplify the good, not the bad and the ugly Make it as safe as possible to use AI Use AI to improve and modernise the systems landscape
  2. © 2026 Thoughtworks 4 - The backend code is Python,

    it's in ./app - Build system: Poetry - Remember to activate the virtual environment before any Python or Poetry command Simplest form: “Rules” files AGENTS.md
  3. © 2026 Thoughtworks Context Engineering: Beyond MCP servers and rules

    files 5 Skills Rules Specs Subagents Plugins Commands MCP Servers
  4. © 2026 Thoughtworks Skills example Modularisation of rules Can be

    loaded by LLM just-in-time Can include more files Often refer to already installed CLIs
  5. © 2026 Thoughtworks Context Engineering “in a nutshell” Reusable instructions

    and conventions Skills Rules Specs Com- mands Context interfaces MCP Servers Tools Skills “Intelligentlyˮ loaded, just in time Manage and monitor context size
  6. © 2026 Thoughtworks Context Engineering: Ask yourself… Coding conventions you

    want to amplify? → Skills Workflows to build for modernisation initiatives? → Subagents, Skills Tools that should be available in your org? → CLIs, MCP servers, LSPs, … Versioning and distribution? Is it making things better, or worse? What are practices you want to amplify? → Skills
  7. © 2026 Thoughtworks A familiar beast rears its head: Dev

    Sandboxes “X is not installedˮ OutOfMemoryError Internet access: Yes or no, when and where to?
  8. © 2026 Thoughtworks Less supervision: Ask yourself… Where do you

    want to experiment with cloud agents? How do you help your teams gauge the appropriate level of supervision?
  9. © 2026 Thoughtworks 25 Probability …that AI gets something wrong

    Impact …if AI gets something wrong Detectability …that AI got something wrong → Know the AI tool, know and engineer the context → Know your confidence level in the requirements → Know the use case criticality → Know your feedback loops Which workflow? How much review? How long without supervision?
  10. © 2026 Thoughtworks 26 Probability …that AI gets something wrong

    Impact …if AI gets something wrong Detectability …that AI got something wrong “You have to be this tallˮ to reduce supervision
  11. © 2026 Thoughtworks The lethal trifecta Access to Private Data

    Ability to Externally Communicate Exposure to Untrusted Content Simon Willison https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/
  12. © 2025 Thoughtworks Cost - “the honeymoon is over” Beginning

    of 2024 “Generating 100 lines of code only costs about 12 cents, compare that to a developer salary!ˮ Keynote at Craft Conf 2024 (simplified) Summer 2025 viberank.app
  13. © 2025 Thoughtworks Cost - “the honeymoon is over” Beginning

    of 2024 “Generating 100 lines of code only costs about 12 cents, compare that to a developer salary!ˮ Keynote at Craft Conf 2024 (simplified) Summer 2025 viberank.app $20 Flat Rates $200 Flat Rates Request limiting
  14. © 2025 Thoughtworks Why do 100 LOC cost more like

    $2,50, not $0,12? Plan Review the plan Research existing code Implement the first task Make changes Run the tests Fix the tests Check lint errors Fix lint errors Check the browser “Itʼs not rendering, debugˮ Fix Code review Improve a method Summarise
  15. © 2026 Thoughtworks Less human supervision: Ask yourself… How to

    sandbox coding agents? Does everybody understand the lethal trifecta? Where do you want to experiment with cloud agents? How do you help your teams gauge the appropriate level of supervision?
  16. © 2025 Thoughtworks Experiment: Structural tests as agent feedback “External

    SDKs may only be imported by files in server/clientsˮ
  17. © 2025 Thoughtworks Harness Executed by GPU inference Executed by

    CPU Coding Agent Principles Rules, Examples Ref Docs How-tos CfRs Guides feed-forward Runtime state Static state Agent-as- judge Sensors Mutation testing feed-back self-correction CPU-executed when possible Shift sensors left Human Steering Loop Uses to build steering loop LSPs CLIs, scripts
  18. © 2026 Thoughtworks 41 Probability …that AI gets something wrong

    Impact …if AI gets something wrong Detectability …that AI got something wrong Harness Our trust in the agent
  19. © 2025 Thoughtworks Data dashboard, Typescript Will harnessable topologies be

    the new abstraction layer? Structure Tech Stack Harness template instantiate Guides Sensors
  20. © 2026 Thoughtworks Strong forces are tempting humans out of

    the loop Where can your organisation give in to that pull, where is it dangerous? Context engineering Powerful lever of amplification – both good and bad