Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Using TDD to Get Better Results From LLMs/AI

Avatar for Clare Sudbery Clare Sudbery
June 17, 2026
4

Using TDD to Get Better Results From LLMs/AI

(This is the slide deck from the workshop I delivered at Training Day for SoCraTes UK 2026.)

How can you use an LLM to build reliable software? Is it even possible? Are the extravagant claims, both for and against, realistic?

If you combine tests and XP rigour with the efforts of an AI coding assistant, what you can get is something very powerful. This workshop will give you the to start building a very simple app using tests, process files and AI… as well as show some of the potential pitfalls.

Avatar for Clare Sudbery

Clare Sudbery

June 17, 2026

More Decks by Clare Sudbery

Transcript

  1. BREAK INTRO + PAIRS Exercise Part 1 Agenda 14:00 –

    14:10 14:10 – 14:40 14:40 – 14:50 Exercise Part 1 (Continued) 14:50 – 15:00 @ClareSudbery Exercise Part 2 Let me tell you a story… 15:00 – 15:10 15:10 – 15:15 Final discussion 15:15 – 15:30
  2. @ClareSudbery Windsurf – free until you reach your token limit

    (which could happen during the day!) Cursor – free up to a limit, or if you have your own API keys from OpenAI or Anthropic (Claude) VSCode with GitHub CoPilot – free up to a limit Codex (ChatGPT) Claude (reported performance degaradation recently) Download a tool (if you haven’t already) One that will work in an integrated IDE
  3. @ClareSudbery 2 minutes Write your most optimistic LLM-coding goal on

    a postit What would be fun to build using LLM-augmented coding? Write clearly – people need to read it Add your name and remember what you wrote!
  4. @ClareSudbery YOU ARE LIKELY TO BE ABOUT TO MOVE SEATS

    Be prepared! And remember to introduce yourself… Name + role
  5. @ClareSudbery Walk around, waving your sticky note Find the person

    who wrote it Form a pair / three with them Introduce yourselves! Name + role
  6. TDD • Know what it is? • Used it before?

    • Use it regularly? @ClareSudbery
  7. Exercise, Part One Create a music sequencer. The user will

    be shown a simple grid. When they click on squares in the grid, musical notes are played. Notes can be toggled on and off. When they click a Play button, the notes in the grid are played in sequence. @ClareSudbery
  8. The Catch This half of the room will use TDD,

    and enforce the AI’s use of TDD. The other half are BANNED from any mention of tests. If the AI suggests tests, follow its suggestions but don’t intervene. @ClareSudbery
  9. Exercise, Part One Create a music sequencer. The user will

    be shown a simple grid. When they click on squares in the grid, musical notes are played. Notes can be toggled on and off. When they click a Play button, the notes in the grid are played in sequence. @ClareSudbery
  10. Exercise, Part Two Join with another pair. Prove to them

    that your app does what it should. @ClareSudbery
  11. @ClareSudbery LLM-AUGMENTED CODING: GOLDEN RULES 1. Encourage the LLM to

    ask clarifying questions 2. Move in small steps. Big steps feel fast but they quickly slow you down. 3. Always start with tests. - a. Ask the LLM to write tests before code. - b. Check and refine the tests. - c. Ask the LLM to make the tests pass by implementing the solution. - d. Keep running all tests to make sure nothing is broken. 4. Ask the LLM to summarise conversations. Persist those summaries in files. 5. Use a starter character in each process file. 6. Throw things away! Start again from scratch. Beware the sunk cost fallacy. 7. Regularly start a new chat context. 8. Be polite. Use "please" and "thank you".
  12. @ClareSudbery LLM-AUGMENTED CODING: GOLDEN RULES 9. Give the LLM examples

    that will help it to understand 10. Don't write code (first draft) if the LLM can do it for you 11. Beware confirmation bias! 12. Try having two or more LLMs working in parallel 13. Ask the LLM for multiple options, with explanations of pros and cons
  13. RESULTS • Who completed part 1? • Who completed the

    extension? • What problems did you encounter? • Who's happy with the result? What did you like? • If you were successful, why do you think that is? @ClareSudbery
  14. Context windows @ClareSudbery Context window = num of tokens the

    LLM can handle Tokens are chunks of text All training text is broken down into tokens LLMs work by predicting the next most likely token Context windows are large - often millions of tokens Every time we start a new chat, we start filling up the effective context
  15. Effective context @ClareSudbery Every time we start a new chat,

    we start filling up the effective context This seems to be more like 10,000 - 20,000 tokens