Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Coding challenge: AI-test vs PI-test

Coding challenge: AI-test vs PI-test

Let's be honest: every developer loves writing tests, right? Meaning that we're all diligently practicing Test-Driven Development (TDD), not just preaching it. Or... are we? Maybe that's why we keep asking our AI coding assistants to write tests for us. Just prompt ChatGPT to generate a test for your latest class and voilà: "Of course I can help you write your tests. Here's a complete test suite that achieves 100% line and branch coverage."

But does that really mean the tests are any good?

In this session, Laurens and Frederieke will push AI-generated tests to their limits using mutation testing and the PI-Test framework. Together, we'll explore whether these tests truly hold up under scrutiny, or if they just look good on paper.

By the end, you'll know when to say to your coding assistant, "Thanks, ChatGPT, that was helpful," and when to say, "Thanks, but no thanks. I'll write my own test this time."

Avatar for Frederieke Scheper

Frederieke Scheper

October 10, 2025
Tweet

More Decks by Frederieke Scheper

Other Decks in Programming

Transcript

  1. AI-TEST VS PI-TEST Devvox Belgium 2025 Laurens van der Kooi

    | Frederieke Scheper Designed by Robin Potze Coding Challenge:
  2. Laurens van der Kooi Frederieke Scheper CodeSmith Sopra Steria Senior

    Java Developer Sopra Steria About us Coding Challenge: AI-Test vs PI-Test
  3. “Simple as 1, 2, 3 ?” mvn test-compile pitest:mutationCoverage “All

    the mutants should have been killed 🛸, No survivors allowed ☠ …” PI-test - mutation testing in Java
  4. Coding Challenge: AI-Test vs PI-Test PI-test - example mutators if

    (a < b) { // do something } if (a <= b) { // do something } ➡ becomes “Conditionals boundary mutator” Source code < <= > >= Mutated code <= < >= > ➡ becomes AND THE TEST SHOULD FAIL !
  5. Coding Challenge: AI-Test vs PI-Test PI-test - example mutators public

    int method(int i) { i++; return i; } public int method(int i) { i--; return i; } ➡ becomes “Increments mutator” ➡ becomes Source code i++ i-- i = i+1 i = i-1 Mutated code i-- i++ i = i-1 i = i+1 AND THE TEST SHOULD FAIL (AGAIN) !
  6. Coding Challenge: AI-Test vs PI-Test PI-test - example mutators public

    void aVoidMethod(int i) { // does something } public int foo() { int i = 5; aVoidMethod(i); return i; } public void aVoidMethod(int i) { // does something } public int foo() { int i = 5; /* method removed !! */ return i; } ➡ becomes “Void method calls mutator” AND THE TEST SHOULD FAIL (AGAIN) !
  7. Coding Challenge: AI-Test vs PI-Test PI-test - example mutators public

    String sayHello() { int i = 5; String foo = method(i); return "hello " + foo; } public String sayHello() { int i = 5; String foo = method(i); return null; } ➡ becomes “Null returns mutator” AND THE TEST SHOULD FAIL (AGAIN) !
  8. Coding Challenge: AI-Test vs PI-Test PI-test - example mutators public

    Set<String> strings() { int i = 5; Set<String> foo = method(i); return foo; } public Set<String> strings() { int i = 5; Set<String> foo = method(i); return Collections.emptySet(); } ➡ becomes “Empty returns mutator” AND THE TEST SHOULD FAIL (AGAIN) !
  9. Coding Challenge: AI-Test vs PI-Test PI-test - example mutators public

    Set<String> strings() { int i = 5; Set<String> foo = method(i); return foo; } public Set<String> strings() { int i = 5; Set<String> foo = method(i); return Collections.emptySet(); } ➡ becomes “Empty returns mutator” AND THE TEST SHOULD FAIL (AGAIN) ! Over to the DJ !
  10. At a party — People come together to dance and

    have fun. — DJs perform MixSessions to guide the crowd. — Music evolves: warm-up → peak → cool-down. Laurens van der Kooi | Frederieke Scheper Coding Challenge: AI-Test vs PI-Test
  11. Laurens van der Kooi | Frederieke Scheper What does a

    DJ do at a party? Select Tracks Mixes & Transitions Read the Crowd Respond to Requests Build Atmosphere Type Action Coding Challenge: AI-Test vs PI-Test
  12. Reading the crowd: crowd events Laurens van der Kooi |

    Frederieke Scheper CrowdCheered Coding Challenge: AI-Test vs PI-Test RequestFromAudienceReceived DanceFloorFilledUp DancefloorEmptied CrowdEnergyDropped
  13. The DJ starts Laurens van der Kooi | Frederieke Scheper

    TESTS findTrack MixSession RequestFromAudienceReceived: Play my favourite song! Coding Challenge: AI-Test vs PI-Test
  14. Laurens van der Kooi | Frederieke Scheper The dance floor

    starts filling up CrowdCheered LOW MEDIUM HIGH CrowdCheered: Sheesh! Coding Challenge: AI-Test vs PI-Test
  15. Laurens van der Kooi | Frederieke Scheper The floor is

    packed DanceFloorFilledUpEvent Sheesh! Turn it up! Pop off! Straight bangers! Coding Challenge: AI-Test vs PI-Test
  16. Laurens van der Kooi | Frederieke Scheper The DJ makes

    a mistake E-8302: DanceFloorEmptied CrowdEnergyDropped: Boooooo! Coding Challenge: AI-Test vs PI-Test
  17. Laurens van der Kooi | Frederieke Scheper CrowdCheered: Saved!!! The

    DJ recovers Coding Challenge: AI-Test vs PI-Test [email protected] ****************
  18. Devoxx Belgium Laurens van der Kooi | Frederieke Scheper Coding

    Challenge: AI-Test vs PI-Test And the winner is … AI-TEST? PI-TEST ?
  19. Devoxx Belgium Laurens van der Kooi | Frederieke Scheper Coding

    Challenge: AI-Test vs PI-Test And the winner is … AI-TEST? THE DEV-COMMUNITY ! PI-TEST ? 🍻
  20. Devvox Belgium 2025 THANK YOU Frederieke Scheper Laurens van der

    Kooi Coding Challenge: AI-Test vs PI-Test Designed by Robin Potze