Exploratory Unit Testing with and for GenAI

© 2025 CGI Inc. 2 Took a while to see
why testing != testing Testing as artifact creation Testing as perfor- mance EXAMPLES of INTENT EXPERIMENTS of IMPACT

© 2025 CGI Inc. 3 Test-Driven Development vs. Exploratory unit
testing 3

© 2025 CGI Inc. 4 @maaretp @[email protected] o Who Am
I? Guessing with power to accept Testing my program or the tool? Note: …pillaging digital content without consent, compensation and attribution Even if legal not ethical.

© 2025 CGI Inc. 5 @maaretp @[email protected] o Computer Assisted
Software Authorship https://github.com/features/copil ot/

© 2025 CGI Inc. 6 @maaretp @[email protected] o CTRL+enter for
alternatives

© 2025 CGI Inc. 7 @maaretp @[email protected] o CTRL+enter for
alternatives

© 2025 CGI Inc. 8 WE are accountable 1. Intent
/ Implementation 2. Domain for the Layman 3. Domain for the Expert 4. Reference Implementation 5. People Filtering

© 2025 CGI Inc. 10 @maaretp @[email protected] o Some Tests
Done?

© 2025 CGI Inc. 12 Asserts Approvals From One to
Many

© 2025 CGI Inc. 14 In a nutshell… 14 *
Sandi Metz: The Magic Tricks of Testing https://www.youtube.com/watch?v=URSWYvyc42M

© 2025 CGI Inc. 15 Part I. Review for correctness
and conciseness Part II. Input -> Output Part III. Rules of behavior boundaries Part IIII. Coverage Part IIIII. Sampling vs wide nets (approvals) Part VI. Properties Developer intent

© 2025 CGI Inc. 24 @maaretp @[email protected] o Domain rules:
1V à IIII in clock design as per orders of King Louis XIV of France https://www.amalgamsab.com/iiii-or-is-it- iv.html

© 2025 CGI Inc. 25 Part 1. Be the resident
expert. Ask around. Part II. Rules. More rules. Part III. Find better experts. Part IIII. Disagreeing with boundaries. Part IIIII. Oracles. Part VI. Find better oracles. Part VII. No user would do what users would do. Domain

© 2025 CGI Inc. 26 Part 1. Dependencies. Part II.
Interruptions. Both software and hardware. Part III. People. “People are not pure functions; they have all sorts of interesting side effects.” - Engineering Management for the Rest of Us, Sarah Drasner … nor are pure functions if you grow the boundary of what might fail. Environment

© 2025 CGI Inc. 27 WE are accountable 1. Intent
/ Implementation 2. Domain for the Layman 3. Domain for the Expert 4. Reference Implementation 5. People Filtering

© 2025 CGI Inc. 28 Answer key to some of
the bugs 1…3999 is not the right boundary. 4 is IIII and 5 is IIIII and 50 is XXXXX with right choice of domain. Entirely different rules from classic to concise to simplified. Fractions and zero do exist. Infinite loops, miscalculation at boundaries. Unimplemented boundary checking. Reuse without license. 28

© 2025 CGI Inc. 29 Stakeholders happy, even delighted –Quality
Information Good Team’s Output –Quality Information Less than Good Team’s Output –Quality Information Results Gap Surprise! Results Gap on a Team that thinks Testers == Testing Pick up the pizza boxes… ”Find (some of) What Others May Have Missed”

© 2025 CGI Inc. 30 A majority of the production
failures (77%) can be reproduced by a unit test. https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-yuan.pdf Through https://www.slideshare.net/Kevlin/the-error-of-our-ways

© 2025 CGI Inc. 31 A vocab walkthrough From Arrange-Act-
Assert to Do- Verify Structure From Asserts to Approvals Timing of expected From handcrafted examples to generated Tools to do more From hard to soft Decision of fail handling Tools ideology Pandora’s box with opportunities

© 2025 CGI Inc. 35 Insights you can act on
Founded in 1976, CGI is among the largest IT and business consulting services firms in the world. We are insights-driven and outcomes-based to help accelerate returns on your investments. Across hundreds of locations worldwide, we provide comprehensive, scalable and sustainable IT and business consulting services that are informed globally and delivered locally. cgi.com

Exploratory Unit Testing with and for GenAI

Exploratory Unit Testing with and for GenAI

More Decks by Maaret Pyhäjärvi

Other Decks in Programming

Featured

Transcript