Slide 1

Slide 1 text

© 2024 CGI Inc. 1 Sociotechnical Guardrails for AI-Driven Application Testing Maaret Pyhäjärvi November 2024

Slide 2

Slide 2 text

© 2024 CGI Inc. 2 Hello, I am an AI practitioner 👋 IVVES (2019) ITEA3 (EU) research project on AI in testing / testing AI GitHub Copilot (2021) Pair programming job interview with this lead to ‘Let’s Do a Thing and Call it Foo’ ChatGPT (2022) GenAI pair testing CodiumAI Code reviews to mostly ignore Microsoft Copilot Corporate constraints Models Integrations Integrations LLMs Contracts RAG

Slide 3

Slide 3 text

© 2024 CGI Inc. 3 Time used on warning about test automation is time away from succeeding with it. 3 Photo by Filip Zrnzević on Unsplash

Slide 4

Slide 4 text

© 2024 CGI Inc. 4 Setting expectations 01 Programmatic tests with Github Copilot 02 GenAI pair testing with general purpose LLMs 03 RAGifying with general purpose genAI 04 Sociotechnical guardrails

Slide 5

Slide 5 text

© 2024 CGI Inc. 5 Programmatic tests with Github Copilot

Slide 6

Slide 6 text

© 2024 CGI Inc. 6 Guessing with power to accept

Slide 7

Slide 7 text

© 2024 CGI Inc. 7 Who Am I? Guessing with power to accept

Slide 8

Slide 8 text

© 2024 CGI Inc. 8 CTRL+enter for alternatives

Slide 9

Slide 9 text

© 2024 CGI Inc. 9 CTRL+enter for alternatives

Slide 10

Slide 10 text

© 2024 CGI Inc. 10 Some Tests Done?

Slide 11

Slide 11 text

© 2024 CGI Inc. 11 “I love the extra autocompletion that I get with it, it feels like I never have to write any kind of boilerplate code anymore, and I also find it very useful to just ask stuff directly in the IDE. I used to google stuff all the time, and ended up on Stackoverflow a lot, but nowadays I rarely have to do that.” 11

Slide 12

Slide 12 text

© 2024 CGI Inc. 12 GenAI Pair Testing with General Purpose LLMs

Slide 13

Slide 13 text

© 2024 CGI Inc. 13 GenAI Pair Testing Search boundaries: argue for different stances on assumptions Recognize insufficiency and fix it – creating average text is not *your* goal Freedom to criticize as the pair takes no offense Dare to ask things you’d not dare to ask from a colleague Co-piloting allows for repair 13 Photo by Rajvir Kaur on Unsplash

Slide 14

Slide 14 text

© 2024 CGI Inc. 14 Imagine exploring from generated charters

Slide 15

Slide 15 text

© 2024 CGI Inc. 15 Predictive text generation useful for recall not reasoning 15

Slide 16

Slide 16 text

© 2024 CGI Inc. 16 RAGifying with general purpose genAI 16

Slide 17

Slide 17 text

© 2024 CGI Inc. 17 Simplest form of larger context 17

Slide 18

Slide 18 text

© 2024 CGI Inc. 18 RAG + input templating applied over task breakdown 18 Rohamo, Paavo. Enabling Self-healing Locators for Robot Framework with Large Language Models (thesis)

Slide 19

Slide 19 text

© 2024 CGI Inc. 19 Mapping the acronyms 19 ML Machine Learning LLM Large Language Model RAG Retrieval Augmented Generation CoT Chain of Thought Agents Actors in Flows

Slide 20

Slide 20 text

© 2024 CGI Inc. 20 Sociotechnical Guardrails 20

Slide 21

Slide 21 text

© 2024 CGI Inc. 21 21 Core sociotechnical guardrail: data ownership and confidentiality

Slide 22

Slide 22 text

© 2024 CGI Inc. 22 22 Essential sociotechnical guardrails: task breakdown, balancing attended and unattended, timing test activities, experimentation and improve first thinking

Slide 23

Slide 23 text

© 2024 CGI Inc. 23 23 Boundary-seeking sociotechnical guardrails: costs, compensations, architecture

Slide 24

Slide 24 text

© 2024 CGI Inc. 24 Practice-level guardrails Expected values Pay attention to the old testing wisdom of oracles and how do we know. Our critical thinking, built on our learning through curiosity of the world is essential. 01 Anti-toolist worldwiew Realize that features in tools can be copied. Looking for the one best tool makes little sense. We need to protect our time to a partner of choice. 02 Taskwide learning Not lifelong learning or life wide learning, but it's task wide learning. Everything we do is learning activity. 03

Slide 25

Slide 25 text

© 2024 CGI Inc. 25 Shared Direction and Results that Show up in Scale AI-Driven Testing Measure and assess to baseline productivity and good practices Experiment together with customers to deliver promise of value in point, application and system solutions Capture in pipelines and methodology of sociotechnical guardrails Habitually apply and reflect to instill culture of learning Teach for co-creation, share and learn to avoid regional divide Scale with IP on improved service and licensed solutions Human-centric Enhancing Incremental Impactful at Scale

Slide 26

Slide 26 text

© 2024 CGI Inc. 26 Insights you can act on Founded in 1976, CGI is among the largest IT and business consulting services firms in the world. We are insights-driven and outcomes-based to help accelerate returns on your investments. Across hundreds of locations worldwide, we provide comprehensive, scalable and sustainable IT and business consulting services that are informed globally and delivered locally. cgi.com