Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ATAGTR: Sociotechnical Guardrails for AI-Driven...

ATAGTR: Sociotechnical Guardrails for AI-Driven Application Testing

With machine learning, large language models and retrieval augmented generation applied on text, voice, images and video, there's plenty of things AI out there. For 'acting humanly' framing to AI, any software doing things humans could do gets bundled in the new wave of AI-driven application testing. Armed with mindset of evaluation and amplifying fail signals, we are educated with great examples of the many ways these technologies fail. We have less education on the guardrails on how these technologies add to results of testing and productivity. We need the warnings to make sense if we are ok with all of this, yet the time to listen and deliver warnings is time away from figuring out the guardrails to succeed with this. Succeed with what is available to us all today, even if not yet perfectly packaged.

In this talk, we go through three demos, framed with a perspective of someone who cares about results of testing and value of testing perspective in software development. We have people around with these augmented intelligence solutions, and we need a conversation that is framed with the change of testing and curiosity, rather than fear. We need sociotechnical guardrails providing us useful solutions. Given there is a problem here, how do we work with AI so that we recognize and mitigate the problems? How do we move from value of fun to value at our work and discuss use of time in a frame of curiosity, building a future we can and want to be around? How would you use AI for it to be useful to you and your organization?

Time used on something is time away from something else. You'll use time listening to this, and chances are you're able to navigate the waters of AI-driven application testing a little better, today.

Maaret Pyhäjärvi

November 15, 2024
Tweet

More Decks by Maaret Pyhäjärvi

Other Decks in Programming

Transcript

  1. © 2024 CGI Inc. 2 Hello, I am an AI

    practitioner 👋 IVVES (2019) ITEA3 (EU) research project on AI in testing / testing AI GitHub Copilot (2021) Pair programming job interview with this lead to ‘Let’s Do a Thing and Call it Foo’ ChatGPT (2022) GenAI pair testing CodiumAI Code reviews to mostly ignore Microsoft Copilot Corporate constraints Models Integrations Integrations LLMs Contracts RAG
  2. © 2024 CGI Inc. 3 Time used on warning about

    test automation is time away from succeeding with it. 3 Photo by Filip Zrnzević on Unsplash
  3. © 2024 CGI Inc. 4 Setting expectations 01 Programmatic tests

    with Github Copilot 02 GenAI pair testing with general purpose LLMs 03 RAGifying with general purpose genAI 04 Sociotechnical guardrails
  4. © 2024 CGI Inc. 11 “I love the extra autocompletion

    that I get with it, it feels like I never have to write any kind of boilerplate code anymore, and I also find it very useful to just ask stuff directly in the IDE. I used to google stuff all the time, and ended up on Stackoverflow a lot, but nowadays I rarely have to do that.” 11
  5. © 2024 CGI Inc. 13 GenAI Pair Testing Search boundaries:

    argue for different stances on assumptions Recognize insufficiency and fix it – creating average text is not *your* goal Freedom to criticize as the pair takes no offense Dare to ask things you’d not dare to ask from a colleague Co-piloting allows for repair 13 Photo by Rajvir Kaur on Unsplash
  6. © 2024 CGI Inc. 18 RAG + input templating applied

    over task breakdown 18 Rohamo, Paavo. Enabling Self-healing Locators for Robot Framework with Large Language Models (thesis)
  7. © 2024 CGI Inc. 19 Mapping the acronyms 19 ML

    Machine Learning LLM Large Language Model RAG Retrieval Augmented Generation CoT Chain of Thought Agents Actors in Flows
  8. © 2024 CGI Inc. 22 22 Essential sociotechnical guardrails: task

    breakdown, balancing attended and unattended, timing test activities, experimentation and improve first thinking
  9. © 2024 CGI Inc. 24 Practice-level guardrails Expected values Pay

    attention to the old testing wisdom of oracles and how do we know. Our critical thinking, built on our learning through curiosity of the world is essential. 01 Anti-toolist worldwiew Realize that features in tools can be copied. Looking for the one best tool makes little sense. We need to protect our time to a partner of choice. 02 Taskwide learning Not lifelong learning or life wide learning, but it's task wide learning. Everything we do is learning activity. 03
  10. © 2024 CGI Inc. 25 Shared Direction and Results that

    Show up in Scale AI-Driven Testing Measure and assess to baseline productivity and good practices Experiment together with customers to deliver promise of value in point, application and system solutions Capture in pipelines and methodology of sociotechnical guardrails Habitually apply and reflect to instill culture of learning Teach for co-creation, share and learn to avoid regional divide Scale with IP on improved service and licensed solutions Human-centric Enhancing Incremental Impactful at Scale
  11. © 2024 CGI Inc. 26 Insights you can act on

    Founded in 1976, CGI is among the largest IT and business consulting services firms in the world. We are insights-driven and outcomes-based to help accelerate returns on your investments. Across hundreds of locations worldwide, we provide comprehensive, scalable and sustainable IT and business consulting services that are informed globally and delivered locally. cgi.com