Chapter 1 – Introduction to Generative AI for Software Testing (ISTQBⓇ CT-GenAI v1.1). Reading Materials

Slide 1

Slide 1 text

ISTQB® CT-GenAI TRAINING COURSE Chapter 1. Introduction to Generative AI for Software Testing Iuliia Emelianova, Dmitrii Degtiarenko BUILD SOFTWARE TO TEST SOFTWARE ISTQB® CT-GenAI COURSE 2026, V1.1 exactpro.com

Slide 2

Slide 2 text

Learning Activity Overview Title: Chapter 1 – Introduction to Generative AI for Software Testing (ISTQBⓇ CT-GenAI v1.1) Format: Reading Materials (self-study or guided reading) Estimated Duration: 100 minutes Target Audience: Software Testers, Test Automation Engineers, Test Analysts, Test Managers, Software Developers and professionals who need a solid understanding of Generative AI (GenAI) in testing – project managers, quality managers, software development managers, business analysts, IT directors and consultants, professionals preparing for ISTQBⓇ CT-GenAI certification Programme Context: This learning activity forms a part of the ISTQBⓇ CT-GenAI training programme and aligns with the syllabus version 1.1 Engagement: During this chapter, you will: ● Understand what GenAI and Large Language Models (LLMs) are, how they work and when to use them ● See how LLMs support software testing tasks such as requirements analysis, test case creation, and defect detection ● Learn how multimodal LLMs enhance testing through image and text understanding ● Explore how LLMs assist in test data generation, automation, and result analysis ISTQB® CT-GenAI Training Course | Chapter 1. Introduction to Generative AI for Software Testing Page 2 of 34

Slide 5

Slide 5 text

● Multimodal models. Models that handle more than one kind of data (e.g., text, pictures and audio). Think of GenAI as an extremely knowledgeable assistant that has read almost everything on the internet. It doesn’t just repeat information, it can generate something new, like a recipe, a poem, or a test case which is a set of preconditions, inputs, actions (where applicable), expected results and postconditions, developed based on test conditions. To understand your request, it chops your text into Lego-like bricks called tokens. The context window is how many Lego bricks it can keep on the table while building an answer. A multimodal model is like a person who can read, look at photos, and listen to music all at once, then explain how they fit together. In software testing, LLMs can support tasks such as reviewing and improving acceptance criteria which are the criteria that a work product must satisfy to be accepted by the stakeholders, generating test cases or test scripts which are the sequences of instructions for the execution of tests, identifying potential defects, analysing defect patterns, generating synthetic test data which is data needed for test execution, or supporting documentation generation, across the entire test process. Example 1.1. Using Generative AI to Derive Test Cases from Requirements Imagine pasting this requirement into an AI tool: “If a user enters an invalid password three times, the system must lock the account for 15 minutes.” Here’s how GenAI might help a tester: 1. It splits the sentence into tokens so it can “read” the rule accurately. 2. It interprets the meaning: three wrong attempts, lock the account, duration is 15 minutes. 3. It proposes draft test ideas: Enter a wrong password once – expect no lock. Enter it twice – still no lock. Enter it three times – expect a lock message and the timer starts. After 15 minutes, try again – login should work. ISTQB® CT-GenAI Training Course | Chapter 1. Introduction to Generative AI for Software Testing Page 5 of 34

Slide 8

Slide 8 text

● Generative AI is a branch of deep learning that doesn’t just recognise patterns. Instead it creates new content (text, images, video, audio, code) based on what it has learned. LLMs are the main type. Here we have an inventive chef who takes everything they’ve learned and writes brand-new recipes. Example 1.5. GenAI in Software Testing From a user story, GenAI can draft test cases, propose boundary values, or even write an automated test script. AI has grown from strict rule engines to models that learn, to systems that create. Generative AI’s strength is that it uses vast pre-training, so you can apply it to testing tasks right away. This means that you don’t need to build a model from scratch. That power also means you must understand its limits and risks, which we’ll explore in later chapters. 1.1.2 Basics of Generative AI and LLMs (K2) Generative AI is powered by a family of models called Large Language Models (LLMs). These models are built on a special type of deep-learning architecture called the transformer. They’re trained on enormous collections of text (books, articles, code, websites) so they can learn the structure and meaning of language. Some lighter versions, called Small Language Models (SLMs), use the same principles but have fewer parameters. They’re faster and easier to run, but usually less capable. But before an LLM can “understand” text, it must translate words into numbers it can work with. It does this in two key steps: 1. Tokenization. The model breaks a sentence into small pieces called tokens. A token might be a whole word (“tokenization”), a part of a word (“token” and “ization”), or even punctuation. Just like mentioned before, think of tokens as Lego bricks: the model doesn’t see a sentence as a smooth wall of text; it sees a row of bricks it can rearrange and analyse. ISTQB® CT-GenAI Training Course | Chapter 1. Introduction to Generative AI for Software Testing Page 8 of 34

Slide 16

Slide 16 text

An analogy here would be an investigator who doesn’t just read witness statements but also studies photos, listens to recordings, and watches CCTV footage and then pieces everything together to solve a case. A vision-language model is a specialised subset of multimodal LLMs trained mainly on text–image pairs. They learn how visual elements connect to written descriptions and can answer questions or generate text about an image. It’s like a bilingual person fluent in both “picture language” and “written language”, able to translate between them or discuss both at once. Software testers often work with visual artefacts in the form of screenshots, mock-ups, wireframes, charts, as well as textual specs. Multimodal models can bridge the gap: ● GUI analysis: Supply a screenshot of an app and ask, “List any accessibility issues you notice.” ● Wireframe (or simple visual outline of a screen) & acceptance criteria: Give a page mock-up plus a short story; ask the model to propose acceptance criteria that match what’s on the screen (for example, what input fields exist, what happens when you click each button, are there navigation flows that need to be tested, etc.) ● Image-based defect detection: Compare an expected screen image with an actual run; the model can highlight missing buttons or colour mismatches. ● Hybrid reasoning: Mix logs, a screen capture, and a requirement paragraph to explain why a test failed. Example 1.11. Using Multimodal LLM for GUI Testing A tester uploads a screenshot of a login page (with “Username,” “Password,” and a misaligned “Login” button) and a text snippet: “The login button should be horizontally centred.” A multimodal LLM can: ● recognise the button in the image; ● check its position against the requirement; ISTQB® CT-GenAI Training Course | Chapter 1. Introduction to Generative AI for Software Testing Page 16 of 34

Slide 20

Slide 20 text

3. Test oracle generation. A test oracle is the source of truth that tells you whether a test passed or failed, in other words, the expected result. Without a reliable oracle, even the best test execution leaves you unsure: “Is this behaviour correct, or is it a bug?” The oracle problem is a long-standing challenge in software testing. Sometimes requirements are incomplete, ambiguous, or missing edge cases. This uncertainty exists in conventional testing and remains a challenge even when AI is involved, because models themselves don’t magically know the “true” answer. Imagine you’re grading essays without an answer key. You can guess which ones are good, but you don’t always know with certainty. That’s the everyday life of a tester facing the oracle problem. Test oracles require interpretation and should be sensitive enough to flag genuinely unusual behaviour without overwhelming testers with minor issues. They function similarly to fraud-detection systems, IT monitoring platforms, or market surveillance tools. For complex or probabilistic systems, establishing a test oracle may be difficult without access to the “ground truth”, the actual real-world result that the system aims to predict. In some cases, expected results can be defined within limits through expert consultation, though experts may disagree or be unwilling to have their judgment automated. Issues such as varying competence, differing interpretations, and human uncertainty must be considered. Several testing techniques can help mitigate the oracle problem, including A/B testing, back-to-back testing, and metamorphic testing. ● Back-to-back (or differential) testing: Run the same input on two different implementations (e.g., legacy system vs new system, or two models) and compare outputs. AI can help automate the comparison or highlight suspicious differences. ● A/B testing: Present different user groups with different versions (A vs B), then analyse outcomes. An LLM can assist in designing the test plan, collecting feedback, or spotting anomalies in results. ● Metamorphic testing: Define relationships between inputs and outputs that should always hold true. For example, doubling an input should double an output. LLMs can help generate these relationships or check them against results. ISTQB® CT-GenAI Training Course | Chapter 1. Introduction to Generative AI for Software Testing Page 20 of 34

Slide 29

Slide 29 text

● Quickly explore requirements or clarify ambiguities by pasting text and asking questions. ● Brainstorm test ideas: “List edge cases for password validation.” ● Translate or rephrase requirements into Gherkin syntax or structured test steps. ● Summarise logs, reports, or defect descriptions. ● Experiment interactively (refining prompts until the response fits the context). Chatbots provide flexibility but no built-in control over traceability, history, or integration with test management tools. They’re great for discovery but require human verification and adaptation before outputs are used in real projects. LLM-Powered Testing Applications These are specialised testing tools that integrate LLMs into their workflows. Unlike general chatbots, they use predefined prompts, structured templates, or pipelines to deliver repeatable, auditable results. Examples include AI-assisted test case generators, automatic defect classifiers, log analysers, or tools that create synthetic test data. If a chatbot is like a friendly consultant, an LLM-powered app is like a workshop machine with safety guards and settings pre-configured for testing. It focuses the AI’s power toward a specific purpose while maintaining consistency and traceability. How testers use it: ● Test case generation: Convert user stories or requirements into structured test artefacts. ● Defect analysis: Cluster similar bugs by text similarity or predict severity from descriptions. ● Test data synthesis: Produce realistic but anonymised data. ● Test maintenance support: Detect redundant or overlapping tests. ● Root-cause hints: Analyse failed runs and highlight likely causes. The benefits of this technology include consistent outputs aligned with project templates, built-in history and traceability and easier validation within established workflows. At the same time the limitations of this approach are as ISTQB® CT-GenAI Training Course | Chapter 1. Introduction to Generative AI for Software Testing Page 29 of 34

Slide 1

Slide 1 text

Slide 2

Slide 2 text

Slide 3

Slide 3 text

Slide 4

Slide 4 text

Slide 5

Slide 5 text

Slide 6

Slide 6 text

Slide 7

Slide 7 text

Slide 8

Slide 8 text

Slide 9

Slide 9 text

Slide 10

Slide 10 text

Slide 11

Slide 11 text

Slide 12

Slide 12 text

Slide 13

Slide 13 text

Slide 14

Slide 14 text

Slide 15

Slide 15 text

Slide 16

Slide 16 text

Slide 17

Slide 17 text

Slide 18

Slide 18 text

Slide 19

Slide 19 text

Slide 20

Slide 20 text

Slide 21

Slide 21 text

Slide 22

Slide 22 text

Slide 23

Slide 23 text

Slide 24

Slide 24 text

Slide 25

Slide 25 text

Slide 26

Slide 26 text

Slide 27

Slide 27 text

Slide 28

Slide 28 text

Slide 29

Slide 29 text

Slide 30

Slide 30 text

Slide 31

Slide 31 text

Slide 32

Slide 32 text

Slide 33

Slide 33 text

Slide 34

Slide 34 text