Vehid Geruslu) AI-powered Software Testing (A hands-on tutorial – 3 hours) 1- Software Testing Strategist / Consultant Testinium A.Ş., Türkiye ProSys MMC, Azərbaycan AzInTelecom, Azərbaycan 2- Professor of Software Engineering Queen’s University Belfast, UK vgeruslu [email protected] www.vgarousi.com
Testing: Hands-on tutorial (3 hours) • Introduction ◦ About the speaker ◦ History and evolution of software testing ◦ Testing + AI: Two combinations: ▪ Testing AI = Testing the AI itself versus ▪ AI Testing = Using AI to in software testing: to increase effectiveness, efficiency ▪ Two separate issues and fields • AI-powered Software Testing ◦ What testing activities can AI help with, and how? ◦ Insights and lessons learned from using AI in testing ◦ Prompt engineering: The importance of the commands (‘prompts’) that we give to the AI (LLM) • Two categories of AI-powered testing tools ◦ “General-purpose” AI testing tools: Like ChatGPT ◦ “Special-purpose” AI testing tools: 80+ tools already in the market and new tools coming to the market constantly … • Conclusions: ◦ Insights and implications: for software testing jobs, … ◦ The future of this field… Topics will be presented in a hands-on manner and with live examples using ChatGPT 2 ~ 15 minutes ~ 20 minutes ~ 45 minutes ~ 100 minutes = 1.5 hours Total = 180 minutes Practical exercises
in Software Engineering, Carleton University, Ottawa, Canada, 2006 MSc in Software Engineering, University of Waterloo, Canada, 2003 BSc in Software Engineering, Sharif University of Technology, Tehran, 2000 1998-2013 2013- now 2019- now 2022- now • I have been working in Software Engineering, Testing and QA since 1998 ◦ (I wrote my first Test Automation code in 1998 using the IBM Rational Functional Tester tool, to test an e-commerce web app) • Origin: Azerbaijani Turkish • Nationality: Canadian, British • I have lived and worked in 4 different countries since 2000 • Work experience: 3 My area of expertise (in short): Improving software engineering processes and activities of software companies ...and increasing effectiveness / efficiency / quality
Testing, as a field 4 Many software companies around the world have started using artificial intelligence (AI) in their testing and quality assurance activities... Let's not be left behind!
5 System / component / model under test = Aİ / ML models System under test (SUT) = Any software system Aİ Testing: Use of AI in software testing activities Also called: AI-powered testing AI-powered testing AI-augmented testing AI-driven testing AI-powered testing … Testing of Aİ Testing of AI models Testing of AI-enabled software (software with one or more major AI component) Tests (assesses accuracy / precision of) Tests Getting support Aİ / ML models / LLMs Purpose: To get support from AI in software testing activities. Mainly: To increase effectiveness and efficiency (and quality?) of testing Purpose: To test the AI and ML models / software and verify that they work correctly. Or: AI-enabled software
Testing: Hands-on tutorial (3 hours) • Introduction ◦ About the speaker ◦ History and evolution of software testing ◦ Testing + AI: Two combinations: ▪ Testing AI = Testing the AI itself versus ▪ AI Testing = Getting AI to help with testing ▪ Two separate issues and fields • AI-powered Software Testing ◦ What testing activities can AI help with, and how? ◦ Insights and lessons learned from using AI in testing ◦ Prompt engineering: The importance of the commands (‘prompts’) that we give to the AI (LLM) • Two categories of AI-powered testing tools ◦ “General-purpose” AI testing tools: Like ChatGPT ◦ “Special-purpose” AI testing tools: 80+ tools already in the market and new tools coming to the market constantly … • Conclusions: ◦ Insights and implications: for software testing jobs, … ◦ The future of this field… Topics will be presented in a hands-on manner and with live examples using ChatGPT 7 ~ 15 minutes ~ 20 minutes ~ 45 minutes ~ 100 minutes = 1.5 hours Total = 180 minutes Practical exercises
assist with and how? To analyze this, we need to refresh our knowledge of: • SDLC = Software Development Life Cycle • STLC = Software Testing Life Cycle 8
assist with and how? • In almost all testing activities… • (but the level of help from AI in different test activities can vary) • (and the level of help could increase as we are advancing in time !) 9 Let’s add more details to the STLC / process
Planning, etc.) Important point: As software test engineers, we should definitely review carefully what we get from AI 10 Live demo + practice Creating Test Plans: Prompt: My task is to test the following mobile banking app: https://play.google.com/store/apps/details?id=com.pozitron.isc ep https://www.isbank.com.tr/iscep For functional manual testing of this app, provide a draft test plan document. (Türkçe cevap ver lütfen ) Risk Analysis Prompt: For the same mobile app, perform a detailed Risk Analys, so that I can add to the test plan document
hands-on practice … • It’s now your turn … • Prompt: For functional manual testing of the following mobile app: X, provide a draft test plan One suggestion But feel free to use any mobile app idea • Lessons learned? ◦ What are the advantages / benefits of using GenAI for Test Planning? ◦ What are the disadvantages / drawbacks of using GenAI for Test Planning? • Or (you can practice with these later): ◦ Asking ask for draft test plans for Functional “automated” testing ◦ Asking ask for draft test plans for Automated performance testing ◦ Asking ask for draft test plans for Automated security testing ◦ … • Some other app ideas, as Systems Under Test (SUTs). You can use GenAI to test them later 11
Planning, etc.) • In AI-powered Software Testing, we often need to have an iterative interaction with the AI / LLM • … to get the most help that we want, from the AI 12 Just like we ask our colleagues (other software engineers) Yes, this is what I wanted! Is this what I wanted? Can I get a better result?
planning, effort estimation, …) Test effort estimation • Prompt: For the same mobile app, for the Functional Manual Test Plan above, perform test effort estimation Importance of our prompts, to the GenAI: • If we issue a general command, and we do not specify the type of effort estimation approach, the AI will choose any method • But if we ask specific questions, we will get specific and often better / more precise answers from the AI. • Just like we ask our colleagues (other software engineers) 13 Live demo
What is test-case design (or just test design)? • Various types (forms) of test cases? ◦ tests in tabular format, automated test-code, etc. • Example: We can issue the prompt to the GenAI in different ways: 15 Test case design is a systematic / structured approach to creating test cases, outlining the test inputs / steps / conditions, and expected results to verify if a software feature or functionality is working as intended. Which input fields? We always need to give the RIGHT direction to the AI tool Conclusions: • For many years to come, we still need to be in FULL CONTROL of how we benefit from AI in our Software Engineering activities, and how it generates artifacts for our tasks • … until AI becomes capable enough to make near-zero mistakes! • We don’t know when that will be: in 5, 10, 20 years?
For test-case design, we also need to consider another important concept: coverage • How many types of coverage do we have? • In testing, there are two types of coverage: Test coverage, and code coverage ◦ Prompt: I want you to do test-case design based on test (requirements) coverage. For this, I suggest you generate requirements for the same above mobile app (İşCep) in form of user stories. Then use those requirements to generate test cases that "cover" (test) all requirements. You can use tabular format for presentation, if that is suitable. Respond in English ▪ Option for later: Respond in Turkish. ◦ Prompt: For the same İşCep Mobile Banking App, as the next activity, I want you to do test-case design based on code coverage, this time. ▪ For brevity, let’s only do this for the “Login” feature. To see code coverage in this example, generate back- end code in Java for Login feature. ▪ Then use generate test cases in tabular format to fully "cover" (test) the code. Note: I want test cases, and not test code (yet) ▪ When you design test cases in the process, provide details of which lines of code are covered by which test cases. And explain how you are adding new test cases to cover the uncovered lines of code so far. Respond in English ▪ Option for later: Respond in Turkish. All code and variables / comments in it should also be in Turkish 16 Live demo Code coverage Test coverage Percentage of code executed during testing % Percentage of requirements tested during testing %
Testing) • But Exploratory testing ≠ Random (monkey) testing • AI-powered Exploratory Testing: Do you think AI could do / support Exploratory Testing? ◦ AI cannot yet replace human test engineers in full execution of Exploratory Testing, due to its current limitation of learning “context” ◦ But that could change in a few years ◦ For now, AI can support test engineers in Exploratory Testing. Let’s ask the AI itself! 17 • Exploratory Testing: let’s recall what it is… ◦ A common approach: using Test Charters How can you help in Exploratory testing in general?
of test-code from designed test-cases • Prompt: For the same İşCep Mobile Banking App: ◦ I want you to “script” (write) automated test code in Java Selenium, using the test-cases that you designed in the previous interaction, for the “Login” feature ◦ No need for POM.XML or other info. Just give the Java Selenium code ◦ (Code elements should all be in Turkish) • Insights / observations : 19 Canlı demo
each of the above manual or automated test cases on the SUT? • Prompt: For the “Login” feature of the same İşCep Mobile Banking App: 1. First generate prototype back-end (production) code in Java for the Login feature 2. Then do white-box test-case design to derive the list of test cases (in tabular format), to have full “branch" coverage on the code. Note: I want test cases, and not test code 3. Then, generate Selenium Java test code for those design test-cases, clearly matching each test method to each designed test cases 4. Run (in your simulation mode, obviously not on the actual app) the Selenium Java test suite on the prototype back-end code for the Login feature, and report the test output Show details of each step as you proceed. Answer in all English 20 Canlı demo AI-powered Test Execution
we saw, as “general-purpose” AI tool, GPT does not help much in this test activity • Prompt: ◦ Show the names of a few AI-powered Test “Execution” tools and their features ◦ Note: I am not referring to AI tools which do test-code maintenance (such as the self-healing feature), but only those AI-powered tools that help in "execution" of tests, running tests intelligently. ◦ Explain each feature briefly • Note: Two categories of AI-powered testing tools (we will explore this topic later in the tutorial) ◦ “General-purpose” AI testing tools: Like ChatGPT ◦ “Special-purpose” AI testing tools: 80+ tools already in the market and new tools coming to the market constantly … 21
mobile banking app as the SUT: ◦ If we want to test real SUT, the mobile app on an actual phone ◦ What kind of other help can you provide for the “Test Evaluation” phase of the STLC? see attached Prompt: For the same İşCep app as the SUT: ◦ For the Test Evaluation phase, give one single concrete example of the help that you can provide Let’s focus on Test flakiness…. A flaky test is a software test that yields both passing and failing results despite zero changes to the code or test. Prompt: Test flakiness is widely discussed in the testing industry. For the same İşCep app: ◦ In the Test Evaluation phase, how would you help us identify and prevent test flakiness? 22 Canlı demo AI-powered Test Evaluation
page of a university online service… ◦ I have received the following error message, which I believe is a bug ◦ My question: Do you agree that it is a bug? ◦ If you agree that it is a bug, generate a detailed bug report? 23 Canlı demo AI-powered Reporting of Test Results and Defects
Test Data Management (TDM) is the process of … ◦ planning, creating, storage, maintaining and deleting datasets used in software testing • Prompt: For the same İşCep app as the SUT: ◦ Explain, using concrete examples, the assistance that you can provide in test-data management
Environment Management (TEM) is a crucial process that ensures Test Engineers have access to functional, stable, and usable environments for testing, bug replication, and overall software quality improvement 25
For the same İşCep app as the SUT: ◦ Explain, using concrete examples, the assistance that you can provide in Test-Environment management • Prompt: Define a comprehensive device matrix, for Test-Environment management 26
Testing: Hands-on tutorial (3 hours) • Introduction ◦ About the speaker ◦ History and evolution of software testing ◦ Testing + AI: Two combinations: ▪ Testing AI = Testing the AI itself versus ▪ AI Testing = Getting AI to help with testing ▪ Two separate issues and fields • AI-powered Software Testing ◦ What testing activities can AI help with, and how? ◦ Insights and lessons learned from using AI in testing ◦ Prompt engineering: The importance of the commands (‘prompts’) that we give to the AI (LLM) • Two categories of AI-powered testing tools ◦ “General-purpose” AI testing tools: Like ChatGPT ◦ “Special-purpose” AI testing tools: 80+ tools already in the market and new tools coming to the market constantly … • Conclusions: ◦ Insights and implications: for software testing jobs, … ◦ The future of this field… Topics will be presented in a hands-on manner and with live examples using ChatGPT 27 ~ 15 minutes ~ 20 minutes ~ 45 minutes ~ 100 minutes = 1.5 hours Total = 180 minutes Practical exercises
Testing: Hands-on tutorial (3 hours) • Introduction ◦ About the speaker ◦ History and evolution of software testing ◦ Testing + AI: Two combinations: ▪ Testing AI = Testing the AI itself versus ▪ AI Testing = Getting AI to help with testing ▪ Two separate issues and fields • AI-powered Software Testing ◦ What testing activities can AI help with, and how? ◦ Insights and lessons learned from using AI in testing ◦ Prompt Engineering: The importance of the commands (‘prompts’) that we give to the GenAI / LLMs • Two categories of AI-powered testing tools ◦ “General-purpose” AI testing tools: Like ChatGPT ◦ “Special-purpose” AI testing tools: 80+ tools already in the market and new tools coming to the market constantly … • Conclusions: ◦ Insights and implications: for software testing jobs, … ◦ The future of this field… Topics will be presented in a hands-on manner and with live examples using ChatGPT 29 ~ 15 minutes ~ 20 minutes ~ 45 minutes ~ 100 minutes = 1.5 hours Total = 180 minutes Practical exercises
Testing: Hands-on tutorial (3 hours) • Introduction ◦ About the speaker ◦ History and evolution of software testing ◦ Testing + AI: Two combinations: ▪ Testing AI = Testing the AI itself versus ▪ AI Testing = Getting AI to help with testing ▪ Two separate issues and fields • AI-powered Software Testing ◦ What testing activities can AI help with, and how? ◦ Insights and lessons learned from using AI in testing ◦ Prompt Engineering: The importance of the commands (‘prompts’) that we give to the GenAI / LLMs • Two categories of AI-powered testing tools ◦ “General-purpose” AI testing tools: Like ChatGPT ◦ “Special-purpose” AI testing tools: 80+ tools already in the market and new tools coming to the market constantly … • Conclusions: ◦ Insights and implications: for software testing jobs, … ◦ The future of this field… Topics will be presented in a hands-on manner and with live examples using ChatGPT 33 ~ 15 minutes ~ 20 minutes ~ 45 minutes ~ 100 minutes = 1.5 hours Total = 180 minutes Practical exercises
An example • Self-healing of UI tests: Has become a major trend in the industry ◦ Self-healing UI tests use AI to automatically adapt to changes in the application's UI, reducing manual test-code maintenance and ensuring tests remain valid even with UI modifications. • A few interesting videos: ◦ youtube.com/watch?v=cWX2uvygLg4 ◦ youtube.com/watch?v=O8z_sBDFL5A 35
An example • healenium.io 36 I am reviewing tools such as https://healenium.io/ . can you give an estimate on extent of effort spent on maintaining "locators" in Selenium tests Old UI Updated new UI
Testing: Hands-on tutorial (3 hours) • Introduction ◦ About the speaker ◦ History and evolution of software testing ◦ Testing + AI: Two combinations: ▪ Testing AI = Testing the AI itself versus ▪ AI Testing = Getting AI to help with testing ▪ Two separate issues and fields • AI-powered Software Testing ◦ What testing activities can AI help with, and how? ◦ Insights and lessons learned from using AI in testing ◦ Prompt Engineering: The importance of the commands (‘prompts’) that we give to the GenAI / LLMs • Two categories of AI-powered testing tools ◦ “General-purpose” AI testing tools: Like ChatGPT ◦ “Special-purpose” AI testing tools: 80+ tools already in the market and new tools coming to the market constantly … • Conclusions: ◦ Insights and implications: for software testing jobs, … ◦ The future of this field… Topics will be presented in a hands-on manner and with live examples using ChatGPT 37 ~ 15 minutes ~ 20 minutes ~ 45 minutes ~ 100 minutes = 1.5 hours Total = 180 minutes Practical exercises
Tester AI Software Tester: Will it replace us? • Human versus AI? Or: • Human with (and) AI (collaboration)? We are here ~2015 ~2040 ? When? (timeline) 38
learned what and where? • Both can make mistakes in all tasks, including in software testing activities • What can each do faster and with better quality? (effectiveness and efficiency) the World Wide Web contains over 100 billion indexed pages 100,000,000,000 + 40
typical Software Test Engineer asks herself: ◦ “Should I do this testing activity on my own, or should I get help from AI?” • What can we (test engineers) do faster and with better quality? (efficiency and productivity) • What can AI do faster and with better quality? • I have observed these in my own 2 years of AI-assisted Software Testing experience In software testing, I want a table to support my own decision-making of when to use AI and when not to. Create a table showing: what we (test engineers) can do faster and with higher quality (effectiveness and efficiency), and what AI can do faster and with higher quality 41
Levels of autonomy in software testing with AI • To what extent can AI help us? A little or a lot? • First, let's look at the topic in another context: A well-known model in the autonomous vehicle sector (levels of autonomy)... • We can think of a similar model for software testing. Where do you think we are? 42
is Here (A hands-on tutorial – 3 hours) * Also called: AI-driven (AI having a major / active role), AI-powered, AI-enabled, AI-augmented, … Vahid Garousi (Vehid Geruslu) 1- Software Testing Strategist / Consultant Testinium A.Ş., Türkiye ProSys MMC, Azərbaycan AzInTelecom, Azərbaycan 2- Professor of Software Engineering Queen’s University Belfast, UK vgeruslu [email protected] www.vgarousi.com