Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Playwright × AI: Non-Technical QA Team in Practice

Playwright × AI: Non-Technical QA Team in Practice

Avatar for KintoTech_Dev

KintoTech_Dev

May 18, 2026

More Decks by KintoTech_Dev

Other Decks in Technology

Transcript

  1. From 50% Cost Reduction to 90% Coverage Playwright × AI:

    Non-Technical QA Team in Practice QA Engineer Wenjia Lu ©KINTO Technologies Corporation
  2. Agenda Presentation Overview 1. Tasks and Challenge 2. Learning from

    Failures: 3 Costly Lessons Learned 3. Solution: 3-Layer AI-QA Integration Method 4. Results 5. Q&A
  3. Tasks and Challenges of Test Automation • Manual Testing: Enormous

    time on repetitive tasks • Script modifications required for each release, automation test maintenance costs straining resources • Complexity of per-environment configuration management: →URL, managing parameters that differ per environment such as credentials and test data paths
  4. The Team Reality • Most QA members have limited coding

    experience • Varying levels of test automation knowledge across the team • 'Everyone can use' automated testing framework is needed • Training cost vs immediate productivity balance
  5. Failure #1: The AI Translation Disaster • TypeScript → Python

    migration using AI - Background: To write operations such as external system integration into scripts, Python was adopted • Code that looked perfect but didn't work in production • Cost impact: 3x the time of manual rewriting 3x Over Estimated Cost 100% Human Code Verification Needed
  6. Failure #1: Code Example (1) Missing Popup Handling Failure #1:

    AI-Generated Code That "Looked Perfect" AI-Generated Code (Looks OK) : # AI generated - missing popup context handling await page.get_by_role("link", name=“Car").click() # AI continues on the same page object await page.get_by_role("button", name="Apply").click() # fails! After Human Review: # Human-verified: link opens new tab, must capture popup context async with page.expect_popup() as popup_info: await page.get_by_role("link", name=“Car").click() page1 = await popup_info.value # new tab context await page1.wait_for_load_state("domcontentloaded") await page1.get_by_role("button", name="Apply").click() # works! Issue: AI doesn't know the link opens in a new tab. It continues operating on the original page object, so the 'Apply' button cannot be found and a timeout error occurs. expect_popup() is needed to capture the new page context. Lesson : Always verify AI-generated code in the actual environment
  7. Failure #1: Code Example (2) Locator Instability Issue Failure #1:

    AI-Generated Code That "Looked Perfect" AI-Generated Code (Looks OK) : # Work Phone Number await page.locator('input[type="tel"]').nth(2).fill(CELLPHONE_NUMBER_ OF_CORPORATE1) await page.wait_for_timeout(500) await page.locator('input[type="tel"]').nth(3).fill(CELLPHONE_NUMBER_ OF_CORPORATE2) await page.wait_for_timeout(500) await page.locator('input[type="tel"]').nth(4).fill(CELLPHONE_NUMBER_ OF_CORPORATE3) await page.wait_for_timeout(2000) After Fix (Human Verified) : # Work Phone Number await page.locator('xpath=//*[@id="__next"]/main/div/div/div[5]/div[1]/ div/div[1]/div[1]/div/input').fill(cellphone_number1) await page.wait_for_timeout(1000) await page.locator('xpath=//*[@id="__next"]/main/div/div/div[5]/div[1]/ div/div[1]/div[2]/div/input').fill(cellphone_number2) await page.wait_for_timeout(1000) await page.locator('xpath=//*[@id="__next"]/main/div/div/div[5]/div[1]/ div/div[1]/div[3]/div/input').fill(cellphone_number3) await page.wait_for_timeout(1000) Lesson : Always verify AI-generated code in the actual environment Issue: The moment a value is entered in the first text box, the framework reconstructs the DOM, causing the 2nd and 3rd locator indices to change and making the text boxes unidentifiable. Rather than using CSS selector's nth(), it is more reliable to manually specify the absolute XPath for each text box.
  8. Failure #2: The Data Collision Crisis • Test data generation

    collisions in shared environments • Parallel test executions destroying each other's data • Data corruption by 2 testers using the same search criteria (overwriting, etc.) Solution: Precise targeting and isolation strategy Lesson: In shared environments, use unique identifiers. Always track exactly which data belongs to your test. # check contract status: identify contract by contractnumber await page.goto(backoffice_url) await page.get_by_role(“textbox”, name=“Login Name”).fill(backoffice_username) await page.get_by_role("textbox", name="Password").fill(backoffice_password) await page.get_by_role("button", name="Log In").click() await page.get_by_role("link", name="Contract Information").click() await page.get_by_role("textbox", name="Contract Number").fill(contract_number) await page.get_by_role("button", name="Search").click()
  9. Failure #3: The Agile Maintenance Trap • Underestimating maintenance costs

    with frequent releases ⇒Code modifications required with every system update • '"Free" AI-generated code's hidden maintenance costs ⇒ Chasing only automation speed sacrifices sustainability ⇒ Automation speed vs sustainability balance Lesson : Modularization / Variant Pattern to minimize maintenance costs
  10. Section 2 Learning from Failures: 3 Costly Lessons Learned Failure

    #1.AI The Translation Disaster Failure #2.The Data Collision Crisis Failure #3.The Agile Maintenance Trap
  11. 3-Layer AI-QA Integration Method Overview Layer 1: Foundation — Playwright

    Framework Layer 2: Intelligence — AI-Augmented Workflows Layer 3: Integration — Beyond Browser Testing
  12. Layer 1: Playwright Framework ①Modular Architecture ➁Variant Pattern ③Scenario Matrix

    ④Centralized Configuration Management ⑤CI/CD Integration
  13. Layer 1: Playwright Framework (1) Modular Architecture • Maximize reusability

    through function- based module separation -carSelect — Car Selection Flow -agreement — Terms of Service Agreement -customerInformationInput — Customer Information Input -Web_companyInformationInput — Corporate Information Input -csvReader — CSV Data Loading # Main script = simple composition from carSelect import car_select from agreement import agreement from customerInformationInput import customer_information_input # Each module is one focused function await car_select(page, env, category, regex, period) await agreement(page) await customer_information_input( page, env, mail, password, surname, name, katakana_surname, katakana_name, ... ) # Result: non-technical users compose modules # like building blocks
  14. Layer 1: Playwright Framework (2) Variant Pattern Cover multiple business

    scenarios from the same base module Web_companyInformationInput.py ←Standard Corporate Application _shop.py Dealership Application _guarantor.py With Guarantor _shop_guarantor.py Dealership + Guarantor _pre_registered.py Pre-registered User _my_number.py My Number Card By modularizing common parts of scripts, these derived scenarios become easy to create and maintain
  15. Layer 1: Playwright Framework (3) Scenario Matrix Individual Corporate Logged

    In Regre_UsedCar_Ind_Logged_A.py Regre_UsedCar_Corp_Logged_A.py Not Logged In Regre_NewCar_Ind_NotLogged_A.py Regre_NewCar_Corp_NotLogged_A.py Login In In the middle Regre_UsedCar_Ind_MidLog_A.py Regre_NewCar_Corp_MidLog_A.py Common_B Regre_UsedCar_Ind_Common_B.py Regre_UsedCar_Corp_Common_B.py 8 Scenarios × A/B Flow = Systematic Coverage of All Business Paths Benefits: 1. Check what is tested and what is not. 2. Just by looking at script names, anyone can understand the test coverage. 3. Before release, scenarios to run can be selected from the matrix like a checklist. 4. If UI changes, which scenarios are affected can be immediately determined from the matrix
  16. Layer 1: Playwright Framework (4) Centralized Configuration Management # playwright_config.py

    — shared across ALL scripts DEFAULT_TIMEOUT = 120000 # 120s (2 min)Between operations NAVIGATION_TIMEOUT = 180000 # 180s (3 min)Page transition, loading SLOW_MO = 1000 # 1s per action Insert additional time between operations async def configure_page(page): page.set_default_timeout(DEFAULT_TIMEOUT) page.set_default_navigation_timeout(NAVIGATION_TIMEOUT ) return page # CSV-driven external configuration def load_config(config_file: str) -> dict: config = {} with open(config_file, encoding='utf-8') as f: reader = csv.DictReader(f) for row in reader: config[row['config_name']] = row['config_value'] return config CONFIG_FILE = Path(__file__).parent / 'config_paths.csv’ #URL, credentials, file paths, etc. config = load_config(str(CONFIG_FILE)) • All scripts import the same configuration module — One change reflects across all • CSV externalization enables environment switching without code changes
  17. Layer 1: Playwright Framework (5) CI/CD Integration # .github/workflows/S3 upload.yml

    name: S3 Upload on: schedule: - cron: '0 8 * * *' - cron: '0 16 * * *' workflow_dispatch: jobs: upload: runs-on: ubuntu-latest permissions: id-token: write # OIDC authentication contents: read # read repository’s code steps: - uses: actions/checkout@v4 - uses: aws-actions/configure-aws-credentials@v4 with: role-to-assume: arn:aws:iam::role/×××-role aws-region: ap-northeast-1 - run: aws s3 cp tests/input.txt s3://test-data/input/ • GitHub Actions trigger • AWS OIDC(openID connect) authentication (no long-term password needed) • Automated test data upload to S3 bucket
  18. Layer 2: AI-Augmented Workflow (1) AI Test Data Generation (2)

    AI Script Generation (3) Autonomous Code Agent Review (4) When AI Works Well vs When AI Fails
  19. Layer 2: AI-Augmented Workflow (1) AI Test Data Generation AI

    prompts generate diverse test data combinations automatically # CSV test data structure (testData4.csv) # AI generates diverse combinations: password,surname,katakanaSurname,yearOfBirth, monthOfBirth,dayOfBirth,sex,postCode1,postCode2, cellphoneNumber1,cellphoneNumber2,cellphoneNumber3, typeOfHousing,yearsOfResidence,... # 20+ proven prompts for: # - Individual applicant data # - Corporate applicant data # - Guarantor information # - Edge cases & boundary values • External management of test data via CSV files • AI prompts for individual/corporate/guarantor information input data generation • Edge case and boundary value variation generation
  20. Layer 2: AI-Augmented Workflow (2) AI Script Generation AI prompts

    generate scripts automatically However, AI-generated code must always be verified in the actual environment. Leave the basic structure of page transitions and form inputs to AI, while humans verify and fix timing handling, dynamic content handling, and popup handling. AI builds the foundation, humans refine it. This division of roles is the key to maximizing productivity.
  21. Layer 2: AI-Augmented Workflow (3) Autonomous Code Agent Review Autonomous

    Code Agent Review Checkpoints: Consistency with existing coding standards, insufficient error handling, missing wait processing, variable naming consistency, etc.
  22. Layer 2: AI-Augmented Workflow (3) Autonomous Code Agent Review Autonomous

    Code Agent Review Checkpoints: Consistency with existing coding standards, insufficient error handling, missing wait processing, variable naming consistency, etc.
  23. Layer 2: AI-Augmented Workflow (3) Autonomous Code Agent Review Autonomous

    Code Agent Review Checkpoints: Consistency with existing coding standards, insufficient error handling, missing wait processing, variable naming consistency, etc.
  24. Layer 2: AI-Augmented Workflow (4) AI Works Well vs When

    AI Fails AI Works Well • Test data generation (CSV) • Boilerplate code generation(modules) • Code review / Quality checks • Documentation generation • Repetitive pattern processing AI Fails • Wait handling for dynamic content • Environment-specific timing issues • Complex business logic • Migration between frameworks • XPath/selector optimization
  25. Layer 3: Beyond Browser Testing (1) Playwright + VBA Application

    Integration (Excel Macro Execution) (2) AWS S3 File Processing Within Test Scripts (3) Integrated Data-Driven Test Workflow (4) Multi-System E2E Flow (5) Slack Bidirectional Integration
  26. Layer 3:Beyond Browser Testing (1) Playwright + VBA Application Integration

    (Excel Macro Execution) # openpyxl for data manipulation + win32com for VBA macro execution wb = openpyxl.load_workbook(motas_file_path, keep_vba=True) ws = wb['Row 2 onwards'] # Full-width Half-width conversion (Japanese-specific) def convert_fullwidth_to_halfwidth(s): return ''.join(chr(ord(c) - 0xFEE0) if '0' <= c <= '9' else c for c in s) # Update Excel cells with test data ws['D2'] = formatted_date # Application date ws['D5'] = chassis_number # Vehicle chassis number ws['D16'] = user_name # Applicant name wb.save(motas_file_path) # Execute VBA macros via COM automation excel = win32com.client.Dispatch("Excel.Application") wb_com = excel.Workbooks.Open(motas_file_path) excel.Application.Run('Button1_Click') # Run macro 1 excel.Application.Run('Button2_Click') # Run macro 2 wb_com.Save(); wb_com.Close(); excel.Quit() Playwright + openpyxl + win32com = Browser / Excel VBA unified in a single script
  27. Layer 3:Beyond Browser Testing (2) AWS S3 File Processing Within

    Test Scripts async def damidata_registration(): # SSO: Specify SSO-authenticated profile session = boto3.Session(profile_name=“your-sso-profile") # S3 Client Creation s3 = session.client("s3") # Upload Destination Info bucket_name = "test-data-ap-northeast-1" local_file_path = r"C:¥Users¥Desktop¥input.txt" s3_object_key = "input/input.txt" # S3 path (bracket fix) try: s3.upload_file(local_file_path, bucket_name, s3_object_key) print(f"{local_file_path} to S3bucket '{bucket_name}' at '{s3_object_key}' uploaded successfully.") except Exception as e: print(f"Upload failed: {e}") Playwright + AWS S3 File Processing = Browser / AWS S3 unified in a single script
  28. Layer 3:Beyond Browser Testing (3) Integrated Data-Driven Test Workflow async

    def main(): test_data = read_csv(os.path.join(base_path, 'testData4.csv')) await test_provisional_application(test_data) async def test_provisional_application(test_data): async with async_playwright() as p: for data in test_data: password = data['password'] surname = data['name'] year_of_birth = data['yearOfBirth'] month_of_birth = data['monthOfBirth'] day_of_birth = data['dayOfBirth'] sex = data['sex'] post_code1 = data['postCode1'] post_code2 = data['postCode2'] cellphone_number1 = data['cellphoneNumber1'] cellphone_number2 = data['cellphoneNumber2'] cellphone_number3 = data['cellphoneNumber3'] type_of_housing = data['typeOfHousing'] years_of_residence = data[‘yearsOfResidence’] ...... Loop through each row and run the test with different data each time:
  29. Layer 3:Beyond Browser Testing (4) Multi-System E2E Flow Single Script

    Traverses 5 Systems Front End User Application → Back Office1 Back Office → Login Auth → Back Office2 Business Ops → Back Office3 Operations async def main(): # Phase 1: User application (Front End,backoffice1) await test_user_application(page) # Provisional application await test_backoffice1(page) await test_login(page) await browser.close() # Phase 2: Email verification (Slack integration) await slack_main() # Fetch verification from Slack await link_main() # Open confirmation links # Phase 3: Review & Contract (Back Office) await test_backoffice2(page) await test_backoffice3(page) # Full application flow
  30. Layer 3:Beyond Browser Testing (5) Slack Bidirectional Integration # 1.

    Webhook: Send test results requests.post(SLACK_WEBHOOK_URL, json={ "text": f'Test succeeded! Email: "{mail}"' }) # 2. SDK: Read email content from Slack response = await slack_client.conversations_history( channel=SOURCE_CHANNEL_ID, limit=1 ) # 3. Regex: Extract completion links pattern = re.compile( r'https://***¥.com/entry/' r'(?:personal|corporation)/complete/[a- zA-Z0-9]+' ) links = pattern.findall(text) # 4. Auto-open links in Playwright await page.goto(link, timeout=60000) • Webhook: Send real-time notification of test results • Slack SDK: Retrieve email content from channels • Regex: Automatically open confirmation links • Playwright: Automatically open links
  31. Section 3 Solution: 3-Layer AI-QA Integration Method Layer1: Playwright Framework

    Layer2: AI-Augmented Framework Layer3: Beyond Browser Testing
  32. AI×Playwright: Results Dashboard 50ー60% Cost Reduction To 90% Automation Test

    Coverage Up 50%+ Execution Time Reduction 50%+ Test Creation Time Reduction Non-Technical QA Team Test Automation Now Possible Annual Savings per Engineer $18K+
  33. Implementation Roadmap Where to Start Week 1-2 Foundation Playwright Setup

    Playwright_config.py Creation Basic Module Design Week 3-4 Modularization Major Flow Separation Variant Pattern Introduction CSV Externalization Month 2 AI Integration AI Test Data Generation Adoption Prompt Library Construction Month 3 System Integration VBA/S3/Slack Integration CI/CD Pipeline Construction
  34. AI: Free Tools & Paid Tools •Claude Code (claude.ai) -

    Code generation, debugging, and architecture advice. Free plan available •ChatGPT (OpenAI) - Free plan with GPT-4o mini available. Ideal for coding support •GitHub Copilot Free - Free plan available for VS Code (with completion limits) •Google Gemini - Free plan available. Integration with Google tools possible •GitHub Copilot - $10/month (individual). Top-tier auto-completion with rich IDE integration •Cursor - $20/month. AI-native code editor based on VS Code •Claude Pro / API - $20/month or token-based pricing. High-performance model for complex tasks •ChatGPT Plus / API - $20/month or token-based pricing. Access to GPT-4o •Devin(Cognition AI) - $500/month. Fully autonomous AI software engineer Free Tools Paid Tools
  35. Section 4 Results & ROI: The Numbers That Matter -

    AI×Playwright: Results Dashboard - Implementation Roadmap - AI: Free Tools & Paid Tools