Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A tester's journey in the world of machine lear...

Avatar for Shivani Gaba Shivani Gaba
November 22, 2023
51

A tester's journey in the world of machine learning

Machine learning (ML) has become the epicenter of software in today’s era. Tremendous efforts are put into developing ML-based systems. But what about ensuring the quality of such applications? What role can testers play in this context?

Usually, testing takes a backseat here and testers struggle to find their role and contributions they could make. The perception remains that “It’s machine learning, it cannot be tested”. But is this really the case? If not, how can we break this myth? How can testers make their space in the ML world?

In this presentation, Shivani explores answers to such questions and many more by sharing her experience of testing ML-based applications without having any prior experience of it. She’ll reflect on her motivation, challenges faced, learnings and techniques to test such systems. Get to know how a tester's skill set like product knowledge, attention to details, thinking out of the box and customer focus can be game changer here.

Avatar for Shivani Gaba

Shivani Gaba

November 22, 2023
Tweet

Transcript

  1. Challenge status quo Being a Question Asker Think out of

    the box @shivani_gaba Apply product knowledge Understand the structure & building blocks
  2. Enabling machines to learn from data & algorithms without being

    explicitly programmed @shivani_gaba Machine learning
  3. The model learns from past labelled input data of what

    articles user clicked. @shivani_gaba Supervised learning (for article recommendation) And then predicts for future recommendations that are likely to be clicked.
  4. Step 1: Collect DATA Step 2: Select model FEATURES Step

    3: Train MODEL @shivani_gaba Learning process
  5. Step 1: Collect DATA Step 2: Select model FEATURES Step

    3: Train MODEL @shivani_gaba Learning process
  6. Missing values Inconsistent data types Duplicates Domain knowledge Security Data

    privacy Biases (completeness) Outlier data ...... @shivani_gaba
  7. User Id User Gender Article id Article age User Clicked?

    1 Male 20 100000 Yes 2 Male 30 30 Yes 2 Male 30 30 Yes 0 Male 40 20 No 4 Male 70 Null No 5 Female 98 -2 Yes 5 Female 86 -3 No Duplicates Missing values Invalid data Outlier Bias Domain knowledge
  8. User Id User Gender Article id Article age … User

    Clicked? 1 Male 20 100000 …. Yes 2 Male 30 30 …. Yes 2 Male 30 30 …. Yes 0 Male 40 20 …. No 4 Male 70 Null …. No 5 Male 98 -2 …. Yes 5 Female 86 -3 …. No … …. …. …. …. ….
  9. @shivani_gaba - Review & enhance features - Standardize rules of

    features - Challenge the current features - Document features Feature engineering Features are the independent variables/attributes fed to the models.
  10. User Id User Gender Article id Article age User Clicked?

    1 Male 20 10 Yes 2 Female 30 30 Yes 3 Non binary 55 30 Yes … … … … …
  11. User Id User Gender Article id Article age Total article

    views Total article clicks User Clicked? 1 Male 20 10 90000 600 Yes 2 Female 30 30 80000 200 Yes 3 Non binary 55 30 30000 500 Yes … … … … … … … à 2 views à 1 click
  12. User Id User Gender Article id Article age Total article

    views Total article clicks User Clicked? 1 Male 20 10 3000 600 Yes 2 Female 30 30 1000 200 Yes 3 Non binary 55 30 2000 500 Yes … … … … … … …
  13. User Id User Gender Article id Article age Total article

    views Total article clicks Article clicks/ views User Clicked? 1 Male 20 10 3000 600 0.50 Yes 2 Female 30 30 1000 200 0.20 Yes 3 Non binary 55 30 2000 500 0.40 Yes … … … … … … … …
  14. User Id User Gender Article id Article age Total article

    views Total article clicks Article clicks/ views User Clicked? 1 Male 20 10 3000 600 0.50 Yes 2 Female 30 30 1000 200 0.20 Yes 3 Non binary 55 30 2000 500 0.40 Yes … … … … … … … … Model Features Target(label)
  15. Step 2: Select model FEATURES Step 3: Train the MODEL

    @shivani_gaba Step 1: Collect DATA
  16. User Id User Gender Article id Article age Total article

    impressions Total article clicks Article click/view ratio User Clicked? 1 Male A4 10 1000 500 0.50 Yes 2 Female A3 30 2000 600 0.30 Yes 2 Female A5 30 3000 600 0.50 Yes 3 Male A4 20 4000 120 0.03 No 4 Male A2 50 5000 300 0.06 No 5 Non binary A4 18 700 80 0.11 Yes 5 Non binary A6 2 5000 540 0.10 Yes 6 Female A4 3 3400 340 0.10 No 7 Male A7 6 800 80 0.10 Yes
  17. User Id User Gender Article id Article age Total article

    impressions Total article clicks Article click/view ratio User Clicked? 1 Male A4 10 1000 500 0.50 Yes 2 Female A3 30 2000 600 0.30 Yes 2 Female A5 30 3000 600 0.50 Yes 3 Male A4 20 4000 120 0.03 No 4 Male A2 50 5000 300 0.06 No 5 Non binary A4 18 700 80 0.11 Yes 5 Non binary A6 2 5000 540 0.10 Yes 6 Female A4 3 3400 340 0.10 No 7 Male A7 6 800 80 0.10 Yes Training set Testing set
  18. @shivani_gaba Age least important feature Feature importance User industry Total

    article clicks Article age Total article views User gender …………………………………
  19. Already seen articles ………….. …….… ..…..… 10,000 users Top X

    articles Top X articles Average age # of overlap articles Diverse sources Diverse types Old version New version Out of the box test idea
  20. Is there a need for testers to be involved in

    testing AI/ML based project in today’s era?
  21. Q&A