Upgrade to Pro — share decks privately, control downloads, hide ads and more …

FrUITeR: A Framework for Evaluating UI Test Reuse

Yixue Zhao
November 02, 2020

FrUITeR: A Framework for Evaluating UI Test Reuse

Presentation slides of the paper "FrUITeR: A Framework for Evaluating UI Test Reuse" at the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) 2020.
Presentation link: https://youtu.be/zVWpT5aLyQo

Yixue Zhao

November 02, 2020
Tweet

More Decks by Yixue Zhao

Other Decks in Research

Transcript

  1. FrUITeR: A Framework for Evaluating UI Test Reuse Yixue Zhao1,

    Justin Chen2, Adriana Sejfia1, Marcelo Schmitt Laser1, Jie Zhang3, Federica Sarro3, Mark Harman3, Nenad Medvidović1 ESEC/FSE 2020, Virtual Event 1 2 3
  2. What’s UI Test Reuse? 2 ▪ New and exciting!! J

    ▪ Test generation technique ▪ “Usage-based” UI test ▪ Reuse existing tests ▪ UI similarities a1 b1 a1-1 a1-2 a1-3 b2 b1-1 b2-1 a1 b1 a1-1 a1-2 a1-3 b2 b3 b1-1 b2-1 b3-3 b3-2 b3-1 Source app Target app
  3. Why evaluation framework? 4 Ideal Happy Yixue in her proposal

    UI Test Reuse! Reality Happier people submitting papers I’m done hehe!
  4. ▪ Limitations of existing work ▪ Which techniques are better

    ▪ Compare my new technique 5 FrUITeR was born A Framework for Evaluating UI Test Reuse
  5. So hard! ▪ Identify 5 key challenges ▪ Establish 5

    requirements ▪ Design framework ▪ Build benchmark ▪ Migrate existing work 6
  6. Challenges ▪ Evaluation metrics are different and limited ▪ Significant

    manual effort ▪ No guidelines for manual inspection ▪ One-off solutions evaluated as a whole ▪ Different benchmarks 7
  7. 9 a1 b1 a1-1 a1-2 a1-3 b2 b3 b1-1 b2-1

    b3-3 b3-2 b3-1 Wish Etsy
  8. 11 a1 b1 a1-1 a1-2 a1-3 b2 b3 b1-1 b2-1

    b3-3 b3-2 b3-1 Wish Etsy Wish → Etsy 1 test w/ 3 events 3 times
  9. 12 Wish Etsy Wish → Etsy 1 test w/ 3

    events 3 times 10 apps (100 pairs) 10 tests w/ 10 events 100 × 100 times!
  10. 14 a1 b1 a1-1 a1-2 a1-3 b2 b3 b1-1 b2-1

    b3-3 b3-2 b3-1 Wish Etsy a1-1 à b3-1???
  11. Address Challenge #1 Metrics 17 1. Evaluation metrics are different

    and limited Effort = Levenshtein Distance (transEvents, gtEvents) 2 Utility Metrics
  12. Address Challenges #2 & #3 18 2. Significant manual effort

    3. No guidelines for manual inspection
  13. 19

  14. 22

  15. 23 a1 b1 a1-1 a1-2 a1-3 b2 b1-1 b2-1 a1

    b1 a1-1 a1-2 a1-3 b2 b3 b1-1 b2-1 b3-3 b3-2 b3-1 Wish Canonical Map a1-1 à email a1-2 à password a1-3 à sign in Etsy Canonical Map b3-1 à email b3-2 à password b3-3 à sign in Canonical Map
  16. 24 Wish Canonical Map a1-1 à email a1-2 à password

    a1-3 à sign in Etsy Canonical Map b3-1 à email b3-2 à password b3-3 à sign in Address Challenge #2 ONLY manual effort
  17. 25 Wish Etsy Wish → Etsy 1 test w/ 3

    events 3 times 10 apps (100 pairs) 10 tests w/ 10 events 100 × 100 times!
  18. 26 Address Challenge #3 a1 b1 a1-1 a1-2 a1-3 b2

    b1-1 b a1 b1 a1-3 b2 b3 b1-1 b2-1 b3-3 b3-2 b3-1 Wish Canonical Map a1-1 à email a1-2 à password a1-3 à sign in Etsy Canonical Map b3-1 à email b3-2 à password b3-3 à sign in
  19. 27 a1 b1 a1-1 a1-2 a1-3 b2 b1-1 b a1

    b1 a1-3 b2 b3 b1-1 b2-1 b3-3 b3-2 b3-1 Wish Canonical Map a1-1 à email a1-2 à password a1-3 à sign in Etsy Canonical Map b3-1 à email username b3-2 à password b3-3 à sign in Address Challenge #3
  20. More challenges in reality… 28 Contact authors Study implementation Modify/verify

    implementation Establish 239 benchmark tests Construct 20 ground- truth Canonical Maps
  21. FrUITeR’s Empirical Results 29 ▪ 1,000 source-target app pairs (2

    app categories × 100 app pairs in each category × 5 techniques) ▪ 11,917 results entries ▪ 7 fidelity metrics ▪ 2 utility metrics
  22. FrUITeR’s Selected Implications 32 ▪ Perfect isn’t perfect (e.g., fidelity

    vs. utility) ▪ Source app selection (e.g., app company, code clone) ▪ Testing technique selection (e.g., trade-offs, manual vs. auto)