Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Avgust: Automating Usage-Based Test Generation from Videos of App Executions

Yixue Zhao
January 27, 2023

Avgust: Automating Usage-Based Test Generation from Videos of App Executions

Presentation slides of our paper "Avgust: Automating Usage-Based Test Generation from Videos of App Executions" at ESEC/FSE 2022.
Presentation: https://youtu.be/LB-TrLhQcvI

Yixue Zhao

January 27, 2023
Tweet

More Decks by Yixue Zhao

Other Decks in Technology

Transcript

  1. AVGUST: Automating Usage-Based
    Test Generation from Videos of
    App Executions
    ESEC/FSE 2022, Singapore
    Yixue Zhao Saghar Talebipour, Kesina Baral, Hyojae Park, Leon
    Yee, Safwat Ali Khan, Yuriy Brun, Nenad Medvidović, Kevin Moran

    View Slide

  2. ▪ 6.64 billion smartphone
    users worldwide (80%+)
    ▪ 230 billion downloads in
    2021 worldwide
    ▪ Avg American spends over
    5h/day on mobile devices
    ▪ Avg American checks
    phone 96 times/day, or
    once every 10min
    2
    Source: Statista, ZIPPIA

    View Slide

  3. 3
    App Developer
    Oh man, I need
    to write a sign-in
    test again…
    Behold!
    Do NOT fear!
    Have you tried
    AVGUST?

    View Slide

  4. 4
    App Developer
    AVGUST, I want a
    sign-in test for my
    app! Oh, and also
    search, add cart,

    View Slide

  5. 5
    App Developer Tests!
    AVGUST
    There you go!
    AVGUST, I want a
    sign-in test for my
    app! Oh, and also
    search, add cart,

    View Slide

  6. 6
    App Developer
    Yay thank you
    AVGUST!!!! What
    about deposit
    money for my
    banking app?

    View Slide

  7. 7
    App Developer
    Yay thank you
    AVGUST!!!! What
    about deposit
    money for my
    banking app?
    AVGUST
    Just give me
    similar videos
    and you shall
    receive! J

    View Slide

  8. 8
    UI Testing

    View Slide

  9. 9
    Existing Work
    ▪ Random Testing, e.g., Monkey
    ▪ Model-based Testing (MBT), e.g., Stoat
    ▪ ……

    View Slide

  10. 10
    Existing Work
    ▪ Random Testing, e.g., Monkey
    ▪ Model-based Testing (MBT), e.g., Stoat
    ▪ ……
    Goal
    Maximize code coverage

    View Slide

  11. 11
    Usage-based Test
    ▪ Test usage scenarios of an app (e.g., sign in, add item
    to the shopping cart)
    ▪ Highly preferred by developers
    ▪ Mimics realistic user behaviors

    View Slide

  12. 12
    Usage-based Test, How?
    ▪ Test Transfer
    □ GTM ISSTA 2018
    □ ATM ASE 2019
    □ CraftDroid ASE 2019
    □ FrUITeR ESEC/FSE 2020 (our work J)
    □ MAPIT ASE 2021 (our work J)
    □ …

    View Slide

  13. 13
    Usage-based Test, How?
    ▪ Test Transfer
    □ GTM ISSTA 2018
    □ ATM ASE 2019
    □ CraftDroid ASE 2019
    □ FrUITeR ESEC/FSE 2020 (our work J)
    □ MAPIT ASE 2021 (our work J)
    □ …
    Limitation
    Rely on existing tests
    □ Unavailable
    □ Low quality
    □ Too different

    View Slide

  14. 14
    AVGUST
    ▪ Developer-in-the-loop tool
    ▪ AVGUST = App-video-based generation of usage tests
    ▪ Only relies on videos
    □ Easy to get (e.g., public, crowdsourcing)
    □ Vison-only (pixel-based)
    □ Cross platform (any apps!)

    View Slide

  15. 15
    AVGUST Overview
    Videos Tests!
    Developer
    Assistance
    Models
    (per usage)

    View Slide

  16. 16
    Video Analysis
    Event Frames
    § Video processing
    § Action identification
    § Keyboard detection

    View Slide

  17. 17
    Model Generation
    (app-specific)
    Event Frames
    (app-independent)
    Model

    View Slide

  18. 18
    Model Generation
    (app-specific)
    Event Frames
    (app-independent)
    Model
    Image
    Classification!

    View Slide

  19. 19
    Model Generation

    View Slide

  20. 20
    Model Generation

    View Slide

  21. 21
    Model Generation

    View Slide

  22. 22
    Model Generation

    View Slide

  23. ▪ Widget Classifier
    23
    Model Generation
    ▪ Screen Classifier
    “home”
    “about”
    “account”
    “cart”
    “search”
    ……
    “menu”
    “password”
    “add cart”
    “bookmark”
    “buy”
    ……
    37 74

    View Slide

  24. 24
    Screen Features
    Screen Image
    Visual
    Textual
    Abstract GUI Screen
    OCR Text

    View Slide

  25. 25
    Widget Features
    Visual
    Features
    Textual
    Features
    OCR + BERT
    Widget
    Image
    Location:
    Top-Left
    Type:
    ImageButton
    Context
    “Home” Screen
    ResNet

    View Slide

  26. 26
    Model Generation

    View Slide

  27. 27
    Model Generation
    ……
    App1 AppN

    View Slide

  28. 28
    Test Generation
    Videos Tests!
    Developer
    Assistance
    IR Models

    View Slide

  29. 29
    Current State
    of Target App
    Top-K
    actions
    Test(s)! J
    Model
    (e.g., sign in)
    Image Features
    Textual Features
    IR Classifiers
    Developer

    View Slide

  30. 30
    AVGUST Evaluation
    ▪ 374 videos, 18 usages, 18 apps
    ▪ 51 generated tests for unseen apps
    □ 69% successful!
    □ 80% state precision (avg.)
    □ 70% state recall (avg.)
    Saves efforts!

    View Slide

  31. 31
    AVGUST Evaluation
    Screen IR Classifier’s Accuracy

    View Slide

  32. 32
    AVGUST Contributions
    ▪ First usage-based test generation based on videos
    ▪ Effective image classification (videos à formal models)
    ▪ Ready-to-use trained models

    View Slide

  33. 33
    AVGUST Contributions
    ▪ First usage-based test generation based on videos
    ▪ Effective image classification (videos à formal models)
    ▪ Ready-to-use trained models
    Crowd
    Workers
    Community
    Database
    Usage-based
    Tests, and more!

    View Slide

  34. Thank You
    Saghar Kesina Hyojae Leon
    Safwat Yuriy Neno Kevin

    View Slide

  35. Questions?
    Let’s connect! J
    ▪ Email: [email protected]
    ▪ Twitter: @yixue_zhao
    ▪ LinkedIn: www.linkedin.com/in/yixue-zhao/
    35

    View Slide