Slide 1

Slide 1 text

AVGUST: Automating Usage-Based Test Generation from Videos of App Executions ESEC/FSE 2022, Singapore Yixue Zhao Saghar Talebipour, Kesina Baral, Hyojae Park, Leon Yee, Safwat Ali Khan, Yuriy Brun, Nenad Medvidović, Kevin Moran

Slide 2

Slide 2 text

▪ 6.64 billion smartphone users worldwide (80%+) ▪ 230 billion downloads in 2021 worldwide ▪ Avg American spends over 5h/day on mobile devices ▪ Avg American checks phone 96 times/day, or once every 10min 2 Source: Statista, ZIPPIA

Slide 3

Slide 3 text

3 App Developer Oh man, I need to write a sign-in test again… Behold! Do NOT fear! Have you tried AVGUST?

Slide 4

Slide 4 text

4 App Developer AVGUST, I want a sign-in test for my app! Oh, and also search, add cart, …

Slide 5

Slide 5 text

5 App Developer Tests! AVGUST There you go! AVGUST, I want a sign-in test for my app! Oh, and also search, add cart, …

Slide 6

Slide 6 text

6 App Developer Yay thank you AVGUST!!!! What about deposit money for my banking app?

Slide 7

Slide 7 text

7 App Developer Yay thank you AVGUST!!!! What about deposit money for my banking app? AVGUST Just give me similar videos and you shall receive! J

Slide 8

Slide 8 text

8 UI Testing

Slide 9

Slide 9 text

9 Existing Work ▪ Random Testing, e.g., Monkey ▪ Model-based Testing (MBT), e.g., Stoat ▪ ……

Slide 10

Slide 10 text

10 Existing Work ▪ Random Testing, e.g., Monkey ▪ Model-based Testing (MBT), e.g., Stoat ▪ …… Goal Maximize code coverage

Slide 11

Slide 11 text

11 Usage-based Test ▪ Test usage scenarios of an app (e.g., sign in, add item to the shopping cart) ▪ Highly preferred by developers ▪ Mimics realistic user behaviors

Slide 12

Slide 12 text

12 Usage-based Test, How? ▪ Test Transfer □ GTM ISSTA 2018 □ ATM ASE 2019 □ CraftDroid ASE 2019 □ FrUITeR ESEC/FSE 2020 (our work J) □ MAPIT ASE 2021 (our work J) □ …

Slide 13

Slide 13 text

13 Usage-based Test, How? ▪ Test Transfer □ GTM ISSTA 2018 □ ATM ASE 2019 □ CraftDroid ASE 2019 □ FrUITeR ESEC/FSE 2020 (our work J) □ MAPIT ASE 2021 (our work J) □ … Limitation Rely on existing tests □ Unavailable □ Low quality □ Too different

Slide 14

Slide 14 text

14 AVGUST ▪ Developer-in-the-loop tool ▪ AVGUST = App-video-based generation of usage tests ▪ Only relies on videos □ Easy to get (e.g., public, crowdsourcing) □ Vison-only (pixel-based) □ Cross platform (any apps!)

Slide 15

Slide 15 text

15 AVGUST Overview Videos Tests! Developer Assistance Models (per usage)

Slide 16

Slide 16 text

16 Video Analysis Event Frames § Video processing § Action identification § Keyboard detection

Slide 17

Slide 17 text

17 Model Generation (app-specific) Event Frames (app-independent) Model

Slide 18

Slide 18 text

18 Model Generation (app-specific) Event Frames (app-independent) Model Image Classification!

Slide 19

Slide 19 text

19 Model Generation

Slide 20

Slide 20 text

20 Model Generation

Slide 21

Slide 21 text

21 Model Generation

Slide 22

Slide 22 text

22 Model Generation

Slide 23

Slide 23 text

▪ Widget Classifier 23 Model Generation ▪ Screen Classifier “home” “about” “account” “cart” “search” …… “menu” “password” “add cart” “bookmark” “buy” …… 37 74

Slide 24

Slide 24 text

24 Screen Features Screen Image Visual Textual Abstract GUI Screen OCR Text

Slide 25

Slide 25 text

25 Widget Features Visual Features Textual Features OCR + BERT Widget Image Location: Top-Left Type: ImageButton Context “Home” Screen ResNet

Slide 26

Slide 26 text

26 Model Generation

Slide 27

Slide 27 text

27 Model Generation …… App1 AppN

Slide 28

Slide 28 text

28 Test Generation Videos Tests! Developer Assistance IR Models

Slide 29

Slide 29 text

29 Current State of Target App Top-K actions Test(s)! J Model (e.g., sign in) Image Features Textual Features IR Classifiers Developer

Slide 30

Slide 30 text

30 AVGUST Evaluation ▪ 374 videos, 18 usages, 18 apps ▪ 51 generated tests for unseen apps □ 69% successful! □ 80% state precision (avg.) □ 70% state recall (avg.) Saves efforts!

Slide 31

Slide 31 text

31 AVGUST Evaluation Screen IR Classifier’s Accuracy

Slide 32

Slide 32 text

32 AVGUST Contributions ▪ First usage-based test generation based on videos ▪ Effective image classification (videos à formal models) ▪ Ready-to-use trained models

Slide 33

Slide 33 text

33 AVGUST Contributions ▪ First usage-based test generation based on videos ▪ Effective image classification (videos à formal models) ▪ Ready-to-use trained models Crowd Workers Community Database Usage-based Tests, and more!

Slide 34

Slide 34 text

Thank You Saghar Kesina Hyojae Leon Safwat Yuriy Neno Kevin

Slide 35

Slide 35 text

Questions? Let’s connect! J ▪ Email: yzhao@isi.edu ▪ Twitter: @yixue_zhao ▪ LinkedIn: www.linkedin.com/in/yixue-zhao/ 35