Evaluating Non-adequate Test-Case Reduction

D27cb84e0d30e2778e9b66d6a5f42106?s=47 Rahul Gopinath
September 07, 2016

Evaluating Non-adequate Test-Case Reduction

Mohammad Amin Alipour, August Shi, Rahul Gopinath, Darko Marinov, Alex Groce
IEEE/ACM Conference on Automated Software Engineering
(ASE 2016), pages 16-26, Singapore, Singapore, Sept. 2016

D27cb84e0d30e2778e9b66d6a5f42106?s=128

Rahul Gopinath

September 07, 2016
Tweet

Transcript

  1. Evalua&ng Non-adequate Test-Case Reduc&on Mohammad Amin Alipour, August Shi, Rahul

    Gopinath, Darko Marinov, and Alex Groce ASE 2016 Singapore, Singapore September 5, 2016 CCF-1054876 1 CCF-1409423 CCF-1421503
  2. Tes&ng Can Be Slow 2 ... T1 T2 T3 T4

    Tn
  3. Test-Suite Reduc&on 3 T1 T2 T3 T4 Tn T1 T3

    Tn ... SJll saJsfies all test requirements
  4. Non-adequate Test-Suite Reduc&on 4 T1 T2 T3 T4 Tn T1

    T3 Tn ... SaJsfies almost all test requirements
  5. Test-Case Reduc&on 5 T1 T2 T3 T4 Tn T1 ’

    T3 ’ Tn ’ ... T2 ’ T4 ’ Each test case sJll saJsfies same test requirements
  6. Our Work: Non-adequate Test-Case Reduc&on 6 T1 T2 T3 T4

    Tn T1 ’ T3 ’ Tn ’ ... T2 ’ T4 ’ Each test case saJsfies almost same test requirements
  7. Non-adequate Test-Case Reduc&on: Approaches •  Reduce test cases without preserving

    all test requirements •  We propose two approaches: •  C%-Coverage: coverage-based non-adequate test-case reducJon •  N-Mutant: mutant-based non-adequate test-case reducJon 7
  8. Non-adequate Test-Case Reduc&on: Metrics •  We evaluate with three metrics:

    •  Size ReducJon Rate (SRR): how much test case is reduced •  Coverage PreservaJon Rate (CPR): how much coverage does reduced test case preserve •  Mutant PreservaJon Rate (MPR): how many killed mutants does reduced test case preserve 8
  9. Adequate Test-Case Reduc&on (Coverage) 9 To ’’ Covers lines: 1,2,4,7,8

    To Covers lines: 1,2,4,7,8 Covers lines: 1,2,4,7,8 ... Covers lines: 1,2,4,7,8 To ’ Tr Cause ReducJon (based on Delta Debugging)* *Groce, A., Alipour, M., Zhang, C., Chen, Y., and Regehr, J. Cause reducJon for quick tesJng. ICST 2014 1-minimal
  10. Adequate Test-Case Reduc&on (Mutants) 10 Kills mutants: M1,M3,M7,M10 To ’’

    To ... To ’ Tr Kills mutants: M1,M3,M7,M10 Kills mutants: M1,M3,M7,M10 Kills mutants: M1,M3,M7,M10 1-minimal
  11. Non-adequate Test-Case Reduc&on 11 Non-adequate Reduc4on Adequate Reduc4on C%-Coverage Preserve

    at least C% of coverage C=100 N-Mutant Preserve at least N specified mutants killed N=all killed mutants
  12. C%-Coverage vs. N-Mutant: 3 Differences Test Requirement Percentage vs. Absolute

    Changing vs. Fixed Test Requirements C%-Coverage Lines Covered Percentage Any C% lines covered N-Mutant Mutants Killed Absolute Fixed N killed mutants 12
  13. C%-Coverage 13 To ’’ Covers lines: 1,2,4,7,8 Covers lines: 1,2,4,7,8

    Covers lines: 1,2,4,7,8 ... Covers lines: 1,2,4,7,8 To ’ Tr C = 80 To ’’’ Covers lines: 1,2,4,7,8 To
  14. N-Mutant 14 To ’’ To ... To ’ Tr To

    ’’’ Kills mutants: M1,M3,M7,M10 N = 3 {M3,M7,M10} Kills mutants: M1,M3,M7,M10 Kills mutants: M1,M3,M7,M10 Kills mutants: M1,M3,M7,M10 Kills mutants: M1,M3,M7,M10
  15. Metrics •  Size ReducJon Rate (SRR) •  Coverage PreservaJon Rate

    (CPR) •  Mutant PreservaJon Rate (MPR) 15 (​↓ ,​↓ )=​(​↓ )−(​↓ )/(​↓ )  (​↓ ,​↓ )=​|(​↓ )∩(​↓ )|/|(​↓ )|  (​↓ ,​↓ )=​|(​↓ )∩(​↓ )|/|(​↓ )| 
  16. Research Ques&ons •  RQ1: How much are test cases reduced

    (SRR)? •  RQ2: How much are code coverage and mutants killed preserved (CPR and MPR)? •  RQ3: How do SRR, CPR, and MPR trade off? •  RQ4: How do CPR and MPR for our approaches compare to CPR and MPR for random test-case reducJon? See paper for RQ4 evaluaJon 16
  17. Experimental Setup •  C from {70,80,90,95,100} •  Coverage measured using

    GCov •  N from {1,2,4,8,16,32} •  Mutants generated using Andrews et al. mutaJon tool* •  Randomly sampled mutants •  See paper for evaluaJon using minimal mutants •  ReducJon Jmeout of 30 minutes per test case 17 *Andrews, J., Briand, L., and Labiche, Y. Is mutaJon an appropriate tool for tesJng experiments? ICSE 2005
  18. Projects 18 Project # Test Cases What is Removed #

    Mutants Min. Killed Max. Killed SpiderMonkey 99 JavaScript statement 69,067 8,101 12,825 YAFFS2 99 API call 15,046 2,071 3,439 Grep 112 Character in command line 7,591 19 993 Gzip 73 Byte 7,175 1,813 2,046 Experiments use N from 1 to 32, small percentage of min killed
  19. RQ1: Size Reduc&on Rate (SRR) 19 C%-Coverage Median SRR >

    50% for non-adequate
  20. RQ2: Coverage Preserva&on Rate (CPR) 20 N-Mutant Median CPR close

    to 80% with just one mutant!
  21. RQ2: Mutant Preserva&on Rate (MPR) 21 N-Mutant RelaJvely high MPR

    with even just one mutant!
  22. RQ3: SRR vs. CPR (YAFFS2) 22 C%-Coverage N-Mutant

  23. RQ3: SRR vs. MPR (SpiderMonkey) 23 C%-Coverage N-Mutant

  24. RQ3: CPR vs. MPR 24 C%-Coverage N-Mutant

  25. RQ Highlights •  RQ1: High SRR difference from adequate to

    non- adequate •  RQ2: High CPR/MPR even with low non-adequacy, e.g., N=1 for N-Mutant •  RQ3: Higher SRR trades off lower CPR/MPR; high CPR tends to imply high MPR •  Not so clear trade-offs in case of N-Mutant 25
  26. Conclusions •  We propose non-adequate test-case reducJon •  Non-adequate test-case

    reducJon: •  Provides high size reducJon and sJll largely preserves quality •  C%-Coverage offers substanJal size reducJon with controlled loss in coverage •  N-Mutant shows just preserving small number of mutants can sJll preserve a large percentage •  High dependency among mutants needs more invesJgaJon 26 awshi2@illinois.edu
  27. Minimal Mutants vs. 1-Mutant 27

  28. C%-Coverage vs Random (baseline) 28

  29. N-Mutant vs Random (baseline) 29

  30. Interdependency between Mutants 30

  31. Reduc&on Time 31