Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Evaluating Non-adequate Test-Case Reduction

Rahul Gopinath
September 07, 2016

Evaluating Non-adequate Test-Case Reduction

Mohammad Amin Alipour, August Shi, Rahul Gopinath, Darko Marinov, Alex Groce
IEEE/ACM Conference on Automated Software Engineering
(ASE 2016), pages 16-26, Singapore, Singapore, Sept. 2016

Rahul Gopinath

September 07, 2016
Tweet

More Decks by Rahul Gopinath

Other Decks in Research

Transcript

  1. Evalua&ng Non-adequate
    Test-Case Reduc&on
    Mohammad Amin Alipour, August Shi, Rahul Gopinath,
    Darko Marinov, and Alex Groce
    ASE 2016
    Singapore, Singapore
    September 5, 2016
    CCF-1054876
    1
    CCF-1409423
    CCF-1421503

    View Slide

  2. Tes&ng Can Be Slow
    2
    ...
    T1
    T2
    T3
    T4
    Tn

    View Slide

  3. Test-Suite Reduc&on
    3
    T1
    T2
    T3
    T4
    Tn
    T1
    T3
    Tn
    ...
    SJll saJsfies all
    test requirements

    View Slide

  4. Non-adequate Test-Suite Reduc&on
    4
    T1
    T2
    T3
    T4
    Tn
    T1
    T3
    Tn
    ...
    SaJsfies almost all
    test requirements

    View Slide

  5. Test-Case Reduc&on
    5
    T1
    T2
    T3
    T4
    Tn
    T1
    ’ T3
    ’ Tn

    ...
    T2
    ’ T4

    Each test case sJll saJsfies
    same test requirements

    View Slide

  6. Our Work: Non-adequate Test-Case Reduc&on
    6
    T1
    T2
    T3
    T4
    Tn
    T1
    ’ T3
    ’ Tn

    ...
    T2
    ’ T4

    Each test case saJsfies almost
    same test requirements

    View Slide

  7. Non-adequate Test-Case Reduc&on: Approaches
    •  Reduce test cases without preserving all test requirements
    •  We propose two approaches:
    •  C%-Coverage: coverage-based non-adequate test-case reducJon
    •  N-Mutant: mutant-based non-adequate test-case reducJon
    7

    View Slide

  8. Non-adequate Test-Case Reduc&on: Metrics
    •  We evaluate with three metrics:
    •  Size ReducJon Rate (SRR): how much test case is reduced
    •  Coverage PreservaJon Rate (CPR): how much coverage does
    reduced test case preserve
    •  Mutant PreservaJon Rate (MPR): how many killed mutants does
    reduced test case preserve
    8

    View Slide

  9. Adequate Test-Case Reduc&on (Coverage)
    9
    To
    ’’
    Covers lines:
    1,2,4,7,8
    To
    Covers lines:
    1,2,4,7,8
    Covers lines:
    1,2,4,7,8
    ...
    Covers lines:
    1,2,4,7,8
    To

    Tr
    Cause ReducJon (based on Delta Debugging)*
    *Groce, A., Alipour, M., Zhang, C., Chen, Y., and Regehr, J. Cause reducJon for quick tesJng. ICST 2014
    1-minimal

    View Slide

  10. Adequate Test-Case Reduc&on (Mutants)
    10
    Kills mutants:
    M1,M3,M7,M10
    To
    ’’
    To
    ...
    To

    Tr
    Kills mutants:
    M1,M3,M7,M10
    Kills mutants:
    M1,M3,M7,M10
    Kills mutants:
    M1,M3,M7,M10
    1-minimal

    View Slide

  11. Non-adequate Test-Case Reduc&on
    11
    Non-adequate Reduc4on Adequate Reduc4on
    C%-Coverage Preserve at least C% of coverage C=100
    N-Mutant Preserve at least N specified mutants killed N=all killed mutants

    View Slide

  12. C%-Coverage vs. N-Mutant: 3 Differences
    Test
    Requirement
    Percentage vs.
    Absolute
    Changing vs. Fixed
    Test Requirements
    C%-Coverage Lines Covered Percentage Any C% lines covered
    N-Mutant Mutants Killed Absolute Fixed N killed mutants
    12

    View Slide

  13. C%-Coverage
    13
    To
    ’’
    Covers lines:
    1,2,4,7,8
    Covers lines:
    1,2,4,7,8
    Covers lines:
    1,2,4,7,8 ...
    Covers lines:
    1,2,4,7,8
    To

    Tr
    C = 80
    To
    ’’’
    Covers lines:
    1,2,4,7,8
    To

    View Slide

  14. N-Mutant
    14
    To
    ’’
    To
    ...
    To

    Tr
    To
    ’’’
    Kills mutants:
    M1,M3,M7,M10
    N = 3
    {M3,M7,M10}
    Kills mutants:
    M1,M3,M7,M10
    Kills mutants:
    M1,M3,M7,M10
    Kills mutants:
    M1,M3,M7,M10
    Kills mutants:
    M1,M3,M7,M10

    View Slide

  15. Metrics
    •  Size ReducJon Rate (SRR)
    •  Coverage PreservaJon Rate (CPR)
    •  Mutant PreservaJon Rate (MPR)
    15
    (​↓ ,​↓ )=​(​↓ )−(​↓ )/(​↓ ) 
    (​↓ ,​↓ )=​|(​↓ )∩(​↓ )|/|(​↓ )| 
    (​↓ ,​↓ )=​|(​↓ )∩(​↓ )|/|(​↓ )| 

    View Slide

  16. Research Ques&ons
    •  RQ1: How much are test cases reduced (SRR)?
    •  RQ2: How much are code coverage and mutants killed
    preserved (CPR and MPR)?
    •  RQ3: How do SRR, CPR, and MPR trade off?
    •  RQ4: How do CPR and MPR for our approaches compare to
    CPR and MPR for random test-case reducJon?
    See paper for RQ4 evaluaJon
    16

    View Slide

  17. Experimental Setup
    •  C from {70,80,90,95,100}
    •  Coverage measured using GCov
    •  N from {1,2,4,8,16,32}
    •  Mutants generated using Andrews et al. mutaJon tool*
    •  Randomly sampled mutants
    •  See paper for evaluaJon using minimal mutants
    •  ReducJon Jmeout of 30 minutes per test case
    17
    *Andrews, J., Briand, L., and Labiche, Y. Is mutaJon an appropriate tool for tesJng experiments? ICSE 2005

    View Slide

  18. Projects
    18
    Project # Test Cases What is Removed # Mutants Min. Killed Max. Killed
    SpiderMonkey 99 JavaScript statement 69,067 8,101 12,825
    YAFFS2 99 API call 15,046 2,071 3,439
    Grep 112 Character in command line 7,591 19 993
    Gzip 73 Byte 7,175 1,813 2,046
    Experiments use N from 1 to 32, small percentage of min killed

    View Slide

  19. RQ1: Size Reduc&on Rate (SRR)
    19
    C%-Coverage
    Median SRR > 50% for non-adequate

    View Slide

  20. RQ2: Coverage Preserva&on Rate (CPR)
    20
    N-Mutant
    Median CPR close to 80% with just one mutant!

    View Slide

  21. RQ2: Mutant Preserva&on Rate (MPR)
    21
    N-Mutant
    RelaJvely high MPR with even just one mutant!

    View Slide

  22. RQ3: SRR vs. CPR (YAFFS2)
    22
    C%-Coverage N-Mutant

    View Slide

  23. RQ3: SRR vs. MPR (SpiderMonkey)
    23
    C%-Coverage N-Mutant

    View Slide

  24. RQ3: CPR vs. MPR
    24
    C%-Coverage N-Mutant

    View Slide

  25. RQ Highlights
    •  RQ1: High SRR difference from adequate to non-
    adequate
    •  RQ2: High CPR/MPR even with low non-adequacy,
    e.g., N=1 for N-Mutant
    •  RQ3: Higher SRR trades off lower CPR/MPR;
    high CPR tends to imply high MPR
    •  Not so clear trade-offs in case of N-Mutant
    25

    View Slide

  26. Conclusions
    •  We propose non-adequate test-case reducJon
    •  Non-adequate test-case reducJon:
    •  Provides high size reducJon and sJll largely preserves quality
    •  C%-Coverage offers substanJal size reducJon with controlled loss
    in coverage
    •  N-Mutant shows just preserving small number of mutants can sJll
    preserve a large percentage
    •  High dependency among mutants needs more invesJgaJon
    26
    [email protected]

    View Slide

  27. Minimal Mutants vs. 1-Mutant
    27

    View Slide

  28. C%-Coverage vs Random (baseline)

    28

    View Slide

  29. N-Mutant vs Random (baseline)
    29

    View Slide

  30. Interdependency between Mutants
    30

    View Slide

  31. Reduc&on Time
    31

    View Slide