ISSTA 2016: Generating focused random tests using directed swarm testing

Generating Focused Random Tests using Directed Swarm Testing Mohammad Amin
Alipour, Alex Groce, Rahul Gopinath, Arpit Christi EECS, Oregon State University

Random Test Generator Test Generator Software Under Test (SUT) Oracle
( ,output) Test Features (e.g. name of APIs) OK NOT OK

Goal Spidermonkey Static Analysis tool Buggy Block JsFunfuzz Test suite
- FOCUS : newly added/changed code, rarely covered code, statement suggested by static analysis

Hitting Fraction (HF) Hitting Fraction or Frequency of Coverage: Given
a test suite TS and a coverage target t, the fraction of test cases in TS that cover t is called hitting fraction. • If TS has no test to cover t, HF(TS, t) = 0 • If all tests in TS cover t, HF(TS, t) = 1 Hitting Fraction of statements of YAFFS2 using yaffs2tester

Focused Random Tests Potential Applications: • Finding bugs in suspicious
parts of code • Regression Testing Initial Test Suite Focused Test Suites Generating new test suites to increase hitting fraction (HF) of less-frequently covered targets.

Swarm Testing Random Test Generator Software Under Test (SUT) Oracle
Test Features F1,F2,F3,F4.... (e.g. name of APIs) Save test case ( output) Pick a random subset of test features Alex Groce, Chaoqiang Zhang, Eric Eide, Yang Chen, John Regehr: Swarm testing. ISSTA 2012: 78-88 OK NOT OK

Triggers and suppressors • Target- Behavior of SUT, produced by
test cases • Example- faults, coverage entities, mutant • Relationship between feature f and target t • Trigger • Suppressor • Irrelevant • Wilson Score of Confidence [ISSRE’13] Alex Groce, Chaoqiang Zhang, Mohammad Amin Alipour, Eric Eide, Yang Chen, John Regehr: Help, help, I'm being suppressed! The significance of suppressors in software testing. ISSRE 2013: 390-399 F1 F2 F3 F4 ….. T1 S T I I I T2 I I S I I T3 S S I T I …… I I S I T

Summary so far Program Swarm random generator F1 F2 F3
…... S1 S2 S3 S4 S5 T1 T2 T3 T4 T5 F1 F2 F3 F4 ….. T1 S T I I I T2 I I S I I T3 S S I T I …… I I S I T

Directed Swarm Testing Program Swarm random generator F1 F2 F3
…... Configuration Strategy S1 S2 S3 S4 S5 T1 T2 T3 T4 T5 F1 F2 F3 F4 ….. T1 S T I I I T2 I I S I I T3 S S I T I … … I I S I T F1,F2,…...

Methodology Swarm Testing (Seed run) Configuration Strategy Swarm Testing Coverage,Configuration
Configuration

Half Swarm f1,f2 - triggers, f3,f4 - suppressors, f5 -
f9 - irrelevants f1 f2 f6 f8 f5 f6 f7 f8 f9 Swarm generator f1 f2 f3 f4

No Supressors f1,f2 - triggers, f3,f4 - suppressors, f5 -
f9 - irrelevants f1 f2 f5 f6 f7 f8 f9 f5 f6 f7 f8 f9 Swarm generator f1 f2 f3 f4

Triggers only f1,f2 - triggers, f3,f4 - suppressors, f5 -
f9 - irrelevants f1 f2 f5 f6 f7 f8 f9 Swarm generator f1 f2 f3 f4

Multi-target Directed Swarm Testing Challenge: • A trigger for a
target can be a suppressor for another one, or vice versa. Merging Strategies: • Round-robin (No merging) • Subsumption Merge a configuration to a stricter one • Aggressive Merging Tries to merge as many as non-conflicting configuration • Optimum Generates minimum set of configurations NP-Complete [RW] [RW] Edward L. Robertson and Catharine M. Wyss: Optimal Tuple Merge in NP-Complete, Technical Report TR599, Indiana University

Evaluation - SUT SUT LOC Fuzzer Description YAFFS2 15K Yaff2tester
Flash file system GCC 4.4.7 860K Csmith C and C++ compiler Spidermonkey 118K Jsfunfuzz Javascript engine for Mozilla SUT # Features Seed time(min) Run time(min) YAFFS 43 15 5 GCC 25 60 10 Spidermonkey 171 30 10 Subjects Experimental parameters

Evaluation: Directed vs Undirected Swarm for single target Hitting Fraction:
Directed vs Undirected

Evaluation of Directed Swarm Testing YAFFS Strategy % better HFd/HFu
Mean HFd/HFu Max Half-Swarm 100 2.4 5.01 No-Supressor 100 2.56 4.44 Trigger 100 2.8 7.87

Evaluation of Directed Swarm Testing GCC Strategy % better HFd/HFu

Evaluation of Directed Swarm Testing Spidermonkey Strategy % better HFd/HFu

Evaluation of Directed Swarm Testing Multi-Target

Ability to find real faults Test Strategy Fault #1 Fault
#2 Fault #3 Fault #4 Fault #5 Undirected Swarm 13.6 0.07 0.24 0.26 0.07 Round-robin Half-swarm 31.9 0.19 0.35 0.56 0.29 Round-robin No- suppressors 34.2 0.26 0.17 0.46 0.69 Subsumption Half-swarm 33.0 0.24 0.12 0.10 0.29 Subsumption No- suppressors 33.1 0.31 0.29 0.31 0.46 Spidermonkey - 5 real faults, multi-target testing

Final note • Using collected statistics on code coverage and
swarm testing, it is possible to produce focused random tests • The method is able to increase the frequency with which tests cover targeted code by a factor often more than 2x • This approach is readily applicable to existing, industrial-strength random testing tools for critical systems software.

Targeted Testing Dynamic Symbolic Execution Search based testing

Evaluation of Directed Swarm Testing Single- Target Multi- Target

Limitations Focus is on Full System, program or a module
Does not naturally focus on any parts of the program Why we need focus?

Swarm testing Fuzzer = generator + features Example of feature
- api calls, grammar terminal & production rules, c semantics Configuration of a test generator Pure Random testing: configuration = set of all features Swarm testing: configuration = each feature chosen with a coin toss Increases feature count per test and increases interaction

- You want to hammer out a piece of code
as frequent as possible.

ISSTA 2016: Generating focused random tests usi...

ISSTA 2016: Generating focused random tests using directed swarm testing

Arpit Christi

Other Decks in Research

Featured

Transcript

Generating Focused Random Tests using Directed Swarm Testing Mohammad Amin

Random Test Generator Test Generator Software Under Test (SUT) Oracle

Goal Spidermonkey Static Analysis tool Buggy Block JsFunfuzz Test suite

Hitting Fraction (HF) Hitting Fraction or Frequency of Coverage: Given

Focused Random Tests Potential Applications: • Finding bugs in suspicious

Swarm Testing Random Test Generator Software Under Test (SUT) Oracle

Triggers and suppressors • Target- Behavior of SUT, produced by

Summary so far Program Swarm random generator F1 F2 F3

Directed Swarm Testing Program Swarm random generator F1 F2 F3

Methodology Swarm Testing (Seed run) Configuration Strategy Swarm Testing Coverage,Configuration

Half Swarm f1,f2 - triggers, f3,f4 - suppressors, f5 -

No Supressors f1,f2 - triggers, f3,f4 - suppressors, f5 -

Triggers only f1,f2 - triggers, f3,f4 - suppressors, f5 -

Multi-target Directed Swarm Testing Challenge: • A trigger for a

Evaluation - SUT SUT LOC Fuzzer Description YAFFS2 15K Yaff2tester

Evaluation: Directed vs Undirected Swarm for single target Hitting Fraction:

Evaluation of Directed Swarm Testing YAFFS Strategy % better HFd/HFu

Evaluation of Directed Swarm Testing GCC Strategy % better HFd/HFu

Evaluation of Directed Swarm Testing Spidermonkey Strategy % better HFd/HFu

Evaluation of Directed Swarm Testing Multi-Target

Ability to find real faults Test Strategy Fault #1 Fault

Final note • Using collected statistics on code coverage and

Targeted Testing Dynamic Symbolic Execution Search based testing

Evaluation of Directed Swarm Testing Single- Target Multi- Target

Limitations Focus is on Full System, program or a module

Swarm testing Fuzzer = generator + features Example of feature

- You want to hammer out a piece of code