# Topsy Turvy: A Smarter and Faster Parallelization of Mutation Analysis

Mutation analysis is an effective, if computationally expensive, technique that allows practitioners to accurately evaluate the quality of their test suites. To reduce the time and cost of mutation analysis, researchers have looked at parallelizing mutation runs — running multiple mutated versions of the program in parallel, and running through the tests in sequence on each mutated program until a bug is found. While an improvement over sequential execution of mutants and tests, this technique carries a significant overhead cost due to its redundant execution of unchanged code paths. In this paper we propose a novel technique (and its implementation) which parallelizes the test runs rather than the mutants, forking mutants from a single program execution at the point of invocation, which reduces redundancy. We show that our technique can lead to significant efficiency improvements and cost reductions.

January 15, 2016

## Transcript

1. ### Topsy-Turvy A Smarter and Faster Parallelization of Mutation Analysis Rahul

Gopinath Carlos Jensen Alex Groce
2. ### class TestSimpleNumber < Test::Unit::TestCase def setup @num = SimpleNumber.new(2) end

def test_simple_add assert(@num.add(2) != 0) end def test_simple_multiply assert(@num.multiply(2) != @num) end end Mutation Analysis : A birds eye view • Requirement: Find how good your tests are • Graph coverage metrics have well known faults • They do not take into account quality of oracles. May 15, 2016 2 class SimpleName def initialize(num) @x = num end def add(y) @x + y end def multiply(y) @x * y end end
3. ### Mutation Analysis : A birds eye view Instead: • Seed

a large number of faults (mutants). • Run test suite against each. • The effectiveness is the % of faults detected (killed). May 15, 2016 3 d = b^2 + 4 * a * c;  d = b^2 * 4 * a * c;  d = b^2 / 4 * a * c;  d = b^2 ^ 4 * a * c;  d = b^2 % 4 * a * c; d = b^2 << 4 * a * c; d = b^2 >> 4 * a * c; d = b^2 * 4 + a * c;  d = b^2 * 4 - a * c;  d = b^2 * 4 / a * c;  d = b^2 * 4 ^ a * c;  d = b^2 * 4 % a * c; d = b^2 * 4 << a * c; d = b^2 * 4 >> a * c; d = b^2 * 4 * a + c;  d = b^2 * 4 * a - c;  d = b^2 * 4 * a / c;  d = b^2 * 4 * a ^ c;  d = b^2 * 4 * a % c; d = b^2 * 4 * a << c; d = b^2 * 4 * a >> c; d = b + 2 - 4 * a * c;  d = b - 2 - 4 * a * c;  d = b * 2 - 4 * a * c;  d = b / 2 - 4 * a * c;  d = b % 2 - 4 * a * c;  d = b << 2 - 4 * a * c;  d = b >> 2 - 4 * a * c;  d = b^0 - 4 * a * c;  d = b^1 - 4 * a * c; d = b^-1 - 4 * a * c; d = b^MAX - 4 * a * c; d = b^MIN - 4 * a * c; d = b - 4 * a * c;  d = b ^ 4 * a * c; d = b^2 - 0 * a * c;  d = b^2 - 1 * a * c;  d = b^2 – (-1) * a * c;  d = b^2 - MAX * a * c;  d = b^2 - MIN * a * c;  d = b^2 * a * c;  d = b^2 - a * c; Δ=b2 – 4ac
4. ### Mutation Analysis : The problem • Each seeded test requires

a full test suite run • The number of mutants are huge for even small programs. • Observation: Assertions often happen after a fairly long execution. May 15, 2016 4 class TestSimpleX < Test::Unit::TestCase def setup @x = SimpleX.new() end def test_check assert (@x.check(1000) ) end end class SimpleX def initialize() .... end def check(y) .... .... x = ... z = x > y return z end end mutation point
5. ### Observation: Two test cases on three mutants May 15, 2016

5 Mutants executed in parallel m1 m2 m3
6. ### We Propose: Topsy Turvy May 15, 2016 6 Tests are

executed in parallel Fork off mutants as they are encountered t1 t2
7. ### We Propose: Topsy Turvy May 15, 2016 7 Fork off

mutants as they are encountered Original: def avg(a,b) return (a+b) / 2 end The library function: μ() def μ(id, a, b, op) if parent? return op(a,b) if has?(id) mutations(op).each do |o| fork if child? set(id, o) return o(a,b) end end set(id, op) return op(a, b) else o = get(id) || op return o(a,b) end end Transformed: def avg(a,b) return μ(:a, μ(:b, x, y, +), 2, /) end Library call (!) Mutant id (:b)

9. ### Related ideas May 15, 2016 9 Related ideas: • Split

Stream Execution – King et al. 1991  Difference: Interpreter only technique – Interpreter manages the forking. • MuVM – Tokumoto et al. (ICST) 2016  Difference: Similar idea, but Virtual Machine based. The tests are still serial Topsy Turvy (dynamic mutants): Applied as source transformation, and applicable in any environment and language with cheap forking.
10. ### Conclusion May 15, 2016 10 Original: def avg(a,b) return (a+b)

/ 2 end Transformed: def avg(a,b) return μ(:a, μ(:b, x, y, +), 2, /) end The library function: μ() def μ(id, a, b, op) if parent? return op(a,b) if has?(id) mutations(op).each do |o| fork if child? set(id, o) return o(a,b) end end set(id, op) return op(a, b) else o = get(id) || op return o(a,b) end end Advantages: • Applicable for any language • Simple source transformation (can do it in AWK) • Involves just a library call • Only executes mutants relevant to test cases • Obtains significant runtime improvement Caveats: • Requires cheap forking (such as Unix) • Assumes test cases are parallelizable • Assumes reusable state