Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Mutation Testing - Ruby Edition

Mutation Testing - Ruby Edition

Extended version of my brown bag lightning talk on Mutation Testing. Given at IPRUG.

Chris Sinjakli

July 03, 2012
Tweet

More Decks by Chris Sinjakli

Other Decks in Programming

Transcript

  1. Mutation Testing
    Chris Sinjakli

    View Slide

  2. Testing is a good thing
    But how do we know our tests are
    good?

    View Slide

  3. Code coverage is a start
    But it can give a “good” score with
    really dreadful tests

    View Slide

  4. Really dreadful tests
    class Adder
    def self.add (x, y)
    return x - y
    end
    end
    describe Adder do
    it "should add the two arguments" do
    Adder.add(1, 1)
    end
    end
    Coverage: 100%
    Usefulness: 0

    View Slide

  5. View Slide

  6. A contrived example
    But how could we detect it?

    View Slide

  7. Mutation Testing!
    “Who watches the watchmen?”

    View Slide

  8. If you can change the code, and a
    test doesn’t fail, either the code is
    never run or the tests are wrong.

    View Slide

  9. How?
    1. Run test suite
    2. Change code (mutate)
    3. Run test suite again
    If tests now fail, mutant dies. Otherwise it
    survives.

    View Slide

  10. Going with our previous example
    class Adder
    def self.add (x, y)
    return x - y
    end
    end
    describe Adder do
    it "should add the two arguments" do
    Adder.add(1, 1)
    end
    end
    Let’s change something

    View Slide

  11. Going with our previous example
    class Adder
    def self.add (x, y)
    return x + y
    end
    end
    describe Adder do
    it "should add the two arguments" do
    Adder.add(1, 1)
    end
    end
    This still passes

    View Slide

  12. Success
    We know something is wrong

    View Slide

  13. So what? It caught a really
    rubbish test
    How about something slightly less
    obvious?

    View Slide

  14. Slightly less obvious (and I mean slightly)
    class ConditionChecker
    def self.check(a, b)
    if a && b
    return 42
    else
    return 0
    end
    end
    end
    describe ConditionChecker do
    it "should return 42 when both arguments are true" do
    ConditionChecker.check(true, true).should == 42
    end
    it "should return 0 when both arguments are false" do
    ConditionChecker.check(false, false).should == 0
    end
    end Coverage: 100%
    Usefulness: >0
    But still wrong

    View Slide

  15. Slightly less obvious (and I mean slightly)
    class ConditionChecker
    def self.check(a, b)
    if a && b
    return 42
    else
    return 0
    end
    end
    end
    describe ConditionChecker do
    it "should return 42 when both arguments are true" do
    ConditionChecker.check(true, true).should == 42
    end
    it "should return 0 when both arguments are false" do
    ConditionChecker.check(false, false).should == 0
    end
    end
    Mutate

    View Slide

  16. Slightly less obvious (and I mean slightly)
    class ConditionChecker
    def self.check(a, b)
    if a || b
    return 42
    else
    return 0
    end
    end
    end
    describe ConditionChecker do
    it "should return 42 when both arguments are true" do
    ConditionChecker.check(true, true).should == 42
    end
    it "should return 0 when both arguments are false" do
    ConditionChecker.check(false, false).should == 0
    end
    end
    Passing tests

    View Slide

  17. Mutation testing caught our
    mistake
    :D

    View Slide

  18. Useful technique
    But still has its flaws

    View Slide

  19. The downfall of mutation
    (Equivalent Mutants)
    index = 0
    while index != 100 do
    doStuff()
    index += 1
    end
    index = 0
    while index < 100 do
    doStuff()
    index += 1
    end
    Mutates to
    But the programs are equivalent, so no test will fail

    View Slide

  20. There is no possible test which
    can “kill” the mutant
    The programs are equivalent

    View Slide

  21. Also (potentially)
    • Infinite loops
    • More memory used
    • Compile/run time errors – tools should
    minimise these

    View Slide

  22. How bad is it?
    • Good paper assessing the problem [SZ10]
    • Took 7 widely used, “large” projects
    • Found:
    – 15 mins to assess one mutation
    – 45% uncaught mutations are equivalent
    – Better tested project -> worse signal-to-noise ratio

    View Slide

  23. Can we detect the equivalents?
    • Not in the general case [BA82]
    • Some specific cases can be detected
    – Using compiler optimisation techniques [BS79]
    – Using mathematical constraints [DO91]
    – Line coverage changes [SZ10]
    • All heuristic algorithms – not seen any
    claiming to kill all equivalent mutants

    View Slide

  24. Tools
    Some Ruby, then a Java one I liked

    View Slide

  25. Ruby
    • Looked into Heckle
    • Seemed unmaintained (nothing since 2009)
    • Then I saw...

    View Slide

  26. Ruby

    View Slide

  27. Ruby
    • Mutant seems to be the new favourite
    • Runs in Rubinius (1.8 or 1.9 mode)
    • Only supports RSpec
    • Easy to set up
    rvm install rbx-head
    rvm use rbx-head
    gem install mutant
    • And easy to use
    mutate “ClassName#method_to_test” spec

    View Slide

  28. Java
    • Loads of tools to choose from
    • Bytecode vs source mutation
    • Will look at PIT (seems like one of the better
    ones)

    View Slide

  29. PIT - pitest.org
    • Works with “everything”
    – Command line
    – Ant
    – Maven
    • Bytecode level mutations (faster)
    • Very customisable
    – Exclude classes/packages from mutation
    – Choose which mutations you want
    – Timeouts
    • Makes pretty HTML reports (line/mutation coverage)

    View Slide

  30. Summary
    • Can point at weak areas in your tests
    • At the same time, can be prohibitively noisy
    • Try it and see

    View Slide

  31. Questions?

    View Slide

  32. References
    • [BA82] - T. A. Budd and D. Angluin. Two notions of correctness and
    their relation to testing. Acta Informatica, 18(1):31-45, November
    1982.
    • [BS79] - D. Baldwin and F. Sayward. Heuristics for determining
    equivalence of program mutations. Research report 276,
    Department of Computer Science, Yale University, 1979.
    • [DO91] - R. A. DeMillo and A. J. O
    utt. Constraint-based automatic test data generation. IEEE
    Transactions on Software Engineering, 17(9):900-910, September
    1991.
    • [SZ10] - D. Schuler and A. Zeller. (Un-)Covering Equivalent Mutants.
    Third International Conference on Software Testing, Verification and
    Validation (ICST), pages 45-54. April 2010.

    View Slide

  33. Also interesting
    • [AHH04] – K. Adamopoulos, M. Harman and R. M. Hierons. How to
    Overcome the Equivalent Mutant Problem and Achieve Tailored
    Selective Mutation Using Co-evolution. Genetic and Evolutionary
    Computation -- GECCO 2004, pages 1338-1349. 2004.

    View Slide