Slide 1

Slide 1 text

Mutation Testing Chris Sinjakli

Slide 2

Slide 2 text

Testing is a good thing But how do we know our tests are good?

Slide 3

Slide 3 text

Code coverage is a start But it can give a “good” score with really dreadful tests

Slide 4

Slide 4 text

Really dreadful tests public int addTwoNumbers(int a, int b) { return a – b; } ... @Test public void shouldAddTwoNumbers() { int result = addTwoNumbers(1, 1); assertTrue(true); } Coverage: 100% Usefulness: 0

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

A contrived example But how could we detect it?

Slide 7

Slide 7 text

Mutation Testing! “Who watches the watchmen?”

Slide 8

Slide 8 text

If you can change the code, and a test doesn’t fail, either the code is never run or the tests are wrong.

Slide 9

Slide 9 text

Going with our previous example public int addTwoNumbers(int a, int b) { return a – b; } ... @Test public void shouldAddTwoNumbers() { int result = addTwoNumbers(1, 1); assertTrue(true); } Let’s change something

Slide 10

Slide 10 text

Going with our previous example public int addTwoNumbers(int a, int b) { return a + b; } ... @Test public void shouldAddTwoNumbers() { int result = addTwoNumbers(1, 1); assertTrue(true); } This still passes

Slide 11

Slide 11 text

So it caught a really rubbish test How about something slightly less obvious?

Slide 12

Slide 12 text

Slightly less obvious (and I mean slightly) public int checkConditions(boolean a, boolean b) { if (a && b) { return 42; } else { return 0; } } @Test public void testBothFalse() { int result = checkConditions(false, false); assertEquals(0, result); } @Test public void testBothTrue () { int result = checkConditions(true, true); assertEquals(42, result); } Coverage: 100% Usefulness: >0 But still wrong

Slide 13

Slide 13 text

Slightly less obvious (and I mean slightly) public int checkConditions(boolean a, boolean b) { if (a && b) { return 42; } else { return 0; } } @Test public void testBothFalse() { int result = checkConditions(false, false); assertEquals(0, result); } @Test public void testBothTrue () { int result = checkConditions(true, true); assertEquals(42, result); } Mutate

Slide 14

Slide 14 text

Slightly less obvious (and I mean slightly) public int checkConditions(boolean a, boolean b) { if (a || b) { return 42; } else { return 0; } } @Test public void testBothFalse() { int result = checkConditions(false, false); assertEquals(0, result); } @Test public void testBothTrue () { int result = checkConditions(true, true); assertEquals(42, result); } Passing tests

Slide 15

Slide 15 text

Mutation testing caught our mistake :D

Slide 16

Slide 16 text

Useful technique But still has its flaws

Slide 17

Slide 17 text

The downfall of mutation (Equivalent Mutants) int index = 0 while (someCondition) { doStuff(); index++; if (index == 100) { break; } } int index = 0 while (someCondition) { doStuff(); index++; if (index >= 100) { break; } } Mutates to But the programs are equivalent, so no test will fail

Slide 18

Slide 18 text

Tools Some Java, then some Ruby

Slide 19

Slide 19 text

Java • Loads of tools to choose from • Bytecode vs source mutation • Will look at PIT (seems like one of the better ones)

Slide 20

Slide 20 text

PIT - pitest.org • Works with “everything” – Command line – Ant – Maven • Bytecode level mutations (faster) • Very customisable – Exclude classes/packages from mutation – Choose which mutations you want – Timeouts • Makes pretty HTML reports (line/mutation coverage)

Slide 21

Slide 21 text

Ruby

Slide 22

Slide 22 text

Ruby • Mutant seems to be the new favourite • Runs in Rubinius (1.8 or 1.9 mode) • Only supports RSpec • Easy to set up rvm install rbx-head rvm use rbx-head gem install mutant • And easy to use mutate “ClassName#method_to_test” spec

Slide 23

Slide 23 text

Summary • Seems like it could identify areas of weakness in our tests • At the same time, could be very noisy • Might be worth just trying it against an existing project and seeing what happens

Slide 24

Slide 24 text

Questions?