Mutation testing: How good your tests really are?

Slide 1

Slide 1 text

Mutation testing How good your tests really are? Marcin Zajączkowski Wrocław, 2016-01-16 @SolidSoftBlog [email protected] www.codearte.io

Slide 2

Slide 2 text

About me Java architect TDD practitioner Team mentor Clean code developer Software Craftsmanship Evangelist FOSS developer Linux enthusiast IT Trainer Code quality freak Blogger @SolidSoftBlog [email protected] www.codearte.io

Slide 3

Slide 3 text

Two interesting companies @SolidSoftBlog [email protected] www.codearte.io

Slide 4

Slide 4 text

Presentation plan ● What is and how works mutation testing? ● Mutation operators ● Why is almost not known and rarely used? ● PIT – Java tool which works ● Support in other languages ● Mutants in action – live coding ● Benefits of using ● Adaptation in real life project @SolidSoftBlog [email protected] www.codearte.io

Slide 5

Slide 5 text

http://www.erniemort.com/

Slide 6

Slide 6 text

http://www.townofdarienny.com/

Slide 7

Slide 7 text

http://www.censusfinder.com/nebraska-historical-museums.htm

Slide 8

Slide 8 text

http://muratordom.pl/

Slide 9

Slide 9 text

http://www.oldwestlawmansforgottenmemoir.com/memoir_bringsV13.html

Slide 10

Slide 10 text

https://weshouldnamethissoon.wordpress.com/

Slide 11

Slide 11 text

http://www.rustyaccents.com/

Slide 12

Slide 12 text

http://www.nps.gov/fosm/historyculture/executions-at-fort-smith-1873-to-1896.htm

Slide 13

Slide 13 text

https://secure.flickr.com/photos/kingdafy/500117608/

Slide 14

Slide 14 text

Sfingowane przestępstwo https://secure.flickr.com/photos/7402220@N02/491093210/

Slide 15

Slide 15 text

2 - http://goo.gl/C8yFe

Slide 16

Slide 16 text

http://publish.illinois.edu/libraryitnews/2012/06/

Slide 17

Slide 17 text

Analogies Project Bugs in code Automated tests Code coverage Mutants in code Town Crimes Sheriffs Patrol paths Provocations The idea with an analogy to criminality and law enforcement taken from Chris Rimmer

Slide 18

Slide 18 text

Mutation testing What's it about? ● Intentionally break selected line of production code (introduce a mutant) ● Check if any test detects a modification - if it fails (killed mutant) ● Survived mutants (which were not detected) are a potential bugs which would not be detected by automated tests @SolidSoftBlog [email protected] www.codearte.io

Slide 19

Slide 19 text

Mutation operators ● Kind of modification to be applied on production code ● Preferred features – Small number of generated equivalent mutants – Not very easy to detect – Resulting in sensible code 1 - http://blog.spoongraphics.co.uk/tutorials/create-a-cute-furry-vector-monster-in-illustrator 2 - http://blog.spoongraphics.co.uk/tutorials/create-a-cute-vector-monster-from-a-pencil-sketch

Slide 20

Slide 20 text

Conditionals Boundary Mutator Mutation operators - example ● Replace relational operators (<, <=, >, >=) ● Can detect a lack of range boundary testing if (a < b) { // ... } if (a <= b) { // ... } @SolidSoftBlog [email protected] www.codearte.io

Slide 21

Slide 21 text

Popular mutation operators ● Conditionals Mutators – Boundary, Negation, Removal ● Math Mutator ● Increments Mutator ● Invert Negatives Mutator ● Return Values Mutator ● (Non) Void Method Calls Mutator 1 - http://blog.spoongraphics.co.uk/tutorials/create-a-cute-furry-vector-monster-in-illustrator 2 - http://blog.spoongraphics.co.uk/tutorials/create-a-cute-vector-monster-from-a-pencil-sketch

Slide 22

Slide 22 text

Mutation testing ● Known for decades – First paper in 1971 by Richard Lipton ● Very interesting and promising idea ● Why not commonly used in the 21st century? @SolidSoftBlog [email protected] www.codearte.io

Slide 23

Slide 23 text

Common implementation issues http://bestclipartblog.com/27-tools-clip-art.html/tools-clip-art-2

Slide 24

Slide 24 text

Long execution time Common issues ● AssertJ – medium size project ● ~50000 lines of production code ● ~120000 lines of tests ● Compilation time – ~7 seconds ● Test execution time – ~30 seconds (~7400 unit tests) @SolidSoftBlog [email protected] www.codearte.io

Slide 25

Slide 25 text

Long execution time Brute force ● 5500 mutants generated ● 5500 * 7 seconds – compilation ● 5500 * 30 seconds – test execution ● 203500 seconds in total ● ~3400 minutes ● Over 56 hours! (for not so large project) @SolidSoftBlog [email protected] www.codearte.io

Slide 26

Slide 26 text

Common implementation issues with Mutation testing ● Long execution time ● Equivalent Mutations ● Small number of tools (many not maintained anymore) ● Required production code modification ● Broken mutated code – Infinite loops – Stack overflow http://bestclipartblog.com/27-tools-clip-art.html/tools-clip-art-2

Slide 27

Slide 27 text

PIT – fast mutations for Java

Slide 28

Slide 28 text

PIT – mutation generation ● Bytecode manipulation – Instead of source code ● No need to recompile modified classes ● Mutation generation in almost no time ● Harder to implement mutations ● No possible to cover some cases due to compiler optimizations @SolidSoftBlog [email protected] www.codearte.io

Slide 29

Slide 29 text

General test selection techniques ● Running all tests for every mutation – very ineffective ● By name convention – Tends to underestimate test suite effectiveness – Not all tests are written in that way – Typos ● Static call analysis – Problematic with polymorphism – Skip reflection calls @SolidSoftBlog [email protected] www.codearte.io

Slide 30

Slide 30 text

PIT – effective test selection ● Standard code coverage measurement first ● Mutants with no coverage has to survive – No test even executes given line ● Test prioritization – Fast tests first – Stop when one fails @SolidSoftBlog [email protected] www.codearte.io

Slide 31

Slide 31 text

PIT – parallel test execution ● Multiple tests run simultaneously ● Modern laptops have 2 or 4 cores ● Ideal for unit test suite ● Can decrease execution time dramatically @SolidSoftBlog [email protected] www.codearte.io

Slide 32

Slide 32 text

PIT – incremental build ● Mutation testing only for modified code ● Local changes in SCM – For developers ● Modified classes since last execution – For CI server ● Ideal for large codebase @SolidSoftBlog [email protected] www.codearte.io

Slide 33

Slide 33 text

PIT – fast mutations for Java ● Bytecode manipulation ● Mutation of the lines with standard coverage only ● Execution of related test only ● Parallel execution ● Incremental analysis http://carhumor.net/blast-from-the-past/

Slide 34

Slide 34 text

PIT – fast mutations for Java ● Remember AssertJ and brute force method? – Over 56 hours ● With PIT using 4 threads it is just ~6 minutes – on my 3-year-old laptop @SolidSoftBlog [email protected] www.codearte.io

Slide 35

Slide 35 text

PIT – rich ecosystem TestNG Spock Logos – home pages of mentioned projects

Slide 36

Slide 36 text

PIT – pros ● Fast ● Powerful ● Widely supported @SolidSoftBlog [email protected] www.codearte.io

Slide 37

Slide 37 text

PIT – cons ● No mutation of production code in other JVM languages – But tests in Spock are completely fine ● No coverage change between execution on generated reports – Plugin for SonarQube can mitigate that @SolidSoftBlog [email protected] www.codearte.io

Slide 38

Slide 38 text

Some Java alternatives ● Javalanche – small ecosystem, last commit in 2012 ● µJava – limited access to source code (before April 2015), purely academic project ● Judy – no source code available, small ecosystem ● Jester – first tool for Java, no longer maintained ● Jumble – only occasional updates, small ecosystem @SolidSoftBlog [email protected] www.codearte.io

Slide 39

Slide 39 text

Other languages Selected tools ● Mutant – Ruby – actively developed ● grunt-mutation-testing – Java Script – actively developed ● MutPy – Python – no longer maintained ● Mutagenesis – PHP – low activity (somehow maintained in judgedim fork) ● NinjaTurtles – .NET – last commit in 2012 ● Visualmutator – .NET – out-of-box integration with Visual Studio @SolidSoftBlog [email protected] www.codearte.io

Slide 40

Slide 40 text

Mutants in action http://www.adolescentadulthood.com/2013/01/23/how-did-the-teenage-mutant-ninja-turtles-get-their-names/

Slide 41

Slide 41 text

What can you get? ● Better code quality ● Less bugs in production ● Job satisfaction ● ... (other benefits from writing testable code) ● Information how good your tests really are ● Places in code that are not properly tested – Better than with „normal” code coverage @SolidSoftBlog [email protected] www.codearte.io

Slide 42

Slide 42 text

Is it a MUST for any project? ● Short answer – no ● It is advanced technique/tool – Many projects don't measure even standard code coverage – Or code quality at all ● There are other, easier to introduce (and maintain) tools/techniques to start with – Static code analysis – Standard code coverage – Pair programming or peer review @SolidSoftBlog [email protected] www.codearte.io

Slide 43

Slide 43 text

When to use? ● Greenfield project developed with high quality in mind ● High coverage, but still bugs in production which could (and should) be detected by tests ● Doubts about test suite quality – HLD requirement – 95% minimal code coverage level with the development team with no experience in automated testing ● Improve legacy system with low code quality and/or without tests @SolidSoftBlog [email protected] www.codearte.io

Slide 44

Slide 44 text

Prepare your project ● Write automatic tests ● Write fast automatic unit tests (not only slow integration ones) ● Separate fast unit tests from slow integration tests – Be able to run only selected group of tests ● Introduce basic code quality measurement techniques This presentation is available under the terms of Creative Commons Attribution-NonCommercial-ShareAlike 3.0 (with exclusion of the parts created by other people – including photos). Version 1.1.3-frogs.

Slide 45

Slide 45 text

Does anyone use it in the real life project? Yes :-) ● British Sky Broadcasting ● TheLadders ● Large Hadron Collider in CERN ● Maybe you? http://www.mysciencework.com/fr/MyScienceNews/10027/de-l-in-opportunite-des-open-spaces-dans-les-labos

Slide 46

Slide 46 text

Summary of benefits Verification of effectiveness of automatic test More reliable code Less troubles at work More time for interesting things Increased job satisfaction @SolidSoftBlog [email protected] www.codearte.io

Slide 47

Slide 47 text

Thank you for your attention Marcin Zajączkowski [email protected] http://codearte.io/ http://blog.solidsoft.info/ @SolidSoftBlog

Slide 48

Slide 48 text

Before questions... … immediate feedback

Slide 49

Slide 49 text

Questions? Marcin Zajączkowski [email protected] http://codearte.io/ http://blog.solidsoft.info/ @SolidSoftBlog