Mutation testing: How good your tests really are?

Mutation testing How good your tests really are? Marcin Zajączkowski
Wrocław, 2016-01-16 @SolidSoftBlog [email protected] www.codearte.io

About me Java architect TDD practitioner Team mentor Clean code
developer Software Craftsmanship Evangelist FOSS developer Linux enthusiast IT Trainer Code quality freak Blogger @SolidSoftBlog [email protected] www.codearte.io

Two interesting companies @SolidSoftBlog [email protected] www.codearte.io

Presentation plan • What is and how works mutation testing?
• Mutation operators • Why is almost not known and rarely used? • PIT – Java tool which works • Support in other languages • Mutants in action – live coding • Benefits of using • Adaptation in real life project @SolidSoftBlog [email protected] www.codearte.io

http://www.erniemort.com/

http://www.townofdarienny.com/

http://www.censusfinder.com/nebraska-historical-museums.htm

http://muratordom.pl/

http://www.oldwestlawmansforgottenmemoir.com/memoir_bringsV13.html

https://weshouldnamethissoon.wordpress.com/

http://www.rustyaccents.com/

http://www.nps.gov/fosm/historyculture/executions-at-fort-smith-1873-to-1896.htm

https://secure.flickr.com/photos/kingdafy/500117608/

Sfingowane przestępstwo https://secure.flickr.com/photos/7402220@N02/491093210/

2 - http://goo.gl/C8yFe

http://publish.illinois.edu/libraryitnews/2012/06/

Analogies Project Bugs in code Automated tests Code coverage Mutants
in code Town Crimes Sheriffs Patrol paths Provocations The idea with an analogy to criminality and law enforcement taken from Chris Rimmer

Mutation testing What's it about? • Intentionally break selected line
of production code (introduce a mutant) • Check if any test detects a modification - if it fails (killed mutant) • Survived mutants (which were not detected) are a potential bugs which would not be detected by automated tests @SolidSoftBlog [email protected] www.codearte.io

Mutation operators • Kind of modification to be applied on
production code • Preferred features – Small number of generated equivalent mutants – Not very easy to detect – Resulting in sensible code 1 - http://blog.spoongraphics.co.uk/tutorials/create-a-cute-furry-vector-monster-in-illustrator 2 - http://blog.spoongraphics.co.uk/tutorials/create-a-cute-vector-monster-from-a-pencil-sketch

Conditionals Boundary Mutator Mutation operators - example • Replace relational
operators (<, <=, >, >=) • Can detect a lack of range boundary testing if (a < b) { // ... } if (a <= b) { // ... } @SolidSoftBlog [email protected] www.codearte.io

Popular mutation operators • Conditionals Mutators – Boundary, Negation, Removal
• Math Mutator • Increments Mutator • Invert Negatives Mutator • Return Values Mutator • (Non) Void Method Calls Mutator 1 - http://blog.spoongraphics.co.uk/tutorials/create-a-cute-furry-vector-monster-in-illustrator 2 - http://blog.spoongraphics.co.uk/tutorials/create-a-cute-vector-monster-from-a-pencil-sketch

Mutation testing • Known for decades – First paper in
1971 by Richard Lipton • Very interesting and promising idea • Why not commonly used in the 21st century? @SolidSoftBlog [email protected] www.codearte.io

Common implementation issues http://bestclipartblog.com/27-tools-clip-art.html/tools-clip-art-2

Long execution time Common issues • AssertJ – medium size
project • ~50000 lines of production code • ~120000 lines of tests • Compilation time – ~7 seconds • Test execution time – ~30 seconds (~7400 unit tests) @SolidSoftBlog [email protected] www.codearte.io

Long execution time Brute force • 5500 mutants generated •
5500 * 7 seconds – compilation • 5500 * 30 seconds – test execution • 203500 seconds in total • ~3400 minutes • Over 56 hours! (for not so large project) @SolidSoftBlog [email protected] www.codearte.io

Common implementation issues with Mutation testing • Long execution time
• Equivalent Mutations • Small number of tools (many not maintained anymore) • Required production code modification • Broken mutated code – Infinite loops – Stack overflow http://bestclipartblog.com/27-tools-clip-art.html/tools-clip-art-2

PIT – fast mutations for Java

PIT – mutation generation • Bytecode manipulation – Instead of
source code • No need to recompile modified classes • Mutation generation in almost no time • Harder to implement mutations • No possible to cover some cases due to compiler optimizations @SolidSoftBlog [email protected] www.codearte.io

General test selection techniques • Running all tests for every
mutation – very ineffective • By name convention – Tends to underestimate test suite effectiveness – Not all tests are written in that way – Typos • Static call analysis – Problematic with polymorphism – Skip reflection calls @SolidSoftBlog [email protected] www.codearte.io

PIT – effective test selection • Standard code coverage measurement
first • Mutants with no coverage has to survive – No test even executes given line • Test prioritization – Fast tests first – Stop when one fails @SolidSoftBlog [email protected] www.codearte.io

PIT – parallel test execution • Multiple tests run simultaneously
• Modern laptops have 2 or 4 cores • Ideal for unit test suite • Can decrease execution time dramatically @SolidSoftBlog [email protected] www.codearte.io

PIT – incremental build • Mutation testing only for modified
code • Local changes in SCM – For developers • Modified classes since last execution – For CI server • Ideal for large codebase @SolidSoftBlog [email protected] www.codearte.io

PIT – fast mutations for Java • Bytecode manipulation •
Mutation of the lines with standard coverage only • Execution of related test only • Parallel execution • Incremental analysis http://carhumor.net/blast-from-the-past/

PIT – fast mutations for Java • Remember AssertJ and
brute force method? – Over 56 hours • With PIT using 4 threads it is just ~6 minutes – on my 3-year-old laptop @SolidSoftBlog [email protected] www.codearte.io

PIT – rich ecosystem TestNG Spock Logos – home pages
of mentioned projects

PIT – pros • Fast • Powerful • Widely supported
@SolidSoftBlog [email protected] www.codearte.io

PIT – cons • No mutation of production code in
other JVM languages – But tests in Spock are completely fine • No coverage change between execution on generated reports – Plugin for SonarQube can mitigate that @SolidSoftBlog [email protected] www.codearte.io

Some Java alternatives • Javalanche – small ecosystem, last commit
in 2012 • µJava – limited access to source code (before April 2015), purely academic project • Judy – no source code available, small ecosystem • Jester – first tool for Java, no longer maintained • Jumble – only occasional updates, small ecosystem @SolidSoftBlog [email protected] www.codearte.io

Other languages Selected tools • Mutant – Ruby – actively
developed • grunt-mutation-testing – Java Script – actively developed • MutPy – Python – no longer maintained • Mutagenesis – PHP – low activity (somehow maintained in judgedim fork) • NinjaTurtles – .NET – last commit in 2012 • Visualmutator – .NET – out-of-box integration with Visual Studio @SolidSoftBlog [email protected] www.codearte.io

Mutants in action http://www.adolescentadulthood.com/2013/01/23/how-did-the-teenage-mutant-ninja-turtles-get-their-names/

What can you get? • Better code quality • Less
bugs in production • Job satisfaction • ... (other benefits from writing testable code) • Information how good your tests really are • Places in code that are not properly tested – Better than with „normal” code coverage @SolidSoftBlog [email protected] www.codearte.io

Is it a MUST for any project? • Short answer
– no • It is advanced technique/tool – Many projects don't measure even standard code coverage – Or code quality at all • There are other, easier to introduce (and maintain) tools/techniques to start with – Static code analysis – Standard code coverage – Pair programming or peer review @SolidSoftBlog [email protected] www.codearte.io

When to use? • Greenfield project developed with high quality
in mind • High coverage, but still bugs in production which could (and should) be detected by tests • Doubts about test suite quality – HLD requirement – 95% minimal code coverage level with the development team with no experience in automated testing • Improve legacy system with low code quality and/or without tests @SolidSoftBlog [email protected] www.codearte.io

Prepare your project • Write automatic tests • Write fast
automatic unit tests (not only slow integration ones) • Separate fast unit tests from slow integration tests – Be able to run only selected group of tests • Introduce basic code quality measurement techniques This presentation is available under the terms of Creative Commons Attribution-NonCommercial-ShareAlike 3.0 (with exclusion of the parts created by other people – including photos). Version 1.1.3-frogs.

Does anyone use it in the real life project? Yes
:-) • British Sky Broadcasting • TheLadders • Large Hadron Collider in CERN • Maybe you? http://www.mysciencework.com/fr/MyScienceNews/10027/de-l-in-opportunite-des-open-spaces-dans-les-labos

Summary of benefits Verification of effectiveness of automatic test More
reliable code Less troubles at work More time for interesting things Increased job satisfaction @SolidSoftBlog [email protected] www.codearte.io

Thank you for your attention Marcin Zajączkowski [email protected] http://codearte.io/ http://blog.solidsoft.info/
@SolidSoftBlog

Before questions... … immediate feedback

Questions? Marcin Zajączkowski [email protected] http://codearte.io/ http://blog.solidsoft.info/ @SolidSoftBlog

Mutation testing: How good your tests really are?

Mutation testing: How good your tests really are?

More Decks by Marcin Zajączkowski

Other Decks in Programming

Featured

Transcript