Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Mutation testing: How good your tests really are?

Mutation testing: How good your tests really are?

Standard code coverage analysis provides insight into execution paths exercised by a testing scheme. Mutation testing takes this notion a step further and offers a deep down investigation (through code modification or removal) of the impact each line of production code makes on the whole test. This approach has been known for over 30 years circling mainly within academic communities and only recently has it been rediscovered and introduced to the benefit of commercial solutions. Mutation testing technique which focuses on partitioning the code on per line basis with regards to influence on the logic put under test provides actual (as opposed to artificial and synthetic) view on tests quality.

The presentation will provide solutions and answers to the following:
- What is mutation testing?
- Why use it?
- Limitations and drawbacks
- Why only now does it start to get traction?
- How to prepare your project for mutation testing?
- Is there a case for it in the enterprise?

This presentation is meant for software development-involved individuals who want to learn more about mutation testing, what advantages over traditional coverage scheme it offers and how to apply the idea to their existing codebase. Although not required, some knowledge on test automation and coverage would be beneficial to the attendee.

Marcin Zajączkowski

January 16, 2016

More Decks by Marcin Zajączkowski

Other Decks in Programming


  1. About me Java architect TDD practitioner Team mentor Clean code

    developer Software Craftsmanship Evangelist FOSS developer Linux enthusiast IT Trainer Code quality freak Blogger @SolidSoftBlog [email protected] www.codearte.io
  2. Presentation plan • What is and how works mutation testing?

    • Mutation operators • Why is almost not known and rarely used? • PIT – Java tool which works • Support in other languages • Mutants in action – live coding • Benefits of using • Adaptation in real life project @SolidSoftBlog [email protected] www.codearte.io
  3. Analogies Project Bugs in code Automated tests Code coverage Mutants

    in code Town Crimes Sheriffs Patrol paths Provocations The idea with an analogy to criminality and law enforcement taken from Chris Rimmer
  4. Mutation testing What's it about? • Intentionally break selected line

    of production code (introduce a mutant) • Check if any test detects a modification - if it fails (killed mutant) • Survived mutants (which were not detected) are a potential bugs which would not be detected by automated tests @SolidSoftBlog [email protected] www.codearte.io
  5. Mutation operators • Kind of modification to be applied on

    production code • Preferred features – Small number of generated equivalent mutants – Not very easy to detect – Resulting in sensible code 1 - http://blog.spoongraphics.co.uk/tutorials/create-a-cute-furry-vector-monster-in-illustrator 2 - http://blog.spoongraphics.co.uk/tutorials/create-a-cute-vector-monster-from-a-pencil-sketch
  6. Conditionals Boundary Mutator Mutation operators - example • Replace relational

    operators (<, <=, >, >=) • Can detect a lack of range boundary testing if (a < b) { // ... } if (a <= b) { // ... } @SolidSoftBlog [email protected] www.codearte.io
  7. Popular mutation operators • Conditionals Mutators – Boundary, Negation, Removal

    • Math Mutator • Increments Mutator • Invert Negatives Mutator • Return Values Mutator • (Non) Void Method Calls Mutator 1 - http://blog.spoongraphics.co.uk/tutorials/create-a-cute-furry-vector-monster-in-illustrator 2 - http://blog.spoongraphics.co.uk/tutorials/create-a-cute-vector-monster-from-a-pencil-sketch
  8. Mutation testing • Known for decades – First paper in

    1971 by Richard Lipton • Very interesting and promising idea • Why not commonly used in the 21st century? @SolidSoftBlog [email protected] www.codearte.io
  9. Long execution time Common issues • AssertJ – medium size

    project • ~50000 lines of production code • ~120000 lines of tests • Compilation time – ~7 seconds • Test execution time – ~30 seconds (~7400 unit tests) @SolidSoftBlog [email protected] www.codearte.io
  10. Long execution time Brute force • 5500 mutants generated •

    5500 * 7 seconds – compilation • 5500 * 30 seconds – test execution • 203500 seconds in total • ~3400 minutes • Over 56 hours! (for not so large project) @SolidSoftBlog [email protected] www.codearte.io
  11. Common implementation issues with Mutation testing • Long execution time

    • Equivalent Mutations • Small number of tools (many not maintained anymore) • Required production code modification • Broken mutated code – Infinite loops – Stack overflow http://bestclipartblog.com/27-tools-clip-art.html/tools-clip-art-2
  12. PIT – mutation generation • Bytecode manipulation – Instead of

    source code • No need to recompile modified classes • Mutation generation in almost no time • Harder to implement mutations • No possible to cover some cases due to compiler optimizations @SolidSoftBlog [email protected] www.codearte.io
  13. General test selection techniques • Running all tests for every

    mutation – very ineffective • By name convention – Tends to underestimate test suite effectiveness – Not all tests are written in that way – Typos • Static call analysis – Problematic with polymorphism – Skip reflection calls @SolidSoftBlog [email protected] www.codearte.io
  14. PIT – effective test selection • Standard code coverage measurement

    first • Mutants with no coverage has to survive – No test even executes given line • Test prioritization – Fast tests first – Stop when one fails @SolidSoftBlog [email protected] www.codearte.io
  15. PIT – parallel test execution • Multiple tests run simultaneously

    • Modern laptops have 2 or 4 cores • Ideal for unit test suite • Can decrease execution time dramatically @SolidSoftBlog [email protected] www.codearte.io
  16. PIT – incremental build • Mutation testing only for modified

    code • Local changes in SCM – For developers • Modified classes since last execution – For CI server • Ideal for large codebase @SolidSoftBlog [email protected] www.codearte.io
  17. PIT – fast mutations for Java • Bytecode manipulation •

    Mutation of the lines with standard coverage only • Execution of related test only • Parallel execution • Incremental analysis http://carhumor.net/blast-from-the-past/
  18. PIT – fast mutations for Java • Remember AssertJ and

    brute force method? – Over 56 hours • With PIT using 4 threads it is just ~6 minutes – on my 3-year-old laptop @SolidSoftBlog [email protected] www.codearte.io
  19. PIT – cons • No mutation of production code in

    other JVM languages – But tests in Spock are completely fine • No coverage change between execution on generated reports – Plugin for SonarQube can mitigate that @SolidSoftBlog [email protected] www.codearte.io
  20. Some Java alternatives • Javalanche – small ecosystem, last commit

    in 2012 • µJava – limited access to source code (before April 2015), purely academic project • Judy – no source code available, small ecosystem • Jester – first tool for Java, no longer maintained • Jumble – only occasional updates, small ecosystem @SolidSoftBlog [email protected] www.codearte.io
  21. Other languages Selected tools • Mutant – Ruby – actively

    developed • grunt-mutation-testing – Java Script – actively developed • MutPy – Python – no longer maintained • Mutagenesis – PHP – low activity (somehow maintained in judgedim fork) • NinjaTurtles – .NET – last commit in 2012 • Visualmutator – .NET – out-of-box integration with Visual Studio @SolidSoftBlog [email protected] www.codearte.io
  22. What can you get? • Better code quality • Less

    bugs in production • Job satisfaction • ... (other benefits from writing testable code) • Information how good your tests really are • Places in code that are not properly tested – Better than with „normal” code coverage @SolidSoftBlog [email protected] www.codearte.io
  23. Is it a MUST for any project? • Short answer

    – no • It is advanced technique/tool – Many projects don't measure even standard code coverage – Or code quality at all • There are other, easier to introduce (and maintain) tools/techniques to start with – Static code analysis – Standard code coverage – Pair programming or peer review @SolidSoftBlog [email protected] www.codearte.io
  24. When to use? • Greenfield project developed with high quality

    in mind • High coverage, but still bugs in production which could (and should) be detected by tests • Doubts about test suite quality – HLD requirement – 95% minimal code coverage level with the development team with no experience in automated testing • Improve legacy system with low code quality and/or without tests @SolidSoftBlog [email protected] www.codearte.io
  25. Prepare your project • Write automatic tests • Write fast

    automatic unit tests (not only slow integration ones) • Separate fast unit tests from slow integration tests – Be able to run only selected group of tests • Introduce basic code quality measurement techniques This presentation is available under the terms of Creative Commons Attribution-NonCommercial-ShareAlike 3.0 (with exclusion of the parts created by other people – including photos). Version 1.1.3-frogs.
  26. Does anyone use it in the real life project? Yes

    :-) • British Sky Broadcasting • TheLadders • Large Hadron Collider in CERN • Maybe you? http://www.mysciencework.com/fr/MyScienceNews/10027/de-l-in-opportunite-des-open-spaces-dans-les-labos
  27. Summary of benefits Verification of effectiveness of automatic test More

    reliable code Less troubles at work More time for interesting things Increased job satisfaction @SolidSoftBlog [email protected] www.codearte.io