Mutation testing: How good your tests really are?

Mutation testing: How good your tests really are?

Standard code coverage analysis provides insight into execution paths exercised by a testing scheme. Mutation testing takes this notion a step further and offers a deep down investigation (through code modification or removal) of the impact each line of production code makes on the whole test. This approach has been known for over 30 years circling mainly within academic communities and only recently has it been rediscovered and introduced to the benefit of commercial solutions. Mutation testing technique which focuses on partitioning the code on per line basis with regards to influence on the logic put under test provides actual (as opposed to artificial and synthetic) view on tests quality.

The presentation will provide solutions and answers to the following:
- What is mutation testing?
- Why use it?
- Limitations and drawbacks
- Why only now does it start to get traction?
- How to prepare your project for mutation testing?
- Is there a case for it in the enterprise?

This presentation is meant for software development-involved individuals who want to learn more about mutation testing, what advantages over traditional coverage scheme it offers and how to apply the idea to their existing codebase. Although not required, some knowledge on test automation and coverage would be beneficial to the attendee.

220d0825b07706221aeae4751057ede8?s=128

Marcin Zajączkowski

January 16, 2016
Tweet

Transcript

  1. Mutation testing How good your tests really are? Marcin Zajączkowski

    Wrocław, 2016-01-16 @SolidSoftBlog m.zajaczkowski@solidsoft.info www.codearte.io
  2. About me Java architect TDD practitioner Team mentor Clean code

    developer Software Craftsmanship Evangelist FOSS developer Linux enthusiast IT Trainer Code quality freak Blogger @SolidSoftBlog m.zajaczkowski@solidsoft.info www.codearte.io
  3. Two interesting companies @SolidSoftBlog m.zajaczkowski@solidsoft.info www.codearte.io

  4. Presentation plan • What is and how works mutation testing?

    • Mutation operators • Why is almost not known and rarely used? • PIT – Java tool which works • Support in other languages • Mutants in action – live coding • Benefits of using • Adaptation in real life project @SolidSoftBlog m.zajaczkowski@solidsoft.info www.codearte.io
  5. http://www.erniemort.com/

  6. http://www.townofdarienny.com/

  7. http://www.censusfinder.com/nebraska-historical-museums.htm

  8. http://muratordom.pl/

  9. http://www.oldwestlawmansforgottenmemoir.com/memoir_bringsV13.html

  10. https://weshouldnamethissoon.wordpress.com/

  11. http://www.rustyaccents.com/

  12. http://www.nps.gov/fosm/historyculture/executions-at-fort-smith-1873-to-1896.htm

  13. https://secure.flickr.com/photos/kingdafy/500117608/

  14. Sfingowane przestępstwo https://secure.flickr.com/photos/7402220@N02/491093210/

  15. 2 - http://goo.gl/C8yFe

  16. http://publish.illinois.edu/libraryitnews/2012/06/

  17. Analogies Project Bugs in code Automated tests Code coverage Mutants

    in code Town Crimes Sheriffs Patrol paths Provocations The idea with an analogy to criminality and law enforcement taken from Chris Rimmer
  18. Mutation testing What's it about? • Intentionally break selected line

    of production code (introduce a mutant) • Check if any test detects a modification - if it fails (killed mutant) • Survived mutants (which were not detected) are a potential bugs which would not be detected by automated tests @SolidSoftBlog m.zajaczkowski@solidsoft.info www.codearte.io
  19. Mutation operators • Kind of modification to be applied on

    production code • Preferred features – Small number of generated equivalent mutants – Not very easy to detect – Resulting in sensible code 1 - http://blog.spoongraphics.co.uk/tutorials/create-a-cute-furry-vector-monster-in-illustrator 2 - http://blog.spoongraphics.co.uk/tutorials/create-a-cute-vector-monster-from-a-pencil-sketch
  20. Conditionals Boundary Mutator Mutation operators - example • Replace relational

    operators (<, <=, >, >=) • Can detect a lack of range boundary testing if (a < b) { // ... } if (a <= b) { // ... } @SolidSoftBlog m.zajaczkowski@solidsoft.info www.codearte.io
  21. Popular mutation operators • Conditionals Mutators – Boundary, Negation, Removal

    • Math Mutator • Increments Mutator • Invert Negatives Mutator • Return Values Mutator • (Non) Void Method Calls Mutator 1 - http://blog.spoongraphics.co.uk/tutorials/create-a-cute-furry-vector-monster-in-illustrator 2 - http://blog.spoongraphics.co.uk/tutorials/create-a-cute-vector-monster-from-a-pencil-sketch
  22. Mutation testing • Known for decades – First paper in

    1971 by Richard Lipton • Very interesting and promising idea • Why not commonly used in the 21st century? @SolidSoftBlog m.zajaczkowski@solidsoft.info www.codearte.io
  23. Common implementation issues http://bestclipartblog.com/27-tools-clip-art.html/tools-clip-art-2

  24. Long execution time Common issues • AssertJ – medium size

    project • ~50000 lines of production code • ~120000 lines of tests • Compilation time – ~7 seconds • Test execution time – ~30 seconds (~7400 unit tests) @SolidSoftBlog m.zajaczkowski@solidsoft.info www.codearte.io
  25. Long execution time Brute force • 5500 mutants generated •

    5500 * 7 seconds – compilation • 5500 * 30 seconds – test execution • 203500 seconds in total • ~3400 minutes • Over 56 hours! (for not so large project) @SolidSoftBlog m.zajaczkowski@solidsoft.info www.codearte.io
  26. Common implementation issues with Mutation testing • Long execution time

    • Equivalent Mutations • Small number of tools (many not maintained anymore) • Required production code modification • Broken mutated code – Infinite loops – Stack overflow http://bestclipartblog.com/27-tools-clip-art.html/tools-clip-art-2
  27. PIT – fast mutations for Java

  28. PIT – mutation generation • Bytecode manipulation – Instead of

    source code • No need to recompile modified classes • Mutation generation in almost no time • Harder to implement mutations • No possible to cover some cases due to compiler optimizations @SolidSoftBlog m.zajaczkowski@solidsoft.info www.codearte.io
  29. General test selection techniques • Running all tests for every

    mutation – very ineffective • By name convention – Tends to underestimate test suite effectiveness – Not all tests are written in that way – Typos • Static call analysis – Problematic with polymorphism – Skip reflection calls @SolidSoftBlog m.zajaczkowski@solidsoft.info www.codearte.io
  30. PIT – effective test selection • Standard code coverage measurement

    first • Mutants with no coverage has to survive – No test even executes given line • Test prioritization – Fast tests first – Stop when one fails @SolidSoftBlog m.zajaczkowski@solidsoft.info www.codearte.io
  31. PIT – parallel test execution • Multiple tests run simultaneously

    • Modern laptops have 2 or 4 cores • Ideal for unit test suite • Can decrease execution time dramatically @SolidSoftBlog m.zajaczkowski@solidsoft.info www.codearte.io
  32. PIT – incremental build • Mutation testing only for modified

    code • Local changes in SCM – For developers • Modified classes since last execution – For CI server • Ideal for large codebase @SolidSoftBlog m.zajaczkowski@solidsoft.info www.codearte.io
  33. PIT – fast mutations for Java • Bytecode manipulation •

    Mutation of the lines with standard coverage only • Execution of related test only • Parallel execution • Incremental analysis http://carhumor.net/blast-from-the-past/
  34. PIT – fast mutations for Java • Remember AssertJ and

    brute force method? – Over 56 hours • With PIT using 4 threads it is just ~6 minutes – on my 3-year-old laptop @SolidSoftBlog m.zajaczkowski@solidsoft.info www.codearte.io
  35. PIT – rich ecosystem TestNG Spock Logos – home pages

    of mentioned projects
  36. PIT – pros • Fast • Powerful • Widely supported

    @SolidSoftBlog m.zajaczkowski@solidsoft.info www.codearte.io
  37. PIT – cons • No mutation of production code in

    other JVM languages – But tests in Spock are completely fine • No coverage change between execution on generated reports – Plugin for SonarQube can mitigate that @SolidSoftBlog m.zajaczkowski@solidsoft.info www.codearte.io
  38. Some Java alternatives • Javalanche – small ecosystem, last commit

    in 2012 • µJava – limited access to source code (before April 2015), purely academic project • Judy – no source code available, small ecosystem • Jester – first tool for Java, no longer maintained • Jumble – only occasional updates, small ecosystem @SolidSoftBlog m.zajaczkowski@solidsoft.info www.codearte.io
  39. Other languages Selected tools • Mutant – Ruby – actively

    developed • grunt-mutation-testing – Java Script – actively developed • MutPy – Python – no longer maintained • Mutagenesis – PHP – low activity (somehow maintained in judgedim fork) • NinjaTurtles – .NET – last commit in 2012 • Visualmutator – .NET – out-of-box integration with Visual Studio @SolidSoftBlog m.zajaczkowski@solidsoft.info www.codearte.io
  40. Mutants in action http://www.adolescentadulthood.com/2013/01/23/how-did-the-teenage-mutant-ninja-turtles-get-their-names/

  41. What can you get? • Better code quality • Less

    bugs in production • Job satisfaction • ... (other benefits from writing testable code) • Information how good your tests really are • Places in code that are not properly tested – Better than with „normal” code coverage @SolidSoftBlog m.zajaczkowski@solidsoft.info www.codearte.io
  42. Is it a MUST for any project? • Short answer

    – no • It is advanced technique/tool – Many projects don't measure even standard code coverage – Or code quality at all • There are other, easier to introduce (and maintain) tools/techniques to start with – Static code analysis – Standard code coverage – Pair programming or peer review @SolidSoftBlog m.zajaczkowski@solidsoft.info www.codearte.io
  43. When to use? • Greenfield project developed with high quality

    in mind • High coverage, but still bugs in production which could (and should) be detected by tests • Doubts about test suite quality – HLD requirement – 95% minimal code coverage level with the development team with no experience in automated testing • Improve legacy system with low code quality and/or without tests @SolidSoftBlog m.zajaczkowski@solidsoft.info www.codearte.io
  44. Prepare your project • Write automatic tests • Write fast

    automatic unit tests (not only slow integration ones) • Separate fast unit tests from slow integration tests – Be able to run only selected group of tests • Introduce basic code quality measurement techniques This presentation is available under the terms of Creative Commons Attribution-NonCommercial-ShareAlike 3.0 (with exclusion of the parts created by other people – including photos). Version 1.1.3-frogs.
  45. Does anyone use it in the real life project? Yes

    :-) • British Sky Broadcasting • TheLadders • Large Hadron Collider in CERN • Maybe you? http://www.mysciencework.com/fr/MyScienceNews/10027/de-l-in-opportunite-des-open-spaces-dans-les-labos
  46. Summary of benefits Verification of effectiveness of automatic test More

    reliable code Less troubles at work More time for interesting things Increased job satisfaction @SolidSoftBlog m.zajaczkowski@solidsoft.info www.codearte.io
  47. Thank you for your attention Marcin Zajączkowski m.zajaczkowski@solidsoft.info http://codearte.io/ http://blog.solidsoft.info/

    @SolidSoftBlog
  48. Before questions... … immediate feedback

  49. Questions? Marcin Zajączkowski m.zajaczkowski@solidsoft.info http://codearte.io/ http://blog.solidsoft.info/ @SolidSoftBlog