Practical Unit Testing for Existing Codebases

Practical Unit Testing for Existing Codebases K. DEVIN MCINTYRE @KDEVINMCINTYRE
HTTPS://SPEAKERDECK.COM/MIYASUDOKORO HTTPS://GITHUB.COM/MIYASUDOKORO/UNIT-TEST-DEMO CONNECT.TECH 2019

Overview ◦ Blocker: Time ◦ Urgency of features has left
quality cut short ◦ Team or individual process is too tightly scheduled ◦ Blocker: Practical knowledge ◦ Poor/nonexistent training in unit testing ◦ Blocker: Existing codebase ◦ What should be prioritized ◦ How to refactor safely ◦ Examples – based on a real untested app ◦ Blocker: Team & management buy-in – time allowing

Example of a unit-test-free process 1. Prototype / First draft
◦ QA has nothing to do 2. Trial-and-Error ◦ Trial: Run the app / debug the code. ◦ Error: See a defect / see functionality missing. ◦ Code: Fix the defect / add missing functionality. 3. Confirmation / QA ◦ QA starts manual testing ◦ Automated tests, unit tests, refactoring, etc. stuffed in here if time allows

Habit shift: Test-During 1. Prototype / First draft ◦ QA
has nothing to do starts writing automated system tests 2. Trial-and-Error ◦ Trial: Run the app / debug the code. Write some unit tests and run the full suite. ◦ Error: See a defect test failure / see functionality missing. ◦ Code: Fix the defect test failure / add missing functionality. ◦ Refactor: Make sure coupled tests + code follow best practices. 3. Confirmation / QA ◦ QA finishes the automated system tests ◦ Manual tests still exist but should be fewer

Why use Test-During? ◦ Productivity increases because unit tests are
faster than manual runs ◦ You gain experience in unit testing without interrupting your work ◦ It is easier to change habits gradually ◦ You can start testing your logic immediately – find holes faster ◦ Don’t need to keep double-checking your work; it automatically retests itself

Test-During -> Test-Driven 1. Prototype / First draft <- throw
out this code ◦ QA starts writing automated system tests and watches them fail 2. Trial-and-Error <- much shorter increments 1. Trial: Write some unit tests one unit test for missing functionality and run the full suite. 2. Error: See a test failure. 3. Code: Fix the test failure with just enough code for the test to pass. 4. Refactor: Make sure coupled tests + code follow best practices. 3. Confirmation / QA ◦ The automated tests that were failing now pass

Starting Test-Driven Development 1. Decide whether you want to –
it’s a design method, not a test method 2. Gain experience with unit testing best practices. ◦ Poorly-written unit tests lock in the bad rather than uncovering it 3. Understand how unit tests and code reflect each other. 4. Mentally prepare for continuous refactoring, redesign, and “failure.” 5. Try it with defects first. Read http://neopragma.com/index.php/2019/09/29/against-tdd/ ◦ which is not actually against TDD ◦ and see other resources at the end – both for and against TDD

Practical knowledge • GETTING STARTED • DEFINITIONS

Getting started

Choosing a test framework ◦ What are other people using
for apps similar to yours? ◦ Most frameworks have one or more recommended setups ◦ “Misuse” of framework features may make this difficult without refactoring ◦ Personal recommendation: Mocha / Chai / Sinon ◦ Flexible enough for any framework, in UI or Node

Unit tests in browsers? Maybe not ◦ Tests can stop
running due to browser updates ◦ Node is faster due to browser startup time ◦ Considerations: ◦ Do you have good functional / system tests against the UI? ◦ Does your code handle its own cross-browser support? ◦ Did you roll your own UI framework? ◦ If you want to do it, you may need to experiment with test frameworks

Testing UI code in Node ◦ Mimic the UI environment
with JSDOM ◦ Pro: Provides a fake DOM that is good enough for most unit testing ◦ Pro: Can load your UI files directly into mock windows ◦ Con: Not enough documentation ◦ Con: Too many outdated posts in blogs/Stackoverflow using old API ◦ Support Node natively, then convert files during build process ◦ There are many tools / frameworks for this for a reason ◦ “Agnostic” files that run in either environment can exist in any project ◦ Use functional / system tests of UI to complete your test coverage ◦ WDIO, Selenium, CasperJS, Katalon, TestComplete, Browsera …

Definitions

What to unit test 1. “Contracts” of public functions ◦
If X goes into function A, the output will be Y ◦ Given we are in X state, if we call function A, the state will change to Y 2. Logic branches ◦ If-else, switch, try-catch, etc. ◦ The various paths going through private methods ◦ “Public” = functions called by code not defined in this file ◦ This includes callbacks, promises, observables, etc., even if anonymous

What not to unit test ◦ Every possible combination of
parameters, logic, etc. ◦ Third-party anything ◦ Trust frameworks, APIs, dependencies, etc. to handle themselves ◦ Theoretical ways things could be used but probably never will be ◦ Do test your boundary conditions, though ◦ Typically, the “contracts” of a private method ◦ Cost: extra maintenance ◦ Benefit: can more easily isolate logic branches

Structure ◦ One “setup” file to set up global state
◦ One “spec” file per each source code file ◦ One “describe” block per state of app ◦ Complex states -> nested describes ◦ One “it” block per call of a function under test ◦ Assert/expect choices: A. One “assert/expect” statement per “it” B. Use enough “assert/expect” statements to fully query the end state or output

State control ◦ Do not let state bleed between individual
tests ◦ Will cause random or mysterious test failures ◦ Tests don’t necessarily run in the same order every time – this is a feature ◦ Must restore to previous state after each “describe” or “it” finishes ◦ Use “beforeEach” and “afterEach” to create and destroy states for each “it” ◦ Avoid pollution of global objects ◦ You may need to add “destroy” / “reset” functions ◦ Memory leaks will break your tests; you must truly discard everything ◦ Remove all event listeners ◦ All timeouts, intervals, and other asynchronous processes must be finished

Stubs and mocks ◦ Mock: a whole object pretending to
be another object ◦ Data being passed around (e.g. ajax call responses, AWS lambda events) ◦ A fake version of a third-party library that you swap in (e.g. on global scope) ◦ Stub/spy: temporarily replacing an object’s method ◦ Find out what parameters were passed in; return whatever you want ◦ Stub/spy libraries apply and remove them for you ◦ Jasmine: spyOn( jQuery, ‘ajax’ ).and.callFake( myFakeAjaxFunction ) ◦ Sinon: sinon.stub( jQuery, ‘ajax’ ).callsFake( myFakeAjaxFunction ) ◦ Easier and more flexible than mocks

Schools of thought LONDON / MOCKIST ◦ Should mock outside
resources and usually dependent modules ◦ Tests involving dependencies are … ◦ technically integration tests because multiple modules were involved ◦ used sparingly to augment normal unit tests ◦ Isolating source code using mocks … ◦ facilitates refactoring by narrowing scope ◦ prevents redundant code coverage DETROIT / CLASSICAL ◦ Should mock outside resources but not dependent modules ◦ Tests involving dependencies are … ◦ unit tests because only one function call occurred ◦ ideal because they are closest to reality ◦ Isolating source code using mocks … ◦ increases overhead of test creation ◦ risks mocks becoming out-of-date

Existing codebase • SPECIFIC METHODS • REFACTORING • PRIORITIZATION

Specific methods

Use the London school ◦ Isolation of files lets you
work on one thing at a time ◦ Easier to control state ◦ No need to understand 100% of logic to write each test ◦ No need to force every file in the system to conform to unit test structure ◦ Clearer boundaries of what code has been tested vs covered ◦ Code coverage tools only tell you whether lines of code are reached, not tested

Basic London method ◦ Stub/mock dependencies, outside or inside ◦
Prefer stubs over mocks – easier to remove later, better reflection of true code ◦ Prefer open-source mocks over your own ◦ Always query your stubs to be sure they received the right parameters ◦ Each stub is tightly coupled; you must update stubs if you update the code under them ◦ Add some cross-file “integration” tests sparingly ◦ Use to study tightly-coupled areas ◦ Use to double-check that your stubs/mocks are correct ◦ Probably not possible until some level of code coverage is reached in both files

Setup file ◦ The test runner loads code files; your
code does not ◦ Therefore, you can’t test whatever gets your app loaded up ◦ Do not attempt to directly test your initial load/config stage file(s) yet 1. Manually re-create the app-wide global state as it is before any of the code to be tested runs (hopefully this is small) 2. Start writing tests for app code; the next step can wait 3. When ready, refactor as much of your initial load/configuration logic as possible into testable pieces, probably in separate file(s)

Helper files ◦ Used for difficult source code files ◦
Broad state side effects, e.g. global pollution ◦ Cannot reach dependencies to mock/stub them ◦ Spaghetti, ball of mud, etc. ◦ Move logic out of the difficult file into this one ◦ Write unit tests against the helper file ◦ Only the one difficult file uses the helper as a dependency ◦ This is just an extension of a single file; think of it as private to that file ◦ Intended as a temporary refactoring tool ◦ Ideally, you refactor the difficult source code file and remove the helper

Cleanup code ◦ Functions like “destroy,” “reset,” etc. that clear
state at the end of a unit test ◦ Need to remove listeners, clear timeouts/intervals, set object pointers to undefined ◦ Added into your real source code ◦ Benefit: prevention of memory leaks if you call them in real app execution

Refactoring

Refactoring catch-22 A. Few untested legacy systems are written in
a testable manner, so they must be refactored so they can be unit tested B. Refactoring legacy code could break it, and without unit tests, it’s hard to know whether you’ve broken anything

Refactoring a file for unit tests 1. Strictly maintain all
public contracts within the file 2. Only add or move code; do not remove or change anything (yet) ◦ E.g. helper file, cleanup code, extra getters 3. Focus on: ◦ control of state ◦ splitting code into smaller, testable pieces

Be cautious ◦ Don’t rush to delete anything ◦ Avoid
“improving” the code until you have good coverage ◦ Track the bugs you find as real defects/issues ◦ Management & teammates can track the results of the work ◦ You get to choose the right time to tackle the defect (maybe it’s not right now) ◦ If you have commented a defect number on a failing unit test in the codebase, your teammates will know that the test is known to be failing and not worry that they broke it

Prioritization

(1) “canary” describe( ‘canary’, () => { it( ‘adds 2
+ 2’, () => { expect( 2 + 2 ).to.equal( 4 ); } } ◦ If this test fails, you know you’ve broken your setup, failed to install something, etc. ◦ Keep it around. You never know what could break your test environment.

(2) Easy file(s) ◦ Pulling in one or two easy
files lets you test whether you’ve set up the test environment correctly for your app. ◦ Some indications of an “easy” file: ◦ Other parts of your app call this code, not the other way around. ◦ It cares very little about the state of the app outside itself. ◦ It has lots of pure functions. ◦ It has points where you can easily mock data, such as to/from an API that is outside your code (server or third-party JS).

(3) Dependency tree roots ◦ Files that are most depended
on should ideally gain test coverage first. ◦ Defects in them can affect multiple places in the application. ◦ Changing them can cause regression defects across the application. ◦ Look at your dependents (files using this file) to help think of test cases. ◦ Make test cases that reflect how the code is called in various ways. ◦ Work your way gradually to files further along the tree, in order of “reach” or importance

(4) Mission-critical features ◦ Wherever defect consequences are most severe
◦ High-usage areas ◦ Anything where money changes hands ◦ Anything that would greatly embarrass the business or your team

(5) Defect areas ◦ Add coverage for your worst (applicable)
defect areas. ◦ Not all defects can be unit tested for, so avoid zealotry on this one ◦ As you fix new defects, add unit tests for them (if applicable). ◦ It can be helpful to put comments like // DE12121 on the test cases ◦ I don’t recommend putting the defect number in the test name; it’s just clutter ◦ If something is difficult to QA effectively, cover it proactively.

(6) Error handlers ◦ Those that are user-facing or otherwise
important ◦ Those that communicate with third parties ◦ Not those that just log an error ◦ Why? Your integration / system / functional / manual tests will not be able to reach many of these under normal circumstances.

Examples from real code

Examples contain … Practical knowledge section ◦ Framework: Mocha /
Chai / Sinon ◦ Using JSDOM for UI code ◦ Structure ◦ State control ◦ Stubs & mocks Existing codebase section ◦ Use of London school ◦ Setup files ◦ Helper file / refactoring ◦ Cleanup code ◦ Prioritization 1. Canary file 2. Easy file https://github.com/miyasudokoro/unit-test-demo

Buy-in

Gaining buy-in ◦ Be clear about both costs and benefits
◦ Discover everyone’s concerns up front and discuss them seriously ◦ Yes: “It seems you’re concerned about X.” Then listen! ◦ No: “Don’t worry about X.” “Y will handle X.” “Just trust the team/the process.” ◦ Do not assume “unit test” means the same thing to everyone ◦ Together, define a plan of action that ramps up gradually ◦ Read Never Split the Difference by Chris Voss

Return on investment: Pros ◦ Helps with some types of
defects ◦ Logic: prevent most defects of pure logic failure ◦ Regression: prevent some regression defects ◦ Integration: depends reliability of mocks/stubs, understanding of order of events ◦ Cosmetic: nope

Return on investment: Pros ◦ Confidence in your code’s logic
is a stress reliever ◦ Encourages good practices for code quality ◦ KISS ◦ Loose coupling ◦ Single Responsibility Principle ◦ Avoiding state side effects / maximizing number of pure functions

Return on investment: Cons ◦ Delay of your product may
mean loss of market share ◦ Sometimes getting it out there fast and fixing defects later is the right business strategy ◦ Complacency in QA / too much trust of unit tests ◦ Unskilled / untrained developers forced to write unit tests: ◦ Loss of productivity ◦ Frustration / low morale ◦ Poorly-written unit tests => maintenance cost for everyone ◦ Unit test maintenance always adds some overhead ◦ Difficult to study unit testing empirically; mixed results ◦ See references

Further reading Brandes, Ross. "London School TDD." 23 August 2019.
Github / testdouble / contributing-tests. 7 10 2019. <https://github.com/testdouble/contributing-tests/wiki/London-school-TDD>. Dalling, Tom. Wasting Time TDDing The Wrong Things. 11 October 2016. 30 September 2019. <https://www.rubypigeon.com/posts/wasting-time-tdd-the-wrong-things/>. Fischer, Tom. Unit Testing Myths and Practices. 5 January 2012. 6 10 2019. <https://www.red-gate.com/simple- talk/dotnet/net-framework/unit-testing-myths-and-practices/>. Fowler, Martin. Test Coverage. n.d. 30 9 2019. <https://martinfowler.com/bliki/TestCoverage.html>. Gren, Lucas & Antinyan, Vard. "On the Relation Between Unit Testing and Code Quality." Conference: Euromicro Conference on Software Engineering and Advanced Applications (SEAA2017), At Vienna, Austria (2017). Hassan, Ahmed, Emad Shihab, Zhen Ming Jiang, Bram Adams, Robert Bowerman. "Prioritizing the creation of unit tests in legacy software systems." Software Practice and Experience (2010): 1-22. Melnik, Grigori and Ron Jeffries. "Guest Editors' Introduction: TDD--The Art of Fearless Programming." IEEE Software 24 (2007): 24-30. <https://www.computer.org/csdl/magazine/so/2007/03/s3024/13rRUygT7kK>. Moonen, Leon and Arie van Deursen. "The Video Store Revisited -- Thoughts on Refactoring and Testing." 2002. Document. 12 August 2019. <https://www.academia.edu/31982330/The_Video_Store_Revisited_- _Thoughts_on_Refactoring_and_Testing>. Nicolette, Dave. Against TDD. 29 9 2019. 5 10 2019. <http://neopragma.com/index.php/2019/09/29/against-tdd/>. Searls, Justin. "Detroit School TDD." 24 August 2015. Github / testdouble / contributing-tests. 7 October 2019. <https://github.com/testdouble/contributing-tests/wiki/Detroit-school-TDD>. Torkar, R., S. Mankefors, K. Hansson and A. Jonsson. "An Exploratory Study of Component Reliability Using Unit Testing." Proceedings of the 14th International Symposium on Software Reliability Engineering. 2003. Online. Voss, Chris., and Tahl Raz. Never Split the Difference: Negotiating As If Your Life Depended On It. First edition. New York, NY: Harper Business, an imprint of HarperCollins Publishers, 2016. Warne, Henrik. A Response to "Why Most Unit Testing is Waste". 4 September 2014. Web page. 12 August 2019. <https://henrikwarne.com/2014/09/04/a-response-to-why-most-unit-testing-is-waste/>. Williams, Laurie, Gunnar Kudrjavets, and Nachiappan Nagappan. "On the Effectiveness of Unit Test Automation at Microsoft." Empirical Software Engineering 13.3 (n.d.). 4 10 2019. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.648.9924&rep=rep1&type=pdf>. HTTPS://SPEAKERDECK.COM/MIYASUDOKORO

Practical Unit Testing for Existing Codebases

Practical Unit Testing for Existing Codebases

More Decks by K. Devin McIntyre

Other Decks in Programming

Featured

Transcript