Test code engineering - Workshop at University of Innsbruck (2020)

Test code engineering - Workshop at University of Innsbruck (2020)

Topics:

The testing pyramid
Design for testability
** Dependency injection
** Ports and Adapters
Mocking
** To facilitate testing (more control, more observability)
** To explore boundaries (in a TDD fashion)
TDD
Test code best (and bad) practices
** FIRST principles
** Kent Beck’s test desiderata
** Test smells
Test design patterns
** Abstractions
** Builders
** Page Objects

77cc77ccdaf329705ce14a9d3ff1e8a8?s=128

Mauricio Aniche

March 06, 2020
Tweet

Transcript

  1. 4.

    Topics of today • The testing pyramid • Design for

    testability • Dependency injection • Ports and Adapters • Mocking • To facilitate testing (more control, more observability) • To explore boundaries (in a TDD fashion) • TDD • Test code best (and bad) practices • FIRST principles • Kent Beck’s test desiderata • Test smells • Test design patterns • Abstractions • Builders • Page Objects
  2. 5.

    Experience matters! Yu, C. S., Treude, C., & Aniche, M.

    (2019, July). Comprehending Test Code: An Empirical Study. In 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME) (pp. 501-512). IEEE.
  3. 6.

    Study material • Martin Fowler’s wiki: https://martinfowler.com/tags/testing.html • Simple to

    read and full of pragmatic advice! • Hevery, Misko. The testability guide. http://misko.hevery.com/attachments/Guide- Writing%20Testable%20Code.pdf • Meszaros, Gerard. xUnit test patterns: Refactoring test code. Pearson Education, 2007. • Freeman, Steve, and Nat Pryce. Growing object-oriented software, guided by tests. Pearson Education, 2009. • Beck, Kent. Test-driven development: by example. Addison-Wesley Professional, 2003. • Hunt, Andy, and Dave Thomas. Pragmatic unit testing in Java with JUnit. The Pragmatic Bookshelf, 2003. • My lecture notes: https://sttp.site
  4. 7.

    Getting to know you • Are you familiar with automated

    tests? • How often do you write automated tests? • Now, tell me about your pains… :)
  5. 9.

    Some definitions of unit testing • ISQTB: “Searches for defects

    in, and verifies the functioning of software items (e.g., modules, programs, objects, classes, etc) that are separately testable”. • Osherove: “A unit test is an automated piece of code that invokes a unit of work in the system and then checks a single assumption about the behavior of that unit of work. [...] A unit of work is a single logical functional use case in the system that can be invoked by some public interface (in most cases). A unit of work can span a single method, a whole class or multiple classes working together to achieve one single logical purpose that can be verified.”
  6. 10.

    Advantages Disadvantages • Very fast • Easy to control •

    Easy to write • Less real • Some bugs can’t be reproduced at such level
  7. 13.

    How I (Maurício) do the trade-off All business rules should

    be tested here. Exploratory tests. Complex integrations with external services. Main/Risky flow of the app tested. Unit tests Integration tests System tests Manual
  8. 14.

    The ice-cream cone anti-pattern Unit tests Integration tests System tests

    Manual Manual GUI tests System tests Integration tests Unit tests
  9. 20.

    What should we do to test a class without its

    dependencies? etc B DB A C How to write unit tests for A?
  10. 21.

    We simulate the dependencies! • Fast • Full control B’

    A C’ B’ and C’ are (lightweight) simulations of B and C, respectively.
  11. 22.

    Mock Objects mock objects are objects that mimic the behavior

    of real objects/dependencies, easing their controllability and observability.
  12. 23.

    Why do I want to control my dependencies? • To

    easily simulate exceptions • To easily simulate database access • To easily simulate the interaction with any other infrastructure • To avoid building complex objects • To control third-party libraries that I do not own etc B DB A C
  13. 24.

    Let’s code! I wanna filter all the invoices where their

    value are smaller than 100.0. Invoices come from the database. Code: https://gist.github.com/mauricioaniche/03ee12e64d734e7ea370eceb68fe6676
  14. 25.

    To mock or not to mock? • Developers have mixed

    feelings about the usage of mock objects. • Can you see the advantages and the disadvantages of using mocks? • Adv: Easy to control dependencies, easy to automate test cases. • Disadv: Not very real, integration problems might pass. • At the end of the day, it’s about using the right tool at the right moment.
  15. 26.

    EXTERNAL DEPENDENCIES 72%(167) 28%(64) 69%(182) 31%(82) 68%(140) 32%(67) 36%(319) 64%(579)

    7% (12) 93%(160) 94%(358) 6% DATABASE WEB SERVICE DOMAIN OBJECT JAVA LIBRARIES TEST SUPPORT Percentage of non-mocked dependencies Percentage of mocked dependencies Spadini, D, Aniche, M, Bruntink, M & Bacchelli, A 2018, 'Mock objects for testing java systems: Why and how developers use them, and how they evolve' Empirical Software Engineering, pp. 1-38.
  16. 27.

    When to mock? • We empirically see that infrastructure is

    often mocked. • There was no clear trend on domain objects. • Their practice: Complicated/complex classes are mocked. • No mocks for libraries (e.g., lists or small util methods).
  17. 28.

    Mocks are introduced from the very beginning of the test

    class! Spadini, D, Aniche, M, Bruntink, M & Bacchelli, A 2018, 'Mock objects for testing java systems: Why and how developers use them, and how they evolve' Empirical Software Engineering, pp. 1-38.
  18. 29.

    50% of changes in a mock occur because the production

    code changed! Coupling is strong! Spadini, D, Aniche, M, Bruntink, M & Bacchelli, A 2018, 'Mock objects for testing java systems: Why and how developers use them, and how they evolve' Empirical Software Engineering, pp. 1-38.
  19. 30.

    Developers are aware of the trade-offs! • They mock databases,

    but then later write integration tests. • Added “complexity” in exchange of testability. • They understand how coupled they are when they use mocks. Davide Spadini, M. Finavaro Aniche, Magiel Bruntink, Alberto Bacchelli. To Mock or Not To Mock? An Empirical Study on Mocking Practices. MSR 2017. Mock Objects For Testing Java Systems: Why and How Developers Use Them, and How They Evolve. EMSE, 2018.
  20. 31.

    Mock as a way to explore the boundaries Freeman, Steve,

    and Nat Pryce. Growing object-oriented software, guided by tests. Pearson Education, 2009.
  21. 33.

    Testability and Good Design Feathers, Michael. The Deep Synergy between

    Testability and Good Design. Talk: https://www.youtube.com/watch?v=4cVZvoFGJTU Well-designed classes are naturally more testable. Good (class) design is key!
  22. 34.

    Controllability and Observability • Controllability determines the work it takes

    to set up and run test cases and the extent to which individual functions and features of the system under test (SUT) can be made to respond to test cases. • Observability determines the work it takes to set up and run test cases and the extent to which the response of the system under test (SUT) to test cases can be verified. • See Robert Binder’s post on Testability: https://robertvbinder.com/software-testability-part-2-controllability- and-observability/
  23. 40.

    Encapsulation • Behavior should be in the right place. •

    Otherwise, test gets too complicated. • Indirect testing smell. • Law of Demeter • Avoid a.getB().getC().getD().doSomething(); • Much harder to arrange the test, i.e., you have to arrange states in A, B, C, and D.
  24. 41.

    Should I test a private method? • Probably not. •

    You should test the class from its public interface. • If you want to test the private method in an isolated way, that’s the test telling you that your code is not cohesive.
  25. 42.

    Static methods • Now that you know mock, what can

    we do with static methods? • Static methods can’t be mocked. • Therefore, AVOID THEM.
  26. 43.

    Don’t be afraid of creating layers • How can I

    test date/time related things? • How can I test environment-dependent things? • Create a layer on top of APIs • These layers are easily mockable • The Restfulie.NET case: • https://github.com/mauricioaniche/restfulie.net/blob/master/Restful ie.Server/Marshalling/UrlGenerators/AspNetMvcUrlGenerator.cs
  27. 45.

    Solution • We added a layer on top of Calendar.

    • Again, do not be afraid of adding layers to facilitate testability. • Solution: https://gist.github.com/mauricioaniche/2200d8c1dcab41e7c4dddcc46211f4c0
  28. 47.

    Some of the flaws listed in the Testability Guide, by

    Misko Hevery • Flaw: Constructor does Real Work • Flaw: Digging into Collaborators • Flaw: Brittle Global State & Singletons • Flaw: Class Does Too Much • His tool, Testability Explorer (https://github.com/mhevery/testability- explorer) never became a hit. • But the idea is awesome!
  29. 53.

    • Tests should be fast (in practice, hard to draw

    a line between fast and slow) • Minimize the amount of code that ultimately depends on slow things. [F]IRST: Fast
  30. 54.

    • Good unit tests focus on a small chunk of

    code to verify. • Good unit tests also don’t depend on other unit tests. • Be careful with shared resources. F[I]RST: Isolated
  31. 55.

    • A repeatable test is one that produces the same

    results each time you run it. • To achieve that, tests must be well isolated. FI[R]ST: Repeatable
  32. 56.

    • Tests aren’t tests unless they assert that things went

    as expected. • Manually verifying the results of tests is a time- consuming • They must also be self- arranging. FIR[S]T: Self-validating
  33. 57.

    • Unit testing should become a habit. TEST INFECTED! •

    In practice, chances are low that you’ll find the time to come back and write tests. FIRS[T]: Timely
  34. 58.

    Kent Beck’s test desiderata • Isolated — tests should return

    the same results regardless of the order in which they are run. • Composable — if tests are isolated, then I can run 1 or 10 or 100 or 1,000,000 and get the same results. • Fast — tests should run quickly. • Inspiring — passing the tests should inspire confidence • Writable — tests should be cheap to write relative to the cost of the code being tested. • Readable — tests should be comprehensible for reader, invoking the motivation for writing this particular test. • Behavioural — tests should be sensitive to changes in the behavior of the code under test. If the behavior changes, the test result should change. • Structure-insensitive — tests should not change their result if the structure of the code changes. • Automated — tests should run without human intervention. • Specific — if a test fails, the cause of the failure should be obvious. • Deterministic — if nothing changes, the test result shouldn't change. • Predictive — if the tests all pass, then the code under test should be suitable for production. https://medium.com/@kentbeck_7670/test-desiderata-94150638a4b3
  35. 59.
  36. 61.

    Test code duplication • The same test code is repeated

    many times. • Can happen in a single test too as it may contain repeated groups of similar statements. • All the issues a ”Copy and paste in production” has. • Extract common behavior to a method or a class.
  37. 62.

    Test logic in production • The code that is put

    into production contains logic that should be exercised only during tests. • The test code may actually execute in production, leading to bad behavior. • Remove test logic, inject class in the dependencies, use a mock during the test.
  38. 63.

    Erratic / Flaky tests • One or more tests are

    behaving erratically; sometimes they pass and sometimes they fail. Aka: Flaky tests. • Less confidence, hard to know whether the test is failing because of a real bug. • Fresh fixture, check infrastructure-specific dependencies (e.g. clock).
  39. 64.

    Obscure tests • It is difficult to understand the test

    at a glance. • May be caused by general fixtures, assertion roulettes, irrelevant information, etc. • Difficult maintenance and increase costs. • Refactor according to the root cause.
  40. 65.

    Possible solution: Test Data Builder • If you have a

    complex entity that should be part of a test, instantiating it can obscure the test. • Extract the creation of the object to a dedicated class. • Builder pattern [GoF] • In test world: Test Data Builder. • Recommended reading: http://www.natpryce.com/articles/000714.html.
  41. 66.

    Possible Solution: Page Objects • In system testing, navigating through

    complex web pages might make your test complex. • Solution: abstractions that represent the different “pages” of your system. • Read: • https://www.martinfowler.com/bliki/PageObject.html • https://github.com/SeleniumHQ/selenium/wiki/PageObjects • The Screenplay pattern (@wakaleo suggested me to talk about it) • https://www.infoq.com/articles/Beyond-Page-Objects-Test-Automation- Serenity-Screenplay/
  42. 67.

    Assertion Roulette • It is hard to tell which of

    several assertions within the same test method caused a test failure. • Harder to understand which assertion failed and why. • Break the test in two, make assertions clearer.
  43. 68.

    Clear assertions • Assertions are key in test code •

    Clear assertions are fundamental • The AssertJ project: https://joel-costigliola.github.io/assertj/
  44. 69.

    Condition logic in test • A test contains code that

    may or may not be executed. • Any loop or if statement in a test deserves attention. • To assert a collection? • To initialize an infrastructure? • To make the test execution different per environment? • Harder to understand what a test actually does. • Break the test in two, make assertions clearer.
  45. 70.

    Slow tests • The tests take too long to run.

    • Reduce the productivity of the developer. • Slow component -> mock • Expensive fixture -> shared fixture • Too many tests -> Isolate expensive from faster tests.
  46. 71.

    Mystery guest • The test reader is not able to

    see the cause and effect between fixture and verification logic because part of it is done outside the Test Method. • Hard to see the cause and effect relationship between the test fixture (the pre-conditions of the test) and the expected outcome of the test. • Fresh fixture, inline fixture.
  47. 72.

    Resource Optimism • A test that depends on external resources

    has non- deterministic results depending on when/where it is run. • Makes the test flaky. • Fresh fixture • Inline fixture
  48. 73.

    Test Run War • Test failures occur at random when

    several people are running tests simultaneously. • Makes the test flaky. • Hard to understand why it fails. • Fresh fixture • Inline fixture • Isolated/Independent test
  49. 74.

    General Fixture • Fixture is too general and different tests

    only access part of the fixture. • Harder to read and understand. • May make tests run more slowly (because they do unnecessary work). • Break the fixture into many.
  50. 75.

    Lazy Test • Several test methods check the same method

    using the same fixture. • Too many tests, harder to understand the full tested behavior. • Join the tests.
  51. 76.

    Indirect Testing • Test class contains methods that actually perform

    tests on other objects. • Understanding and debugging is harder. • Isolate your units.
  52. 77.

    Sensitive Equality • The assertion relies on many irrelevant details.

    • Test may fail for reasons other than a bug. • Implement an equals() method.
  53. 78.

    Gets worse over time? Probably Resource Leakage Yes Probably Non-

    Deterministic Test No Happens when test run alone? Probably Lonely Test Yes Probably Interacting Tests No
  54. 81.

    Let’s try! • Roman Numerals • Receives a string, converts

    to integer • ”I” -> 1 • ”III” -> 3 • ”VI” -> 6 • ”IV” -> 4 • ”XVI” -> 16 • ”XIV” -> 14
  55. 83.

    Baby steps • Simplicity: We should do the simplest implementation

    that solves the problem, start by the simplest possible test, … • Do not confuse being simple with being innocent. • Kent Beck states in his book: ”Do these steps seem too small to you? Remember, TDD is not about taking teensy tiny steps, it’s about being able to take teensy tiny steps. Would I code day-to-day with steps this small? No. But when things get the least bit weird, I’m glad I can.”
  56. 84.

    Refactor! • In many opportunities, we are so busy making

    the test pass that we forget about writing good code. • After the test is green, you can refactor. • Good thing is that, after the refactoring, tests should still be green. • Refactoring can be at low-level or high-level. • Low-level: rename variables, extract methods. • High-level: change the class design, class contracts.
  57. 86.
  58. 87.

    The TDD cycle Failing test Write a failing test Make

    it pass Tests passing Refactor What are the advantages ?
  59. 88.

    Focus on the requirements • Starting by the test means

    starting by the requirements. • It makes us think more about: • what we expect from the class. • how the class should behave in specific cases. • We do not write ”useless code” • Go to your codebase right now. How much code have you written that is never used in real world?
  60. 89.

    Controlling your pace • Having a failing test, give us

    a clear focus: make the test pass. • I can write whenever test I want: • If I feel insecure, I can write a simpler test. • If I feel safe, I can write a more complicated test. • If there’s something I do not understand, I can take a tiny baby step. • If I understand completely, I can take a larger step.
  61. 90.

    Test from the requirements • Starting from the test means

    starting from the requirements. Meaning your tests derive from the requirements, and not from existing code. • If you follow the idea of always having tests, you do not need to test afterwards. • Your code is tested already!
  62. 91.

    It’s our first client! • The test code is the

    first client of the class you are constructing. • Use it to your advantage. • What can you get from the client? • Is it hard to make use of your class? • Is it hard to build the class? • Is it hard to set up the class for use (pre conditions)? • Does the class return what I want?
  63. 92.

    Testable code • TDD makes you think about tests from

    the beginning. • This means you will be enforced to write testable classes. • We discussed it before: a testable class is also an easy-to-use class. • Some people call TDD as Test-Driven Design.
  64. 93.

    Tests as a draft • Changing your class design is

    cheaper when done at the beginning. • Use your tests as a draft: play with the class; if you don’t like the class design, change it. • Remember: the test is your first client.
  65. 94.

    Controllability • Tests make you think about managing dependencies from

    the beginning. • If your class depends on too many classes, testing gets harder. • You should refactor.
  66. 95.

    Listen to your test • The test may reveal design

    problems. • You should ”listen to it”. • Too many tests? • Maybe your class does too much. • Too many mocks? • Maybe your class is too coupled. • Complex set up before calling the desired behavior? • Maybe rethink the pre-conditions.
  67. 96.

    TDD as a design tool • Let’s do it together!

    • Remember the InvoiceFilter example?
  68. 97.

    Is it really effective? • 50% more tests, less time

    debugging [5]. • 40-50% less defects, no impact on productivity [6]. • 40-50% less defects in Microsoft and IBM products [12]. • Better use of OOP concepts [13]. • More cohesive, less coupled [15]. Janzen, D., Software Architecture Improvement through Test-Driven Development. Conference on Object Oriented Programming Systems Languages and Applications, ACM, 2005. Maximilien, E. M. and L. Williams. Assessing test-driven development at IBM. IEEE 25th International Conference on Software Engineering, Portland, Orlando, USA, IEEE Computer Society, 2003. Nagappan, N., Bhat, T. Evaluating the efficacy of test- driven development: industrial case studies. Proceedings of the 2006 ACM/IEEE international symposium on Empirical software engineering. Janzen, D., Saiedian, H. On the Influence of Test-Driven Development on Software Design. Proceedings of the 19th Conference on Software Engineering Education & Training (CSEET’06). Steinberg, D. H. The Effect of Unit Tests on Entry Points, Coupling and Cohesion in an Introductory Java Programming Course. XP Universe, Raleigh, North Carolina, USA, 2001.
  69. 98.

    Is it? • No difference in code quality [Erdogmus et

    al., Müller et al.] • Siniaalto and Abrahamsson: The differences in the program code, between TDD and the iterative test-last development, were not as clear as expected. Erdogmus, H., Morisio, M., et al. On the effectiveness of the test-first approach to programming. IEEE Transactions on Software Engineering 31(3): 226 – 237, 2005. Müller, M. M., Hagner, O. Experiment about test-first programming. IEE Proceedings 149(5): 131 – 136, 2002. Siniaalto, Maria, and Pekka Abrahamsson. "Does test-driven development improve the program code? Alarming results from a comparative case study." Balancing Agility and Formalism in Software Engineering. Springer Berlin Heidelberg, 2008. 143-156.
  70. 99.

    Is it? • ”The practice of test-driven development does not

    drive directly the design, but gives them a safe space to think, the opportunity to refactor constantly, and subtle feedback given by unit tests, are responsible to improve the class design”. • “The claimed benefits of TDD may not be due to its distinctive test-first dynamic, but rather due to the fact that TDD-like processes encourage fine- grained, steady steps that improve focus and flow.” Aniche, M., & Gerosa, M. A. (2015). Does test-driven development improve class design? A qualitative study on developers’ perceptions. Journal of the Brazilian Computer Society, 21(1), 15. Fucci, D., Erdogmus, H., Turhan, B., Oivo, M., & Juristo, N. (2016). A Dissection of Test-Driven Development: Does It Really Matter to Test- First or to Test-Last?. IEEE Transactions on Software Engineering.
  71. 100.

    Practical advice on TDD • Keep a ”test list”. •

    Refactor both production and test code. • Always see the test failing. • Stop and think.
  72. 101.

    TDD 100% of the time? • No silver bullet! J

    • Maurício: I do not use TDD 100% of the times. I let my experience tell me when I need it. • However, I always write tests and I never spend too much time only with production code.