Test code engineering - Workshop at University of Innsbruck (2020)

Test code engineering Maurício Aniche [email protected] @mauricioaniche

TEST ANALYSIS & TEST DESIGN

TEST ANALYSIS & TEST DESIGN Our focus today!

Topics of today • The testing pyramid • Design for
testability • Dependency injection • Ports and Adapters • Mocking • To facilitate testing (more control, more observability) • To explore boundaries (in a TDD fashion) • TDD • Test code best (and bad) practices • FIRST principles • Kent Beck’s test desiderata • Test smells • Test design patterns • Abstractions • Builders • Page Objects

Experience matters! Yu, C. S., Treude, C., & Aniche, M.
(2019, July). Comprehending Test Code: An Empirical Study. In 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME) (pp. 501-512). IEEE.

Study material • Martin Fowler’s wiki: https://martinfowler.com/tags/testing.html • Simple to
read and full of pragmatic advice! • Hevery, Misko. The testability guide. http://misko.hevery.com/attachments/Guide- Writing%20Testable%20Code.pdf • Meszaros, Gerard. xUnit test patterns: Refactoring test code. Pearson Education, 2007. • Freeman, Steve, and Nat Pryce. Growing object-oriented software, guided by tests. Pearson Education, 2009. • Beck, Kent. Test-driven development: by example. Addison-Wesley Professional, 2003. • Hunt, Andy, and Dave Thomas. Pragmatic unit testing in Java with JUnit. The Pragmatic Bookshelf, 2003. • My lecture notes: https://sttp.site

Getting to know you • Are you familiar with automated
tests? • How often do you write automated tests? • Now, tell me about your pains… :)

Testing pyramid Unit tests Integration tests System tests Manual More
reality More complexity

Some definitions of unit testing • ISQTB: “Searches for defects
in, and verifies the functioning of software items (e.g., modules, programs, objects, classes, etc) that are separately testable”. • Osherove: “A unit test is an automated piece of code that invokes a unit of work in the system and then checks a single assumption about the behavior of that unit of work. [...] A unit of work is a single logical functional use case in the system that can be invoked by some public interface (in most cases). A unit of work can span a single method, a whole class or multiple classes working together to achieve one single logical purpose that can be verified.”

Advantages Disadvantages • Very fast • Easy to control •
Easy to write • Less real • Some bugs can’t be reproduced at such level

BigTest.java We can do System Testing!

Advantages Disadvantages • Very realistic • Captures the user perspective
• Slow • Hard to write • Flaky

How I (Maurício) do the trade-off All business rules should
be tested here. Exploratory tests. Complex integrations with external services. Main/Risky flow of the app tested. Unit tests Integration tests System tests Manual

The ice-cream cone anti-pattern Unit tests Integration tests System tests
Manual Manual GUI tests System tests Integration tests Unit tests

The practical test pyramid: https://martinfowler.com/articles/practical-test-pyramid.html

Mock Objects

That’s how it is in OO systems… A A does
too much!

That’s how it is in OO systems… A C A
does too much again!

That’s how it is in OO systems… etc B DB
A C

What should we do to test a class without its
dependencies? etc B DB A C How to write unit tests for A?

We simulate the dependencies! • Fast • Full control B’
A C’ B’ and C’ are (lightweight) simulations of B and C, respectively.

Mock Objects mock objects are objects that mimic the behavior
of real objects/dependencies, easing their controllability and observability.

Why do I want to control my dependencies? • To
easily simulate exceptions • To easily simulate database access • To easily simulate the interaction with any other infrastructure • To avoid building complex objects • To control third-party libraries that I do not own etc B DB A C

Let’s code! I wanna filter all the invoices where their
value are smaller than 100.0. Invoices come from the database. Code: https://gist.github.com/mauricioaniche/03ee12e64d734e7ea370eceb68fe6676

To mock or not to mock? • Developers have mixed
feelings about the usage of mock objects. • Can you see the advantages and the disadvantages of using mocks? • Adv: Easy to control dependencies, easy to automate test cases. • Disadv: Not very real, integration problems might pass. • At the end of the day, it’s about using the right tool at the right moment.

EXTERNAL DEPENDENCIES 72%(167) 28%(64) 69%(182) 31%(82) 68%(140) 32%(67) 36%(319) 64%(579)
7% (12) 93%(160) 94%(358) 6% DATABASE WEB SERVICE DOMAIN OBJECT JAVA LIBRARIES TEST SUPPORT Percentage of non-mocked dependencies Percentage of mocked dependencies Spadini, D, Aniche, M, Bruntink, M & Bacchelli, A 2018, 'Mock objects for testing java systems: Why and how developers use them, and how they evolve' Empirical Software Engineering, pp. 1-38.

When to mock? • We empirically see that infrastructure is
often mocked. • There was no clear trend on domain objects. • Their practice: Complicated/complex classes are mocked. • No mocks for libraries (e.g., lists or small util methods).

Mocks are introduced from the very beginning of the test
class! Spadini, D, Aniche, M, Bruntink, M & Bacchelli, A 2018, 'Mock objects for testing java systems: Why and how developers use them, and how they evolve' Empirical Software Engineering, pp. 1-38.

50% of changes in a mock occur because the production
code changed! Coupling is strong! Spadini, D, Aniche, M, Bruntink, M & Bacchelli, A 2018, 'Mock objects for testing java systems: Why and how developers use them, and how they evolve' Empirical Software Engineering, pp. 1-38.

Developers are aware of the trade-offs! • They mock databases,
but then later write integration tests. • Added “complexity” in exchange of testability. • They understand how coupled they are when they use mocks. Davide Spadini, M. Finavaro Aniche, Magiel Bruntink, Alberto Bacchelli. To Mock or Not To Mock? An Empirical Study on Mocking Practices. MSR 2017. Mock Objects For Testing Java Systems: Why and How Developers Use Them, and How They Evolve. EMSE, 2018.

Mock as a way to explore the boundaries Freeman, Steve,
and Nat Pryce. Growing object-oriented software, guided by tests. Pearson Education, 2009.

Design for Testability

Testability and Good Design Feathers, Michael. The Deep Synergy between
Testability and Good Design. Talk: https://www.youtube.com/watch?v=4cVZvoFGJTU Well-designed classes are naturally more testable. Good (class) design is key!

Controllability and Observability • Controllability determines the work it takes
to set up and run test cases and the extent to which individual functions and features of the system under test (SUT) can be made to respond to test cases. • Observability determines the work it takes to set up and run test cases and the extent to which the response of the system under test (SUT) to test cases can be verified. • See Robert Binder’s post on Testability: https://robertvbinder.com/software-testability-part-2-controllability- and-observability/

Hexagonal architecture (aka ports and adapters) Extracted from http://alistair.cockburn.us/Hexagonal+architecture

Harder than it seems!

Clear and well-defined interfaces Encapsulation plays a big role here!

Dependency Inversion • Make dependencies injectable.

Tell! Don’t Ask

Encapsulation • Behavior should be in the right place. •
Otherwise, test gets too complicated. • Indirect testing smell. • Law of Demeter • Avoid a.getB().getC().getD().doSomething(); • Much harder to arrange the test, i.e., you have to arrange states in A, B, C, and D.

Should I test a private method? • Probably not. •
You should test the class from its public interface. • If you want to test the private method in an isolated way, that’s the test telling you that your code is not cohesive.

Static methods • Now that you know mock, what can
we do with static methods? • Static methods can’t be mocked. • Therefore, AVOID THEM.

Don’t be afraid of creating layers • How can I
test date/time related things? • How can I test environment-dependent things? • Create a layer on top of APIs • These layers are easily mockable • The Restfulie.NET case: • https://github.com/mauricioaniche/restfulie.net/blob/master/Restful ie.Server/Marshalling/UrlGenerators/AspNetMvcUrlGenerator.cs

Let’s code • The Clock abstraction example. • https://gist.github.com/mauricioaniche/312249471a71dddeeb74991 774529b7b

Solution • We added a layer on top of Calendar.
• Again, do not be afraid of adding layers to facilitate testability. • Solution: https://gist.github.com/mauricioaniche/2200d8c1dcab41e7c4dddcc46211f4c0

http://misko.hevery.com/attachments/Guide- Writing%20Testable%20Code.pdf

Some of the flaws listed in the Testability Guide, by
Misko Hevery • Flaw: Constructor does Real Work • Flaw: Digging into Collaborators • Flaw: Brittle Global State & Singletons • Flaw: Class Does Too Much • His tool, Testability Explorer (https://github.com/mhevery/testability- explorer) never became a hit. • But the idea is awesome!

Test code smells

Many things can hamper comprehensibility! the test! the production code!

Arrange-Act-Assert (AAA)

What are the ugly things you have seen in a
test code?

The FIRST principles

• Tests should be fast (in practice, hard to draw
a line between fast and slow) • Minimize the amount of code that ultimately depends on slow things. [F]IRST: Fast

• Good unit tests focus on a small chunk of
code to verify. • Good unit tests also don’t depend on other unit tests. • Be careful with shared resources. F[I]RST: Isolated

• A repeatable test is one that produces the same
results each time you run it. • To achieve that, tests must be well isolated. FI[R]ST: Repeatable

• Tests aren’t tests unless they assert that things went
as expected. • Manually verifying the results of tests is a time- consuming • They must also be self- arranging. FIR[S]T: Self-validating

• Unit testing should become a habit. TEST INFECTED! •
In practice, chances are low that you’ll find the time to come back and write tests. FIRS[T]: Timely

Kent Beck’s test desiderata • Isolated — tests should return
the same results regardless of the order in which they are run. • Composable — if tests are isolated, then I can run 1 or 10 or 100 or 1,000,000 and get the same results. • Fast — tests should run quickly. • Inspiring — passing the tests should inspire confidence • Writable — tests should be cheap to write relative to the cost of the code being tested. • Readable — tests should be comprehensible for reader, invoking the motivation for writing this particular test. • Behavioural — tests should be sensitive to changes in the behavior of the code under test. If the behavior changes, the test result should change. • Structure-insensitive — tests should not change their result if the structure of the code changes. • Automated — tests should run without human intervention. • Specific — if a test fails, the cause of the failure should be obvious. • Deterministic — if nothing changes, the test result shouldn't change. • Predictive — if the tests all pass, then the code under test should be suitable for production. https://medium.com/@kentbeck_7670/test-desiderata-94150638a4b3

Pattern format • Description. • Possible negative effects. • Solution.

Test code duplication • The same test code is repeated
many times. • Can happen in a single test too as it may contain repeated groups of similar statements. • All the issues a ”Copy and paste in production” has. • Extract common behavior to a method or a class.

Test logic in production • The code that is put
into production contains logic that should be exercised only during tests. • The test code may actually execute in production, leading to bad behavior. • Remove test logic, inject class in the dependencies, use a mock during the test.

Erratic / Flaky tests • One or more tests are
behaving erratically; sometimes they pass and sometimes they fail. Aka: Flaky tests. • Less confidence, hard to know whether the test is failing because of a real bug. • Fresh fixture, check infrastructure-specific dependencies (e.g. clock).

Obscure tests • It is difficult to understand the test
at a glance. • May be caused by general fixtures, assertion roulettes, irrelevant information, etc. • Difficult maintenance and increase costs. • Refactor according to the root cause.

Possible solution: Test Data Builder • If you have a
complex entity that should be part of a test, instantiating it can obscure the test. • Extract the creation of the object to a dedicated class. • Builder pattern [GoF] • In test world: Test Data Builder. • Recommended reading: http://www.natpryce.com/articles/000714.html.

Possible Solution: Page Objects • In system testing, navigating through
complex web pages might make your test complex. • Solution: abstractions that represent the different “pages” of your system. • Read: • https://www.martinfowler.com/bliki/PageObject.html • https://github.com/SeleniumHQ/selenium/wiki/PageObjects • The Screenplay pattern (@wakaleo suggested me to talk about it) • https://www.infoq.com/articles/Beyond-Page-Objects-Test-Automation- Serenity-Screenplay/

Assertion Roulette • It is hard to tell which of
several assertions within the same test method caused a test failure. • Harder to understand which assertion failed and why. • Break the test in two, make assertions clearer.

Clear assertions • Assertions are key in test code •
Clear assertions are fundamental • The AssertJ project: https://joel-costigliola.github.io/assertj/

Condition logic in test • A test contains code that
may or may not be executed. • Any loop or if statement in a test deserves attention. • To assert a collection? • To initialize an infrastructure? • To make the test execution different per environment? • Harder to understand what a test actually does. • Break the test in two, make assertions clearer.

Slow tests • The tests take too long to run.
• Reduce the productivity of the developer. • Slow component -> mock • Expensive fixture -> shared fixture • Too many tests -> Isolate expensive from faster tests.

Mystery guest • The test reader is not able to
see the cause and effect between fixture and verification logic because part of it is done outside the Test Method. • Hard to see the cause and effect relationship between the test fixture (the pre-conditions of the test) and the expected outcome of the test. • Fresh fixture, inline fixture.

Resource Optimism • A test that depends on external resources
has non- deterministic results depending on when/where it is run. • Makes the test flaky. • Fresh fixture • Inline fixture

Test Run War • Test failures occur at random when
several people are running tests simultaneously. • Makes the test flaky. • Hard to understand why it fails. • Fresh fixture • Inline fixture • Isolated/Independent test

General Fixture • Fixture is too general and different tests
only access part of the fixture. • Harder to read and understand. • May make tests run more slowly (because they do unnecessary work). • Break the fixture into many.

Lazy Test • Several test methods check the same method
using the same fixture. • Too many tests, harder to understand the full tested behavior. • Join the tests.

Indirect Testing • Test class contains methods that actually perform
tests on other objects. • Understanding and debugging is harder. • Isolate your units.

Sensitive Equality • The assertion relies on many irrelevant details.
• Test may fail for reasons other than a bug. • Implement an equals() method.

Gets worse over time? Probably Resource Leakage Yes Probably Non-
Deterministic Test No Happens when test run alone? Probably Lonely Test Yes Probably Interacting Tests No

You want more smells?

Test-Driven Development in Practice Maurício Aniche

Let’s try! • Roman Numerals • Receives a string, converts
to integer • ”I” -> 1 • ”III” -> 3 • ”VI” -> 6 • ”IV” -> 4 • ”XVI” -> 16 • ”XIV” -> 14

Are you happy with this code?

Baby steps • Simplicity: We should do the simplest implementation
that solves the problem, start by the simplest possible test, … • Do not confuse being simple with being innocent. • Kent Beck states in his book: ”Do these steps seem too small to you? Remember, TDD is not about taking teensy tiny steps, it’s about being able to take teensy tiny steps. Would I code day-to-day with steps this small? No. But when things get the least bit weird, I’m glad I can.”

Refactor! • In many opportunities, we are so busy making
the test pass that we forget about writing good code. • After the test is green, you can refactor. • Good thing is that, after the refactoring, tests should still be green. • Refactoring can be at low-level or high-level. • Low-level: rename variables, extract methods. • High-level: change the class design, class contracts.

Let’s do some refactor and continue!

The TDD cycle Failing test Write a failing test Make
it pass Tests passing Refactor

The TDD cycle Failing test Write a failing test Make
it pass Tests passing Refactor What are the advantages ?

Focus on the requirements • Starting by the test means
starting by the requirements. • It makes us think more about: • what we expect from the class. • how the class should behave in specific cases. • We do not write ”useless code” • Go to your codebase right now. How much code have you written that is never used in real world?

Controlling your pace • Having a failing test, give us
a clear focus: make the test pass. • I can write whenever test I want: • If I feel insecure, I can write a simpler test. • If I feel safe, I can write a more complicated test. • If there’s something I do not understand, I can take a tiny baby step. • If I understand completely, I can take a larger step.

Test from the requirements • Starting from the test means
starting from the requirements. Meaning your tests derive from the requirements, and not from existing code. • If you follow the idea of always having tests, you do not need to test afterwards. • Your code is tested already!

It’s our first client! • The test code is the
first client of the class you are constructing. • Use it to your advantage. • What can you get from the client? • Is it hard to make use of your class? • Is it hard to build the class? • Is it hard to set up the class for use (pre conditions)? • Does the class return what I want?

Testable code • TDD makes you think about tests from
the beginning. • This means you will be enforced to write testable classes. • We discussed it before: a testable class is also an easy-to-use class. • Some people call TDD as Test-Driven Design.

Tests as a draft • Changing your class design is
cheaper when done at the beginning. • Use your tests as a draft: play with the class; if you don’t like the class design, change it. • Remember: the test is your first client.

Controllability • Tests make you think about managing dependencies from
the beginning. • If your class depends on too many classes, testing gets harder. • You should refactor.

Listen to your test • The test may reveal design
problems. • You should ”listen to it”. • Too many tests? • Maybe your class does too much. • Too many mocks? • Maybe your class is too coupled. • Complex set up before calling the desired behavior? • Maybe rethink the pre-conditions.

TDD as a design tool • Let’s do it together!
• Remember the InvoiceFilter example?

Is it really effective? • 50% more tests, less time
debugging [5]. • 40-50% less defects, no impact on productivity [6]. • 40-50% less defects in Microsoft and IBM products [12]. • Better use of OOP concepts [13]. • More cohesive, less coupled [15]. Janzen, D., Software Architecture Improvement through Test-Driven Development. Conference on Object Oriented Programming Systems Languages and Applications, ACM, 2005. Maximilien, E. M. and L. Williams. Assessing test-driven development at IBM. IEEE 25th International Conference on Software Engineering, Portland, Orlando, USA, IEEE Computer Society, 2003. Nagappan, N., Bhat, T. Evaluating the efficacy of test- driven development: industrial case studies. Proceedings of the 2006 ACM/IEEE international symposium on Empirical software engineering. Janzen, D., Saiedian, H. On the Influence of Test-Driven Development on Software Design. Proceedings of the 19th Conference on Software Engineering Education & Training (CSEET’06). Steinberg, D. H. The Effect of Unit Tests on Entry Points, Coupling and Cohesion in an Introductory Java Programming Course. XP Universe, Raleigh, North Carolina, USA, 2001.

Is it? • No difference in code quality [Erdogmus et
al., Müller et al.] • Siniaalto and Abrahamsson: The differences in the program code, between TDD and the iterative test-last development, were not as clear as expected. Erdogmus, H., Morisio, M., et al. On the effectiveness of the test-first approach to programming. IEEE Transactions on Software Engineering 31(3): 226 – 237, 2005. Müller, M. M., Hagner, O. Experiment about test-first programming. IEE Proceedings 149(5): 131 – 136, 2002. Siniaalto, Maria, and Pekka Abrahamsson. "Does test-driven development improve the program code? Alarming results from a comparative case study." Balancing Agility and Formalism in Software Engineering. Springer Berlin Heidelberg, 2008. 143-156.

Is it? • ”The practice of test-driven development does not
drive directly the design, but gives them a safe space to think, the opportunity to refactor constantly, and subtle feedback given by unit tests, are responsible to improve the class design”. • “The claimed benefits of TDD may not be due to its distinctive test-first dynamic, but rather due to the fact that TDD-like processes encourage fine- grained, steady steps that improve focus and flow.” Aniche, M., & Gerosa, M. A. (2015). Does test-driven development improve class design? A qualitative study on developers’ perceptions. Journal of the Brazilian Computer Society, 21(1), 15. Fucci, D., Erdogmus, H., Turhan, B., Oivo, M., & Juristo, N. (2016). A Dissection of Test-Driven Development: Does It Really Matter to Test- First or to Test-Last?. IEEE Transactions on Software Engineering.

Practical advice on TDD • Keep a ”test list”. •
Refactor both production and test code. • Always see the test failing. • Stop and think.

TDD 100% of the time? • No silver bullet! J
• Maurício: I do not use TDD 100% of the times. I let my experience tell me when I need it. • However, I always write tests and I never spend too much time only with production code.

Test code engineering Maurício Aniche [email protected] @mauricioaniche

Test code engineering - Workshop at University ...

Test code engineering - Workshop at University of Innsbruck (2020)

More Decks by Mauricio Aniche

Other Decks in Technology

Featured

Transcript