Test code engineering - Workshop at University of Innsbruck (2020)

Slide 1

Slide 1 text

Test code engineering Maurício Aniche [email protected] @mauricioaniche

Slide 2

Slide 2 text

TEST ANALYSIS & TEST DESIGN

Slide 3

Slide 3 text

TEST ANALYSIS & TEST DESIGN Our focus today!

Slide 4

Slide 4 text

Topics of today • The testing pyramid • Design for testability • Dependency injection • Ports and Adapters • Mocking • To facilitate testing (more control, more observability) • To explore boundaries (in a TDD fashion) • TDD • Test code best (and bad) practices • FIRST principles • Kent Beck’s test desiderata • Test smells • Test design patterns • Abstractions • Builders • Page Objects

Slide 5

Slide 5 text

Experience matters! Yu, C. S., Treude, C., & Aniche, M. (2019, July). Comprehending Test Code: An Empirical Study. In 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME) (pp. 501-512). IEEE.

Slide 6

Slide 6 text

Study material • Martin Fowler’s wiki: https://martinfowler.com/tags/testing.html • Simple to read and full of pragmatic advice! • Hevery, Misko. The testability guide. http://misko.hevery.com/attachments/Guide- Writing%20Testable%20Code.pdf • Meszaros, Gerard. xUnit test patterns: Refactoring test code. Pearson Education, 2007. • Freeman, Steve, and Nat Pryce. Growing object-oriented software, guided by tests. Pearson Education, 2009. • Beck, Kent. Test-driven development: by example. Addison-Wesley Professional, 2003. • Hunt, Andy, and Dave Thomas. Pragmatic unit testing in Java with JUnit. The Pragmatic Bookshelf, 2003. • My lecture notes: https://sttp.site

Slide 7

Slide 7 text

Getting to know you • Are you familiar with automated tests? • How often do you write automated tests? • Now, tell me about your pains… :)

Slide 8

Slide 8 text

Testing pyramid Unit tests Integration tests System tests Manual More reality More complexity

Slide 9

Slide 9 text

Some definitions of unit testing • ISQTB: “Searches for defects in, and verifies the functioning of software items (e.g., modules, programs, objects, classes, etc) that are separately testable”. • Osherove: “A unit test is an automated piece of code that invokes a unit of work in the system and then checks a single assumption about the behavior of that unit of work. [...] A unit of work is a single logical functional use case in the system that can be invoked by some public interface (in most cases). A unit of work can span a single method, a whole class or multiple classes working together to achieve one single logical purpose that can be verified.”

Slide 10

Slide 10 text

Advantages Disadvantages • Very fast • Easy to control • Easy to write • Less real • Some bugs can’t be reproduced at such level

Slide 11

Slide 11 text

BigTest.java We can do System Testing!

Slide 12

Slide 12 text

Advantages Disadvantages • Very realistic • Captures the user perspective • Slow • Hard to write • Flaky

Slide 13

Slide 13 text

How I (Maurício) do the trade-off All business rules should be tested here. Exploratory tests. Complex integrations with external services. Main/Risky flow of the app tested. Unit tests Integration tests System tests Manual

Slide 14

Slide 14 text

The ice-cream cone anti-pattern Unit tests Integration tests System tests Manual Manual GUI tests System tests Integration tests Unit tests

Slide 15

Slide 15 text

The practical test pyramid: https://martinfowler.com/articles/practical-test-pyramid.html

Slide 16

Slide 16 text

Mock Objects

Slide 17

Slide 17 text

That’s how it is in OO systems… A A does too much!

Slide 18

Slide 18 text

That’s how it is in OO systems… A C A does too much again!

Slide 19

Slide 19 text

That’s how it is in OO systems… etc B DB A C

Slide 20

Slide 20 text

What should we do to test a class without its dependencies? etc B DB A C How to write unit tests for A?

Slide 21

Slide 21 text

We simulate the dependencies! • Fast • Full control B’ A C’ B’ and C’ are (lightweight) simulations of B and C, respectively.

Slide 22

Slide 22 text

Mock Objects mock objects are objects that mimic the behavior of real objects/dependencies, easing their controllability and observability.

Slide 23

Slide 23 text

Why do I want to control my dependencies? • To easily simulate exceptions • To easily simulate database access • To easily simulate the interaction with any other infrastructure • To avoid building complex objects • To control third-party libraries that I do not own etc B DB A C

Slide 24

Slide 24 text

Let’s code! I wanna filter all the invoices where their value are smaller than 100.0. Invoices come from the database. Code: https://gist.github.com/mauricioaniche/03ee12e64d734e7ea370eceb68fe6676

Slide 25

Slide 25 text

To mock or not to mock? • Developers have mixed feelings about the usage of mock objects. • Can you see the advantages and the disadvantages of using mocks? • Adv: Easy to control dependencies, easy to automate test cases. • Disadv: Not very real, integration problems might pass. • At the end of the day, it’s about using the right tool at the right moment.

Slide 26

Slide 26 text

EXTERNAL DEPENDENCIES 72%(167) 28%(64) 69%(182) 31%(82) 68%(140) 32%(67) 36%(319) 64%(579) 7% (12) 93%(160) 94%(358) 6% DATABASE WEB SERVICE DOMAIN OBJECT JAVA LIBRARIES TEST SUPPORT Percentage of non-mocked dependencies Percentage of mocked dependencies Spadini, D, Aniche, M, Bruntink, M & Bacchelli, A 2018, 'Mock objects for testing java systems: Why and how developers use them, and how they evolve' Empirical Software Engineering, pp. 1-38.

Slide 27

Slide 27 text

When to mock? • We empirically see that infrastructure is often mocked. • There was no clear trend on domain objects. • Their practice: Complicated/complex classes are mocked. • No mocks for libraries (e.g., lists or small util methods).

Slide 28

Slide 28 text

Mocks are introduced from the very beginning of the test class! Spadini, D, Aniche, M, Bruntink, M & Bacchelli, A 2018, 'Mock objects for testing java systems: Why and how developers use them, and how they evolve' Empirical Software Engineering, pp. 1-38.

Slide 29

Slide 29 text

50% of changes in a mock occur because the production code changed! Coupling is strong! Spadini, D, Aniche, M, Bruntink, M & Bacchelli, A 2018, 'Mock objects for testing java systems: Why and how developers use them, and how they evolve' Empirical Software Engineering, pp. 1-38.

Slide 30

Slide 30 text

Developers are aware of the trade-offs! • They mock databases, but then later write integration tests. • Added “complexity” in exchange of testability. • They understand how coupled they are when they use mocks. Davide Spadini, M. Finavaro Aniche, Magiel Bruntink, Alberto Bacchelli. To Mock or Not To Mock? An Empirical Study on Mocking Practices. MSR 2017. Mock Objects For Testing Java Systems: Why and How Developers Use Them, and How They Evolve. EMSE, 2018.

Slide 31

Slide 31 text

Mock as a way to explore the boundaries Freeman, Steve, and Nat Pryce. Growing object-oriented software, guided by tests. Pearson Education, 2009.

Slide 32

Slide 32 text

Design for Testability

Slide 33

Slide 33 text

Testability and Good Design Feathers, Michael. The Deep Synergy between Testability and Good Design. Talk: https://www.youtube.com/watch?v=4cVZvoFGJTU Well-designed classes are naturally more testable. Good (class) design is key!

Slide 34

Slide 34 text

Controllability and Observability • Controllability determines the work it takes to set up and run test cases and the extent to which individual functions and features of the system under test (SUT) can be made to respond to test cases. • Observability determines the work it takes to set up and run test cases and the extent to which the response of the system under test (SUT) to test cases can be verified. • See Robert Binder’s post on Testability: https://robertvbinder.com/software-testability-part-2-controllability- and-observability/

Slide 35

Slide 35 text

Hexagonal architecture (aka ports and adapters) Extracted from http://alistair.cockburn.us/Hexagonal+architecture

Slide 36

Slide 36 text

Harder than it seems!

Slide 37

Slide 37 text

Clear and well-defined interfaces Encapsulation plays a big role here!

Slide 38

Slide 38 text

Dependency Inversion • Make dependencies injectable.

Slide 39

Slide 39 text

Tell! Don’t Ask

Slide 40

Slide 40 text

Encapsulation • Behavior should be in the right place. • Otherwise, test gets too complicated. • Indirect testing smell. • Law of Demeter • Avoid a.getB().getC().getD().doSomething(); • Much harder to arrange the test, i.e., you have to arrange states in A, B, C, and D.

Slide 41

Slide 41 text

Should I test a private method? • Probably not. • You should test the class from its public interface. • If you want to test the private method in an isolated way, that’s the test telling you that your code is not cohesive.

Slide 42

Slide 42 text

Static methods • Now that you know mock, what can we do with static methods? • Static methods can’t be mocked. • Therefore, AVOID THEM.

Slide 43

Slide 43 text

Don’t be afraid of creating layers • How can I test date/time related things? • How can I test environment-dependent things? • Create a layer on top of APIs • These layers are easily mockable • The Restfulie.NET case: • https://github.com/mauricioaniche/restfulie.net/blob/master/Restful ie.Server/Marshalling/UrlGenerators/AspNetMvcUrlGenerator.cs

Slide 44

Slide 44 text

Let’s code • The Clock abstraction example. • https://gist.github.com/mauricioaniche/312249471a71dddeeb74991 774529b7b

Slide 45

Slide 45 text

Solution • We added a layer on top of Calendar. • Again, do not be afraid of adding layers to facilitate testability. • Solution: https://gist.github.com/mauricioaniche/2200d8c1dcab41e7c4dddcc46211f4c0

Slide 46

Slide 46 text

http://misko.hevery.com/attachments/Guide- Writing%20Testable%20Code.pdf

Slide 47

Slide 47 text

Some of the flaws listed in the Testability Guide, by Misko Hevery • Flaw: Constructor does Real Work • Flaw: Digging into Collaborators • Flaw: Brittle Global State & Singletons • Flaw: Class Does Too Much • His tool, Testability Explorer (https://github.com/mhevery/testability- explorer) never became a hit. • But the idea is awesome!

Slide 48

Slide 48 text

Test code smells

Slide 49

Slide 49 text

Many things can hamper comprehensibility! the test! the production code!

Slide 50

Slide 50 text

Arrange-Act-Assert (AAA)

Slide 51

Slide 51 text

What are the ugly things you have seen in a test code?

Slide 52

Slide 52 text

The FIRST principles

Slide 53

Slide 53 text

• Tests should be fast (in practice, hard to draw a line between fast and slow) • Minimize the amount of code that ultimately depends on slow things. [F]IRST: Fast

Slide 54

Slide 54 text

• Good unit tests focus on a small chunk of code to verify. • Good unit tests also don’t depend on other unit tests. • Be careful with shared resources. F[I]RST: Isolated

Slide 55

Slide 55 text

• A repeatable test is one that produces the same results each time you run it. • To achieve that, tests must be well isolated. FI[R]ST: Repeatable

Slide 56

Slide 56 text

• Tests aren’t tests unless they assert that things went as expected. • Manually verifying the results of tests is a time- consuming • They must also be self- arranging. FIR[S]T: Self-validating

Slide 57

Slide 57 text

• Unit testing should become a habit. TEST INFECTED! • In practice, chances are low that you’ll find the time to come back and write tests. FIRS[T]: Timely

Slide 58

Slide 58 text

Kent Beck’s test desiderata • Isolated — tests should return the same results regardless of the order in which they are run. • Composable — if tests are isolated, then I can run 1 or 10 or 100 or 1,000,000 and get the same results. • Fast — tests should run quickly. • Inspiring — passing the tests should inspire confidence • Writable — tests should be cheap to write relative to the cost of the code being tested. • Readable — tests should be comprehensible for reader, invoking the motivation for writing this particular test. • Behavioural — tests should be sensitive to changes in the behavior of the code under test. If the behavior changes, the test result should change. • Structure-insensitive — tests should not change their result if the structure of the code changes. • Automated — tests should run without human intervention. • Specific — if a test fails, the cause of the failure should be obvious. • Deterministic — if nothing changes, the test result shouldn't change. • Predictive — if the tests all pass, then the code under test should be suitable for production. https://medium.com/@kentbeck_7670/test-desiderata-94150638a4b3

Slide 59

Slide 59 text

No content

Slide 60

Slide 60 text

Pattern format • Description. • Possible negative effects. • Solution.

Slide 61

Slide 61 text

Test code duplication • The same test code is repeated many times. • Can happen in a single test too as it may contain repeated groups of similar statements. • All the issues a ”Copy and paste in production” has. • Extract common behavior to a method or a class.

Slide 62

Slide 62 text

Test logic in production • The code that is put into production contains logic that should be exercised only during tests. • The test code may actually execute in production, leading to bad behavior. • Remove test logic, inject class in the dependencies, use a mock during the test.

Slide 63

Slide 63 text

Erratic / Flaky tests • One or more tests are behaving erratically; sometimes they pass and sometimes they fail. Aka: Flaky tests. • Less confidence, hard to know whether the test is failing because of a real bug. • Fresh fixture, check infrastructure-specific dependencies (e.g. clock).

Slide 64

Slide 64 text

Obscure tests • It is difficult to understand the test at a glance. • May be caused by general fixtures, assertion roulettes, irrelevant information, etc. • Difficult maintenance and increase costs. • Refactor according to the root cause.

Slide 65

Slide 65 text

Possible solution: Test Data Builder • If you have a complex entity that should be part of a test, instantiating it can obscure the test. • Extract the creation of the object to a dedicated class. • Builder pattern [GoF] • In test world: Test Data Builder. • Recommended reading: http://www.natpryce.com/articles/000714.html.

Slide 66

Slide 66 text

Possible Solution: Page Objects • In system testing, navigating through complex web pages might make your test complex. • Solution: abstractions that represent the different “pages” of your system. • Read: • https://www.martinfowler.com/bliki/PageObject.html • https://github.com/SeleniumHQ/selenium/wiki/PageObjects • The Screenplay pattern (@wakaleo suggested me to talk about it) • https://www.infoq.com/articles/Beyond-Page-Objects-Test-Automation- Serenity-Screenplay/

Slide 67

Slide 67 text

Assertion Roulette • It is hard to tell which of several assertions within the same test method caused a test failure. • Harder to understand which assertion failed and why. • Break the test in two, make assertions clearer.

Slide 68

Slide 68 text

Clear assertions • Assertions are key in test code • Clear assertions are fundamental • The AssertJ project: https://joel-costigliola.github.io/assertj/

Slide 69

Slide 69 text

Condition logic in test • A test contains code that may or may not be executed. • Any loop or if statement in a test deserves attention. • To assert a collection? • To initialize an infrastructure? • To make the test execution different per environment? • Harder to understand what a test actually does. • Break the test in two, make assertions clearer.

Slide 70

Slide 70 text

Slow tests • The tests take too long to run. • Reduce the productivity of the developer. • Slow component -> mock • Expensive fixture -> shared fixture • Too many tests -> Isolate expensive from faster tests.

Slide 71

Slide 71 text

Mystery guest • The test reader is not able to see the cause and effect between fixture and verification logic because part of it is done outside the Test Method. • Hard to see the cause and effect relationship between the test fixture (the pre-conditions of the test) and the expected outcome of the test. • Fresh fixture, inline fixture.

Slide 72

Slide 72 text

Resource Optimism • A test that depends on external resources has non- deterministic results depending on when/where it is run. • Makes the test flaky. • Fresh fixture • Inline fixture

Slide 73

Slide 73 text

Test Run War • Test failures occur at random when several people are running tests simultaneously. • Makes the test flaky. • Hard to understand why it fails. • Fresh fixture • Inline fixture • Isolated/Independent test

Slide 74

Slide 74 text

General Fixture • Fixture is too general and different tests only access part of the fixture. • Harder to read and understand. • May make tests run more slowly (because they do unnecessary work). • Break the fixture into many.

Slide 75

Slide 75 text

Lazy Test • Several test methods check the same method using the same fixture. • Too many tests, harder to understand the full tested behavior. • Join the tests.

Slide 76

Slide 76 text

Indirect Testing • Test class contains methods that actually perform tests on other objects. • Understanding and debugging is harder. • Isolate your units.

Slide 77

Slide 77 text

Sensitive Equality • The assertion relies on many irrelevant details. • Test may fail for reasons other than a bug. • Implement an equals() method.

Slide 78

Slide 78 text

Gets worse over time? Probably Resource Leakage Yes Probably Non- Deterministic Test No Happens when test run alone? Probably Lonely Test Yes Probably Interacting Tests No

Slide 79

Slide 79 text

You want more smells?

Slide 80

Slide 80 text

Test-Driven Development in Practice Maurício Aniche

Slide 81

Slide 81 text

Let’s try! • Roman Numerals • Receives a string, converts to integer • ”I” -> 1 • ”III” -> 3 • ”VI” -> 6 • ”IV” -> 4 • ”XVI” -> 16 • ”XIV” -> 14

Slide 82

Slide 82 text

Are you happy with this code?

Slide 83

Slide 83 text

Baby steps • Simplicity: We should do the simplest implementation that solves the problem, start by the simplest possible test, … • Do not confuse being simple with being innocent. • Kent Beck states in his book: ”Do these steps seem too small to you? Remember, TDD is not about taking teensy tiny steps, it’s about being able to take teensy tiny steps. Would I code day-to-day with steps this small? No. But when things get the least bit weird, I’m glad I can.”

Slide 84

Slide 84 text

Refactor! • In many opportunities, we are so busy making the test pass that we forget about writing good code. • After the test is green, you can refactor. • Good thing is that, after the refactoring, tests should still be green. • Refactoring can be at low-level or high-level. • Low-level: rename variables, extract methods. • High-level: change the class design, class contracts.

Slide 85

Slide 85 text

Let’s do some refactor and continue!

Slide 86

Slide 86 text

The TDD cycle Failing test Write a failing test Make it pass Tests passing Refactor

Slide 87

Slide 87 text

The TDD cycle Failing test Write a failing test Make it pass Tests passing Refactor What are the advantages ?

Slide 88

Slide 88 text

Focus on the requirements • Starting by the test means starting by the requirements. • It makes us think more about: • what we expect from the class. • how the class should behave in specific cases. • We do not write ”useless code” • Go to your codebase right now. How much code have you written that is never used in real world?

Slide 89

Slide 89 text

Controlling your pace • Having a failing test, give us a clear focus: make the test pass. • I can write whenever test I want: • If I feel insecure, I can write a simpler test. • If I feel safe, I can write a more complicated test. • If there’s something I do not understand, I can take a tiny baby step. • If I understand completely, I can take a larger step.

Slide 90

Slide 90 text

Test from the requirements • Starting from the test means starting from the requirements. Meaning your tests derive from the requirements, and not from existing code. • If you follow the idea of always having tests, you do not need to test afterwards. • Your code is tested already!

Slide 91

Slide 91 text

It’s our first client! • The test code is the first client of the class you are constructing. • Use it to your advantage. • What can you get from the client? • Is it hard to make use of your class? • Is it hard to build the class? • Is it hard to set up the class for use (pre conditions)? • Does the class return what I want?

Slide 92

Slide 92 text

Testable code • TDD makes you think about tests from the beginning. • This means you will be enforced to write testable classes. • We discussed it before: a testable class is also an easy-to-use class. • Some people call TDD as Test-Driven Design.

Slide 93

Slide 93 text

Tests as a draft • Changing your class design is cheaper when done at the beginning. • Use your tests as a draft: play with the class; if you don’t like the class design, change it. • Remember: the test is your first client.

Slide 94

Slide 94 text

Controllability • Tests make you think about managing dependencies from the beginning. • If your class depends on too many classes, testing gets harder. • You should refactor.

Slide 95

Slide 95 text

Listen to your test • The test may reveal design problems. • You should ”listen to it”. • Too many tests? • Maybe your class does too much. • Too many mocks? • Maybe your class is too coupled. • Complex set up before calling the desired behavior? • Maybe rethink the pre-conditions.

Slide 96

Slide 96 text

TDD as a design tool • Let’s do it together! • Remember the InvoiceFilter example?

Slide 97

Slide 97 text

Is it really effective? • 50% more tests, less time debugging [5]. • 40-50% less defects, no impact on productivity [6]. • 40-50% less defects in Microsoft and IBM products [12]. • Better use of OOP concepts [13]. • More cohesive, less coupled [15]. Janzen, D., Software Architecture Improvement through Test-Driven Development. Conference on Object Oriented Programming Systems Languages and Applications, ACM, 2005. Maximilien, E. M. and L. Williams. Assessing test-driven development at IBM. IEEE 25th International Conference on Software Engineering, Portland, Orlando, USA, IEEE Computer Society, 2003. Nagappan, N., Bhat, T. Evaluating the efficacy of test- driven development: industrial case studies. Proceedings of the 2006 ACM/IEEE international symposium on Empirical software engineering. Janzen, D., Saiedian, H. On the Influence of Test-Driven Development on Software Design. Proceedings of the 19th Conference on Software Engineering Education & Training (CSEET’06). Steinberg, D. H. The Effect of Unit Tests on Entry Points, Coupling and Cohesion in an Introductory Java Programming Course. XP Universe, Raleigh, North Carolina, USA, 2001.

Slide 98

Slide 98 text

Is it? • No difference in code quality [Erdogmus et al., Müller et al.] • Siniaalto and Abrahamsson: The differences in the program code, between TDD and the iterative test-last development, were not as clear as expected. Erdogmus, H., Morisio, M., et al. On the effectiveness of the test-first approach to programming. IEEE Transactions on Software Engineering 31(3): 226 – 237, 2005. Müller, M. M., Hagner, O. Experiment about test-first programming. IEE Proceedings 149(5): 131 – 136, 2002. Siniaalto, Maria, and Pekka Abrahamsson. "Does test-driven development improve the program code? Alarming results from a comparative case study." Balancing Agility and Formalism in Software Engineering. Springer Berlin Heidelberg, 2008. 143-156.

Slide 99

Slide 99 text

Is it? • ”The practice of test-driven development does not drive directly the design, but gives them a safe space to think, the opportunity to refactor constantly, and subtle feedback given by unit tests, are responsible to improve the class design”. • “The claimed benefits of TDD may not be due to its distinctive test-first dynamic, but rather due to the fact that TDD-like processes encourage fine- grained, steady steps that improve focus and flow.” Aniche, M., & Gerosa, M. A. (2015). Does test-driven development improve class design? A qualitative study on developers’ perceptions. Journal of the Brazilian Computer Society, 21(1), 15. Fucci, D., Erdogmus, H., Turhan, B., Oivo, M., & Juristo, N. (2016). A Dissection of Test-Driven Development: Does It Really Matter to Test- First or to Test-Last?. IEEE Transactions on Software Engineering.

Slide 100

Slide 100 text

Practical advice on TDD • Keep a ”test list”. • Refactor both production and test code. • Always see the test failing. • Stop and think.

Slide 101

Slide 101 text

TDD 100% of the time? • No silver bullet! J • Maurício: I do not use TDD 100% of the times. I let my experience tell me when I need it. • However, I always write tests and I never spend too much time only with production code.

Slide 102

Slide 102 text

Test code engineering Maurício Aniche [email protected] @mauricioaniche