Produktive Tests (German)

Produktive Tests (German)

Die Testpyramide von Martin Fowler rät uns, möglichst wenige integrative Tests zu schreiben, da sie langsam und teuer sind. J.B. Rainsberger spricht gar von (Selbst-)betrug. Sind integrative Tests also „böse” und Unit Tests „gut”? Neuere Stimmen behaupten das Gegenteil: Unit Tests sind reine Zeit- und Geldverschwendung und behindern uns mehr, als sie uns helfen. Sollten wir also lieber keine Unit Tests schreiben und dafür umso mehr integrative Tests? Die Antwort ist, wie so oft: Es kommt darauf an. Anhand von Kriterien (Test Desiderata), die Kent Beck vor kurzem in seinem Blog aufgelistet hat, werden wir Tests aus den Open-Source-Projekten JUnit und Gradle betrachten und uns fragen: Welche Kriterien erfüllen sie und welche nicht? Wann helfen uns Unit Tests? Sind integrative Tests wirklich immer langsam und unzuverlässig? Wie schreibt man gute integrative Tests? Und wie schafft man es, sie dauerhaft stabil zu halten?

956c7d246841e8507a1e1b96842994db?s=128

Marc Philipp

January 15, 2020
Tweet

Transcript

  1. PRODUCTIVE
TESTS MARC
PHILIPP chuttersnap

  2. ABOUT
ME 5

  3. THE
CASE
AGAINST INTEGRATION
TESTS Lanju Fotografie

  4. INTEGRATED
TESTS
ARE
A
SCAM Famous
ar cle/talk
by
J.B.
Rainsberger 
 Integrated
tests
are
a
scam—a
self‑ replica ng
virus
that
threatens
to
infect your
code
base,
your
project,
and
your team
with
endless
pain
and
suffering. h

    ps:/ /blog.thecodewhisperer.com/permalink/integrated‑tests‑are‑a‑scam
  5. INTEGRATED
VS.
INTEGRATION I
use
the
term
integrated
test
to
mean any
test
whose
result
(pass
or
fail) depends
on
the
correctness
of
the implementa on
of
more
than
one
“piece of
non‑trivial
behavior”.

  6. WHY
INTEGRATED
TESTS
ARE
BAD Slow,
Bri le,
Flaky You
can
never
cover
all
branches Waste
of
 me
and
money

  7. TEST
PYRAMID (2012) h ps:/ /mar nfowler.com/bliki/TestPyramid.html

  8. UNIT
TEST
=
GOOD? From
 
in
JUnit
5 @Test
void
launcherCanExecuteTestPlan()
{
 



TestEngine
engine
=
mock(TestEngine.class);
 



when(engine.getId()).thenReturn("some­engine");
 



when(engine.discover(any(),
any())).thenAnswer(invocation
­>
 







UniqueId
uniqueId
=
invocation.getArgument(1);
 







return
new
EngineDescriptor(uniqueId,
uniqueId.toString() 



});


    
 



var
launcher
=
createLauncher(engine);
 



TestPlan
testPlan
=
launcher.discover(request().build());
 



verify(engine,
times(1)).discover(any(),
any());
 
 



launcher.execute(testPlan);
 



verify(engine,
times(1)).execute(any());
 }
 DefaultLauncherTests.java
  9. INTEGRATION
TEST
=
BAD? static
class
TestCase
{
 



@ParameterizedTest
@CsvSource({
"foo",
"bar"
})
 



void
testWithCsvSource(String
argument)
{
 







fail(argument);
 



}
 }
 @Test
 void
executesWithCsvSource()
{
 



var
results
=
execute("testWithCsvSource",
String.class);


    



results.testEvents().assertThatEvents()
 







.haveExactly(1,
event(displayName("[1]
argument=foo"),
 











finishedWithFailure(message("foo"))))
 







.haveExactly(1,
event(displayName("[2]
argument=bar"),
 











finishedWithFailure(message("bar"))));
 }

  10. THE
CASE
AGAINST UNIT
TESTS Greg Rakozy

  11. “MOST
UNIT
TESTING
IS
WASTE” “low
(even
poten ally
nega ve)
payoff” “increase
maintenance
liabili es
because
they
are less
resilient
against
code
changes” 
 
 h

    ps:/ /rbcs‑us.com/documents/Why‑Most‑Unit‑ Tes ng‑is‑Waste.pdf h ps:/ /blog.usejournal.com/lean‑tes ng‑or‑why‑unit‑ tests‑are‑worse‑than‑you‑think‑b6500139a009 h p:/ /250bpm.com/blog:40
  12. “UNIT
TESTS
PASS
–
NO
INTEGRATION
TESTS”
MEMES

  13. CONTROVERSIAL
OPINIONS? Write
tests.
Not
too
many.
Mostly integra on.
 —
Guillermo
Rauch
(@rauchg)
 December
10,
2016

  14. TEST
PYRAMID,
THE
FINE
PRINT 
 2:
The
pyramid
is
based
on
the
assump on
that broad‑stack
tests
are
expensive,
slow,
and
bri le compared
to
more
focused
tests,
such
as
unit tests.
While
this
is
usually
true,
there
are excep ons.
If
my
high
level
tests
are
fast,
reliable,

    and
cheap
to
modify
‑
then
lower‑level
tests
aren’t needed.
 h ps:/ /mar nfowler.com/bliki/TestPyramid.html
  15. THE
TESTING
TROPHY 

 h ps:/ /twi er.com/kentcdodds/status/960723172591992832

  16. THE
TESTING
TROPHY as
you
move
up
the
pyramid,
the confidence
quo ent
of
each
form
of tes ng
increases our
tools
have
moved
beyond
the assump on
in
Mar n’s
original
Tes

    ng Pyramid
concept h ps:/ /kentcdodds.com/blog/write‑tests
  17. MICROSERVICES
TESTING
HONEYCOMB h ps:/ /labs.spo fy.com/2018/01/11/tes ng‑of‑microservices/

  18. MICROSERVICES
TESTING
HONEYCOMB “Integrated
Test”
=
a
test
that
will
pass
or
fail
based on
the
correctness
of
another
system. “Integra on
Test”
=
verify
the
correctness
of
our service
in
a
more
isolated
fashion
while
focusing
on the
interac on
points
and
making
them
very
explicit “Implementa on
Detail
Test”
=
“unit
test”

    h ps:/ /labs.spo fy.com/2018/01/11/tes ng‑of‑microservices/
  19. INTEGRATION
TESTS
ARE
BAD! UNIT
TESTS
ARE
BAD! ⇒
ALL
TESTS
ARE
A
SCAM!?


  20. TEST
DESIDERATA Dawid Zawiła

  21. TEST
DESIDERATA Recent
blog
post
by
Kent
Beck, Video
series
(with
Kent
Beck
and
Kelly
Su on), 12
desirable
proper es
of
tests h ps:/ /medium.com/@kentbeck_7670/test‑ desiderata‑94150638a4b3

    h ps:/ /www.youtube.com/playlist? list=PLlmVY7qtgT_lkbrk9iZNizp978mVzpBKl
  22. 12(!)
PROPERTIES? Not
all
tests
need
to
exhibit
all proper es.
However,
no
property
should be
given
up
without
receiving
a
property of
greater
value
in
return.

  23. ISOLATED tests
should
return
the
same
results regardless
of
the
order
in
which
they
are run.

  24. COMPOSABLE if
tests
are
isolated,
then
I
can
run
1
or 10
or
100
or
1,000,000
and
get
the same
results.

  25. FAST tests
should
run
quickly.

  26. INSPIRING passing
the
tests
should
inspire confidence

  27. WRITABLE tests
should
be
cheap
to
write
rela ve
to the
cost
of
the
code
being
tested.

  28. READABLE tests
should
be
comprehensible
for reader,
invoking
the
mo va on
for wri ng
this
par cular
test.

  29. BEHAVIORAL tests
should
be
sensi ve
to
changes
in the
behavior
of
the
code
under
test.
If the
behavior
changes,
the
test
result should
change.

  30. STRUCTURE-INSENSITIVE tests
should
not
change
their
result
if the
structure
of
the
code
changes.

  31. AUTOMATED tests
should
run
without
human interven on.

  32. SPECIFIC if
a
test
fails,
the
cause
of
the
failure should
be
obvious.

  33. DETERMINISTIC if
nothing
changes,
the
test
result shouldn’t
change.

  34. PREDICTIVE if
the
tests
all
pass,
then
the
code
under test
should
be
suitable
for
produc on.

  35. POSSIBLE
RATING
(CONTEXT
DEPENDENT!) 


  36. SO
WHAT? Look
at
the
last
test
you
wrote. Which
proper es
does
it
have? Which
does
it
lack? Is
that
the
tradeoff
you
want
to make?

  37. 5 CASE
STUDY:
JUNIT Thomas Lambert

  38. TEST
MIX
IN
JUNIT Lots
of
unit
tests Lots
of
integra on
tests A
few
end‑to‑end
(“integrated”)
tests

  39. A
SIMPLE
UNIT
TEST Structure‑insensi ve
↑,
Inspiring
↑,
Writable
↑, Fast
↑,
… @Test
 void
assertSameWithSameObject()
{
 



Object
foo
=
new
Object();
 



assertSame(foo,
foo);
 



assertSame(foo,
foo,
"message");
 



assertSame(foo,
foo,
()
­>
"message");


    }

  40. A
UNIT
TEST
WITH
MOCKS Structure‑insensi ve
↑,
Inspiring
→,
Writable
→, Fast
↑,
… @Test
void
launcherCanExecuteTestPlan()
{
 



TestEngine
engine
=
mock(TestEngine.class);
 



when(engine.getId()).thenReturn("some­engine");
 



when(engine.discover(any(),
any())).thenAnswer(invocation
­>
 







UniqueId
uniqueId
=
invocation.getArgument(1);
 







return
new
EngineDescriptor(uniqueId,
uniqueId.toString()

    



});
 
 



var
launcher
=
createLauncher(engine);
 



TestPlan
testPlan
=
launcher.discover(request().build());
 



verify(engine,
times(1)).discover(any(),
any());
 
 



launcher.execute(testPlan);
 



verify(engine,
times(1)).execute(any());
 }

  41. ANOTHER
UNIT
TEST Structure‑insensi ve
↓,
Inspiring
→,
Writable
↑, Fast
↑,
… @Test
 void
providesMultipleArguments()
{
 



CsvSource
annotation
=
csvSource("foo",
"bar");
 
 



Stream<Object[]>
arguments
=
provideArguments(annotation);
 


    



assertThat(arguments)
 







.containsExactly(array("foo"),
array("bar"));
 }

  42. A
TYPICAL
INTEGRATION
TEST Structure‑insensi ve
↑,
Inspiring
↑,
Writable
→, Fast
↑,
… static
class
TestCase
{
 



@ParameterizedTest
@CsvSource({
"foo",
"bar"
})
 



void
testWithCsvSource(String
argument)
{
 







fail(argument);
 



}
 }


    @Test
 void
executesWithCsvSource()
{
 



var
results
=
execute("testWithCsvSource",
String.class);
 



results.testEvents().assertThatEvents()
 







.haveExactly(1,
event(displayName("[1]
argument=foo"),
 











finishedWithFailure(message("foo"))))
 







.haveExactly(1,
event(displayName("[2]
argument=bar"),
 











finishedWithFailure(message("bar"))));
 }

  43. END-TO-END
TESTS Structure‑insensi ve
↑,
Predic ve
↑,
Writable
↓, Fast
↓
… @Test
void
gradle_wrapper()
{
 



var
result
=
Request.builder()
 











.setTool(new
GradleWrapper(Paths.get("..")))
 











.setProject("gradle­starter")
 











.addArguments("build",
"­­no­daemon",
"­­debug",
"­­s

    











.setTimeout(Duration.ofMinutes(2))
 











.setJavaHome(Helper.getJavaHome("8").orElseThrow(TestA 











.build()
 











.run();
 



assumeFalse(result.isTimedOut(),
()
­>
"tool
timed
out:
"
+
r 



assertEquals(0,
result.getExitCode());
 



assertTrue(result.getOutputLines("out").stream()
 







.anyMatch(line
­>
line.contains("BUILD
SUCCESSFUL")));
 



assertThat(result.getOutput("out")).contains("Using
Java
vers }

  44. CASE
STUDY:
GRADLE Jennifer Latuperisa‑Andresen

  45. TEST
MIX
IN
GRADLE Mostly
integra on/integrated
tests
(~50%) Some
unit
tests
(~35%) Cross‑version
tests Performance
tests

  46. A
BAD
UNIT
TEST
 Readable
↓,
Structure‑insensi ve
↓,
Inspiring
↓, Fast
↑ def
"init
task
creates
project
with
all
defaults"()
{
 



given:
 



def
projectLayoutRegistry
=
Mock(ProjectLayoutSetupRegistry.class)
 



def
buildConverter
=
Mock(BuildConverter.class)
 



projectLayoutRegistry.buildConverter
>>
buildConverter
 



buildConverter.canApplyToCurrentDirectory()
>>
false


    



projectLayoutRegistry.default
>>
projectSetupDescriptor
 



projectLayoutRegistry.getLanguagesFor(ComponentType.BASIC)
>>
[Language.NONE 



projectLayoutRegistry.get(ComponentType.BASIC,
Language.NONE)
>>
projectSetu 



def
projectSetupDescriptor
=
Mock(BuildInitializer.class)
 



projectSetupDescriptor.componentType
>>
ComponentType.BASIC
 



projectSetupDescriptor.dsls
>>
[GROOVY]
 



projectSetupDescriptor.defaultDsl
>>
GROOVY
 



projectSetupDescriptor.testFrameworks
>>
[NONE]
 



projectSetupDescriptor.defaultTestFramework
>>
NONE
 



projectSetupDescriptor.furtherReading
>>
Optional.empty()
 
 



when:
 



def
init
=
TestUtil.create(testDir).task(InitBuild)
 init setupProjectLayout()
  47. A
GOOD
INTEGRATION
TEST
 Readable
↑,
Structure‑insensi ve
↑,
Inspiring
↑, Fast
→ def
targetDir
=
testDirectory.createDir("some­thing")
 def
"creates
valid
sample
sources
if
no
sources
are
present"()
{
 



when:
 



executer.inDirectory(targetDir)
 







.succeeds('init',
'­­type',
'groovy­library')
 



then:


    



targetDir.file("src/main/groovy")
 







.assertHasDescendants("some/thing/Library.groovy")
 



targetDir.file("src/test/groovy")
 







.assertHasDescendants("some/thing/LibraryTest.groovy")
 



when:
 



run("build")
 



then:
 



assertTestPassed("some.thing.LibraryTest",
"someLibraryMethod
returns
true") }

  48. TEST
FIXTURES In
order
to
write
such
readable
integra on
tests,
a
lot of
suppor ng
code
has
to
be
in
place.

  49. CHALLENGES Speed:
cri cally
important
for
feedback
loop
and
CI build
 mes Flakiness:
the
more
parts
involved,
the
higher
the chance
of
“random”
failures

  50. SPEED Two
modes: embedded
–
runs
Gradle
in‑process
for
fast
local development
and
easy
debugging forking
–
calls
Gradle
like
it
would
be
called
by users CI
only
runs
tests
for
projects
affected
by
changes, other
results
are
loaded
from
build
cache

  51. FLAKINESS Failing
tests
are
automa cally
rerun
on
CI If
the
second
run
passes build
is
marked
as
successful GitHub
project
is
checked
whether
the
test
is known
to
be
flaky Otherwise,
an
issue
to
fix
the
test
is
created

  52. LESSONS
LEARNED

  53. “INTEGRATION
TESTS
ARE
TOO
SLOW!” Profile
them! Is
it
just
the
tests? Or
is
it
actually
your
applica on?

  54. “WRITING
INTEGRATION
TESTS
IS
TOO HARD!” Use
the
right
tools! WireMock
/
MockServer
for
HTTP
calls TestContainers
for
databases
etc. … See
 
for
details
on
all
of
the above
and
more. talk
of
Sandra
Parsick

  55. “TESTING
ASYNCHRONOUS
CODE
IS
TOO HARD!” Don’t
ever
use
sleep! Use
 Spock’s
PollingConditions CompletableFuture Awai lity

  56. “DON’T
MOCK
THINGS
YOU
DON’T
OWN!” Unless
there’s
no
other
way.

  57. “DON’T
MOCK
THINGS!” Unless
that’s
the
best
way
to
test
it.

  58. “THE
TEST
PYRAMID
IS
ALWAYS
RIGHT!” The
test
pyramid
is
not
the
right
strategy
for
everyone.

  59. WHICH
TESTS
SHOULD
I
WRITE? Use
criteria
like
Kent
Beck’s
test
desiderata
to
make
a conscious
decision
about
which
tests
help
you
in
being produc ve. Be
aware
of
the
trade‑offs!

  60. THANKS! Marc
Philipp
 
 @marcphilipp
on
 / mail@marcphilipp.de GitHub Twi er