The Folly of TDD - Speaker Deck

Slide 1

Slide 1 text

The Folly of TDD Tomer Gabel, WeWork Prague, 18-19 October 2018

Slide 2

Slide 2 text

The Folly of TDD Tomer Gabel, WeWork @ GeeCON Prague October 2018 Image: Wikimedia Commons (public domain) @tomerg

Slide 3

Slide 3 text

@tomerg Tech malcontent Who am I? GeeCON alumnus Engineer @ 5x Organizer

Slide 4

Slide 4 text

The Promise of TDD ~~ Act I ~~

Slide 5

Slide 5 text

@tomerg TDD, Revisited Test-driven development (TDD) is a software development process that relies on the repetition of a very short development cycle: requirements are turned into very specific test cases, then the software is improved to pass the new tests, only. This is opposed to software development that allows software to be added that is not proven to meet requirements. -- Wikipedia

Slide 6

Slide 6 text

@tomerg In a nutshell… Red Green Refactor

Slide 7

Slide 7 text

@tomerg In a nutshell… Write test Make test pass Clean up code

Slide 8

Slide 8 text

@tomerg The Benefits of TDD • Fewer bugs • Fewer regressions • Higher quality

Slide 9

Slide 9 text

@tomerg The Benefits of TDD • Fewer bugs • Fewer regressions • Higher quality • Higher velocity • Loose coupling

Slide 10

Slide 10 text

@tomerg The Benefits of TDD • Fewer bugs • Fewer regressions • Higher quality • Higher velocity • Loose coupling • Fast feedback • ”Built-in” documentation • Less debugging • Emergent design • Promotes SRP • …

Slide 11

Slide 11 text

@tomerg • Fewer bugs • Fewer regressions • Higher quality • Higher velocity • Loose coupling • Fast feedback • ”Built-in” documentation • Less debugging • Emergent design • Promotes SRP • … The Benefits of TDD

Slide 12

Slide 12 text

Slide 13

Slide 13 text

@tomerg O RLY?

Slide 14

Slide 14 text

@tomerg What do the experts say? There’s no consistent theme.

Slide 15

Slide 15 text

@tomerg WE CAN’T AGREE ON WHY WE PRACTICE TDD. Takeaway #1

Slide 16

Slide 16 text

@tomerg Axioms • But let’s move on • It’s somewhat accepted that TDD promises: – Reduced bug density – Fewer regressions – Better code quality • But how?

Slide 17

Slide 17 text

The Practice of TDD ~~ Act II ~~

Slide 18

Slide 18 text

@tomerg Uncle Bob’s Three Laws✝ 1. No production code without a failing unit test 2. Write only as much test code as is sufficient to fail 3. Write only as much production code as is sufficient to pass Image: Michael Kappel via Flickr (CC BY-NC 2.0) ✝ Jim Coplien and Bob Martin Debate TDD” on YouTube

Slide 19

Slide 19 text

@tomerg Uncle Bob’s Three Laws✝ 1. No production code without a failing unit test 2. Write only as much test code as is sufficient to fail 3. Write only as much production code as is sufficient to pass ✝ Jim Coplien and Bob Martin Debate TDD” on YouTube Test-first Rapid Iteration

Slide 20

Slide 20 text

@tomerg Uncle Bob’s Three Laws✝ 1. No production code without a failing unit test 2. Write only as much test code as is sufficient to fail 3. Write only as much production code as is sufficient to pass ✝ Jim Coplien and Bob Martin Debate TDD” on YouTube Rapid Iteration Test-first

Slide 21

Slide 21 text

@tomerg Applied TDD • Test-first assumes the solution space is: – Known – Understood – Stable – Finite • This is often fallacious.

Slide 22

Slide 22 text

@tomerg Applied TDD • Test-first assumes the solution space is: – Known – Understood – Stable – Finite • This is often fallacious. We know what “good” is

Slide 23

Slide 23 text

@tomerg Applied TDD • Test-first assumes the solution space is: – Known – Understood – Stable – Finite • This is often fallacious. • Not always the case! • Qualitative/subjective: • Search results • Social feed • Recommendation We know what “good” is

Slide 24

Slide 24 text

@tomerg Applied TDD • Test-first assumes the solution space is: – Known – Understood – Stable – Finite • This is often fallacious. We have a solution in mind

Slide 25

Slide 25 text

@tomerg Applied TDD • Test-first assumes the solution space is: – Known – Understood – Stable – Finite • This is often fallacious. • TDD focuses on units • Test-first within known boundaries • Changing boundaries is expensive! We have a solution in mind

Slide 26

Slide 26 text

@tomerg Applied TDD • Test-first assumes the solution space is: – Known – Understood – Stable – Finite • This is often fallacious. The solution isn’t affected by external actors or conditions

Slide 27

Slide 27 text

@tomerg Applied TDD • Test-first assumes the solution space is: – Known – Understood – Stable – Finite • This is often fallacious. • Major headache with DBs, external systems • Woefully inadequate for e.g. web scrapers The solution isn’t affected by external actors or conditions

Slide 28

Slide 28 text

@tomerg Applied TDD • Test-first assumes the solution space is: – Known – Understood – Stable – Finite • This is often fallacious. Can be adequately covered by a small set of tests

Slide 29

Slide 29 text

@tomerg Applied TDD • Test-first assumes the solution space is: – Known – Understood – Stable – Finite • This is often fallacious. Image: “Not Hotdog” application by Brown Hill Productions, LLC.

Slide 30

Slide 30 text

@tomerg TDD ISN’T UNIVERSALLY APPLICABLE. Takeaway #2

Slide 31

Slide 31 text

The Eﬃcacy of TDD ~~ Act III ~~

Slide 32

Slide 32 text

@tomerg Axioms, Reprise Image: Michael Keen via Flickr (CC BY-NC-ND 2.0) • Back to our axioms • TDD promises: – Reduced bug density – Fewer regressions – Better code quality • Does it deliver?

Slide 33

Slide 33 text

@tomerg Fewer Bugs Image: Michael Bulcik via Wikimedia Commons (CC BY 2.5) • If all production code satisfies some test… – We write only necessary code – And it’s all tested – So it’s bug free? • Sounds great!

Slide 34

Slide 34 text

@tomerg Fewer Bugs … except: • Test code is still code Dad, what’s a good test-to-production code size ratio? Anywhere from 1:1 to 5:1, son Image: Wikimedia Commons (public domain)

Slide 35

Slide 35 text

@tomerg Fewer Bugs … except: • Test code is still code • Tests can be missing • Tests can be buggy • Tests can be wrong

Slide 36

Slide 36 text

@tomerg Fewer Regressions • Test coverage is great – Gives us a safety net • But it doesn’t cover: – Performance – Security – Integrations Image: Keating G. via Wikimedia Commons (IWM Non Commercial Licence)

Slide 37

Slide 37 text

@tomerg WHAT ABOUT QUALITY?

Slide 38

Slide 38 text

@tomerg We’ll Use Science • Research is a bit of a mixed bag – Positive anecdotal evidence “[TDD] can significantly reduce the defect density […] without significant productivity reduction” -- Nagappan, Bhat et al (“IBM study”) “ […] realized about 50% reduction in FVT defect density […] with minimal impact to developer productivity.” -- Maximilien, Williams (“Microsoft study”)

Slide 39

Slide 39 text

@tomerg We’ll Use Science • Research is a bit of a mixed bag – Positive anecdotal evidence – Inconclusive quantitative results “[...] TDD does not always produce highly cohesive code [...] at least, when the TDD users are inexperienced developers” -- Siniaalto et al ”Existing evidence is not sufficient and conclusions and results can be quite contradictory” -- Bulajic et al

Slide 40

Slide 40 text

@tomerg We’ll Use Science • Research is a bit of a mixed bag – Positive anecdotal evidence – Inconclusive quantitative results – Often citing reservations “Although this […] and other studies show favorable results when using TDD, developers must have the correct mindset when using TDD, which requires great discipline” -- Bulajic et al

Slide 41

Slide 41 text

@tomerg We’ll Use Science • Research is a bit of a mixed bag – Positive anecdotal evidence – Inconclusive quantitative results – Often citing reservations • Of course it is. – We can’t measure quality – We don’t have enough data anyway

Slide 42

Slide 42 text

@tomerg Higher Quality • A totally spurious argument • Not supported by evidence • We don’t agree on: – What “quality” is – How to measure it Abbv. Metric WMC Weighted Methods per Class DIT Depth of Inheritance Tree NOC Number of Children CBO Coupling Between Objects RFC Response for a Class LCOM Lack of Cohesion of Methods -- Chidamber, Kemerer et al

Slide 43

Slide 43 text

@tomerg Code Coverage • While we’re on the subject… • Can we agree that 100% coverage is: 1. A good thing™ 2. But ROI is a problem? Source: Eﬀects of Test-Driven Development: A Comparative Analysis of Empirical Studies, Mäkinen et al

Slide 44

Slide 44 text

@tomerg Code Coverage • So… 80% then • What does it signify? – 80% quality – 20% bugs – You’re a fan of Pareto • It’s meaningless! – So, aim for 100% then?

Slide 45

Slide 45 text

@tomerg Code Coverage • Ha! Fooled you • It was a trick question

Slide 46

Slide 46 text

@tomerg Code Coverage • Ha! Fooled you • It was a trick question

Slide 47

Slide 47 text

@tomerg Code Coverage • Ha! Fooled you • It was a trick question • 100% coverage… … exercises all paths … but not all states

Slide 48

Slide 48 text

@tomerg WE CAN’T AGREE ON WHETHER TDD EVEN WORKS. Takeaway #3

Slide 49

Slide 49 text

@tomerg Hold It • So TDD is no silver bullet • But it is… – A methodology – With very smart proponents – Anecdotally successful • Maybe we shouldn’t reject it outright? Image: Tevaprapas via Wikimedia Commons (CC BY 3.0)

Slide 50

Slide 50 text

The Legacy of TDD ~~ Act IV ~~

Slide 51

Slide 51 text

@tomerg The Road to Hell… Dogma noun A point of view or tenet put forth as authoritative without adequate grounds. -- Merriam-Webster

Slide 52

Slide 52 text

@tomerg TDD as Dogma “My thesis is that it has become infeasible, in light of what's happened over the last six years, for a software developer to consider himself professional if he does not practice test- driven development” -- Robert “Uncle Bob” Martin Image: Michael Kappel via Flickr (CC BY-NC 2.0)

Slide 53

Slide 53 text

@tomerg Dogma What Now? • TDD provides: – Executable specs – High test coverage • You might ask… – Are there alternatives? Of course there are. • BDD • Design-by-contract • Formal verification

Slide 54

Slide 54 text

@tomerg Dogma What Now? • TDD provides: – Executable specs – High test coverage • You might ask… – Are there alternatives? Formal Verification “[…] we have used TLA+ on 10 large complex real-world systems. In every case TLA+ has added significant value […] preventing subtle serious bugs from reaching production” -- Chris Newcombe, “Why Amazon Chose TLA+”

Slide 55

Slide 55 text

Slide 56

Slide 56 text

@tomerg Dogma What Now? • TDD provides: – Executable specs – High test coverage • You might ask… – Are there alternatives? Test-Last?!?! “The achievable minimum external quality of delivered software applications increased with the percentage of time spent on testing regardless of the testing strategy (TF or TL) applied” -- Huang, Holcombe et al

Slide 57

Slide 57 text

@tomerg Dogma What Now? • TDD provides: – Executable specs – High test coverage • You might ask… – Are there alternatives? – Is it sufficient? Of course it isn’t. • Performance • Interaction • Security • Load, capacity

Slide 58

Slide 58 text

@tomerg EVEN AT ITS BEST, TDD ISN’T ENOUGH. Takeaway #4

Slide 59

Slide 59 text

@tomerg Recap 1. We can’t agree on why we practice TDD 2. TDD isn’t universally applicable 3. We can’t agree on whether TDD even works 4. Even at its best, TDD isn’t enough

Slide 60

Slide 60 text

@tomerg Conclusion • TDD is no silver bullet • TDD is fine… if you: – Acknowledge its limitations – Apply it responsibly – Augment it with complementary techniques – Aren’t being dogmatic Image: Fred Brooks by Capgemini sd&m AG via Wikimedia Commons (CC BY-SA 3.0)

Slide 61

Slide 61 text

We’re done here Thank you for your time! [email protected] This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. @tomerg (yes, we are hiring ;-)