The Folly of TDD

The Folly of TDD Tomer Gabel, WeWork Prague, 18-19 October
2018

The Folly of TDD Tomer Gabel, WeWork @ GeeCON Prague
October 2018 Image: Wikimedia Commons (public domain) @tomerg

@tomerg Tech malcontent Who am I? GeeCON alumnus Engineer @
5x Organizer

The Promise of TDD ~~ Act I ~~

@tomerg TDD, Revisited Test-driven development (TDD) is a software development
process that relies on the repetition of a very short development cycle: requirements are turned into very specific test cases, then the software is improved to pass the new tests, only. This is opposed to software development that allows software to be added that is not proven to meet requirements. -- Wikipedia

@tomerg In a nutshell… Red Green Refactor

@tomerg In a nutshell… Write test Make test pass Clean
up code

@tomerg The Benefits of TDD • Fewer bugs • Fewer
regressions • Higher quality

regressions • Higher quality • Higher velocity • Loose coupling

regressions • Higher quality • Higher velocity • Loose coupling • Fast feedback • ”Built-in” documentation • Less debugging • Emergent design • Promotes SRP • …

@tomerg • Fewer bugs • Fewer regressions • Higher quality
• Higher velocity • Loose coupling • Fast feedback • ”Built-in” documentation • Less debugging • Emergent design • Promotes SRP • … The Benefits of TDD

@tomerg O RLY?

@tomerg What do the experts say? There’s no consistent theme.

@tomerg WE CAN’T AGREE ON WHY WE PRACTICE TDD. Takeaway
#1

@tomerg Axioms • But let’s move on • It’s somewhat
accepted that TDD promises: – Reduced bug density – Fewer regressions – Better code quality • But how?

The Practice of TDD ~~ Act II ~~

@tomerg Uncle Bob’s Three Laws✝ 1. No production code without
a failing unit test 2. Write only as much test code as is sufficient to fail 3. Write only as much production code as is sufficient to pass Image: Michael Kappel via Flickr (CC BY-NC 2.0) ✝ Jim Coplien and Bob Martin Debate TDD” on YouTube

a failing unit test 2. Write only as much test code as is sufficient to fail 3. Write only as much production code as is sufficient to pass ✝ Jim Coplien and Bob Martin Debate TDD” on YouTube Test-first Rapid Iteration

a failing unit test 2. Write only as much test code as is sufficient to fail 3. Write only as much production code as is sufficient to pass ✝ Jim Coplien and Bob Martin Debate TDD” on YouTube Rapid Iteration Test-first

@tomerg Applied TDD • Test-first assumes the solution space is:
– Known – Understood – Stable – Finite • This is often fallacious.

– Known – Understood – Stable – Finite • This is often fallacious. We know what “good” is

– Known – Understood – Stable – Finite • This is often fallacious. • Not always the case! • Qualitative/subjective: • Search results • Social feed • Recommendation We know what “good” is

– Known – Understood – Stable – Finite • This is often fallacious. We have a solution in mind

– Known – Understood – Stable – Finite • This is often fallacious. • TDD focuses on units • Test-first within known boundaries • Changing boundaries is expensive! We have a solution in mind

– Known – Understood – Stable – Finite • This is often fallacious. The solution isn’t affected by external actors or conditions

– Known – Understood – Stable – Finite • This is often fallacious. • Major headache with DBs, external systems • Woefully inadequate for e.g. web scrapers The solution isn’t affected by external actors or conditions

– Known – Understood – Stable – Finite • This is often fallacious. Can be adequately covered by a small set of tests

– Known – Understood – Stable – Finite • This is often fallacious. Image: “Not Hotdog” application by Brown Hill Productions, LLC.

@tomerg TDD ISN’T UNIVERSALLY APPLICABLE. Takeaway #2

The Eﬃcacy of TDD ~~ Act III ~~

@tomerg Axioms, Reprise Image: Michael Keen via Flickr (CC BY-NC-ND
2.0) • Back to our axioms • TDD promises: – Reduced bug density – Fewer regressions – Better code quality • Does it deliver?

@tomerg Fewer Bugs Image: Michael Bulcik via Wikimedia Commons (CC
BY 2.5) • If all production code satisfies some test… – We write only necessary code – And it’s all tested – So it’s bug free? • Sounds great!

@tomerg Fewer Bugs … except: • Test code is still
code Dad, what’s a good test-to-production code size ratio? Anywhere from 1:1 to 5:1, son Image: Wikimedia Commons (public domain)

@tomerg Fewer Bugs … except: • Test code is still
code • Tests can be missing • Tests can be buggy • Tests can be wrong

@tomerg Fewer Regressions • Test coverage is great – Gives
us a safety net • But it doesn’t cover: – Performance – Security – Integrations Image: Keating G. via Wikimedia Commons (IWM Non Commercial Licence)

@tomerg WHAT ABOUT QUALITY?

@tomerg We’ll Use Science • Research is a bit of
a mixed bag – Positive anecdotal evidence “[TDD] can significantly reduce the defect density […] without significant productivity reduction” -- Nagappan, Bhat et al (“IBM study”) “ […] realized about 50% reduction in FVT defect density […] with minimal impact to developer productivity.” -- Maximilien, Williams (“Microsoft study”)

a mixed bag – Positive anecdotal evidence – Inconclusive quantitative results “[...] TDD does not always produce highly cohesive code [...] at least, when the TDD users are inexperienced developers” -- Siniaalto et al ”Existing evidence is not sufficient and conclusions and results can be quite contradictory” -- Bulajic et al

a mixed bag – Positive anecdotal evidence – Inconclusive quantitative results – Often citing reservations “Although this […] and other studies show favorable results when using TDD, developers must have the correct mindset when using TDD, which requires great discipline” -- Bulajic et al

a mixed bag – Positive anecdotal evidence – Inconclusive quantitative results – Often citing reservations • Of course it is. – We can’t measure quality – We don’t have enough data anyway

@tomerg Higher Quality • A totally spurious argument • Not
supported by evidence • We don’t agree on: – What “quality” is – How to measure it Abbv. Metric WMC Weighted Methods per Class DIT Depth of Inheritance Tree NOC Number of Children CBO Coupling Between Objects RFC Response for a Class LCOM Lack of Cohesion of Methods -- Chidamber, Kemerer et al

@tomerg Code Coverage • While we’re on the subject… •
Can we agree that 100% coverage is: 1. A good thing™ 2. But ROI is a problem? Source: Eﬀects of Test-Driven Development: A Comparative Analysis of Empirical Studies, Mäkinen et al

@tomerg Code Coverage • So… 80% then • What does
it signify? – 80% quality – 20% bugs – You’re a fan of Pareto • It’s meaningless! – So, aim for 100% then?

@tomerg Code Coverage • Ha! Fooled you • It was
a trick question

@tomerg Code Coverage • Ha! Fooled you • It was
a trick question • 100% coverage… … exercises all paths … but not all states

@tomerg WE CAN’T AGREE ON WHETHER TDD EVEN WORKS. Takeaway
#3

@tomerg Hold It • So TDD is no silver bullet
• But it is… – A methodology – With very smart proponents – Anecdotally successful • Maybe we shouldn’t reject it outright? Image: Tevaprapas via Wikimedia Commons (CC BY 3.0)

The Legacy of TDD ~~ Act IV ~~

@tomerg The Road to Hell… Dogma noun A point of
view or tenet put forth as authoritative without adequate grounds. -- Merriam-Webster

@tomerg TDD as Dogma “My thesis is that it has
become infeasible, in light of what's happened over the last six years, for a software developer to consider himself professional if he does not practice test- driven development” -- Robert “Uncle Bob” Martin Image: Michael Kappel via Flickr (CC BY-NC 2.0)

@tomerg Dogma What Now? • TDD provides: – Executable specs
– High test coverage • You might ask… – Are there alternatives? Of course there are. • BDD • Design-by-contract • Formal verification

– High test coverage • You might ask… – Are there alternatives? Formal Verification “[…] we have used TLA+ on 10 large complex real-world systems. In every case TLA+ has added significant value […] preventing subtle serious bugs from reaching production” -- Chris Newcombe, “Why Amazon Chose TLA+”

– High test coverage • You might ask… – Are there alternatives? Of course there are. • BDD • Design-by-contract • Formal verification • Test-last

– High test coverage • You might ask… – Are there alternatives? Test-Last?!?! “The achievable minimum external quality of delivered software applications increased with the percentage of time spent on testing regardless of the testing strategy (TF or TL) applied” -- Huang, Holcombe et al

– High test coverage • You might ask… – Are there alternatives? – Is it sufficient? Of course it isn’t. • Performance • Interaction • Security • Load, capacity

@tomerg EVEN AT ITS BEST, TDD ISN’T ENOUGH. Takeaway #4

@tomerg Recap 1. We can’t agree on why we practice
TDD 2. TDD isn’t universally applicable 3. We can’t agree on whether TDD even works 4. Even at its best, TDD isn’t enough

@tomerg Conclusion • TDD is no silver bullet • TDD
is fine… if you: – Acknowledge its limitations – Apply it responsibly – Augment it with complementary techniques – Aren’t being dogmatic Image: Fred Brooks by Capgemini sd&m AG via Wikimedia Commons (CC BY-SA 3.0)

We’re done here Thank you for your time! [email protected] This
work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. @tomerg (yes, we are hiring ;-)

The Folly of TDD

The Folly of TDD

More Decks by Tomer Gabel

Other Decks in Programming

Featured

Transcript