Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Folly of TDD

The Folly of TDD

The closing keynote at GeeCON 2018 in Prague, Czech Republic:

Spun off of the eXtreme Programming movement, Test Driven Development has taken the software industry by storm. From a highly controversial approach, TDD has become a ubiquitous industry practice and the poster child of software engineering maturity. But is it all it's cracked up to be?

Severely misunderstood, argued about vehemently and lacking in critical research, the popularity of TDD is such that it's barely challenged anymore. In this presentation we'll take a journey through the hype and into to the essence of TDD, observing its shortcomings, hidden costs and unfortunate dogma.

0014decc65763e66f22891be724b5afa?s=128

Tomer Gabel

October 19, 2018
Tweet

More Decks by Tomer Gabel

Other Decks in Programming

Transcript

  1. The Folly of TDD Tomer Gabel, WeWork Prague, 18-19 October

    2018
  2. The Folly of TDD Tomer Gabel, WeWork @ GeeCON Prague

    October 2018 Image: Wikimedia Commons (public domain) @tomerg
  3. @tomerg Tech malcontent Who am I? GeeCON alumnus Engineer @

    5x Organizer
  4. The Promise of TDD ~~ Act I ~~

  5. @tomerg TDD, Revisited Test-driven development (TDD) is a software development

    process that relies on the repetition of a very short development cycle: requirements are turned into very specific test cases, then the software is improved to pass the new tests, only. This is opposed to software development that allows software to be added that is not proven to meet requirements. -- Wikipedia
  6. @tomerg In a nutshell… Red Green Refactor

  7. @tomerg In a nutshell… Write test Make test pass Clean

    up code
  8. @tomerg The Benefits of TDD • Fewer bugs • Fewer

    regressions • Higher quality
  9. @tomerg The Benefits of TDD • Fewer bugs • Fewer

    regressions • Higher quality • Higher velocity • Loose coupling
  10. @tomerg The Benefits of TDD • Fewer bugs • Fewer

    regressions • Higher quality • Higher velocity • Loose coupling • Fast feedback • ”Built-in” documentation • Less debugging • Emergent design • Promotes SRP • …
  11. @tomerg • Fewer bugs • Fewer regressions • Higher quality

    • Higher velocity • Loose coupling • Fast feedback • ”Built-in” documentation • Less debugging • Emergent design • Promotes SRP • … The Benefits of TDD
  12. @tomerg • Fewer bugs • Fewer regressions • Higher quality

    • Higher velocity • Loose coupling • Fast feedback • ”Built-in” documentation • Less debugging • Emergent design • Promotes SRP • … The Benefits of TDD
  13. @tomerg O RLY?

  14. @tomerg What do the experts say? There’s no consistent theme.

  15. @tomerg WE CAN’T AGREE ON WHY WE PRACTICE TDD. Takeaway

    #1
  16. @tomerg Axioms • But let’s move on • It’s somewhat

    accepted that TDD promises: – Reduced bug density – Fewer regressions – Better code quality • But how?
  17. The Practice of TDD ~~ Act II ~~

  18. @tomerg Uncle Bob’s Three Laws✝ 1. No production code without

    a failing unit test 2. Write only as much test code as is sufficient to fail 3. Write only as much production code as is sufficient to pass Image: Michael Kappel via Flickr (CC BY-NC 2.0) ✝ Jim Coplien and Bob Martin Debate TDD” on YouTube
  19. @tomerg Uncle Bob’s Three Laws✝ 1. No production code without

    a failing unit test 2. Write only as much test code as is sufficient to fail 3. Write only as much production code as is sufficient to pass ✝ Jim Coplien and Bob Martin Debate TDD” on YouTube Test-first Rapid Iteration
  20. @tomerg Uncle Bob’s Three Laws✝ 1. No production code without

    a failing unit test 2. Write only as much test code as is sufficient to fail 3. Write only as much production code as is sufficient to pass ✝ Jim Coplien and Bob Martin Debate TDD” on YouTube Rapid Iteration Test-first
  21. @tomerg Applied TDD • Test-first assumes the solution space is:

    – Known – Understood – Stable – Finite • This is often fallacious.
  22. @tomerg Applied TDD • Test-first assumes the solution space is:

    – Known – Understood – Stable – Finite • This is often fallacious. We know what “good” is
  23. @tomerg Applied TDD • Test-first assumes the solution space is:

    – Known – Understood – Stable – Finite • This is often fallacious. • Not always the case! • Qualitative/subjective: • Search results • Social feed • Recommendation We know what “good” is
  24. @tomerg Applied TDD • Test-first assumes the solution space is:

    – Known – Understood – Stable – Finite • This is often fallacious. We have a solution in mind
  25. @tomerg Applied TDD • Test-first assumes the solution space is:

    – Known – Understood – Stable – Finite • This is often fallacious. • TDD focuses on units • Test-first within known boundaries • Changing boundaries is expensive! We have a solution in mind
  26. @tomerg Applied TDD • Test-first assumes the solution space is:

    – Known – Understood – Stable – Finite • This is often fallacious. The solution isn’t affected by external actors or conditions
  27. @tomerg Applied TDD • Test-first assumes the solution space is:

    – Known – Understood – Stable – Finite • This is often fallacious. • Major headache with DBs, external systems • Woefully inadequate for e.g. web scrapers The solution isn’t affected by external actors or conditions
  28. @tomerg Applied TDD • Test-first assumes the solution space is:

    – Known – Understood – Stable – Finite • This is often fallacious. Can be adequately covered by a small set of tests
  29. @tomerg Applied TDD • Test-first assumes the solution space is:

    – Known – Understood – Stable – Finite • This is often fallacious. Image: “Not Hotdog” application by Brown Hill Productions, LLC.
  30. @tomerg TDD ISN’T UNIVERSALLY APPLICABLE. Takeaway #2

  31. The Efficacy of TDD ~~ Act III ~~

  32. @tomerg Axioms, Reprise Image: Michael Keen via Flickr (CC BY-NC-ND

    2.0) • Back to our axioms • TDD promises: – Reduced bug density – Fewer regressions – Better code quality • Does it deliver?
  33. @tomerg Fewer Bugs Image: Michael Bulcik via Wikimedia Commons (CC

    BY 2.5) • If all production code satisfies some test… – We write only necessary code – And it’s all tested – So it’s bug free? • Sounds great!
  34. @tomerg Fewer Bugs … except: • Test code is still

    code Dad, what’s a good test-to-production code size ratio? Anywhere from 1:1 to 5:1, son Image: Wikimedia Commons (public domain)
  35. @tomerg Fewer Bugs … except: • Test code is still

    code • Tests can be missing • Tests can be buggy • Tests can be wrong
  36. @tomerg Fewer Regressions • Test coverage is great – Gives

    us a safety net • But it doesn’t cover: – Performance – Security – Integrations Image: Keating G. via Wikimedia Commons (IWM Non Commercial Licence)
  37. @tomerg WHAT ABOUT QUALITY?

  38. @tomerg We’ll Use Science • Research is a bit of

    a mixed bag – Positive anecdotal evidence “[TDD] can significantly reduce the defect density […] without significant productivity reduction” -- Nagappan, Bhat et al (“IBM study”) “ […] realized about 50% reduction in FVT defect density […] with minimal impact to developer productivity.” -- Maximilien, Williams (“Microsoft study”)
  39. @tomerg We’ll Use Science • Research is a bit of

    a mixed bag – Positive anecdotal evidence – Inconclusive quantitative results “[...] TDD does not always produce highly cohesive code [...] at least, when the TDD users are inexperienced developers” -- Siniaalto et al ”Existing evidence is not sufficient and conclusions and results can be quite contradictory” -- Bulajic et al
  40. @tomerg We’ll Use Science • Research is a bit of

    a mixed bag – Positive anecdotal evidence – Inconclusive quantitative results – Often citing reservations “Although this […] and other studies show favorable results when using TDD, developers must have the correct mindset when using TDD, which requires great discipline” -- Bulajic et al
  41. @tomerg We’ll Use Science • Research is a bit of

    a mixed bag – Positive anecdotal evidence – Inconclusive quantitative results – Often citing reservations • Of course it is. – We can’t measure quality – We don’t have enough data anyway
  42. @tomerg Higher Quality • A totally spurious argument • Not

    supported by evidence • We don’t agree on: – What “quality” is – How to measure it Abbv. Metric WMC Weighted Methods per Class DIT Depth of Inheritance Tree NOC Number of Children CBO Coupling Between Objects RFC Response for a Class LCOM Lack of Cohesion of Methods -- Chidamber, Kemerer et al
  43. @tomerg Code Coverage • While we’re on the subject… •

    Can we agree that 100% coverage is: 1. A good thing™ 2. But ROI is a problem? Source: Effects of Test-Driven Development: A Comparative Analysis of Empirical Studies, Mäkinen et al
  44. @tomerg Code Coverage • So… 80% then • What does

    it signify? – 80% quality – 20% bugs – You’re a fan of Pareto • It’s meaningless! – So, aim for 100% then?
  45. @tomerg Code Coverage • Ha! Fooled you • It was

    a trick question
  46. @tomerg Code Coverage • Ha! Fooled you • It was

    a trick question
  47. @tomerg Code Coverage • Ha! Fooled you • It was

    a trick question • 100% coverage… … exercises all paths … but not all states
  48. @tomerg WE CAN’T AGREE ON WHETHER TDD EVEN WORKS. Takeaway

    #3
  49. @tomerg Hold It • So TDD is no silver bullet

    • But it is… – A methodology – With very smart proponents – Anecdotally successful • Maybe we shouldn’t reject it outright? Image: Tevaprapas via Wikimedia Commons (CC BY 3.0)
  50. The Legacy of TDD ~~ Act IV ~~

  51. @tomerg The Road to Hell… Dogma noun A point of

    view or tenet put forth as authoritative without adequate grounds. -- Merriam-Webster
  52. @tomerg TDD as Dogma “My thesis is that it has

    become infeasible, in light of what's happened over the last six years, for a software developer to consider himself professional if he does not practice test- driven development” -- Robert “Uncle Bob” Martin Image: Michael Kappel via Flickr (CC BY-NC 2.0)
  53. @tomerg Dogma What Now? • TDD provides: – Executable specs

    – High test coverage • You might ask… – Are there alternatives? Of course there are. • BDD • Design-by-contract • Formal verification
  54. @tomerg Dogma What Now? • TDD provides: – Executable specs

    – High test coverage • You might ask… – Are there alternatives? Formal Verification “[…] we have used TLA+ on 10 large complex real-world systems. In every case TLA+ has added significant value […] preventing subtle serious bugs from reaching production” -- Chris Newcombe, “Why Amazon Chose TLA+”
  55. @tomerg Dogma What Now? • TDD provides: – Executable specs

    – High test coverage • You might ask… – Are there alternatives? Of course there are. • BDD • Design-by-contract • Formal verification • Test-last
  56. @tomerg Dogma What Now? • TDD provides: – Executable specs

    – High test coverage • You might ask… – Are there alternatives? Test-Last?!?! “The achievable minimum external quality of delivered software applications increased with the percentage of time spent on testing regardless of the testing strategy (TF or TL) applied” -- Huang, Holcombe et al
  57. @tomerg Dogma What Now? • TDD provides: – Executable specs

    – High test coverage • You might ask… – Are there alternatives? – Is it sufficient? Of course it isn’t. • Performance • Interaction • Security • Load, capacity
  58. @tomerg EVEN AT ITS BEST, TDD ISN’T ENOUGH. Takeaway #4

  59. @tomerg Recap 1. We can’t agree on why we practice

    TDD 2. TDD isn’t universally applicable 3. We can’t agree on whether TDD even works 4. Even at its best, TDD isn’t enough
  60. @tomerg Conclusion • TDD is no silver bullet • TDD

    is fine… if you: – Acknowledge its limitations – Apply it responsibly – Augment it with complementary techniques – Aren’t being dogmatic Image: Fred Brooks by Capgemini sd&m AG via Wikimedia Commons (CC BY-SA 3.0)
  61. We’re done here Thank you for your time! tomer@tomergabel.com This

    work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. @tomerg (yes, we are hiring ;-)