Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Predictive Models of Development Teams and the Systems they Build

Predictive Models of Development Teams and the Systems they Build

It's awkward to perform science experiments on developers, so let's simulate them instead!

In 1968 Melvin Conway pointed out a seemingly inevitable symmetry between organisations and the software systems they construct. Organisations today are more fluid than 40 years ago, with short developer tenure, and frequent migration of individuals between projects and employers. In this slot we’ll examine - and perhaps collect - data on the tenure and productivity of programmers and use this to gain insight into codebases, by simulating their growth with simple stochastic models. From such models, we can make important predictions about the maintainability and long-term viability of software systems, with implications for how we approach software design, documentation and how we assemble teams.

4be361182fa13cf39c00ec69c1cb9e30?s=128

Robert Smallshire

September 11, 2014
Tweet

Transcript

  1. @sixty_north Predictive Models of Development Teams and the Systems they

    Build 1 Robert Smallshire @robsmallshire
  2. 2

  3. Randomised controlled trials 3 Experimental Science ‣ Developers don’t like

    to be watched ‣ Eliminating extraneous factors ‣ Toy problems aren’t realistic ‣ No two projects are the same ‣ Can’t do double-blind ‣ Students have little experience ‣ Time and money
  4. 4

  5. How can we know? 5 Prediction Comparison Modelling Observation Formulate

    a hypothesis. Design a conceptual model. Run simulations. Observe and record reality. Validate or refute the model. 1 2 3 4
  6. Systems and their architectures are long lived Lifetimes in the

    software industry 6 Developers Windows XP Applications CEOs Lines of code FTSE100 Classes Modules 0 15 30 45 60 58 37 22 13 6.8 6.2 4.7 3.1 Category Title Sources: Software Lifetime and its Evolution Process over Generations, CEO Succession Practices: 2012 Edition, Investors Chronicle, Half-lives of software related entities The number of years over which half the entities are replaced
  7. Draw teams at random from a productivity distribution Simulating Developer

    Productivity 7 Productivity on 10000 SLOC codebase
  8. 0 10000 30000 20000 Draw teams at random from a

    productivity distribution Simulating Developer Productivity 7 Productivity SLOC/year Productivity on 10000 SLOC codebase
  9. 0 10000 30000 20000 Draw teams at random from a

    productivity distribution Simulating Developer Productivity 7 Productivity SLOC/year Productivity on 10000 SLOC codebase max min mode
  10. 0 10000 30000 20000 Draw teams at random from a

    productivity distribution Simulating Developer Productivity 7 Productivity SLOC/year Productivity on 10000 SLOC codebase Probability Density max min mode
  11. 0 10000 30000 20000 Draw teams at random from a

    productivity distribution Simulating Developer Productivity 7 1 Productivity SLOC/year Productivity on 10000 SLOC codebase Probability Density max min mode triangular distribution
  12. 0 10000 30000 20000 Draw teams at random from a

    productivity distribution Simulating Developer Productivity 7 1 Productivity SLOC/year Productivity on 10000 SLOC codebase Probability Density 0% 50% 100% Cumulative Probability max min mode triangular distribution cumulative distribution function
  13. 0 10000 30000 20000 Draw teams at random from a

    productivity distribution Simulating Developer Productivity 7 1 Productivity SLOC/year Productivity on 10000 SLOC codebase Probability Density 0% 50% 100% Cumulative Probability max min mode triangular distribution cumulative distribution function
  14. 0 10000 30000 20000 Draw teams at random from a

    productivity distribution Simulating Developer Productivity 7 1 Productivity SLOC/year Productivity on 10000 SLOC codebase Probability Density 0% 50% 100% Cumulative Probability max min mode triangular distribution cumulative distribution function
  15. 0 10000 30000 20000 Draw teams at random from a

    productivity distribution Simulating Developer Productivity 7 1 Productivity SLOC/year Productivity on 10000 SLOC codebase Probability Density 0% 50% 100% Cumulative Probability max min mode triangular distribution cumulative distribution function
  16. 8

  17. 100 1000 10000 100000 1000 10000 100000 1000000 10000000 Productivity

    (Lines of Code / Year) Total Lines of Code Use published productivity data to forward model code size. Modelling team and code evolution 9 Sources: COCOMO II At any given system size we can predict a distribution for developer productivity. Dramatically less productive on larger code bases 29000 5500
  18. 100 1000 10000 100000 1000 10000 100000 1000000 10000000 Productivity

    (Lines of Code / Year) Total Lines of Code Use published productivity data to forward model code size. Modelling team and code evolution 9 Sources: COCOMO II At any given system size we can predict a distribution for developer productivity. Dramatically less productive on larger code bases 29000 5500
  19. 10 5 years Simulating a team of seven over five

    years
  20. 10 5 years Simulating a team of seven over five

    years
  21. 10 start with nothing 5 years Simulating a team of

    seven over five years
  22. 10 start with nothing some developers contribute more 5 years

    Simulating a team of seven over five years
  23. 10 start with nothing some developers contribute more others less

    5 years Simulating a team of seven over five years
  24. 10 start with nothing some developers contribute more others less

    when a developer leaves 5 years Simulating a team of seven over five years
  25. 10 start with nothing some developers contribute more others less

    when a developer leaves 5 years Simulating a team of seven over five years they are replaced
  26. 10 start with nothing some developers contribute more others less

    when a developer leaves After 5 years we have 235 k lines of code written by a total of 19 people. Only 37% of the code is by current team 5 years Simulating a team of seven over five years they are replaced
  27. 11

  28. 12 Team Size : 7 3 years

  29. 12 157 kLoC Cumulative team size : 11 ± 2

    @ 1σ Team Size : 7 LoC : 157 k ± 23 k @ 1σ Author present : 70% ± 14% @ 1σ 3 years
  30. 13 Team Size : 21 20 years

  31. 13 1.8 MLoC Cumulative team size : 114 ± 9

    @ 1σ Team Size : 21 LoC : 1.8 M ± 0.08 M @ 1σ Author present : 19% ± 4% @ 1σ 20 years
  32. Probability density from 1000 simulations How long for seven to

    produce 100 000 lines of code? 14 200 400 600 800 0 Days 0 0.006 Probability
  33. Probability density from 1000 simulations How long for seven to

    produce 100 000 lines of code? 14 probability of delivery on a particular day 200 400 600 800 0 Days 0 0.006 Probability
  34. Cumulative probability from 1000 simulations 15 How long for 7

    to produce 100 000 lines of code? 200 400 600 800 0 Days 100% Cumulative Probability 0%
  35. Cumulative probability from 1000 simulations 15 How long for 7

    to produce 100 000 lines of code? 200 400 600 800 0 Days 100% Cumulative Probability 0% probability of delivery before a particular day
  36. Cumulative probability from 1000 simulations 15 How long for 7

    to produce 100 000 lines of code? 200 400 600 800 0 Days 100% Cumulative Probability 0% 20% probability of delivery before a particular day 330
  37. Cumulative probability from 1000 simulations 15 How long for 7

    to produce 100 000 lines of code? 200 400 600 800 0 Days 100% Cumulative Probability 0% 20% 80% probability of delivery before a particular day 330 470
  38. Most authors of your product quit way back when. Who

    can you still talk to? 16 days 20% after 20 years The proportion of code written by current team
  39. from the 1968 paper How do committees invent? 17 Conway’s

    Law Melvin Conway “Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization's communication structure” integrated over time
  40. 18 Thank you! @sixty_north Robert Smallshire @robsmallshire http://sixty-north.com/blog/ predictive-models-of-development-teams-and- the-systems-they-build