Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Predictive Models of Development Teams and the Systems they Build

Predictive Models of Development Teams and the Systems they Build

It's awkward to perform science experiments on developers, so let's simulate them instead!

In 1968 Melvin Conway pointed out a seemingly inevitable symmetry between organisations and the software systems they construct. Organisations today are more fluid than 40 years ago, with short developer tenure, and frequent migration of individuals between projects and employers. In this slot we’ll examine - and perhaps collect - data on the tenure and productivity of programmers and use this to gain insight into codebases, by simulating their growth with simple stochastic models. From such models, we can make important predictions about the maintainability and long-term viability of software systems, with implications for how we approach software design, documentation and how we assemble teams.

Robert Smallshire

September 11, 2014
Tweet

More Decks by Robert Smallshire

Other Decks in Programming

Transcript

  1. @sixty_north
    Predictive Models of Development Teams
    and the Systems they Build
    1
    Robert Smallshire
    @robsmallshire

    View Slide

  2. 2

    View Slide

  3. Randomised controlled trials
    3
    Experimental Science
    ‣ Developers don’t like to be watched
    ‣ Eliminating extraneous factors
    ‣ Toy problems aren’t realistic
    ‣ No two projects are the same
    ‣ Can’t do double-blind
    ‣ Students have little experience
    ‣ Time and money

    View Slide

  4. 4

    View Slide

  5. How can we know?
    5
    Prediction
    Comparison
    Modelling
    Observation
    Formulate a hypothesis. Design a conceptual model.
    Run simulations.
    Observe and record reality.
    Validate or refute the model.
    1
    2
    3
    4

    View Slide

  6. Systems and their architectures are long lived
    Lifetimes in the software industry
    6
    Developers
    Windows XP
    Applications
    CEOs
    Lines of code
    FTSE100
    Classes
    Modules
    0 15 30 45 60
    58
    37
    22
    13
    6.8
    6.2
    4.7
    3.1
    Category Title
    Sources: Software Lifetime and its Evolution Process over Generations, CEO Succession Practices: 2012 Edition, Investors Chronicle,
    Half-lives of software related entities
    The number of years over which half the entities are replaced

    View Slide

  7. Draw teams at random from a productivity distribution
    Simulating Developer Productivity
    7
    Productivity on 10000 SLOC codebase

    View Slide

  8. 0 10000 30000
    20000
    Draw teams at random from a productivity distribution
    Simulating Developer Productivity
    7
    Productivity SLOC/year
    Productivity on 10000 SLOC codebase

    View Slide

  9. 0 10000 30000
    20000
    Draw teams at random from a productivity distribution
    Simulating Developer Productivity
    7
    Productivity SLOC/year
    Productivity on 10000 SLOC codebase
    max
    min mode

    View Slide

  10. 0 10000 30000
    20000
    Draw teams at random from a productivity distribution
    Simulating Developer Productivity
    7
    Productivity SLOC/year
    Productivity on 10000 SLOC codebase
    Probability Density
    max
    min mode

    View Slide

  11. 0 10000 30000
    20000
    Draw teams at random from a productivity distribution
    Simulating Developer Productivity
    7
    1
    Productivity SLOC/year
    Productivity on 10000 SLOC codebase
    Probability Density
    max
    min mode
    triangular
    distribution

    View Slide

  12. 0 10000 30000
    20000
    Draw teams at random from a productivity distribution
    Simulating Developer Productivity
    7
    1
    Productivity SLOC/year
    Productivity on 10000 SLOC codebase
    Probability Density
    0%
    50%
    100%
    Cumulative Probability
    max
    min mode
    triangular
    distribution
    cumulative
    distribution
    function

    View Slide

  13. 0 10000 30000
    20000
    Draw teams at random from a productivity distribution
    Simulating Developer Productivity
    7
    1
    Productivity SLOC/year
    Productivity on 10000 SLOC codebase
    Probability Density
    0%
    50%
    100%
    Cumulative Probability
    max
    min mode
    triangular
    distribution
    cumulative
    distribution
    function

    View Slide

  14. 0 10000 30000
    20000
    Draw teams at random from a productivity distribution
    Simulating Developer Productivity
    7
    1
    Productivity SLOC/year
    Productivity on 10000 SLOC codebase
    Probability Density
    0%
    50%
    100%
    Cumulative Probability
    max
    min mode
    triangular
    distribution
    cumulative
    distribution
    function

    View Slide

  15. 0 10000 30000
    20000
    Draw teams at random from a productivity distribution
    Simulating Developer Productivity
    7
    1
    Productivity SLOC/year
    Productivity on 10000 SLOC codebase
    Probability Density
    0%
    50%
    100%
    Cumulative Probability
    max
    min mode
    triangular
    distribution
    cumulative
    distribution
    function

    View Slide

  16. 8

    View Slide

  17. 100
    1000
    10000
    100000
    1000 10000 100000 1000000 10000000
    Productivity (Lines of Code / Year)
    Total Lines of Code
    Use published productivity
    data to forward model code
    size.
    Modelling team and code evolution
    9
    Sources: COCOMO II
    At any given system size we
    can predict a distribution for
    developer productivity.
    Dramatically less
    productive on larger
    code bases
    29000
    5500

    View Slide

  18. 100
    1000
    10000
    100000
    1000 10000 100000 1000000 10000000
    Productivity (Lines of Code / Year)
    Total Lines of Code
    Use published productivity
    data to forward model code
    size.
    Modelling team and code evolution
    9
    Sources: COCOMO II
    At any given system size we
    can predict a distribution for
    developer productivity.
    Dramatically less
    productive on larger
    code bases
    29000
    5500

    View Slide

  19. 10
    5 years
    Simulating a team of seven over five years

    View Slide

  20. 10
    5 years
    Simulating a team of seven over five years

    View Slide

  21. 10
    start with nothing
    5 years
    Simulating a team of seven over five years

    View Slide

  22. 10
    start with nothing
    some developers
    contribute more
    5 years
    Simulating a team of seven over five years

    View Slide

  23. 10
    start with nothing
    some developers
    contribute more
    others
    less
    5 years
    Simulating a team of seven over five years

    View Slide

  24. 10
    start with nothing
    some developers
    contribute more
    others
    less
    when a developer leaves
    5 years
    Simulating a team of seven over five years

    View Slide

  25. 10
    start with nothing
    some developers
    contribute more
    others
    less
    when a developer leaves
    5 years
    Simulating a team of seven over five years
    they are replaced

    View Slide

  26. 10
    start with nothing
    some developers
    contribute more
    others
    less
    when a developer leaves
    After 5 years we
    have 235 k lines
    of code written
    by a total of
    19 people.
    Only 37% of the
    code is by
    current team
    5 years
    Simulating a team of seven over five years
    they are replaced

    View Slide

  27. 11

    View Slide

  28. 12
    Team Size : 7
    3 years

    View Slide

  29. 12
    157 kLoC
    Cumulative team size : 11 ± 2 @ 1σ
    Team Size : 7
    LoC : 157 k ± 23 k @ 1σ
    Author present : 70% ± 14% @ 1σ
    3 years

    View Slide

  30. 13
    Team Size : 21
    20 years

    View Slide

  31. 13
    1.8 MLoC
    Cumulative team size : 114 ± 9 @ 1σ
    Team Size : 21
    LoC : 1.8 M ± 0.08 M @ 1σ
    Author present : 19% ± 4% @ 1σ
    20 years

    View Slide

  32. Probability density from 1000 simulations
    How long for seven to produce 100 000 lines of code?
    14
    200 400 600 800
    0
    Days
    0
    0.006
    Probability

    View Slide

  33. Probability density from 1000 simulations
    How long for seven to produce 100 000 lines of code?
    14
    probability of
    delivery on a
    particular day
    200 400 600 800
    0
    Days
    0
    0.006
    Probability

    View Slide

  34. Cumulative probability from 1000 simulations
    15
    How long for 7 to produce 100 000 lines of code?
    200 400 600 800
    0
    Days
    100%
    Cumulative Probability
    0%

    View Slide

  35. Cumulative probability from 1000 simulations
    15
    How long for 7 to produce 100 000 lines of code?
    200 400 600 800
    0
    Days
    100%
    Cumulative Probability
    0%
    probability of
    delivery before a
    particular day

    View Slide

  36. Cumulative probability from 1000 simulations
    15
    How long for 7 to produce 100 000 lines of code?
    200 400 600 800
    0
    Days
    100%
    Cumulative Probability
    0%
    20%
    probability of
    delivery before a
    particular day
    330

    View Slide

  37. Cumulative probability from 1000 simulations
    15
    How long for 7 to produce 100 000 lines of code?
    200 400 600 800
    0
    Days
    100%
    Cumulative Probability
    0%
    20%
    80% probability of
    delivery before a
    particular day
    330 470

    View Slide

  38. Most authors of your product quit way back when.
    Who can you still talk to?
    16
    days
    20% after
    20 years
    The proportion of
    code written by
    current team

    View Slide

  39. from the 1968 paper How do committees invent?
    17
    Conway’s Law
    Melvin Conway
    “Any organization that designs a
    system (defined broadly) will
    produce a design whose structure
    is a copy of the organization's
    communication structure”
    integrated over time

    View Slide

  40. 18
    Thank you!
    @sixty_north
    Robert Smallshire
    @robsmallshire
    http://sixty-north.com/blog/
    predictive-models-of-development-teams-and-
    the-systems-they-build

    View Slide