Can We Measure Developer Productivity?

Slide 1

Slide 1 text

Can We Measure Developer Productivity? Eberhard Wolff Head of Architecture https://swaglab.rocks/ https://ewolff.com/

Slide 2

Slide 2 text

Is Lines of Code (LoC) a good measure for developer productivity?

Slide 3

Slide 3 text

Today I wrote just 10 lines of code. But I finally fixed that bug!

Slide 4

Slide 4 text

Today I wrote just 10 lines of code. …because I spent so much time deploying software Need to fix deployment!

Slide 5

Slide 5 text

Today I wrote 1.000 lines of code. …but that was really yak shaving. No business value

Slide 6

Slide 6 text

Can We Measure Productivity? •Yes •But the questions really are: Who measures? Why? How?

Slide 7

Slide 7 text

Good Scenarios •Review your own productivity (team or individual) •Measure productivity to help

Slide 8

Slide 8 text

Problems

Slide 9

Slide 9 text

Goodhart’s Law Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes.

Slide 10

Slide 10 text

Goodhart’s Law When a measure becomes a target, it ceases to be a good measure

Slide 11

Slide 11 text

Goodhart’s Law: Test Coverage •Test coverage: good measure for the quality of test. •Higher: more parts of the code are executed, so the test can catch more errors.

Slide 12

Slide 12 text

Goodhart’s Law: Test Coverage •Set a test coverage goal •Metric will be manipulated. •E.g. focus on trivial parts of the code •E.g. don’t check any results •E.g. just make sure no exception is thrown •….

Slide 13

Slide 13 text

Goodhart’s Law: Test Coverage •Test coverage increases. •But the tests won’t really catch more problems. •Test coverage is not a good metric for the quality of the tests any more!

Slide 14

Slide 14 text

Goodhart’s Law: Test Coverage Solution? •Be smarter about what you measure. •E.g. mutation testing •E.g. review test code •… •IMHO: This is a pointless arms race. •So don’t manage for metrics?

Slide 15

Slide 15 text

Goodhart’s Law: Solution •Purpose •Teams tries to optimize itself. •Probably not a case for Goodhart’s Law •Management measures quality into software / people •Probably a case for Goodhart’s Law

Slide 16

Slide 16 text

Dealing with Goodhart’s Law •Let the team decide whether / how they want to improve! •Help and support •i.e. pave the road. •Provide support: techniques and technologies

Slide 17

Slide 17 text

So: Questions •Who measures? •Why? •How?

Slide 18

Slide 18 text

Sensible Metrics

Slide 19

Slide 19 text

DORA: 4 Key Metrics •Change lead time •Deployment frequency •Change fail percentage •Failed deployment recovery time •https://dora.dev/

Slide 20

Slide 20 text

DORA: Lean Optimization Production Idea Bottleneck becomes obvious Increase Speed! Testing too slow!

Slide 21

Slide 21 text

DORA: Lean Optimization Production Idea Eliminate Bottleneck Increase Speed!

Slide 22

Slide 22 text

DORA: Lean Optimization Production Idea Increase Speed! Next Bottleneck becomes obvious Deployment too slow!

Slide 23

Slide 23 text

DORA: Typical Bottlenecks •Configuration management •Testing •Deployment •Change approval •Decoupled architecture •…

Slide 24

Slide 24 text

DORA: Typical Solutions •Automation •Also helps with quality, maintainability, …and therefore productivity

Slide 25

Slide 25 text

DORA •Good empiric evidence •Metrics have many positive consequences •More time for new features •Less burnout •More economic success

Slide 26

Slide 26 text

Another Good Metric: Business Value •I.e. Outcome •Ideally a $ / € value •How do you measure business value? •Some organizations require a business case for a software project i.e. they can predict business value •Business case: starting point to find business value?

Slide 27

Slide 27 text

Other Metrics •Code quality •Complexity •Dependencies •… •Not productivity but related to it.

Slide 28

Slide 28 text

https://software-architektur.tv/2023/06/07/folge168.html

Slide 29

Slide 29 text

Why do we focus on these metrics? What is the science?

Slide 30

Slide 30 text

Empiric? •Empiric in our field is generally hard. •Empiric conclusions about specific metrics? •But we must improve somehow. •Gut feeling?

Slide 31

Slide 31 text

https://software-architektur.tv/2021/10/25/episode86.html

Slide 32

Slide 32 text

SPACE

Slide 33

Slide 33 text

SPACE: Areas •SPACE is a framework of metrics •Choose a specific set of metrics to understand a specific problem

Slide 34

Slide 34 text

SPACE: Matrix of Metrics by Levels & Areas Area 1 Area 2 … Level 1 Metric … Level 2 … …

Slide 35

Slide 35 text

SPACE: Three Levels Individual / one person Team or group / people that work together System

Slide 36

Slide 36 text

SPACE: Areas •Satisfaction & Well-Being: e.g. Developer satisfaction / retention •Performance (Outcome) Code review velocity •Activity (Count of actions) Code review scores

Slide 37

Slide 37 text

SPACE: Areas •Communication & Collaboration Code review thoughfullness •Efficiency & Flow Productivity perception

Slide 38

Slide 38 text

SPACE: Recommendations •Multiple metrics across various dimensions •At least 3, but not too many •At least one perceptual (survey) •What gets measured shows what is relevant •Hard to gamble

Slide 39

Slide 39 text

SPACE: Customizing •Understand your goal! •Select metrics! •No one size fits all!

Slide 40

Slide 40 text

SPACE: Conclusion •A comprehensive and sensible framework •Many metrics •Must be tailored to the environment •Broad (e.g. communication, collaboration, satisfaction) •Goes beyond pure performance

Slide 41

Slide 41 text

McKinsey

Slide 42

Slide 42 text

McKinsey •(Business) Consulting Company •Has some controversies: https://en.wikipedia.org/wiki/McKinsey_%26_Compa ny#Controversies

Slide 43

Slide 43 text

https://youtu.be/AiOUojVd6xQ

Slide 44

Slide 44 text

McKinsey Paper

Slide 45

Slide 45 text

McKinsey Matrix Outcome Focus Optimization Focus Opportunity Focus System level Selected DORA / SPACE Metrics … Opportunity- focused metrics (McKinsey) + some SPACE metrics Team level … Individual level

Slide 46

Slide 46 text

Slide 47

Slide 47 text

McKinsey & DORA / SPACE •Predefined set of SPACE metrics for projects •I.e. no customization per organization •Eliminates tailoring …and therefore the discussion about “why?”

Slide 48

Slide 48 text

Slide 49

Slide 49 text

Contribution Analysis •Measuring individual contributions to the backlog using JIRA and custom tools. •Managers can manage expectations and improve performance this way. •IMHO problematic – is this a sensible metric? •Task might be important but time-consuming? •What if you don’t do tickets but support other people? •Shouldn’t contribution be about created business value?

Slide 50

Slide 50 text

Inner / Outer Loop Time Spent Inner Loop Outer Loop Test Code Build Deploy at Scale Security and compliance Integrate Meetings

Slide 51

Slide 51 text

Inner / Outer Loop Time Spent Inner Loop Outer Loop Test Code Build Deploy at Scale Security and compliance Integrate Meetings Optimize for time in the Inner Loop! Hacking away instead of a meeting to understand the problem? Really?

Slide 52

Slide 52 text

Developer Velocity Index •46 Driver in 13 capability areas •Technology (Architecture, Public Cloud, Test Automation) •Working Practices (Engineering Practices e.g. Tech Debt) •Organizational Enablement (e.g. Culture, Talent Management)

Slide 53

Slide 53 text

Developer Velocity Index •Good foundation for a elaborated consulting project •Does it help? •Benchmarking? By industry? •Every project is different •Can you arrive at results more pragmatically and quickly ? •E.g. interviews

Slide 54

Slide 54 text

Talent Capability Score •Individual skills •Diamond: majority in the middle •Example: too many inexperienced individuals → training •Why not aim for the best? Majority More Skill

Slide 55

Slide 55 text

McKinsey Example I • Developers spend too much time on design and managing dependencies • Clarify roles • Result: more code produced • Pro: Managing dependencies is annoying • Con: Design can be useful • Might be a good idea!

Slide 56

Slide 56 text

McKinsey Example II •New employees don't achieve as much •So: better onboarding and mentoring •IMHO good idea •High potential for poor metrics •Mentors perform poorly with regards to Developer Velocity Index

Slide 57

Slide 57 text

McKinsey: Recommended Approach •Learn the basics for communication with C-level •Assess your systems (e.g. to measure test coverage) •Build a plan - concrete goal •Remember that measuring productivity is contextual - it’s about getting better.

Slide 58

Slide 58 text

McKinsey: Conclusion •SPACE should be customized •New metrics are questionable •In my experience, you can find the main challenges quicker e.g. with interviews. •However, the examples and general recommendations make sense. •Doesn’t seem to aim at identifying people to fire.

Slide 59

Slide 59 text

Criticism •The paper has sparked quite some criticism. •Next slides show some highlights. •Not a comprehensive discussion!

Slide 60

Slide 60 text

Dan North’s Criticism

Slide 61

Slide 61 text

Dan North’s Criticism - Highlights •Contribution Analysis measures the wrong thing •Has the outer loop really low value? •Talent capability: depends on organization

Slide 62

Slide 62 text

Dan North’s Recommendation •Theory of Constraints: Identity bottleneck, Utilize it fully •Lead time or flow •I.e. Lean / DORA •If you hire the best, productivity is a problem of the organization not the individual. •Coaching & peer feedback

Slide 63

Slide 63 text

Kent Beck’s and Gergely Orosz’s Criticism

Slide 64

Slide 64 text

Effort Output Outcome Impact

Slide 65

Slide 65 text

Spot customer pain point Ship a solution Design docs Code Feature in prod Customer behave differently Value generated

Slide 66

Slide 66 text

Inner / outer loop Developer velocity index Talten capability score Contribution analysis Retention

Slide 67

Slide 67 text

Kent Beck & Gergely Orosz: Highlights •“Absurdly naïve” •Ignores software development teams •It's about individual performance •CEO/CFO will override the CTO to implement the McKinsey framework •Unethical CEOs and CTOs are the target audience •Then it destroys the organization

Slide 68

Slide 68 text

Kent Beck & Gergely Orosz: Highlights •The criticism doesn’t match what the paper says. •The paper has completely different examples and recommendations. •The criticism might be a caused by the scandals around McKinsey. •Prejudice?

Slide 69

Slide 69 text

Kent Beck & Gergely Orosz: Advice • Understand why you're measuring and recognize power relationships. • Promote Self-Measurement: Teams should analyze their own data. • Trust Your Judgement: Rely on explanations that resonate and take responsibility for decisions. • Productivity metrics are misleading • Focus on Real Accountability: Prioritize consistent delivery of customer-valued outcomes. • IMHO great idea!

Slide 70

Slide 70 text

Conclusion

Slide 71

Slide 71 text

Conclusion •Beware of Goodhart’s Law! •Use metrics to support teams! •Therefore: Create your own custom metrics for the problem at hand. •SPACE is a great starting point.

Slide 72

Slide 72 text

https://software-architektur.tv/2023/12/22/folge194.html

Slide 73

Slide 73 text

WIR SUCHEN MITGESTALTER:INNEN swaglab.rocks/karriere

Slide 74

Slide 74 text

Drink a virtual coffee with me! https://calendly.com/ eberhard-wolff-swaglab/

Slide 75

Slide 75 text

Send email to [email protected] Slides + Service Mesh Primer EN + Microservices Primer DE / EN + Microservices Recipes DE / EN + Sample Microservices Book DE / EN + Sample Practical Microservices DE/EN + Sample of Continuous Delivery Book DE Powered by Amazon Lambda & Microservices EMail address logged for 14 days, wrong addressed emails handled manually