Any sufficiently complex software system has experts, who have a deeper understanding of parts of the system than others.
However, it is not always clear who these experts are and which particular parts of the system they can provide help with.
We propose a framework to elicit the expertise of developers and recommend experts by analyzing the development of code complexity measures over time, by author as well as on the component level.
Teams can use this approach to detect those parts of the software for which currently no, or only few experts exist and can take preventive actions to keep the collective code knowledge and ownership high.
We employed the developed approach at a medium-sized company.
The results were evaluated with a survey, comparing the perceived and the computed expertise of developers.
We show that aggregated code metrics can be used to identify experts for different software components.
The identified experts were rated as acceptable candidates by developers in over 90% of all cases.
Hasso Plattner Institute
University of Potsdam, Germany
Should I Bug You?
Identifying Domain Experts in Software Projects
Using Code Complexity Metrics
Ralf Teusner, Christoph Matthies, Philipp Giese
QRS’17, Prague, July 2017
The number of people on your team who have to
be hit with a truck before the project is in serious trouble. 
 Michael Bowler. “Truck Factor”. May 15, 2005.
■ Any system develops domain experts over time
■ High Truck Number → domain “gurus”
■ Low collective code ownership
■ Can lead to Conway’s Law
■ Who should I ask when I’m in need of assistance?
■ Who is most qualified to write the documentation?
■ Who is most qualified to review this piece of code?
■ In which areas can knowledge sharing be improved?
The knowledge we seek
Who is the domain expert
for which part of the software?
Challenges & Goals
■ Developers are busy
■ Project documentation is likely out of date
■ Avoid overhead of documenting domain expertise
■ Idea: Use already existing artifacts, i.e. code
■ Analyze code to attribute expertise to developers
How can we find the gurus, without “bugging” them
■ Apply proven complexity metrics to code
■ Case-by-case basis, no set of metrics can fit all contexts
■ Consider knowledge of metrics within a software team
■ In this case study
■ Lines of code
■ Efferent coupling (Fan-Out) &
afferent coupling (Fan-in)
■ Cyclomatic Complexity
■ Halstead difficulty & volume
From Code to Domain Expertise
Measurement of the number of
linearly independent paths through
a program's source code
CC = #Edges − #Nodes + 2*#Components
9 edges, 8 nodes, 1 connected component.
Cyclomatic complexity: 9 - 8 + 2*1 = 3
aka McCabe Complexity 
 T. J. McCabe. “A complexity measure,” IEEE Transactions on
Software Engineering, no. 4. pp. 308–320. 1976.
number of distinct and total operators
number of distinct and total operands
■ Volume = (η:= vocabulary size)
■ Difficulty =
Halstead Difficulty & Volume
Idea: complexity based on numbers of operators
(e.g. reserved words) and operands (e.g. variables)
A subset of Halstead complexity measures 
 Halstead, Maurice H. “Elements of Software Science”. Amsterdam:
Elsevier North-Holland, Inc. 1977. ISBN 0-444-00205-7.
Fan-In & Fan-Out Metrics
aka Efferent & Afferent Coupling 
Number of elements that a
code element depends upon
Number of elements that
depend on a code element
 S. Henry and D. Kafura. “Software structure metrics based on information flow”.
IEEE Transactions on Software Engineering, no. 5. pp. 510–518. 1981.
Metrics in a Real Project
Changes in complexity measures related to real-world project
Analyzing every commit of a project
Details for a single commit
Fan-In Fan-Out Source
Under the Hood
JHawk (Java) &
Python & Django
■ Squale: bounded, continuous scale
for comparison of metric values 
■ Combines low-level marks (raw metric values)
into individual marks (IM)
■ IM mapped to unified scale (from 0 to 3),
determined by experts 
■ IM then aggregated (weighted) to form global mark
The Software Quality Enhancement (Squale) Model
 Mordal-Manet et al. "The squale model—A practice-based industrial quality model."
IEEE International Conference on Software Maintenance. 2009.
 Balmas et al. “Software metric for Java and C++ practices“. Research Report. pp.44. 2010.
■ Determine changes in code metrics, i.e. deltas,
for each developer over time
■ Identify influence of commit on component’s global mark
■ Expertise: ratio of commits that increase / decrease marks
(quality impact, qi) smoothed by total author commits ( )
From metrics to knowledge about developers
■ Two surveys performed for evaluation
■ Expert identification — who is currently being asked
■ without knowledge of Analyzr results
■ Proposal evaluation — who should be asked
■ with knowledge of results
■ Bounded time frame of observation
■ Distinguish temporary and permanent leave
■ In this study: 62 days
Assessing the quality of results with surveys
Expert Identification Survey
■ Task: Identify top 3 domain experts for front and back end
components, prior to tool introduction
■ Total agreement between participants on top expert
■ In front end components: 55%
■ In back end components: 33.3%
■ Majority agreed on top expert in 88% of total choices
→ Developers have a specific component expert in mind
Who is currently being identified as domain expert
Expert Identification Survey
■ Accuracy of Analyzr predictions for first choice
of domain expert vs intuitive developer picks
■ Front and back end combined: 47.37% match
■ Back end: 71.43% match
■ Front end: 50% miss
Comparing Analyzr predictions to intuitive survey data
Analyzr Proposal Evaluation
■ Developers asked to rate first, second, third choice
of component expert suggested
■ Scale: strong disagree (0), disagree (1), agree (2), strong agree (3)
■ Back end: 100% agreement (87.5% strong agree)
■ Front end: 90% agreement (48.5% strong agree)
Survey on what developers think of suggestions
Summary & Conclusion
■ Feasibility of identifying experts using code complexity
■ Algorithmically identified experts differed from intuitive selections
■ Algorithmically identified experts rated as accurate in 90% of cases
→ Evidence for non-obvious component experts,
i.e. “hidden experts”
→ Asking for the “guru” might not be ideal,
might simply get you the default person
[email protected] @chrisma0
In order of appearance
■ Recruitment by Gerald Wildmoser from the Noun Project (CC BY 3.0 US)
■ Truck by Mello from the Noun Project (CC BY 3.0 US)
■ questions by Gregor Cresnar from the Noun Project (CC BY 3.0 US)
■ Target by Arthur Shlain from the Noun Project (CC BY 3.0 US)
■ Search Code by icon 54 from the Noun Project (CC BY 3.0 US)
■ Difficulty Gauge by Thanh Nguyen from the Noun Project (CC BY 3.0 US)
■ puzzles by Kirby Wu from the Noun Project (CC BY 3.0 US)
■ clipboard by David from the Noun Project (CC BY 3.0 US)
■ Seo expert by H Alberto Gongora from the Noun Project (CC BY 3.0 US)
■ Idea by Gilbert Bages from the Noun Project (CC BY 3.0 US)