A Comparison of Three Algorithms for Computing Truck Factors (ICPC 2017)

A Comparison of Three Algorithms for Computing Truck Factors Mívian
Ferreira¹, Marco Tulio Valente¹ and Kecia Ferreira² ¹UFMG, Belo Horizonte - Brazil ²CEFET-MG, Belo Horizonte - Brazil ICPC 2017

“No man is an island ...” John Donne, 1623 2

How to measure knowledge distribution in software projects? 3

Truck Factor The minimal number of developers that have to
be hit by a truck (or leave) to put a project in trouble 4

Goal To compare three well-‐known algorithms for measuring Truck
Factors 5 AVL Avelino et al. (ICPC 2016) RIG Rigby et al. (ICSE 2016) CST Cosentino et al. (SANER 2015)

Research Questions RQ1. How accurate are the results provided by
each algorithm? RQ2. How accurate is the idenEficaEon of TF developers? RQ3. What is the impact of different thresholds/configuraEons? 6

AVL •  Greedy heurisEc •  Commit-‐based •  Uses
DOA (Degree-‐of-‐Authorship) (Fritz et al. ICSE 2010, TOSEM 2014) •  Simulates the removal of the top-‐authors unEl 50% of the ﬁles become abandoned “A Novel Approach for Estimating Truck Factors”, Avelino et al. ICPC 2016 7

RIG •  Blame approach: considers the author who last changed
a line •  Abandoned line: a line last changed by a developer who le] •  Abandoned ﬁle: at least 90% of its lines are abandoned 9 “Quantifying and mitigating turnover-induced knowledge loss: case studies of Chrome and a project at Avaya”, Rigby et al. ICSE 2016

RIG •  Non-‐determinisEc algorithm •  Randomly simulates several Truck
Factor scenarios 10

RIG 11

CST •  CST’s authors do not provide a detailed algorithm,
but a tool •  CST first computes the Truck Factor at the file level •  The final result combines the Truck Factor of each file 12 “Assessing the bus factor of Git repositories”, Cosentino et al. SANER 2015

CST •  To compute knowledge on a file, CST considers
2 metrics: •  Last change takes it all (LCTA): authorship on a file is assigned to the last developer who modified it •  Multiple changes equally considered (MCEC): authorship on a file is assigned to the author with the highest number of commits 13 “Assessing the bus factor of Git repositories”, Cosentino et al. SANER 2015

Dataset •  35 GitHub systems: •  6 most popular languages
•  27 systems from AVL dataset •  8 new systems 14

Oracle of Truck Factors •  Survey with the main developers
of the systems •  We only accepted responses from top 10 developers •  Truck Factor number and Truck Factor sets 15

RQ1. How accurate are the results provided by each algorithm?
16

How... 17 Error of TF estimated by each algorithm, compared
to oracle values | Error | = TFalgorithm - TForacle

Results 18 AVL and CST are the most accurate
algorithms

RQ2. How accurate is the identification of TF developers by
each algorithm? 19

How... 20 Analysis of True Positives (TP), False Positives (FP)
and False Negatives (FN) Precision = TP / (TP U FP) Recall = TP / (TP U FN) F-measure = (2 * P * R) / (P+R)

Results 21 AVL has the highest precision

Results 22 RIG and AVL have the best recall, followed
by CST

Results 23 AVL has the highest results for F-measure, closely
followed by CST

RQ3. What is the impact of different thresholds and configurations
in the results of each algorithm? 24

How... •  AVL: variation of threshold on abandoned file (0.1
to 1.0) •  RIG: variation of the number of random samples of developers (1,000 to 10,000) •  CST: change the knowledge metrics (LCTA and MCEC) 25

Results 26 50% is the best threshold on abandoned files
for AVL AVL

Results 27 Increasing the number of tested samples does not
have a major positive impact on RIG results. RIG

Results 28 MCEC (multiple changes equally considered) is the knowledge
metric that leads to the best results on CST

Conclusion 29 1.  AVL and CST are the most accurate
algorithms 2.  AVL is the most accurate algorithm to predict the Truck Factor sets, closely followed by CST 3.  The best threshold for AVL is 50%

Conclusion 30 4.  RIG has a non-deterministic behavior and changing
the number of samples has a minor impact on its results 5.  The Multiple Changes Equally Considered (MCEC) used by CST to infer code knowledge leads to the best results

Thanks for your attention! A Comparison of Three Algorithms for
Computing Truck Factors Mívian Ferreira¹, Marco Tulio Valente¹ and Kecia Ferreira² [email protected] ICPC 2017

A Comparison of Three Algorithms for Computing ...

A Comparison of Three Algorithms for Computing Truck Factors (ICPC 2017)

ASERG, DCC, UFMG

More Decks by ASERG, DCC, UFMG

Other Decks in Research

Featured

Transcript

A Comparison of Three Algorithms for Computing Truck Factors Mívian

“No man is an island ...” John Donne, 1623 2

How to measure knowledge distribution in software projects? 3

Truck Factor The minimal number of developers that have to

Goal To compare three well-‐known algorithms for measuring Truck

Research Questions RQ1. How accurate are the results provided by

AVL •  Greedy heurisEc •  Commit-‐based •  Uses

AVL 8

RIG •  Blame approach: considers the author who last changed

RIG •  Non-‐determinisEc algorithm •  Randomly simulates several Truck

RIG 11

CST •  CST’s authors do not provide a detailed algorithm,

CST •  To compute knowledge on a file, CST considers

Dataset •  35 GitHub systems: •  6 most popular languages

Oracle of Truck Factors •  Survey with the main developers

RQ1. How accurate are the results provided by each algorithm?

How... 17 Error of TF estimated by each algorithm, compared

Results 18 AVL and CST are the most accurate

RQ2. How accurate is the identification of TF developers by

How... 20 Analysis of True Positives (TP), False Positives (FP)

Results 21 AVL has the highest precision

Results 22 RIG and AVL have the best recall, followed

Results 23 AVL has the highest results for F-measure, closely

RQ3. What is the impact of different thresholds and configurations

How... •  AVL: variation of threshold on abandoned file (0.1

Results 26 50% is the best threshold on abandoned files

Results 27 Increasing the number of tested samples does not

Results 28 MCEC (multiple changes equally considered) is the knowledge

Conclusion 29 1.  AVL and CST are the most accurate

Conclusion 30 4.  RIG has a non-deterministic behavior and changing

Thanks for your attention! A Comparison of Three Algorithms for