@jgbarah Bitergia / LibreSoft (URJC) http://speakerdeck.com/jgbarah/ Metrics Day at Chalmers University of Technology Gothenburg (Sweden), November 10th 2016 Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 1 / 69
Dealing with dynamic complexity 3 Sources of information 4 Activity / size 5 Remaining code 6 Performance 7 Demographics 8 Diversity in FOSS development 9 GrimoireLab: tools for software development analytics 10 Final remarks Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 2 / 69
research team Understanding free, open source software Data analytics approach Bitergia: From research to the real world Understanding software development Data analytics approach http://gsyc.es/~jgb Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 4 / 69
open book Fork and play! https://jgbarah.gitbooks.io/evaluating-foss-projects/ Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 6 / 69
Visit Cauldron.io and produce your own dashboard Play with the dashboards Understand the interpretations behind the numbers http://cauldron.io Code: OWL2016 Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 7 / 69
to... ...track what’s happening ...understand why it’s happening ...react quickly ...evaluate results of reaction If data is available analytics may come to the rescue Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 12 / 69
data Define key parameters Monitor, understand, detect deviations Act to correct, improve Track results Measure → Monitor → Act Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 13 / 69
Mercurial, Bazaar, etc. Today: most of them accessible through git... but not always the information is what appears to be (eg: branches in Subversion and git) Can be integrated with other tools: Gerrit, GitHub, etc. Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 17 / 69
RedMine Trac ... Each with a different model, data, operations... Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 18 / 69
review pre-merge change review Different methods: Mailing lists (eg: Linux) Gerrit (eg: OpenStack) GitHub pull requests (eg: ElasticSearch) or even Jira, Bugzilla... Usually, references to tickets and commits Much of the control on the software lies here Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 19 / 69
Mailing list archivers (Gmane) Forums: too many to mention Question/Answer sites: StackOverflow, Askbot Information is always archived Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 20 / 69
Not always text/based (eg: videoconferences) Notes: In many cases, lack of archives Privacy concerns: considered informal communication Difficult to track identities Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 21 / 69
is explicit in FOSS & inner sourcing) Developers: all repositories Contributors: issue tracking, async communication Users: async communication, ... Ecosystem: difficult to track Software may include beacons: tracking usage Needed: tracking identities in different data sources Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 22 / 69
source code management system reporting, commenting or fixing bugs: issue tracking system submitting patches or reviewing them: code review system sending messages: async or sync communication systems Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 24 / 69
a certain period. People active for a certain period. Evolution of any of them. Trends for any of them. Difficult to compare between projects Interesting to compare inside project (different subprojects, different time frames) Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 25 / 69
files by first remaining commit] http://linux.biterg.io Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 31 / 69
(data of authorship, “.c” files) From top left, clockwise: Wireless, USB, IRDA Ethernet Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 32 / 69
Ticket(s) Code review Automated testing Commit in code base The OpenStack case Blueprint (if feature), Launchpad Ticket (bug, feature), Launchpad Code review, Gerrit Automated testing, Jenkins Commit in code base, Gerrit, Git Similar cases: GitHub, GitLab, Atlassian Requires discipline in the developing team Requires enough traces in the repositories Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 41 / 69
of repository level. The project level. The global level. Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 43 / 69
bugs) Small contributions: answers, bug fixes change proposals Core: design, feature implementation, bug fixes Inner source Questions, reports, etc. in public (no more coffee machine meetings) Moving to develop: answers, bug fixes change proposals Core: design, feature implementation, bug fixes, mentorship Finding traces, visualizing career evolution Assessments & forecasts of available expertise Identification of success stories Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 45 / 69
linked to bug fixing and code review Who is helping others to improve their skills? Who are benefiting more from the help of others? Who are newcomers, and who of them are not receiving mentorship? When a newcomer may convert into mentor? Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 46 / 69
[the ASF] created a term we have coined “Pony Factor” (because ASF is full of ponies, or people who think they are ponies). Pony Factor (PF) shows the diversity of a project in terms of the division of labor among committers in a project. Pony Factor is determined as: “The lowest number of committers whose total contribution constitutes the majority of the codebase” https://ke4qqq.wordpress.com/2015/02/08/pony-factor-math/ Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 51 / 69
from companies (elephants). The elephant factor shows the diversity of a project in terms of the division of labor among companies (by mean of developers affiliated with them). Elephant factor is determined as: “The lowest number of companies whose total contribution (in commits by their employees) constitutes the majority of the commits” Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 53 / 69
version is “owned” by the people who produced it. The code “belongs” to those who wrote it. Zapata factor (work in progress): “The lowest number of developers for whom the total number of lines of code they “own” (were last touched by them) constitutes the majority of the lines of code” Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 56 / 69
developers changing it. United Fruit factor (work in progress): “The lowest number of companies for whom the total number of lines of code they “own” (were last touched by their employees) constitutes the majority of the lines of code” Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 58 / 69
9.9% (330 developers) Linux kernel, Nov 2015 – Oct 2016 Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 59 / 69
identity management ElasticSearch (*): database for storing everything Kibiter: dashboard (light fork of Kibana) Panels: visualizations for Kibiter http://grimoirelab.github.io (*) Not a part of GrimoireLab Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 62 / 69
what is important Explore new ways of making data useful Tell interesting stories based on data Visualization is very important Higher-order metrics Simplify results, make them meaningful Can we characterize many aspects with a small set of metrics? Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 65 / 69
can measure a lot of things... http://bitergia.com http://grimoirelab.github.io http://speakerdeck.com/jgbarah Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 66 / 69
http://2017.msrconf.org 14th International Conference on Mining Software Repositories Co-located with ICSE Buenos Aires, Argentina Save the dates: May 20-21 2017 Start the conversation!!! #msr17 Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 67 / 69
under the “Attribution-ShareAlike 3.0” license, by Creative Commons, available at http://creativecommons.org/licenses/by-sa/3.0/ Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 68 / 69
located in Otawa, Canada Picture by Lezumbalaberenjena in Wikimedia Commons License: Public domain https://commons.wikimedia.org/wiki/File: Man_With_Two_Hats_Ottawa_Statue_by_lezumbalaberenjena.jpg “Crowd at FOSDEM 2008” by Jes´ us Corrius License: CC Attribution 2.0 http://www.flickr.com/photos/jcorrius/2302302707/ “Emiliano Zapata” License: Public Domain Jesus Gonzalez-Barahona (Bitergia) Metrics for Large Software Development Teams Metrics Day 2016 69 / 69