[email protected] @jgbarah Bitergia / LibreSoft (URJC) http://speakerdeck.com/jgbarah/ 12th International Conference on Open Source Systems (OSS) Gothenburg (Sweden), May 30th 2016 Jesus Gonzalez-Barahona (Bitergia) Metrics for a Software Development Community OSS 2016 1 / 54
Dealing with dynamic complexity 3 Sources of information 4 Activity / size 5 Performance 6 Demographics 7 Diversity 8 Final remarks Jesus Gonzalez-Barahona (Bitergia) Metrics for a Software Development Community OSS 2016 2 / 54
research team Understanding free, open source software Data analytics approach Bitergia: From research to the real world Understanding software development Data analytics approach http://gsyc.es/~jgb Jesus Gonzalez-Barahona (Bitergia) Metrics for a Software Development Community OSS 2016 4 / 54
open book Fork and play! https://jgbarah.gitbooks.io/evaluating-foss-projects/ Jesus Gonzalez-Barahona (Bitergia) Metrics for a Software Development Community OSS 2016 6 / 54
Visit Cauldron.io and produce your own dashboard Play with the dashboards Understand the interpretations behind the numbers http://cauldron.io Code: OSS16 Jesus Gonzalez-Barahona (Bitergia) Metrics for a Software Development Community OSS 2016 7 / 54
to... ...track what’s happening ...understand why it’s happening ...react quickly ...evaluate results of reaction If data is available analytics may come to the rescue Jesus Gonzalez-Barahona (Bitergia) Metrics for a Software Development Community OSS 2016 11 / 54
data Define key parameters Monitor, understand, detect deviations Act to correct, improve Track results Measure → Monitor → Act Jesus Gonzalez-Barahona (Bitergia) Metrics for a Software Development Community OSS 2016 12 / 54
Mercurial, Bazaar, etc. Today: most of them accessible through git... but not always the information is what appears to be (eg: branches in Subversion and git) Can be integrated with other tools: Gerrit, GitHub, etc. Jesus Gonzalez-Barahona (Bitergia) Metrics for a Software Development Community OSS 2016 16 / 54
RedMine Trac ... Each with a different model, data, operations... Jesus Gonzalez-Barahona (Bitergia) Metrics for a Software Development Community OSS 2016 17 / 54
review pre-merge change review Different methods: Mailing lists (eg: Linux) Gerrit (eg: OpenStack) GitHub pull requests (eg: ElasticSearch) or even Jira, Bugzilla... Usually, references to tickets and commits Much of the control on the software lies here Jesus Gonzalez-Barahona (Bitergia) Metrics for a Software Development Community OSS 2016 18 / 54
Mailing list archivers (Gmane) Forums: too many to mention Question/Answer sites: StackOverflow, Askbot Information is always archived Jesus Gonzalez-Barahona (Bitergia) Metrics for a Software Development Community OSS 2016 19 / 54
Not always text/based (eg: videoconferences) Notes: In many cases, lack of archives Privacy concerns: considered informal communication Difficult to track identities Jesus Gonzalez-Barahona (Bitergia) Metrics for a Software Development Community OSS 2016 20 / 54
tracking, async communication User community: async communication, ... Ecosystem community: difficult to track Software may include beacons: tracking usage Jesus Gonzalez-Barahona (Bitergia) Metrics for a Software Development Community OSS 2016 21 / 54
source code management system reporting, commenting or fixing bugs: issue tracking system submitting patches or reviewing them: code review system sending messages: async or sync communication systems Jesus Gonzalez-Barahona (Bitergia) Metrics for a Software Development Community OSS 2016 23 / 54
a certain period. People active for a certain period. Evolution of any of them. Trends for any of them. Difficult to compare between projects Interesting to compare inside project (different subprojects, different time frames) Jesus Gonzalez-Barahona (Bitergia) Metrics for a Software Development Community OSS 2016 24 / 54
of repository level. The project level. The global level. Jesus Gonzalez-Barahona (Bitergia) Metrics for a Software Development Community OSS 2016 36 / 54
[the ASF] created a term we have coined “Pony Factor” (because ASF is full of ponies, or people who think they are ponies). Pony Factor (PF) shows the diversity of a project in terms of the division of labor among committers in a project. Pony Factor is determined as: “The lowest number of committers whose total contribution constitutes the majority of the codebase” https://ke4qqq.wordpress.com/2015/02/08/pony-factor-math/ Jesus Gonzalez-Barahona (Bitergia) Metrics for a Software Development Community OSS 2016 42 / 54
from companies (elephants). The elephant factor shows the diversity of a project in terms of the division of labor among companies (by mean of developers affiliated with them). Elephant factor is determined as: “The lowest number of companies whose total contribution (in commits by their employees) constitutes the majority of the commits” Jesus Gonzalez-Barahona (Bitergia) Metrics for a Software Development Community OSS 2016 44 / 54
version is “owned” by the people who produced it. The code “belongs” to those who wrote it. Zapata factor (work in progress): “The lowest number of developers for whom the total number of lines of code they “own” (were last touched by them) constitutes the majority of the lines of code” Jesus Gonzalez-Barahona (Bitergia) Metrics for a Software Development Community OSS 2016 47 / 54
developers changing it. United Fruit factor (work in progress): “The lowest number of companies for whom the total number of lines of code they “own” (were last touched by their employees) constitutes the majority of the lines of code” Jesus Gonzalez-Barahona (Bitergia) Metrics for a Software Development Community OSS 2016 48 / 54
what is important Explore new ways of making data useful Tell interesting stories based on data Visualization is very important Higher-order metrics Simplify results, make them meaningful Can we characterize many aspects with a small set of metrics? Jesus Gonzalez-Barahona (Bitergia) Metrics for a Software Development Community OSS 2016 51 / 54
http://icse2017.gatech.edu http://2017.msrconf.org (Coming soon!) 14th International Conference on Mining Software Repositories Co-located with ICSE Buenos Aires, Argentina Save the dates: May 20-21 2017 Start the conversation!!! #msr17 Jesus Gonzalez-Barahona (Bitergia) Metrics for a Software Development Community OSS 2016 53 / 54
under the “Attribution-ShareAlike 3.0” license, by Creative Commons, available at http://creativecommons.org/licenses/by-sa/3.0/ Jesus Gonzalez-Barahona (Bitergia) Metrics for a Software Development Community OSS 2016 54 / 54
located in Otawa, Canada Picture by Lezumbalaberenjena in Wikimedia Commons License: Public domain https://commons.wikimedia.org/wiki/File: Man_With_Two_Hats_Ottawa_Statue_by_lezumbalaberenjena.jpg “Crowd at FOSDEM 2008” by Jes´ us Corrius License: CC Attribution 2.0 http://www.flickr.com/photos/jcorrius/2302302707/ “Emiliano Zapata” License: Public Domain Jesus Gonzalez-Barahona (Bitergia) Metrics for a Software Development Community OSS 2016 55 / 54