Open Development Analytics: a Step Towards More Project Transparency
Understanding the inner life of free / open source software projects is important to developers, users and decision makers.
Presentation for Linux Foundation Collaboration Summit 2016
M. Gonzalez-Barahona [email protected] @jgbarah Bitergia / LibreSoft (URJC) Linux Foundation Collaboration Summit Lake Tahoe (CA, USA), March 29th 2016 Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 1 / 61
Transparency and governance 3 A personal journey 4 Open development analytics 5 Who is contributing? 6 How are changes being reviewed? 7 Dependency 8 Dealing with issues? 9 Diversity 10 Bonus track Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 6 / 61
research team Understanding free, open source software Data analytics approach Bitergia: From research to the real world Understanding software development Data analytics approach http://gsyc.es/~jgb Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 8 / 61
the source for awareness... when it can be used for “sensing” The same applies to any open organization Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 14 / 61
implementation, by the members of the governing body of an organization. It includes the mechanisms required to balance the powers of the members (with the associated accountability), and their primary duty of enhancing the prosperity and viability of the organization.” http://businessdictionary.com Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 15 / 61
implementation, by the members of the governing body of an organization. It includes the mechanisms required to balance the powers of the members (with the associated accountability), and their primary duty of enhancing the prosperity and viability of the organization.” http://businessdictionary.com Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 16 / 61
(fairness) Transparency to third parties (trust) Which for open organizations are kind of the same Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 17 / 61
to promote sharing and collaboration within the OpenStack community” https://www.openstack.org/legal/transparency-policy/ Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 18 / 61
a project. I’m hiring developers, supporting the foundation, sponsoring activities... Are my developers treated according to the policies? Are we getting integrated in the community? How do we compare with other companies of similar characteristics? Are we having reasonable metrics, according to the current stated policies and agreements? Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 20 / 61
large fraction of my time to this project. Are my initiatives being considered on fair terms? Are employees of other companies dealing with me the same way they do with their company colleagues? Am I considered based on my merits? Am I having reasonable metrics, according to the current stated policies? Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 21 / 61
being neutral, inclusive, in making life easy to all contributors Are newcomers being treated as they should? Are we balancing the interests of companies and independent developers? Do we have subprojects which are outliers in terms of performance, inclusiveness, etc. Are the policies we put in place having some impact? Do metrics show our project is as we intended it to be? Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 22 / 61
answered with data There is a lot of value in doing it in the open!!! But: Data retrieval is not that easy We need FOSS tools We need comparable results We need visualizations Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 26 / 61
open we produce a great deal of data about how we develop “Show me the development data” as a step beyond “show me the code” Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 28 / 61
community, development for open development projects can be retrieved, organized, analyzed Let’s publish analytics results & data Open Development Analytics: A new standard for transparency Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 29 / 61
managers Evaluators ... Anyone interested in the health of the project Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 30 / 61
drivers They join forces to push the project... ...but they watch each other, look for balances They contribute money, resources... ...and direct development effort Having an accurate, transparent picture is very important! Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 34 / 61
1 2 3 250 500 1000 2000 4000 Number of accepted reviews Iterations per accepted review (median) Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 38 / 61
ASF] created a term we have coined “Pony Factor” (because ASF is full of ponies, or people who think they are ponies). Pony Factor (PF) shows the diversity of a project in terms of the division of labor among committers in a project. Pony Factor is determined as: “The lowest number of committers whose total contribution constitutes the majority of the codebase” https://ke4qqq.wordpress.com/2015/02/08/pony-factor-math/ Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 40 / 61
companies (elephants). The elephant factor shows the diversity of a project in terms of the division of labor among companies (by mean of developers affiliated with them). Elephant factor is determined as: “The lowest number of companies whose total contribution (in commits by their employees) constitutes the majority of the commits” Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 42 / 61
is “owned” by the people who produced it. The code “belongs” to those who wrote it. Zapata factor (work in progress): “The lowest number of developers for whom the total number of lines of code they “own” (were last touched by them) constitutes the majority of the lines of code” Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 44 / 61
changing it. United Fruit factor (work in progress): “The lowest number of companies for whom the total number of lines of code they “own” (were last touched by their employees) constitutes the majority of the lines of code” Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 45 / 61
may mandate transitions but are they real? Time to close when same company reporting / fixing? Time to close for external bug reports? Time to close depending on who reports? Who opens tickets that nobody cares about? Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 48 / 61
detailed records, but open communties are different Fortunately, some tools leave traces... This allows for better knowledge ...and better tracking of initiatives Example: policies to enlarge the number of developers in XXX region Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 52 / 61
OpenStack Gender Developers Commmits Commits/devel Female 750 14,647 19.5 Male 4,632 207,112 44.7 Only names with more than 80% of certainty. [Work in progress, preliminary results] Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 55 / 61
distributed under the “Attribution-ShareAlike 3.0” license, by Creative Commons, available at http://creativecommons.org/licenses/by-sa/3.0/ Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 58 / 61
located in Otawa, Canada Picture by Lezumbalaberenjena in Wikimedia Commons License: Public domain https://commons.wikimedia.org/wiki/File: Man_With_Two_Hats_Ottawa_Statue_by_lezumbalaberenjena.jpg “Napoleon’s Russian campaign of 1812” Original by Charles Minard License: Public domain https://en.wikipedia.org/wiki/Charles_Joseph_Minard#/media/File: Minard.png “Aged Come In We’re Open” Picture by Czarina Alegre in Flickr License: Creative Commons Attribution 2.0 https://flic.kr/p/fjGamh Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 59 / 61
License: Creative Commons Attribution-NonCommercial 2.5 http://xkcd.com/844/ “Crowd at FOSDEM 2008” Picture by Jes´ us Corrius in Flickr Licenses: Creative Commmons Attribution 2.0 http://www.flickr.com/photos/jcorrius/2302302707/ “Elephant” Picture by ajoheyho License: Creative Commons Public Domain https://pixabay.com/en/elephant-african-bush-elephant-114543/ “Emiliano Zapata” License: Public Domain Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics March 2016 60 / 61