Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Open Development Analytics (reduced version)

Open Development Analytics (reduced version)

Talk at the Metrics Session of the Open Source Summit Paris 2016. November 16th 2016, Paris (France).

Jesus M. Gonzalez-Barahona

November 15, 2016
Tweet

More Decks by Jesus M. Gonzalez-Barahona

Other Decks in Technology

Transcript

  1. Open Development Analytics A Step Towards More Project Transparency (Reduced

    version) Jesus M. Gonzalez-Barahona [email protected] @jgbarah http://speakerdeck.com/jgbarah Bitergia / LibreSoft (URJC) Open Source Summit Paris (France), November 16th 2016 Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 1 / 54
  2. Structure of the presentation 1 A bit of context 2

    Transparency and governance 3 Open development analytics 4 How are changes being reviewed? 5 Dependency 6 Dealing with issues? 7 Diversity 8 The end Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 6 / 54
  3. Me and my two hats Uni Rey Juan Carlos: LibreSoft

    research team Understanding free, open source software Data analytics approach Bitergia: From research to the real world Understanding software development Data analytics approach http://gsyc.es/~jgb Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 8 / 54
  4. The company The software development analytics company dashboards reports consultancy

    ... http://bitergia.com Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 9 / 54
  5. Who drives open software development A community Persons (and organizations)

    with common goals different interests Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 12 / 54
  6. Self-awareness Open development communities need to be self-aware data is

    the source for awareness... when it can be used for “sensing” The same applies to any open organization Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 14 / 54
  7. Governance “Establishment of policies, and continuous monitoring of their proper

    implementation, by the members of the governing body of an organization. It includes the mechanisms required to balance the powers of the members (with the associated accountability), and their primary duty of enhancing the prosperity and viability of the organization.” http://businessdictionary.com Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 15 / 54
  8. Governance “Establishment of policies, and continuous monitoring of their proper

    implementation, by the members of the governing body of an organization. It includes the mechanisms required to balance the powers of the members (with the associated accountability), and their primary duty of enhancing the prosperity and viability of the organization.” http://businessdictionary.com Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 16 / 54
  9. Transparency It comes in two flavors Transparency to the community

    (fairness) Transparency to third parties (trust) Which for open organizations are kind of the same Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 17 / 54
  10. Transparency Example of rationale (OpenStack): “OpenStack favors disclosure and transparency

    to promote sharing and collaboration within the OpenStack community” https://www.openstack.org/legal/transparency-policy/ Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 18 / 54
  11. Transparency: showing the data is not enough Jesus Gonzalez-Barahona (Bitergia)

    Open Development Analytics Paris, Nov 2016 19 / 54
  12. A new dimension of openness When we develop in the

    open we produce a great deal of data about how we develop “Show me the development data” as a step beyond “show me the code” Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 21 / 54
  13. From open development to open development analytics Information about code,

    community, development for open development projects can be retrieved, organized, analyzed Let’s publish analytics results & data Open Development Analytics: A new standard for transparency Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 22 / 54
  14. Open development analytics Who may benefit? Developers Project managers Community

    managers Evaluators ... Anyone interested in the health of the project Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 23 / 54
  15. Who may benefit? Slide used by Jim Zemlin at LF

    Collab 2016 Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 24 / 54
  16. Some areas of interest Performance (understanding activity) Company participation (beyond

    copyright notices) Transparency (available information) Auditing (certify participation, experience, etc.) Profiling (key people, companies) Neutrality (fair treatment) Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 25 / 54
  17. Neutrality? q q q q q q q q 0

    1 2 3 250 500 1000 2000 4000 Number of accepted reviews Iterations per accepted review (median) Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 28 / 54
  18. Apache Pony Factor In words of Daniel Gruno: We [the

    ASF] created a term we have coined “Pony Factor” (because ASF is full of ponies, or people who think they are ponies). Pony Factor (PF) shows the diversity of a project in terms of the division of labor among committers in a project. Pony Factor is determined as: “The lowest number of committers whose total contribution constitutes the majority of the codebase” https://ke4qqq.wordpress.com/2015/02/08/pony-factor-math/ Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 30 / 54
  19. Bitergia Elephant Factor Projects can benefit from powerful collaborations from

    companies (elephants). The elephant factor shows the diversity of a project in terms of the division of labor among companies (by mean of developers affiliated with them). Elephant factor is determined as: “The lowest number of companies whose total contribution (in commits by their employees) constitutes the majority of the commits” Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 32 / 54
  20. Code “owned” “The land belongs to its workers” Emiliano Zapata

    Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 33 / 54
  21. Code “owned” The code changes over time. The current version

    is “owned” by the people who produced it. The code “belongs” to those who wrote it. Zapata factor (work in progress): “The lowest number of developers for whom the total number of lines of code they “own” (were last touched by them) constitutes the majority of the lines of code” Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 34 / 54
  22. Diversity: Code “owned” [Linux kernel, July 2016, Zapata factor: 200]

    Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 35 / 54
  23. Code “owned” The code “belongs” to companies who employ developers

    changing it. United Fruit factor (work in progress): “The lowest number of companies for whom the total number of lines of code they “own” (were last touched by their employees) constitutes the majority of the lines of code” Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 36 / 54
  24. Pony / elephant factors for some projects Pony Factor Elephant

    Factor Commits (excl bots) OpenNebula 4 1 12K Eucalyptus 5 1 25K CloudStack 14 1 42K OpenStack >100 6 126K CloudFoundry 41 1 60K OpenShift 10 1 15K Docker 15 1 18K Kubernetes 12 1 7K [July 2015] Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 37 / 54
  25. Issues may be processed not as intended Policy (or recommendations)

    may mandate transitions but are they real? Time to close when same company reporting / fixing? Time to close for external bug reports? Time to close depending on who reports? Who opens tickets that nobody cares about? Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 39 / 54
  26. Geography Geographical diversity is difficult to assess Companies can keep

    detailed records, but open communties are different Fortunately, some tools leave traces... This allows for better knowledge ...and better tracking of initiatives Example: policies to enlarge the number of developers in XXX region Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 43 / 54
  27. Gender: Analyzing by name Current situation of gender imbalance in

    OpenStack Gender Developers Commmits Commits/devel Female 750 14,647 19.5 Male 4,632 207,112 44.7 Only names with more than 80% of certainty. [Work in progress, preliminary results] Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 46 / 54
  28. Gender: Analyzing by name Commits by women: 6.8% (4 Kcommits)

    Women: 9.9% (330 developers) Linux kernel, Nov 2015 – Oct 2016 Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 47 / 54
  29. Summary Open Development Analytics A step forward in project transparency

    http://grimoirelab.github.io http://speakerdeck.com/jgbarah Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 50 / 54
  30. A moment for a commercial: Join us at MSR 2017!!

    http://2017.msrconf.org 14th International Conference on Mining Software Repositories Co-located with ICSE Buenos Aires, Argentina Save the dates: May 20-21 2017 Start the conversation!!! #msr17 Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 51 / 54
  31. License c 2016 Bitergia Some rights reserved. This presentation is

    distributed under the “Attribution-ShareAlike 3.0” license, by Creative Commons, available at http://creativecommons.org/licenses/by-sa/3.0/ Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 52 / 54
  32. Credits (1) “Man With Two Hats” Statue by Henk Visch,

    located in Otawa, Canada Picture by Lezumbalaberenjena in Wikimedia Commons License: Public domain https://commons.wikimedia.org/wiki/File: Man_With_Two_Hats_Ottawa_Statue_by_lezumbalaberenjena.jpg “Napoleon’s Russian campaign of 1812” Original by Charles Minard License: Public domain https://en.wikipedia.org/wiki/Charles_Joseph_Minard#/media/File: Minard.png “Aged Come In We’re Open” Picture by Czarina Alegre in Flickr License: Creative Commons Attribution 2.0 https://flic.kr/p/fjGamh Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 53 / 54
  33. Credits (2) “Good code” Comic by Randall Munroe, XKCD 844

    License: Creative Commons Attribution-NonCommercial 2.5 http://xkcd.com/844/ “Crowd at FOSDEM 2008” Picture by Jes´ us Corrius in Flickr Licenses: Creative Commmons Attribution 2.0 http://www.flickr.com/photos/jcorrius/2302302707/ “Elephant” Picture by ajoheyho License: Creative Commons Public Domain https://pixabay.com/en/elephant-african-bush-elephant-114543/ “Emiliano Zapata” License: Public Domain Jesus Gonzalez-Barahona (Bitergia) Open Development Analytics Paris, Nov 2016 54 / 54