GrimoireLab: A tool for Open Development Analytics
Talk at Linux Foundation Collab Summit (Tahoe, CA, USA, March 30th 2016). Current state of GrimoireLab, and its relationship with Open Development Analytics.
[email protected] @jgbarah Bitergia / LibreSoft (URJC) Linux Foundation Collaboration Summit Lake Tahoe (CA, USA), March 30th 2016 Jesus Gonzalez-Barahona (Bitergia) GrimoireLab March 2016 1 / 36
A personal story 3 Open Development Analytics 4 MetricsGrimoire 5 GrimoireLab 6 Preview Jesus Gonzalez-Barahona (Bitergia) GrimoireLab March 2016 2 / 36
research team Understanding free, open source software Data analytics approach Bitergia: From research to the real world Understanding software development Data analytics approach http://gsyc.es/~jgb Jesus Gonzalez-Barahona (Bitergia) GrimoireLab March 2016 4 / 36
analytics (data, methodology) The software was FLOSS (sloccount) We could reproduce the study easily We could work on top of it easily We could apply new ideas We could improve the software We could automate everything Applying free, open source software principles to analytics Jesus Gonzalez-Barahona (Bitergia) GrimoireLab March 2016 9 / 36
(diverse, but not that large): git / svn / hg Bugzilla / Jira / GitHub Gerrit Mailman / Gmane / StackOverflow IRC / Slack ... Similar processes: bug fixing coordination using tickets pre-merge code review general discussion in mailing lists support / meetings in IRC / Slack / StackOverflow ... Collection and analysis of data is possible Jesus Gonzalez-Barahona (Bitergia) GrimoireLab March 2016 11 / 36
community, development for open development projects can be retrieved, organized, analyzed Let’s publish analytics results & data Open Development Analytics: A new standard for transparency Jesus Gonzalez-Barahona (Bitergia) GrimoireLab March 2016 12 / 36
development systems It has to be retrieved... ...from many different systems It has to be analyzed It has to be visualized Open source software platforms for software development analytics Jesus Gonzalez-Barahona (Bitergia) GrimoireLab March 2016 13 / 36
orchestration ElasticSearch: storing data Python / Pandas scripts: enrich, analyze, customize the data Kibiter: Kibana fork to interact with the data http://grimoirelab.github.io http://blog.bitergia.com Jesus Gonzalez-Barahona (Bitergia) GrimoireLab March 2016 20 / 36
per software repository Written in Python3 Each backend produces “data items” Data items are commits, tickets, code review processes... Currently implemented backends: Git, Gerrit, Bugzilla, GitHub Issues, mbox, StackOverflow https://github.com/GrimoireLab/perceval Jesus Gonzalez-Barahona (Bitergia) GrimoireLab March 2016 22 / 36
Can use a pool of nodes Written in Python3 Synchronization using MQ Short-term data persistence: Redis Incremental retrieval https://github.com/GrimoireLab/arthur Jesus Gonzalez-Barahona (Bitergia) GrimoireLab March 2016 24 / 36
of higher order metrics Data enrichment chains... ...by combining “processing units” Agnostic about data source / destination Specially tailored to Python/Pandas needs Jesus Gonzalez-Barahona (Bitergia) GrimoireLab March 2016 25 / 36
the data for dashboards Allows for flexible querying Allows for scale Data fed via REST API Jesus Gonzalez-Barahona (Bitergia) GrimoireLab March 2016 26 / 36
distributed under the “Attribution-ShareAlike 3.0” license, by Creative Commons, available at http://creativecommons.org/licenses/by-sa/3.0/ Jesus Gonzalez-Barahona (Bitergia) GrimoireLab March 2016 35 / 36
in Otawa, Canada Picture by Lezumbalaberenjena in Wikimedia Commons License: Public domain https://commons.wikimedia.org/wiki/File: Man_With_Two_Hats_Ottawa_Statue_by_lezumbalaberenjena.jpg Jesus Gonzalez-Barahona (Bitergia) GrimoireLab March 2016 36 / 36