Slide 1

Slide 1 text

Development metrics to know about Free Software projects Alvaro del Castillo San F´ elix [email protected] http://acsblog.es http://twitter.com/acstw Bitergia Santiago de Compostela (Galicia, Spain), October 17th 2012 Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 1 / 27

Slide 2

Slide 2 text

c 2012 Bitergia Some rights reserved. This presentation is distributed under the “Attribution-ShareAlike 3.0” license, by Creative Commons, available at http://creativecommons.org/licenses/by-sa/3.0/ Blog post about this presentation (including link to slides) http://blog.bitergia.com/2012/10/17/tomorrow-at-the-lswc12/ Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 2 / 27

Slide 3

Slide 3 text

Free software is (in many cases) special Source code available Open development model (usually) Many details about the internals of the development process Intense use of tools for coordination Lots of information is tracked, and available Developers & users communities are important sustainability pooling of resources innovation Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 3 / 27

Slide 4

Slide 4 text

Measuring, measuring, measuring Information about code, community, development can be retrieved, organized, analyzed Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 4 / 27

Slide 5

Slide 5 text

Who benefits Quantitative, objective data: facts, not opinions Specific questions can be answered Even simple analysis may help stakeholders: Developers: Understanding, improving development processes Users, integrators: Long-term sustainability, evolution, reaction to issues Investors: Attraction of external resources, growth rate Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 5 / 27

Slide 6

Slide 6 text

But data has to be extracted, mined Data lives in repositories not always designed to release all their data easily: tools are needed to retrieve and extract it Data includes many complexities and details tools are needed to assist in its mining, analysis Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 6 / 27

Slide 7

Slide 7 text

The Metrics Grimoire approach Set of tools specialized in retrieving information from different kinds of repositories. Among them: CVSAnalY: source code management (CVS, Subversion, git, etc.) Bicho: issue tracking systems (Bugzilla, Jira, SourceForge, Allura, Launchpad, Google Code, etc.) MLStats: mailing lists (mbox files, Mailman archives, etc.) Store all the information in SQL databases with similar structure http://metricsgrimoire.github.com https://github.com/MetricsGrimoire Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 7 / 27

Slide 8

Slide 8 text

MetricsGrimoire: CVSAnalY Browses an SCM repository producing a database with: All metainformation (commit records, etc.) Metrics for each release of each file Also produces some tables suitable for specific analysis Multiple SCMs: CVS, svn, git (Bazaar partially) Whole history in the database, it’s possible to rebuild the files tree for any revision Tags and branches support Option to save the log to a file while parsing Extensions system, incremental capabilities Multiple database system support (MySQL and SQLite) Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 8 / 27

Slide 9

Slide 9 text

MetricsGrimoire: CVSAnalY extensions Extension: a “plugin” for CVSAnalY Add information to the database, based in the information in the database and maybe the repository Usually: new tables for specific studies Simple example: commits per month per commiter Extensions add one or more tables to the database but they never modify the existing ones Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 9 / 27

Slide 10

Slide 10 text

MetricsGrimoire: CVSAnalY extensions Some examples: FileTypes: adds a table containing information about the type of every file in the database (code, documentation, i18n, etc.) Metrics: analyzes every revision of every file calculating metrics like sloc and complexity metrics (mccabe, halstead). It currently supports metrics for C/C++, Python, Java and ADA. CommitsLOC: adds a new table with information about the total lines added/removed for every commit Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 10 / 27

Slide 11

Slide 11 text

MetricsGrimoire: Bicho Parsing issue tracking systems Results stored in a MySQL database Information about each issue (ticket), and its modifications Currently it supports: SourceForge (HTML parsing) BugZilla: GNOME, KDE, others Jira, Google Code, Allura, Launchpad (API) Incremental Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 11 / 27

Slide 12

Slide 12 text

MetricsGrimoire: MailingListStats Parses mbox information (RFC 822) Deals with Mailman archives Stores results (headers, body) in a MySQL database: Sender, CCs, etc. Time / Date Subject ... Incremental Can store multiple projects in a single database Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 12 / 27

Slide 13

Slide 13 text

Milking the databases Once information is retrieved, and in suitable format for querying: it can be queried directly in the database it can be analyzed from R it can be filtered, manually inspected, improved it can be combined, cross-analyzed it can be visualized We’re building tools to simplify all of this: vizGrimoire https://github.com/VizGrimoire Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 13 / 27

Slide 14

Slide 14 text

Now, some examples Some examples from real projects Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 14 / 27

Slide 15

Slide 15 text

OpenStack: Opening / closing tickets Folsom release cycle, 2012 http://blog.bitergia.com/2012/09/27/ how-the-new-release-of-openstack-was-built/ Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 15 / 27

Slide 16

Slide 16 text

OpenStack: Who is developing it? Core projects / all projects (Folsom release cycle, 2012) Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 16 / 27

Slide 17

Slide 17 text

Zentyal (basic analysis) Source code management repositories: git: git://git.zentyal.org/zentyal From: 2005-06-27 To: 2012-09-10 Mailing lists: Development Users Announcements http://lists.zentyal.com/cgi-bin/mailman/listinfo/ From: 2010-09-01 To: 2012-09-30 http://blog.bitergia.com/2012/10/03/basic-analysis-of-zentyal/ Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 17 / 27

Slide 18

Slide 18 text

Zentyal: Git repository (parameters per month) Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 18 / 27

Slide 19

Slide 19 text

Zentyal: Mailing lists (Developers, Users) Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 19 / 27

Slide 20

Slide 20 text

LibrePlan (basic analysis) Source code management repositories: git: git://git.zentyal.org/zentyal From: 2009-04-23 To: 2012-12-31 Issue Tracking System: bugzilla: http://bugs.libreplan.org/ From: 2009-10-04 To: 2012-10-01 http://blog.bitergia.com/2012/10/03/basic-analysis-of-zentyal/ Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 20 / 27

Slide 21

Slide 21 text

LibrePlan: Git repository (parameters per month) Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 21 / 27

Slide 22

Slide 22 text

LibrePlan: Bugzilla repository (parameters per month) Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 22 / 27

Slide 23

Slide 23 text

All of this can be integrated... Dashboards Forges IDEs Support systems ... A new generation of tracking systems for software development? Integrated with software forges? Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 23 / 27

Slide 24

Slide 24 text

Example: towards a dashboard http://blog.bitergia.com/2012/09/27/ how-the-new-release-of-openstack-was-built/ Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 24 / 27

Slide 25

Slide 25 text

In summary FLOSS development repositories have a wealth of information Their analysis is potentially interesting to any stakeholder Getting the data out of the repository is not that difficult... ...but analysis may be We’re interested in deep analysis We’re interested in working with developers, managers, users What would you like to know about your pet project? Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 25 / 27

Slide 26

Slide 26 text

Bitergia: an spin-off Started operations in July 2012 Builds on the experience of LibreSoft R&D group Offering professional products and services Focused on: Metrics about software development (including community metrics) Specialized support for development forges (including metrics for projects) http://bitergia.com Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 26 / 27

Slide 27

Slide 27 text

This is the end Have you learned something useful? [I would love to know what interested you the most] [...and the least] [email protected] Alvaro del Castillo (Bitergia) Development metrics to know about Free Software projects LSWC 2012 27 / 27