$30 off During Our Annual Pro Sale. View Details »

How open development data benefits your project and your community

How open development data benefits your project and your community

Slides for the presentation given during LinuxTag'14 about measuring free / open source software development, the need of development analytics and their possible areas of interest and available tools.

Bitergia
PRO

May 08, 2014
Tweet

More Decks by Bitergia

Other Decks in Technology

Transcript

  1. How open development data benefits your project and
    your community
    Jesus M. Gonzalez-Barahona
    [email protected] @jgbarah
    Bitergia / LibreSoft (URJC)
    http://bit.ly/open-sw-analytics
    Linux Tag 2014
    Berlin (Germany), May 8th 2014
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 1 / 33

    View Slide

  2. c 2012-2014 Bitergia
    Some rights reserved. This presentation is distributed under the
    “Attribution-ShareAlike 3.0” license, by Creative Commons, available at
    http://creativecommons.org/licenses/by-sa/3.0/
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 2 / 33

    View Slide

  3. Structure of the presentation
    1 Measuring free / open source software development
    2 Why open development analytics?
    3 Areas of interest
    4 Tools
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 3 / 33

    View Slide

  4. Measuring free / open source
    software development
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 4 / 33

    View Slide

  5. A successful development model?
    Free (open source) software
    has shown to be a great success
    ...but there are many details to be understood
    ...and (a lot of) interest in understanding
    ...but there is room for improvement
    ...and (a lot of) interest in improving
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 5 / 33

    View Slide

  6. A successful development model? (2)
    There are *lots* of development models
    Common characteristics for many of them:
    Community-based development
    Intensive use of tools, processes for coordination
    Open development models
    (as opposed to in-house, hidden models)
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 6 / 33

    View Slide

  7. The importance of the community
    [Crowd at FOSDEM 2008, by Jes´
    us Corrius, CC Attribution 2.0]
    http://www.flickr.com/photos/jcorrius/2302302707/
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 7 / 33

    View Slide

  8. The importance of the community (2)
    Persons (and organizations) with
    different interests
    common goals
    Need for coordination, common decision making
    Availability of data as a tool:
    Transparency to the community (fairness)
    Transparency to third parties (trust)
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 8 / 33

    View Slide

  9. Diversity of tools, processes
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 9 / 33

    View Slide

  10. Diversity of tools, processes (2)
    Despite diversity, a large fraction of projects:
    Use tools & services from a small set
    git / svn / hg
    Bugzilla / Jira / GitHub tickets
    Gerrit
    Mailman / Gmane
    ...
    use similar processes:
    bug fixing coordination using tickets
    pre-merge code review
    general discussion in mailing lists
    ...
    Collection and analysis of data is possible
    Publication of data makes sense
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 10 / 33

    View Slide

  11. Why open development analytics?
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 11 / 33

    View Slide

  12. From open development to open development analytics
    Information about code, community, development
    for open development projects
    can be retrieved, organized, analyzed
    Let’s publish analytics results & data
    Open Development Analytics:
    A new standard for transparency
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 12 / 33

    View Slide

  13. Open development analytics
    Who is interested?
    Large & small free software communities
    ...and thousands of large & small companies,
    public administrations, foundations
    participating in them,
    depending on their software
    [Who can afford not to be interested?
    It is a key strategic need for many actors]
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 13 / 33

    View Slide

  14. Open development analytics
    Why?
    Free software produced with open development
    models is more and more important
    for IT users, producers, integrators
    It is different & complex,
    yet transparent,
    many details are public,
    and it can be improved
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 14 / 33

    View Slide

  15. Areas of interest
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 15 / 33

    View Slide

  16. Some areas of interest
    Performance (understanding activity)
    Company participation (beyond copyright
    notices)
    Transparency (available information)
    Auditing (certify participation, experience, etc.)
    Profiling (key people, companies)
    Neutrality (fair treatment)
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 16 / 33

    View Slide

  17. Areas of interest: community management
    Issues Parameters
    Activity Raw volume, participants, ...
    Reliability Reaction times, pending issues, ...
    Sustainability Growth rate, structure, ...
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 17 / 33

    View Slide

  18. Areas of interest: community management
    [Puppet committers community: Attraction / retention]
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 18 / 33

    View Slide

  19. Areas of interest: community management
    [Linux kernel: age of developers per cohort]
    http://blog.bitergia.com/2013/02/01/
    demographics-of-linux-kernel-developers-how-old-are-they/
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 19 / 33

    View Slide

  20. Areas of interest: engineering
    Liferay: time-to-close tickets (quantiles) http://blog.bitergia.com/
    2012/10/25/preview-of-the-analysis-of-liferay/
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 20 / 33

    View Slide

  21. Areas of interest: engineering
    [MediaWiki community: tickets-related parameters]
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 21 / 33

    View Slide

  22. Areas of interest: company participation
    [Main companies in OpenStack Havana (partial view)]
    http://activity.openstack.org/dash/releases/
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 22 / 33

    View Slide

  23. Areas of interest: company participation
    [IBM participation in OpenStack Havana (partial view)]
    http://activity.openstack.org/dash/releases/company.html?company=IBM
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 23 / 33

    View Slide

  24. Areas of interest: transparency
    Development communities: companies and developers
    working together
    Policies, procedures, tools, source code...
    and development data
    Do they really provide enough data to enable
    assessment?
    Analysis of all repositories (data sources)...
    ...and associated information (eg: affiliation)
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 24 / 33

    View Slide

  25. Areas of interest: auditing
    [OpenStack top contributors (December 2013)]
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 25 / 33

    View Slide

  26. Areas of interest: profiling
    [oVirt developer profile]
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 26 / 33

    View Slide

  27. Areas of interest: neutrality
    q
    q
    q
    q q
    q
    q q
    0
    1
    2
    3
    250 500 1000 2000 4000
    Number of accepted reviews
    Iterations per accepted review (median)
    [WebKit code review data per company (2012)]
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 27 / 33

    View Slide

  28. Tools
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 28 / 33

    View Slide

  29. Tools: Grimoire system
    MetricsGrimoire:
    Free software for retrieving data from repositories
    vizGrimoire:
    Free software for analyzing, visualizing data
    Grimoire Dashboard:
    Many panels, different views of the project
    (charts, summaries, statistic analysis)
    Commercially supported by Bitergia
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 29 / 33

    View Slide

  30. Tools: Grimoire Dashboard
    [Dashboard for the GlusterFS project
    http://projects.bitergia.com/redhat-glusterfs-dashboard/]
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 30 / 33

    View Slide

  31. Summarizing
    Let’s go one step further:
    Open Development Analytics
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 31 / 33

    View Slide

  32. Relationship with EU-funded R&D projects
    Markos:
    License analyzer
    New tools for software development analysis
    Production of linked open data
    PROSE:
    Software development analytics to track results of
    R&D projects
    Open Source Projects Europe forge: development
    analytics facilities
    http://www.markosproject.eu/
    http://www.ict-prose.eu/
    https://opensourceprojects.eu/
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 32 / 33

    View Slide

  33. This is the end
    [Questions, comments...]
    Jesus Gonzalez-Barahona (Bitergia) Open development data Linux Tag 2014 33 / 33

    View Slide