Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Data, data and data about your favourite community: The Grimoire Library

Bitergia
PRO
February 01, 2015

Data, data and data about your favourite community: The Grimoire Library

Lightning talk at FOSDEM 2015. How to use the Grimoire Library. More info at https://fosdem.org/2015/schedule/event/community_data/

Bitergia
PRO

February 01, 2015
Tweet

More Decks by Bitergia

Other Decks in Programming

Transcript

  1. Data, data and data about your favourite community
    The Grimoire Library
    Daniel Izquierdo-Cortazar
    [email protected]
    http://twitter.com/dizquierdo
    Bitergia
    FOSDEM, Brussels, February 1st, 2015
    http://bitergia.com
    Daniel Izquierdo-Cortazar (Bitergia) Data, data and data about your favourite community FOSDEM 2015 1 / 16

    View Slide

  2. c 2012-2015 Bitergia
    Some rights reserved. This presentation is distributed under the
    “Attribution-ShareAlike 3.0” license, by Creative Commons, available at
    http://creativecommons.org/licenses/by-sa/3.0/
    Daniel Izquierdo-Cortazar (Bitergia) Data, data and data about your favourite community FOSDEM 2015 2 / 16

    View Slide

  3. Why this talk?
    Massive data are produced
    by open source projects
    How can we take advantage of it?
    Is it useful to analyze my community?
    Daniel Izquierdo-Cortazar (Bitergia) Data, data and data about your favourite community FOSDEM 2015 3 / 16

    View Slide

  4. GrimoireLib: goals
    Transparency db layer for Metrics Grimoire
    Reuse code: no need to create once and again
    the same queries
    Scalable and modular: a new metric is a new
    class
    https://github.com/VizGrimoire/GrimoireLib
    Daniel Izquierdo-Cortazar (Bitergia) Data, data and data about your favourite community FOSDEM 2015 4 / 16

    View Slide

  5. GrimoireLib: other facts
    Based on Metrics Grimoire output (SQL ddbb)
    Have a look at http://bit.ly/fosdem-grimoire
    A bit of history: developed in R, migrated to
    Python
    Started in 2012
    Team of 4/5 developers per month
    License: GPL v3 or later
    Daniel Izquierdo-Cortazar (Bitergia) Data, data and data about your favourite community FOSDEM 2015 5 / 16

    View Slide

  6. Metrics and studies overview:
    source code and code review
    Source code (git, svn, hg, etc):
    usual ones: commits, authors, files, added/removed
    lines, branches, companies, etc
    not so usual: demographics, timezone, developers
    characterization
    Code Review (gerrit, github):
    merges, abandoned, submitted patchsets or
    changesets, people, companies, etc
    time to close, time waiting for the submitter or
    reviewer
    Daniel Izquierdo-Cortazar (Bitergia) Data, data and data about your favourite community FOSDEM 2015 6 / 16

    View Slide

  7. Metrics and studies overview:
    communication channels: mailing lists, IRC, Q&A
    Mailing lists
    emails, people, companies, hot topics, time to first
    reply,
    emails initiating threads, those replying, unanswered
    posts, timezone analysis
    Question and answers (stackoverflow, askbot, discourse)
    top visited questions, labels, people, answers,
    comments, ...
    Daniel Izquierdo-Cortazar (Bitergia) Data, data and data about your favourite community FOSDEM 2015 7 / 16

    View Slide

  8. Metrics and studies overview:
    ticketing systems: Bugzilla, Jira, Launchpad, ...
    Tickets:
    opened and closed tickets, efficiency, time to close
    tickets
    time to attend
    Other data sources:
    Wikis, Downloads, Releases, Apache logs
    Daniel Izquierdo-Cortazar (Bitergia) Data, data and data about your favourite community FOSDEM 2015 8 / 16

    View Slide

  9. Available filters
    Filters:
    general: repository, company, domain, project, people
    source code: branch, module, file type, log message
    tickets: ticket type
    Examples:
    Commits (by company) (and by repo) (and by
    filetype)
    Time to close issues (by company) (and by tracker)
    Top companies (by project)
    Top developers (by project) (and by company)
    Daniel Izquierdo-Cortazar (Bitergia) Data, data and data about your favourite community FOSDEM 2015 9 / 16

    View Slide

  10. Metrics main methods
    API:
    get agg: aggregated numbers
    get ts: evolutionary numbers
    get list: list of elements (eg authors)
    get trends: trends for a specific date during the last X
    days
    Daniel Izquierdo-Cortazar (Bitergia) Data, data and data about your favourite community FOSDEM 2015 10 / 16

    View Slide

  11. How to
    Import the needed libraries:
    # Database access
    from vizgrimoire.metrics.query_builder import SCMQuery
    # Filters to apply
    from vizgrimoire.metrics.metrics_filter import MetricFilters
    # Let’s start playing with git activity metrics
    import vizgrimoire.metrics.scm_metrics as scm
    Daniel Izquierdo-Cortazar (Bitergia) Data, data and data about your favourite community FOSDEM 2015 11 / 16

    View Slide

  12. How to
    Database access:
    # Instantiate database access
    # Playing with OpenStack source code database (MySQL) at
    # http://activity.openstack.org/dash/.../source_code.mysql.7z
    # Database named as openstack_source_code_fosdem2015
    user = "root"
    password = ""
    source_code_db = "openstack_source_code_fosdem2015"
    identities_db = "openstack_source_code_fosdem2015"
    dbcon = SCMQuery(user, password,
    source_code_db, identities_db)
    Daniel Izquierdo-Cortazar (Bitergia) Data, data and data about your favourite community FOSDEM 2015 12 / 16

    View Slide

  13. How to
    Instantiate filters:
    # Instantiate some filters to play with
    period = MetricFilters.PERIOD_MONTH
    startdate = "’2014-01-01’"
    enddate = "’2015-01-01’"
    # basic filter
    filters = MetricFilters(period, startdate, enddate)
    # company and repo filter
    filters_r = MetricFilters(period, startdate, enddate)
    filters_r.add_filter(MetricFilters.COMPANY, "Red Hat")
    filters_r.add_filter(MetricFilters.REPOSITORY, "nova.git")
    Daniel Izquierdo-Cortazar (Bitergia) Data, data and data about your favourite community FOSDEM 2015 13 / 16

    View Slide

  14. How to
    Instantiate the metric you need
    # Retrieving data for each filter.
    # Let’s start with authors
    commits = scm.Commits(dbcon, filters)
    authors.get_agg()
    authors.get_ts()
    authors.get_list()
    authors.get_trends(filters.enddate, 7)
    Daniel Izquierdo-Cortazar (Bitergia) Data, data and data about your favourite community FOSDEM 2015 14 / 16

    View Slide

  15. This is not the end
    You can use or contribute to *Grimoire
    Code and issues at:
    https://github.com/VizGrimoire/GrimoireLib
    IRC in Freenode at #metrics-grimoire
    Mailing list: https://lists.libresoft.es/
    listinfo/metrics-grimoire
    Daniel Izquierdo-Cortazar (Bitergia) Data, data and data about your favourite community FOSDEM 2015 15 / 16

    View Slide

  16. Data, data and data about your favourite community
    The Grimoire Library
    Daniel Izquierdo-Cortazar
    [email protected]
    http://twitter.com/dizquierdo
    Bitergia
    FOSDEM, Brussels, February 1st, 2015
    http://speakerdeck.com/bitergia
    Daniel Izquierdo-Cortazar (Bitergia) Data, data and data about your favourite community FOSDEM 2015 16 / 16

    View Slide