Analyzing Gerrit Code Review Parameters with Bicho

Analyzing Gerrit Code Review Parameters with Bicho

Code review is becoming a common practice in large scale software development projects. In the case of free, open source software projects, many of them are selecting Gerrit as the system to support the code review process. Therefore, the analysis of the information produced by Gerrit allows for the detailed tracking of the code review process in those projects. In this paper, we present an approach to retrieve and analyze that information based on extending Bicho, a tool designed to retrieve information from issue tracking systems. The details of the retrieval process, the model used to map code review abstractions to issue tracking abstractions, and the structure of the retrieved information are described in detail. In addition, some results of using this approach in a real world scenario, the OpenStack Gerrit code review system, are presented.

Presentation at SQM 2014.

B7081d0131ad47821467b8e81434cf7a?s=128

Jesus M. Gonzalez-Barahona

February 03, 2014
Tweet

Transcript

  1. 1.

    Analyzing Gerrit Code Review Parameters with Bicho Jesus M. Gonzalez-Barahona,

    Daniel Izquierdo-Cortazar Gregorio Robles, Alvaro del Castillo jgb@gsyc.es http://gsyc.es/~jgb http://twitter.com/jgbarah GSyC/LibreSoft (Universidad Rey Juan Carlos), Bitergia These slides: http://bit.ly/bicho-gerrit 8th Intl. Workshop on Software Quality and Maintainability Antwerp (Belgium), February 3rd, 2014 Jesus M. Gonzalez-Barahona, Daniel Izquierdo-Cortazar Gregorio Robles, Alvaro del Castillo Analyzing Gerrit Code Review Parameters with Bicho
  2. 2.

    (cc) 2014 Jesus M. Gonzalez-Barahona, Daniel Izquierdo-Cortazar Gregorio Robles, Alvaro

    del Castillo. Some rights reserved. This work licensed under Creative Commons Attribution-ShareAlike 3.0 Unported License. To view a copy of full license, see http://creativecommons.org/licenses/by-sa/3.0, or write to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA. Jesus M. Gonzalez-Barahona, Daniel Izquierdo-Cortazar Gregorio Robles, Alvaro del Castillo Analyzing Gerrit Code Review Parameters with Bicho
  3. 3.

    Code review is becoming mainstream Code review: authors of patches

    no longer land them in the released code base fellow developers (reviewers) decide on patches, accepting or rejecting Some advantages: early detection of errors checking of policies extended awareness of the changes Some disadvantages: larger time-to-release and time-to-deploy yet another point for bottlenecks extra work on experienced developers (who are a scarce resource) Jesus M. Gonzalez-Barahona, Daniel Izquierdo-Cortazar Gregorio Robles, Alvaro del Castillo Analyzing Gerrit Code Review Parameters with Bicho
  4. 4.

    Code review is becoming mainstream (2) Many large (and small)

    FLOSS projects are implementing code review policies Usually, a priori code review by a selected set of experienced developers Code review can be massive and complex Example (OpenStack Havana, 6 months): more than 21,000 review processes (115 per day) many of them with several iterations Support by code review tools becomes a need for projects Jesus M. Gonzalez-Barahona, Daniel Izquierdo-Cortazar Gregorio Robles, Alvaro del Castillo Analyzing Gerrit Code Review Parameters with Bicho
  5. 5.

    Code review the Gerrit way Gerrit is the most popular

    tool in FLOSS projects Artifacts and actors in a Gerrit code review process: Patchset: change to the code base which is going to be reviewed Submitter: developer producing patchsets to the code base Reviewer: developer reviewing and “voting” a patchset Core reviewers: allowed to land patches in the code base http://code.google.com/p/gerrit/ Jesus M. Gonzalez-Barahona, Daniel Izquierdo-Cortazar Gregorio Robles, Alvaro del Castillo Analyzing Gerrit Code Review Parameters with Bicho
  6. 6.

    Code review the Gerrit way: the process General Gerrit process:

    Submitter submits a patchset for review Until the patchset is accepted: Review of the patchset If the patchset is not accepted: Submitter submits a new patchset Patchset is landed in the code base Jesus M. Gonzalez-Barahona, Daniel Izquierdo-Cortazar Gregorio Robles, Alvaro del Castillo Analyzing Gerrit Code Review Parameters with Bicho
  7. 7.

    Code review the Gerrit way: OpenStack https://review.openstack.org/ Jesus M. Gonzalez-Barahona,

    Daniel Izquierdo-Cortazar Gregorio Robles, Alvaro del Castillo Analyzing Gerrit Code Review Parameters with Bicho
  8. 8.

    The OpenStack code review process (simplified) Submitter submits a patchset

    for review Until the patchset is accepted: Review of the patchset, Includes: Automatic verification (+1, -1) Vote by reviewers (+1, 0, -1) Vote by core reviewers (+2, -2) When there is at least one -1 or -2: Patchset not accepted When there is one +2: Automatic verification (+1, -1) If result is -1: Patchset not accepted Else: Patchset is accepted If the patchset is not accepted: Submitter submits a new patchset Patchset is landed in the code base https: // wiki. openstack. org/ wiki/ Gerrit_ Workflow Jesus M. Gonzalez-Barahona, Daniel Izquierdo-Cortazar Gregorio Robles, Alvaro del Castillo Analyzing Gerrit Code Review Parameters with Bicho
  9. 9.

    Code review the Gerrit way: Example from OpenStack Jesus M.

    Gonzalez-Barahona, Daniel Izquierdo-Cortazar Gregorio Robles, Alvaro del Castillo Analyzing Gerrit Code Review Parameters with Bicho
  10. 10.

    Code review the Gerrit way: Escample from OpenStack (2) Jesus

    M. Gonzalez-Barahona, Daniel Izquierdo-Cortazar Gregorio Robles, Alvaro del Castillo Analyzing Gerrit Code Review Parameters with Bicho
  11. 11.

    Bicho: analyzing issue tracking systems Retrieves information from issue tracking

    systems Stores results in a MySQL database Information about each issue (ticket), and its modifications Currently it supports: Bugzilla (XML API, HTML) Jira, Allura, Launchpad, GitHub, RedMine (API) Could it be used for analyzing code review systems? Jesus M. Gonzalez-Barahona, Daniel Izquierdo-Cortazar Gregorio Robles, Alvaro del Castillo Analyzing Gerrit Code Review Parameters with Bicho
  12. 12.

    Bicho: database schema (simplified) Jesus M. Gonzalez-Barahona, Daniel Izquierdo-Cortazar Gregorio

    Robles, Alvaro del Castillo Analyzing Gerrit Code Review Parameters with Bicho
  13. 13.

    Mapping Gerrit data to the Bicho database Each review code

    process modeled as a ticket Process modeled as changes to fields in the ticket: Submitted (SUBM). Submission of a new patchset (new id) Verify (VRIF). Result of a verification step: +1, -1 Review (CRVW). Vote during code review: -2 .. +2. Approve (APRV): +1 (approved for merging). Abandoned (ABDN): +1 (marked as abandoned). For each change to a field: current patchset id time of the change to the field (TIME) who issued the change (SUBMITTED BY) Jesus M. Gonzalez-Barahona, Daniel Izquierdo-Cortazar Gregorio Robles, Alvaro del Castillo Analyzing Gerrit Code Review Parameters with Bicho
  14. 14.

    Mapping Gerrit data to the Bicho database Example of queries:

    Time to attention (time until first review): From first SUBM to first CRVW Number of code reviewers: Unique people changing CRVW Backlog of open review processes at a certain date: Tickets opened before date, but with no APRV or ABDN before it Jesus M. Gonzalez-Barahona, Daniel Izquierdo-Cortazar Gregorio Robles, Alvaro del Castillo Analyzing Gerrit Code Review Parameters with Bicho
  15. 15.

    Mapping Gerrit data to the Bicho database For each code

    review process, entry in Issues For each patchset, new id in Changes Process as entries in Changes (SUBM, VRIF, CRVM, APRV, ABDN) People and other tables dealt with as usual Files Lines of code Bicho (included all backend) 27 6,621 Bicho backends (9 backends) 12 4,104 Gerrit backend (gerrit.py) 1 231 Jesus M. Gonzalez-Barahona, Daniel Izquierdo-Cortazar Gregorio Robles, Alvaro del Castillo Analyzing Gerrit Code Review Parameters with Bicho
  16. 16.

    Examples of analysis Number of people proposing code changes each

    month: SELECT YEAR(changed_on), MONTH(changed_on), COUNT(DISTINCT(submitted_by)) AS submitters FROM issues GROUP BY YEAR(submitted_on), MONTH(submitted_on); Jesus M. Gonzalez-Barahona, Daniel Izquierdo-Cortazar Gregorio Robles, Alvaro del Castillo Analyzing Gerrit Code Review Parameters with Bicho
  17. 17.

    Examples of analysis (2) Number of people reviewing patchsets each

    month: SELECT YEAR(changed_on), MONTH(changed_on), COUNT(DISTINCT(changed_by)) AS reviewers FROM changes WHERE field=’CRVW’ GROUP BY YEAR(changed_on), MONTH(changed_on); Jesus M. Gonzalez-Barahona, Daniel Izquierdo-Cortazar Gregorio Robles, Alvaro del Castillo Analyzing Gerrit Code Review Parameters with Bicho
  18. 18.

    Examples of analysis (3) 0 200 500 2011−7 2012−2 2012−9

    2013−4 2013−11 Change proposers (black), reviewers (red), core reviewers (blue) Per month, OpenStack, July 2011 to November 2013. Jesus M. Gonzalez-Barahona, Daniel Izquierdo-Cortazar Gregorio Robles, Alvaro del Castillo Analyzing Gerrit Code Review Parameters with Bicho
  19. 19.

    Examples of analysis (4) 0 2 4 6 8 0.00

    0.05 0.10 0.15 Distribution of time to close: merged (black) and abandoned (red) changes (time in natural log hours) Jesus M. Gonzalez-Barahona, Daniel Izquierdo-Cortazar Gregorio Robles, Alvaro del Castillo Analyzing Gerrit Code Review Parameters with Bicho
  20. 20.

    Questions? Reproduceability package: http://gsyc.es/~jgb/repro/2014-sqm-bicho-gerrit Slides: http://bit.ly/bicho-gerrit Jesus M. Gonzalez-Barahona, Daniel

    Izquierdo-Cortazar Gregorio Robles, Alvaro del Castillo Analyzing Gerrit Code Review Parameters with Bicho