Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Why Did your PR Get Rejected? Defining Guidelines for Avoiding PR Rejection in Open Source Projects

Why Did your PR Get Rejected? Defining Guidelines for Avoiding PR Rejection in Open Source Projects

Pull requests are a commonly used method of collaboration for software developers working on open source projects. In this paper, we analyze the most common reasons, sentiment polarity, and interaction length for pull request rejections, as well as the correlations between these factors in a large open-source project called Scapy. We manually analyzed 231 rejected pull requests and systematically mapped sentiment and categorized rejection reasons. We found that the most frequent reasons for pull request rejection refer to source code management issues, incomplete comprehension of project functionalities, poor understanding of what reviewers expect, and misunderstanding the project guidelines (often due to a lack of complete/updated instructions and communication gaps). This work is an ongoing effort toward establishing practical guidelines for globally distributed contributors in open-source projects to minimize pull request rejection and maximize productivity leading to more fruitful remote collaboration. Future work involves expanding the analysis to more projects and incorporating quantitative methods.

Bruno C. da Silva

July 01, 2020
Tweet

More Decks by Bruno C. da Silva

Other Decks in Research

Transcript

  1. Why Did your PR Get Rejected? Defining Guidelines for Avoiding

    PR Rejection in Open Source Projects Nick Papadakis, Ayan Patel, Tanay Gottigundala, Alexandra Garro, Xavier Graham, Bruno da Silva [email protected] CHASE Workshop, June 2020
  2. ?

  3. They built a list of PR rejection reasons from surveying

    the PR authors. We want to do manual analysis directly on the PRs and spell out concrete guidelines for contributors
  4. We focus on all types of PR authors Rejected PRs

    only Qualitative investigation
  5. We manually analyzed 231 rejected PRs A python packet manipulation

    program and lib. 5.3k stars, 1.2k forks RQ1: most frequent reasons why PRs are rejected RQ2: PR rejection vs. sentiment on comments RQ3: interaction length on PRs vs. rejection reason and sentiment
  6. 5 researchers Manual classification of: • Sentiment (pos, neg, neutral)

    • Rejection reason Calibration session: 25 PRs, everyone present Each of the remaining PRs assigned to 2 different researchers Individual classification sessions Conflict resolution sessions
  7. RQ1: most frequent reasons why PRs are rejected PR conflicts

    …after excluding PRs closed by author or accidentally, and PRs with no conversation
  8. RQ1: most frequent reasons why PRs are rejected PR conflicts

    …after excluding PRs closed by author or accidentally, and PRs with no conversation code rebase We found a lot of => researchers: note these ‘false rejections’ since code rebase is another way to integrate code (accepting the changes fully or partially)
  9. RQ1: most frequent reasons why PRs are rejected PR conflicts

    …after excluding PRs closed by author or accidentally, and PRs with no conversation code rebase We found a lot of => researchers: note these ‘false rejections’ since code rebase is another way to integrate code (accepting the changes fully or partially) Unnecessary functionality
  10. RQ1: most frequent reasons why PRs are rejected PR conflicts

    …after excluding PRs closed by author or accidentally, and PRs with no conversation code rebase We found a lot of => researchers: note these ‘false rejections’ since code rebase is another way to integrate code (accepting the changes fully or partially) Unnecessary functionality Needs testing
  11. RQ1: most frequent reasons why PRs are rejected PR conflicts

    …after excluding PRs closed by author or accidentally, and PRs with no conversation code rebase We found a lot of => researchers: note these ‘false rejections’ since code rebase is another way to integrate code (accepting the changes fully or partially) Unnecessary functionality Needs testing Author unable to fix issues
  12. RQ2: PR rejection vs. sentiment on comments The vast majority

    of the PR conversations were neutral in sentiment (~66%)
  13. RQ2: PR rejection vs. sentiment on comments The vast majority

    of the PR conversations were neutral in sentient (~66%) ~30% are positive
  14. RQ2: PR rejection vs. sentiment on comments The vast majority

    of the PR conversations were neutral in sentient (~66%) ~30% are positive ~3% are negative … and this distribution does not vary significantly as you navigate through specific rejection categories
  15. RQ3: interaction length on PRs vs. rejection reason and sentiment

    Author issues Version control issues Code issues Side effect issues Unnecessary changes Avg comment count 10 8 7 4 3
  16. RQ3: interaction length on PRs vs. rejection reason and sentiment

    Avg comment count 10 10 5 Do people act more on things they’re emotionally motivated? More conversation => more sentiment expressed In many PRs with positive sentiment, reviewers were trying to help the author to get the PR accepted
  17. a) Know well the scope and requirements of your change

    and the context around it. b) Make sure that someone else has not already opened a pull request to address the same issue. c) Understand well the project functionalities before creating new ones. Make sure new functionalities are really necessary. d) Always include tests to cover your changes. Make sure your code meets test coverage expectation. e) Make sure to follow up on your pull requests; reviewers may request changes before it is accepted. v0.1
  18. Why Did your PR Get Rejected? Defining Guidelines for Avoiding

    PR Rejection in Open Source Projects Nick Papadakis, Ayan Patel, Tanay Gottigundala, Alexandra Garro, Xavier Graham, Bruno da Silva [email protected] CHASE Workshop, June 2020