Slide 1

Slide 1 text

Work Practices and Challenges in Pull-Based Development: the Integrator’s Perspective Georgios Gousios, Andy Zaidman, Margaret-Anne Storey, Arie van Deursen

Slide 2

Slide 2 text

How does software grow? http://travelspirit333.com

Slide 3

Slide 3 text

How does OSS grow? http://travelspirit333.com

Slide 4

Slide 4 text

The onion model Core developers Co-developers Community Users

Slide 5

Slide 5 text

The pull-based development model Here are my changes Please fix those issues Here are my updates Looks great, thanks! contributor integrator changes integrated changes examined changes re- examined

Slide 6

Slide 6 text

GitHub: made pull requests popular

Slide 7

Slide 7 text

45% of collaborative projects Projects with > 1 committers Gousios et al. ICSE14

Slide 8

Slide 8 text

45% of collaborative projects Projects with > 1 committers 55% use shared repository 45% use pull requests Gousios et al. ICSE14

Slide 9

Slide 9 text

Widely popular and increasing

Slide 10

Slide 10 text

Widely popular and increasing 90k repositories 400k pull requests

Slide 11

Slide 11 text

Widely popular and increasing Per month 90k repositories 400k pull requests

Slide 12

Slide 12 text

Large scale collaboration 2710 committers 309 code reviewers 9120 commenters 657 collaborators 125 code reviewers 3904 commenters 875 committers 92 code reviewers 5200 commenters

Slide 13

Slide 13 text

Too successful?

Slide 14

Slide 14 text

Too successful? “Lack of knowledge of git from contributors; most don’t know how to resolve a merge conflict.” “Sifting through the GitHub information flood to find what, if any, I should address.” “Dealing with loud and trigger-happy developers.”

Slide 15

Slide 15 text

Integrators: guardians of quality

Slide 16

Slide 16 text

WUGRWNNTGSWGUVU! FGEKFGVJGHCVGQHCEQPVTKDWVKQP! GXCNWCVGVJGSWCNKV[QHEQPVTKDWVKQPU! RTKQTKVKUGVJGCRRNKECVKQPQHEQPVTKDWVKQPU! YJCVMG[EJCNNGPIGUFQVJG[HCEG! 4GUGCTEJ3WGUVKQPU *QYFQKPVGITCVQTU

Slide 17

Slide 17 text

Pilot survey GHTorrent 250 integrators (25 responses) Literature Survey Final Survey Analysis

Slide 18

Slide 18 text

How we reached out to integrators Lifelines for 10% slowest pull reqs community participation in commits % comments (red) and commenters (blue) from the community http://ghtorrent.org/pullreq-perf/

Slide 19

Slide 19 text

GHTorrent Final Survey Performance Reports

Slide 20

Slide 20 text

GHTorrent Final Survey Performance Reports 3,200 integrators (749 responses)

Slide 21

Slide 21 text

GHTorrent Final Survey Analysis / filtering (15 closed questions) Performance Reports 4 researchers Card sorting (7 open questions) 3,200 integrators (749 responses)

Slide 22

Slide 22 text

GHTorrent Final Survey Analysis / filtering (15 closed questions) Performance Reports 4 researchers Card sorting (7 open questions) 3,200 integrators (749 responses) Results

Slide 23

Slide 23 text

GHTorrent Final Survey Analysis / filtering (15 closed questions) Performance Reports 4 researchers Card sorting (7 open questions) 3,200 integrators (749 responses) Results 645 integrators

Slide 24

Slide 24 text

Overlay *QYFQKPVGITCVQTUWUGRWNNTGSWGUVU! 

Slide 25

Slide 25 text

How integrators use pull requests? Code reviews Issue fixes Soliciting contributions Discussing new features Work distribution Other 0 20 40 60 80 % of responses

Slide 26

Slide 26 text

How integrators use pull requests? Code reviews Issue fixes Soliciting contributions Discussing new features Work distribution Other 0 20 40 60 80 % of responses

Slide 27

Slide 27 text

How integrators use pull requests? Code reviews Issue fixes Soliciting contributions Discussing new features Work distribution Other 0 20 40 60 80 % of responses

Slide 28

Slide 28 text

How projects do code reviews?

Slide 29

Slide 29 text

How projects do code reviews? 65%: Inline code comments

Slide 30

Slide 30 text

How projects do code reviews? 65%: Inline code comments 60%: Integration pipeline

Slide 31

Slide 31 text

How projects do code reviews? 65%: Inline code comments 60%: Integration pipeline 55%: The community participates

Slide 32

Slide 32 text

Overlay *QYFQKPVGITCVQTUFGEKFGYJKEJ EQPVTKDWVKQPUVQCEEGRV! 

Slide 33

Slide 33 text

Factors for determining acceptance Code quality Code style Project fit Technical fit Testing Documentation Feature importance Code review result 0 5,5 11 16,5 22 % of responses

Slide 34

Slide 34 text

Factors for determining acceptance Code quality Code style Project fit Technical fit Testing Documentation Feature importance Code review result 0 5,5 11 16,5 22 Quality Project Fit % of responses

Slide 35

Slide 35 text

PR characteristics leading to merge PR hotness PR has tests PR track record PR churn PR # comments 0 50 100 150 200 Not or Mildly Important Quite or Very Imporant 100 0 100 % of responses

Slide 36

Slide 36 text

Overlay *QYFQKPVGITCVQTUGXCNWCVGVJG SWCNKV[QHEQPVTKDWVKQPU! 

Slide 37

Slide 37 text

- Following project's code style - Comments in code - Quality of the commit message - Readability of the code - Test coverage and their readability - Included documentation updates “Proper use of C++ language constructs. Knowledge and use of our dependent libraries […]” Views on quality

Slide 38

Slide 38 text

Contribution quality perception Style conformance Test coverage of PR Code quality Code review Test/CI result Documentation Experience Author reputation Project conventions 0 3,25 6,5 9,75 13 % of responses

Slide 39

Slide 39 text

Contribution quality perception Style conformance Test coverage of PR Code quality Code review Test/CI result Documentation Experience Author reputation Project conventions 0 3,25 6,5 9,75 13 % of responses Conformance

Slide 40

Slide 40 text

Contribution quality perception Style conformance Test coverage of PR Code quality Code review Test/CI result Documentation Experience Author reputation Project conventions 0 3,25 6,5 9,75 13 % of responses Conformance Technical Excellence

Slide 41

Slide 41 text

Contribution quality perception Style conformance Test coverage of PR Code quality Code review Test/CI result Documentation Experience Author reputation Project conventions 0 3,25 6,5 9,75 13 % of responses Conformance Technical Excellence

Slide 42

Slide 42 text

Overlay *QYFQKPVGITCVQTURTKQTKVKUGVJG CRRNKECVKQPQHEQPVTKDWVKQPU! 

Slide 43

Slide 43 text

Prioritisation factors

Slide 44

Slide 44 text

Prioritisation factors “How important the issue is to the project. How serious the bug is. […]” Criticality

Slide 45

Slide 45 text

Prioritisation factors Urgency Criticality “Whether the functionality is critical (bug fixes or very important feature blocking many users)[…]”

Slide 46

Slide 46 text

Prioritisation factors Urgency Criticality Size/Complexity “If the merge is small, or affects a large issue, it'll be merged fast. If it's a feature request or large change it will take longer.”

Slide 47

Slide 47 text

Prioritisation factors Urgency Mostly FIFO, but critical bug fixes or hot fixes get pushed to the top Age Criticality Size/Complexity

Slide 48

Slide 48 text

Overlay 9JCVCTGVJGMG[EJCNNGPIGUHCEGF D[KPVGITCVQTU! 

Slide 49

Slide 49 text

Technical challenges

Slide 50

Slide 50 text

Technical challenges code base quality “Ensuring the codebase remains consistent”

Slide 51

Slide 51 text

Technical challenges impact assessment “Huge, unwieldy, complected bundles of ‘hey I added a LOT of features and fixes ALL AT ONCE!’ that are hell to review and that I’d like to *partially* reject if only the parts were in any way separable…” code base quality

Slide 52

Slide 52 text

Technical challenges impact assessment contributor experience code base quality “Lack of knowledge of git from contributors; most don’t know how to resolve a merge conflict.”

Slide 53

Slide 53 text

Technical challenges impact assessment “Sifting through the GitHub information flood to find what, if any, I should address” contributor experience code base quality volume

Slide 54

Slide 54 text

Social challenges

Slide 55

Slide 55 text

Social challenges “Time is the biggest challenge -- I would love to have a more hierarchical structure to spread out the work; however, the community's development capacity is somewhat limited.” workload

Slide 56

Slide 56 text

Social challenges workload responsiveness “People become non-responsive, never write tests, and never update style. They make a change once when they need their problem solved and disappear basically.”

Slide 57

Slide 57 text

Social challenges explaining rejection workload responsiveness “Disappointing people when they put in a lot of work but the quality is low.”

Slide 58

Slide 58 text

Social challenges explaining rejection workload responsiveness motivating contributors “On-boarding new contributors: helping them set up the development environment and learn enough Git/ Github to submit a pull request”

Slide 59

Slide 59 text

How does OSS grow? The pull- based development perspective http://travelspirit333.com

Slide 60

Slide 60 text

Integrators use pull requests for code review https://www.digitalassurance.com/sites/default/files/iStock_000017496218_Medium.jpg

Slide 61

Slide 61 text

Integrators use testing as a safety net http://www.tradeandexportme.com/wp-content/uploads/2013/02/shutterstock_91331678.jpg

Slide 62

Slide 62 text

Integrators face technical challenges

Slide 63

Slide 63 text

Integrators face technical challenges Impact assessment Quality evaluation Work prioritisation

Slide 64

Slide 64 text

Integrators do not use track records (much)

Slide 65

Slide 65 text

Not a better online forum experience

Slide 66

Slide 66 text

Recommendations for integrators

Slide 67

Slide 67 text

Recommendations for integrators automation contribution guidelines testing be proactive/reactive

Slide 68

Slide 68 text

New research directions

Slide 69

Slide 69 text

New research directions prioritisation quality analysis impact analysis code review automation

Slide 70

Slide 70 text

New research directions prioritisation quality analysis impact analysis code review automation @gousiosg