Slide 1

Slide 1 text

Work Practices, Challenges, and Research Opportunities in Continuous Integration Gustavo Pinto @gustavopinto [email protected] gustavopinto.org

Slide 2

Slide 2 text

@gustavopinto

Slide 3

Slide 3 text

@gustavopinto commit

Slide 4

Slide 4 text

@gustavopinto commit

Slide 5

Slide 5 text

@gustavopinto commit commit

Slide 6

Slide 6 text

@gustavopinto commit commit pull

Slide 7

Slide 7 text

@gustavopinto pull commit pull commit

Slide 8

Slide 8 text

@gustavopinto pull commit pull commit 25M GitHub projects use Travis CI

Slide 9

Slide 9 text

@gustavopinto pull commit pull commit

Slide 10

Slide 10 text

@gustavopinto This process should be entirely automated! pull commit

Slide 11

Slide 11 text

@gustavopinto Continuous Integration doesn’t get rid of bugs, but it does make them dramatically easier to find and remove

Slide 12

Slide 12 text

@gustavopinto Continuous Integration doesn’t get rid of bugs, but it does make them dramatically easier to find and remove Continuous Integration is one of the key XP practices. I’ve been selling books talking about it since the 90s

Slide 13

Slide 13 text

Continuous Integration doesn’t get rid of bugs, but it does make them dramatically easier to find and remove Continuous Integration is one of the key XP practices. I’ve been selling books talking about it since the 90s Continuous Integration make sure that our test case suite is in good shape! @gustavopinto

Slide 14

Slide 14 text

Continuous Integration Continuous Delivery Continuous Deployment @gustavopinto

Slide 15

Slide 15 text

Continuous Integration Continuous Delivery Continuous Deployment @gustavopinto DevOps Microservices SaaS

Slide 16

Slide 16 text

Continuous Integration Continuous Delivery Continuous Deployment DevOps Satisfaction Transparency Confidence Microservices SaaS @gustavopinto

Slide 17

Slide 17 text

C C++ C# Clojure CoffeeScript Erlang Go Haskell Java Jav JavaScript Objective-C PHP Perl Python Ruby Scala TypeScript Most popular PLs on @gustavopinto

Slide 18

Slide 18 text

C C++ C# Clojure CoffeeScript Erlang Go Haskell Java Jav JavaScript Objective-C PHP Perl Python Ruby Scala TypeScript Most popular PLs on @gustavopinto

Slide 19

Slide 19 text

C C++ C# Clojure CoffeeScript Erlang Go Haskell Java Jav JavaScript Objective-C PHP Perl Python Ruby Scala TypeScript Most popular PLs on 50 most popular projects … … … … … … … … … … … … 750 most popular projects @gustavopinto

Slide 20

Slide 20 text

@gustavopinto Not software projects Not active projects

Slide 21

Slide 21 text

666 @gustavopinto

Slide 22

Slide 22 text

666 38% 48% 32% @gustavopinto

Slide 23

Slide 23 text

666 38% 48% 32% 17% 13% 18% @gustavopinto

Slide 24

Slide 24 text

666 38% 48% 32% 13% 18% About 25% of all builds have failed! @gustavopinto 17%

Slide 25

Slide 25 text

666 1,100 random failing builds @gustavopinto

Slide 26

Slide 26 text

666 1,100 random failing builds @gustavopinto

Slide 27

Slide 27 text

Dear Andrei, we saw that your pull-request #33292 broke the build of the rails/rails project. Since we are studying the reasons behind broken builds, would you mind to answer a few questions? Link here! Thank you, Gustavo Pinto Hi Gustavo! Sure, I’m happy to help. Andrei @gustavopinto

Slide 28

Slide 28 text

Background Experience with CI (Q1) Current position (Q2) Work for (Q3) How often contribute to open source? (Q4) How familiar with CI? (Q5) How many OSS projects contributed to with CI? (Q6) Have configured any project to use CI? (Q7) Why do you use CI? @gustavopinto

Slide 29

Slide 29 text

Reasons for build breakage? (Q12) What are the technical reasons? (Q13) What are the social reasons? @gustavopinto Benefits and Challenges (Q14) What are the benefits of using CI? (Q15) What are the challenges of using CI?

Slide 30

Slide 30 text

1,100 emails sent @gustavopinto

Slide 31

Slide 31 text

1,100 emails sent @gustavopinto

Slide 32

Slide 32 text

158 responses (~15% of response rate!) 1,100 emails sent @gustavopinto

Slide 33

Slide 33 text

@gustavopinto

Slide 34

Slide 34 text

@gustavopinto

Slide 35

Slide 35 text

codes @gustavopinto

Slide 36

Slide 36 text

! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! @gustavopinto ! ! ! ! ! !

Slide 37

Slide 37 text

Respondents Background @gustavopinto

Slide 38

Slide 38 text

Experience with CI @gustavopinto

Slide 39

Slide 39 text

Experience with CI @gustavopinto

Slide 40

Slide 40 text

Experience with CI: Reasons for adoption Quick feedback on produced development for adding business value running *all* tests locally is taking to long (10+ mins) Speed up software dev. Improve software quality Catch regressions/bugs Enforce automation Credibility @gustavopinto

Slide 41

Slide 41 text

Experience with CI: Reasons for adoption ensuring both, mine and other people changes, do not break anything Speed up software dev. Improve software quality Catch regressions/bugs Enforce automation Credibility @gustavopinto

Slide 42

Slide 42 text

Experience with CI: Reasons for adoption Much easier regression testing in complex projects Prevent to release a bug software into production environment Speed up software dev. Improve software quality Catch regressions/bugs Enforce automation Credibility @gustavopinto

Slide 43

Slide 43 text

Experience with CI: Reasons for adoption To make sure tests are always run, since humans can be inconsistent on that Speed up software dev. Improve software quality Catch regressions/bugs Enforce automation Credibility @gustavopinto

Slide 44

Slide 44 text

Experience with CI: Reasons for adoption it makes the project look more legit Speed up software dev. Improve software quality Catch regressions/bugs Enforce automation Credibility @gustavopinto

Slide 45

Slide 45 text

Technical reasons for build breakage @gustavopinto Inadequate Testing Version changes Dependency Management Complex code base Git usage Badly written that fail with minor bugfixes not enough tests

Slide 46

Slide 46 text

Technical reasons for build breakage @gustavopinto Inadequate Testing Version changes Dependency Management Complex code base Git usage the version of a language component is different, and the change made to the language cause breakage

Slide 47

Slide 47 text

Technical reasons for build breakage @gustavopinto Inadequate Testing Version changes Dependency management Complex code base Git usage people sometimes don’t update dependencies, so the Continuous Integration server detects errors that do not happen locally

Slide 48

Slide 48 text

Technical reasons for build breakage @gustavopinto Inadequate Testing Version changes Dependency Management Complex code base Git usage unfamiliarity with the architecture of the code and overall module interactions

Slide 49

Slide 49 text

Technical reasons for build breakage @gustavopinto Inadequate Testing Version changes Dependency Management Complex code base Git usage git rebase and left the work on hold for too long

Slide 50

Slide 50 text

Social reasons for build breakage just be happy I’m commiting to the project, somebody else can test if what I did works Carelessness Lack of testing culture Overconfidence Lack of communication It’s just fine @gustavopinto

Slide 51

Slide 51 text

Social reasons for build breakage Carelessness Lack of testing culture Overconfidence Lack of communication It’s just fine @gustavopinto people not running the tests and build on their machines before pushing the changes

Slide 52

Slide 52 text

Social reasons for build breakage Carelessness Lack of testing culture Overconfidence Lack of communication It’s just fine @gustavopinto this is only a small fix, it should not break anything

Slide 53

Slide 53 text

Social reasons for build breakage Carelessness Lack of testing culture Overconfidence Lack of communication It’s just fine @gustavopinto not knowing who to ask for help

Slide 54

Slide 54 text

Social reasons for build breakage @gustavopinto Carelessness Lack of testing culture Overconfidence Lack of communication It’s just fine CI is there for you not to be afraid for broken builds in a branch

Slide 55

Slide 55 text

Social reasons for build breakage @gustavopinto Carelessness Lack of testing culture Overconfidence Lack of communication It’s just fine CI is there for you not to be afraid for broken builds in a branch Only if it is not in the master branch

Slide 56

Slide 56 text

Benefits of CI @gustavopinto Catch problems early Cross-platform testing Confidence Automation Software quality Being aware of when/where breakage occurs greatly accelerates solution

Slide 57

Slide 57 text

Benefits of CI @gustavopinto Catch problems early Cross-platform testing Confidence Automation Software quality compiling for multiple different targets (x86, arm, windows, linux etc)

Slide 58

Slide 58 text

Benefits of CI @gustavopinto Catch problems early Cross-platform testing Confidence Automation Software quality compiling for multiple different targets (x86, arm, windows, linux etc) this is not feasible or cost-effective to do manually

Slide 59

Slide 59 text

Benefits of CI @gustavopinto Catch problems early Cross-platform testing Confidence Automation Software quality Some sort of confidence that introduced changes don’t break the current behavior But depends on the coverage

Slide 60

Slide 60 text

Benefits of CI @gustavopinto Catch problems early Cross-platform testing Confidence Automation Software quality performing a wide range of manual steps that a human would not normally be bothered to check

Slide 61

Slide 61 text

Benefits of CI @gustavopinto Catch problems early Cross-platform testing Confidence Automation Software quality When you use CI, you have a good health check in your code base […] It is a warrant of quality of your code

Slide 62

Slide 62 text

Challenges of CI @gustavopinto Set up environment False sense of confidence Discipline Monetary costs Flaky tests I found it hard to setup some distributed/multicomponent tests. This partially can be resolved by the containers

Slide 63

Slide 63 text

Challenges of CI @gustavopinto Set up environment False sense of confidence Discipline Monetary costs Flaky tests Over-reliance on a passing build result can encourage a reviewer to merge code without a thorough review.

Slide 64

Slide 64 text

Challenges of CI @gustavopinto Set up environment False sense of confidence Discipline Monetary costs Flaky tests instead of only focusing on the code of your application, you also focus on the code of your build system

Slide 65

Slide 65 text

Challenges of CI @gustavopinto Set up environment False sense of confidence Discipline Monetary costs Flaky tests Hosted services like Travis CI are not free if you want to use them at a bigger scale or you need to maintain by yourself

Slide 66

Slide 66 text

Challenges of CI @gustavopinto Set up environment False sense of confidence Discipline Monetary costs Flaky tests Tests can sometimes be flaky, which is frustrating

Slide 67

Slide 67 text

@gustavopinto Speed up software dev. Catch bugs/ regressions Credibility Transparency Reason for adoption

Slide 68

Slide 68 text

@gustavopinto Inadequate testing Dependency management Carelessness Overconfidence Why build breaks Speed up software dev. Catch bugs/ regressions Credibility Transparency Reason for adoption

Slide 69

Slide 69 text

@gustavopinto Catch problems Cross-platform testing Confidence Automation Benefits Inadequate testing Dependency management Carelessness Overconfidence Why build breaks Speed up software dev. Catch bugs/ regressions Credibility Transparency Reason for adoption

Slide 70

Slide 70 text

@gustavopinto Catch problems Cross-platform testing Confidence Automation Benefits Set up environment False sense of confidence Discipline Monetary costs Challenges Inadequate testing Dependency management Carelessness Overconfidence Why build breaks Speed up software dev. Catch bugs/ regressions Credibility Transparency Reason for adoption

Slide 71

Slide 71 text

@gustavopinto

Slide 72

Slide 72 text

Confidence @gustavopinto

Slide 73

Slide 73 text

Confidence False sense of @gustavopinto

Slide 74

Slide 74 text

Confidence Over @gustavopinto

Slide 75

Slide 75 text

How do (over/lack of) confidence on CI impact software development practice? @gustavopinto

Slide 76

Slide 76 text

Some non-software projects use CI Why? @gustavopinto

Slide 77

Slide 77 text

Some non-software projects use CI Devs that broken the build are more experienced Why? Really? @gustavopinto

Slide 78

Slide 78 text

Some non-software projects use CI Devs that broken the build are more experienced Some PLs have higher proportion of broken builds Why? Really? Why? @gustavopinto

Slide 79

Slide 79 text

Some non-software projects use CI Devs that broken the build are more experienced Some PLs have higher proportion of broken builds Some commits that lead to broken builds are not broken Why? Really? Why? How? @gustavopinto

Slide 80

Slide 80 text

Are casual contributors more prone to create a failing build? Research Question MSR’17 @gustavopinto

Slide 81

Slide 81 text

Methodology TravisTorrent TravisCI CI Build Data Commiter Data Dataset Dataset without duplicated users User Disambiguation Technique Data Cleaning Dataset with 1,074 curated projects Data Processing Data Statistical Tests @gustavopinto

Slide 82

Slide 82 text

1,074 projects 619,370 builds 35,360 contributors @gustavopinto

Slide 83

Slide 83 text

# Users 0 5000 10000 15000 20000 Builds 1 2 3 4 5+ 0 150000 300000 450000 600000 Casual Non-Casual 1,074 projects 619,370 builds # Builds 35,360 contributors @gustavopinto

Slide 84

Slide 84 text

Being a casual contributor is not a strong indicator for creating failing builds 0 250 500 750 1000 No difference Higher Casual Success Lower Casual Success # Number of Projects @gustavopinto

Slide 85

Slide 85 text

0 22.5 45 67.5 90 0 1.15 2.3 3.45 4.6 Casual contributions are smaller, both in modified source-code lines and modified files Median of Modified LoC Median of Modified Files

Slide 86

Slide 86 text

0 1.15 2.3 3.45 4.6 No difference Higher Casual Success Lower Casual Success Projects in which casuals fail more than non-casuals run more jobs per build Median of jobs per build @gustavopinto

Slide 87

Slide 87 text

Thanks! @gustavopinto

Slide 88

Slide 88 text

Work Practices, Challenges, and Research Opportunities in Continuous Integration Gustavo Pinto @gustavopinto [email protected] gustavopinto.org