Work Practices, Challenges, Research Opportunities in Continuous Integration

Work Practices, Challenges, Research Opportunities in Continuous Integration

Be6953eb1929f548597c7ebf2be91a22?s=128

Gustavo Pinto

July 10, 2018
Tweet

Transcript

  1. Work Practices, Challenges, and Research Opportunities in Continuous Integration Gustavo

    Pinto @gustavopinto gpinto@ufpa.br gustavopinto.org
  2. @gustavopinto

  3. @gustavopinto commit

  4. @gustavopinto commit

  5. @gustavopinto commit commit

  6. @gustavopinto commit commit pull

  7. @gustavopinto pull commit pull commit

  8. @gustavopinto pull commit pull commit 25M GitHub projects use Travis

    CI
  9. @gustavopinto pull commit pull commit

  10. @gustavopinto This process should be entirely automated! pull commit

  11. @gustavopinto Continuous Integration doesn’t get rid of bugs, but it

    does make them dramatically easier to find and remove
  12. @gustavopinto Continuous Integration doesn’t get rid of bugs, but it

    does make them dramatically easier to find and remove Continuous Integration is one of the key XP practices. I’ve been selling books talking about it since the 90s
  13. Continuous Integration doesn’t get rid of bugs, but it does

    make them dramatically easier to find and remove Continuous Integration is one of the key XP practices. I’ve been selling books talking about it since the 90s Continuous Integration make sure that our test case suite is in good shape! @gustavopinto
  14. Continuous Integration Continuous Delivery Continuous Deployment @gustavopinto

  15. Continuous Integration Continuous Delivery Continuous Deployment @gustavopinto DevOps Microservices SaaS

  16. Continuous Integration Continuous Delivery Continuous Deployment DevOps Satisfaction Transparency Confidence

    Microservices SaaS @gustavopinto
  17. C C++ C# Clojure CoffeeScript Erlang Go Haskell Java Jav

    JavaScript Objective-C PHP Perl Python Ruby Scala TypeScript Most popular PLs on @gustavopinto
  18. C C++ C# Clojure CoffeeScript Erlang Go Haskell Java Jav

    JavaScript Objective-C PHP Perl Python Ruby Scala TypeScript Most popular PLs on @gustavopinto
  19. C C++ C# Clojure CoffeeScript Erlang Go Haskell Java Jav

    JavaScript Objective-C PHP Perl Python Ruby Scala TypeScript Most popular PLs on 50 most popular projects … … … … … … … … … … … … 750 most popular projects @gustavopinto
  20. @gustavopinto Not software projects Not active projects

  21. 666 @gustavopinto

  22. 666 38% 48% 32% @gustavopinto

  23. 666 38% 48% 32% 17% 13% 18% @gustavopinto

  24. 666 38% 48% 32% 13% 18% About 25% of all

    builds have failed! @gustavopinto 17%
  25. 666 1,100 random failing builds @gustavopinto

  26. 666 1,100 random failing builds @gustavopinto

  27. Dear Andrei, we saw that your pull-request #33292 broke the

    build of the rails/rails project. Since we are studying the reasons behind broken builds, would you mind to answer a few questions? Link here! Thank you, Gustavo Pinto Hi Gustavo! Sure, I’m happy to help. Andrei @gustavopinto
  28. Background Experience with CI (Q1) Current position (Q2) Work for

    (Q3) How often contribute to open source? (Q4) How familiar with CI? (Q5) How many OSS projects contributed to with CI? (Q6) Have configured any project to use CI? (Q7) Why do you use CI? @gustavopinto
  29. Reasons for build breakage? (Q12) What are the technical reasons?

    (Q13) What are the social reasons? @gustavopinto Benefits and Challenges (Q14) What are the benefits of using CI? (Q15) What are the challenges of using CI?
  30. 1,100 emails sent @gustavopinto

  31. 1,100 emails sent @gustavopinto

  32. 158 responses (~15% of response rate!) 1,100 emails sent @gustavopinto

  33. @gustavopinto

  34. @gustavopinto

  35. codes @gustavopinto

  36. ! ! ! ! ! ! ! ! ! !

    ! ! ! ! ! ! ! ! @gustavopinto ! ! ! ! ! !
  37. Respondents Background @gustavopinto

  38. Experience with CI @gustavopinto

  39. Experience with CI @gustavopinto

  40. Experience with CI: Reasons for adoption Quick feedback on produced

    development for adding business value running *all* tests locally is taking to long (10+ mins) Speed up software dev. Improve software quality Catch regressions/bugs Enforce automation Credibility @gustavopinto
  41. Experience with CI: Reasons for adoption ensuring both, mine and

    other people changes, do not break anything Speed up software dev. Improve software quality Catch regressions/bugs Enforce automation Credibility @gustavopinto
  42. Experience with CI: Reasons for adoption Much easier regression testing

    in complex projects Prevent to release a bug software into production environment Speed up software dev. Improve software quality Catch regressions/bugs Enforce automation Credibility @gustavopinto
  43. Experience with CI: Reasons for adoption To make sure tests

    are always run, since humans can be inconsistent on that Speed up software dev. Improve software quality Catch regressions/bugs Enforce automation Credibility @gustavopinto
  44. Experience with CI: Reasons for adoption it makes the project

    look more legit Speed up software dev. Improve software quality Catch regressions/bugs Enforce automation Credibility @gustavopinto
  45. Technical reasons for build breakage @gustavopinto Inadequate Testing Version changes

    Dependency Management Complex code base Git usage Badly written that fail with minor bugfixes not enough tests
  46. Technical reasons for build breakage @gustavopinto Inadequate Testing Version changes

    Dependency Management Complex code base Git usage the version of a language component is different, and the change made to the language cause breakage
  47. Technical reasons for build breakage @gustavopinto Inadequate Testing Version changes

    Dependency management Complex code base Git usage people sometimes don’t update dependencies, so the Continuous Integration server detects errors that do not happen locally
  48. Technical reasons for build breakage @gustavopinto Inadequate Testing Version changes

    Dependency Management Complex code base Git usage unfamiliarity with the architecture of the code and overall module interactions
  49. Technical reasons for build breakage @gustavopinto Inadequate Testing Version changes

    Dependency Management Complex code base Git usage git rebase and left the work on hold for too long
  50. Social reasons for build breakage just be happy I’m commiting

    to the project, somebody else can test if what I did works Carelessness Lack of testing culture Overconfidence Lack of communication It’s just fine @gustavopinto
  51. Social reasons for build breakage Carelessness Lack of testing culture

    Overconfidence Lack of communication It’s just fine @gustavopinto people not running the tests and build on their machines before pushing the changes
  52. Social reasons for build breakage Carelessness Lack of testing culture

    Overconfidence Lack of communication It’s just fine @gustavopinto this is only a small fix, it should not break anything
  53. Social reasons for build breakage Carelessness Lack of testing culture

    Overconfidence Lack of communication It’s just fine @gustavopinto not knowing who to ask for help
  54. Social reasons for build breakage @gustavopinto Carelessness Lack of testing

    culture Overconfidence Lack of communication It’s just fine CI is there for you not to be afraid for broken builds in a branch
  55. Social reasons for build breakage @gustavopinto Carelessness Lack of testing

    culture Overconfidence Lack of communication It’s just fine CI is there for you not to be afraid for broken builds in a branch Only if it is not in the master branch
  56. Benefits of CI @gustavopinto Catch problems early Cross-platform testing Confidence

    Automation Software quality Being aware of when/where breakage occurs greatly accelerates solution
  57. Benefits of CI @gustavopinto Catch problems early Cross-platform testing Confidence

    Automation Software quality compiling for multiple different targets (x86, arm, windows, linux etc)
  58. Benefits of CI @gustavopinto Catch problems early Cross-platform testing Confidence

    Automation Software quality compiling for multiple different targets (x86, arm, windows, linux etc) this is not feasible or cost-effective to do manually
  59. Benefits of CI @gustavopinto Catch problems early Cross-platform testing Confidence

    Automation Software quality Some sort of confidence that introduced changes don’t break the current behavior But depends on the coverage
  60. Benefits of CI @gustavopinto Catch problems early Cross-platform testing Confidence

    Automation Software quality performing a wide range of manual steps that a human would not normally be bothered to check
  61. Benefits of CI @gustavopinto Catch problems early Cross-platform testing Confidence

    Automation Software quality When you use CI, you have a good health check in your code base […] It is a warrant of quality of your code
  62. Challenges of CI @gustavopinto Set up environment False sense of

    confidence Discipline Monetary costs Flaky tests I found it hard to setup some distributed/multicomponent tests. This partially can be resolved by the containers
  63. Challenges of CI @gustavopinto Set up environment False sense of

    confidence Discipline Monetary costs Flaky tests Over-reliance on a passing build result can encourage a reviewer to merge code without a thorough review.
  64. Challenges of CI @gustavopinto Set up environment False sense of

    confidence Discipline Monetary costs Flaky tests instead of only focusing on the code of your application, you also focus on the code of your build system
  65. Challenges of CI @gustavopinto Set up environment False sense of

    confidence Discipline Monetary costs Flaky tests Hosted services like Travis CI are not free if you want to use them at a bigger scale or you need to maintain by yourself
  66. Challenges of CI @gustavopinto Set up environment False sense of

    confidence Discipline Monetary costs Flaky tests Tests can sometimes be flaky, which is frustrating
  67. @gustavopinto Speed up software dev. Catch bugs/ regressions Credibility Transparency

    Reason for adoption
  68. @gustavopinto Inadequate testing Dependency management Carelessness Overconfidence Why build breaks

    Speed up software dev. Catch bugs/ regressions Credibility Transparency Reason for adoption
  69. @gustavopinto Catch problems Cross-platform testing Confidence Automation Benefits Inadequate testing

    Dependency management Carelessness Overconfidence Why build breaks Speed up software dev. Catch bugs/ regressions Credibility Transparency Reason for adoption
  70. @gustavopinto Catch problems Cross-platform testing Confidence Automation Benefits Set up

    environment False sense of confidence Discipline Monetary costs Challenges Inadequate testing Dependency management Carelessness Overconfidence Why build breaks Speed up software dev. Catch bugs/ regressions Credibility Transparency Reason for adoption
  71. @gustavopinto

  72. Confidence @gustavopinto

  73. Confidence False sense of @gustavopinto

  74. Confidence Over @gustavopinto

  75. How do (over/lack of) confidence on CI impact software development

    practice? @gustavopinto
  76. Some non-software projects use CI Why? @gustavopinto

  77. Some non-software projects use CI Devs that broken the build

    are more experienced Why? Really? @gustavopinto
  78. Some non-software projects use CI Devs that broken the build

    are more experienced Some PLs have higher proportion of broken builds Why? Really? Why? @gustavopinto
  79. Some non-software projects use CI Devs that broken the build

    are more experienced Some PLs have higher proportion of broken builds Some commits that lead to broken builds are not broken Why? Really? Why? How? @gustavopinto
  80. Are casual contributors more prone to create a failing build?

    Research Question MSR’17 @gustavopinto
  81. Methodology TravisTorrent TravisCI CI Build Data Commiter Data Dataset Dataset

    without duplicated users User Disambiguation Technique Data Cleaning Dataset with 1,074 curated projects Data Processing Data Statistical Tests @gustavopinto
  82. 1,074 projects 619,370 builds 35,360 contributors @gustavopinto

  83. # Users 0 5000 10000 15000 20000 Builds 1 2

    3 4 5+ 0 150000 300000 450000 600000 Casual Non-Casual 1,074 projects 619,370 builds # Builds 35,360 contributors @gustavopinto
  84. Being a casual contributor is not a strong indicator for

    creating failing builds 0 250 500 750 1000 No difference Higher Casual Success Lower Casual Success # Number of Projects @gustavopinto
  85. 0 22.5 45 67.5 90 0 1.15 2.3 3.45 4.6

    Casual contributions are smaller, both in modified source-code lines and modified files Median of Modified LoC Median of Modified Files
  86. 0 1.15 2.3 3.45 4.6 No difference Higher Casual Success

    Lower Casual Success Projects in which casuals fail more than non-casuals run more jobs per build Median of jobs per build @gustavopinto
  87. Thanks! @gustavopinto

  88. Work Practices, Challenges, and Research Opportunities in Continuous Integration Gustavo

    Pinto @gustavopinto gpinto@ufpa.br gustavopinto.org