$30 off During Our Annual Pro Sale. View Details »

Work Practices, Challenges, Research Opportunities in Continuous Integration

Work Practices, Challenges, Research Opportunities in Continuous Integration

Gustavo Pinto

July 10, 2018
Tweet

More Decks by Gustavo Pinto

Other Decks in Technology

Transcript

  1. Work Practices, Challenges, and Research Opportunities in
    Continuous Integration
    Gustavo Pinto
    @gustavopinto
    [email protected]
    gustavopinto.org

    View Slide

  2. @gustavopinto

    View Slide

  3. @gustavopinto
    commit

    View Slide

  4. @gustavopinto
    commit

    View Slide

  5. @gustavopinto
    commit
    commit

    View Slide

  6. @gustavopinto
    commit
    commit
    pull

    View Slide

  7. @gustavopinto
    pull
    commit
    pull
    commit

    View Slide

  8. @gustavopinto
    pull
    commit
    pull
    commit
    25M GitHub projects use
    Travis CI

    View Slide

  9. @gustavopinto
    pull
    commit
    pull
    commit

    View Slide

  10. @gustavopinto
    This process should be entirely automated!
    pull
    commit

    View Slide

  11. @gustavopinto
    Continuous Integration doesn’t get rid of bugs, but it
    does make them dramatically easier to find and remove

    View Slide

  12. @gustavopinto
    Continuous Integration doesn’t get rid of bugs, but it
    does make them dramatically easier to find and remove
    Continuous Integration is one of the key XP
    practices. I’ve been selling books talking about it
    since the 90s

    View Slide

  13. Continuous Integration doesn’t get rid of bugs, but it
    does make them dramatically easier to find and remove
    Continuous Integration is one of the key XP
    practices. I’ve been selling books talking about it
    since the 90s
    Continuous Integration make sure that our test case
    suite is in good shape!
    @gustavopinto

    View Slide

  14. Continuous Integration
    Continuous Delivery
    Continuous Deployment
    @gustavopinto

    View Slide

  15. Continuous Integration
    Continuous Delivery
    Continuous Deployment
    @gustavopinto
    DevOps Microservices SaaS

    View Slide

  16. Continuous Integration
    Continuous Delivery
    Continuous Deployment
    DevOps
    Satisfaction Transparency Confidence
    Microservices SaaS
    @gustavopinto

    View Slide

  17. C
    C++
    C#
    Clojure
    CoffeeScript
    Erlang
    Go
    Haskell
    Java
    Jav
    JavaScript
    Objective-C
    PHP
    Perl
    Python
    Ruby
    Scala
    TypeScript
    Most popular PLs on
    @gustavopinto

    View Slide

  18. C
    C++
    C#
    Clojure
    CoffeeScript
    Erlang
    Go
    Haskell
    Java
    Jav
    JavaScript
    Objective-C
    PHP
    Perl
    Python
    Ruby
    Scala
    TypeScript
    Most popular PLs on
    @gustavopinto

    View Slide

  19. C
    C++
    C#
    Clojure
    CoffeeScript
    Erlang
    Go
    Haskell
    Java
    Jav
    JavaScript
    Objective-C
    PHP
    Perl
    Python
    Ruby
    Scala
    TypeScript
    Most popular PLs on
    50 most popular projects












    750 most popular projects
    @gustavopinto

    View Slide

  20. @gustavopinto
    Not software projects Not active projects

    View Slide

  21. 666
    @gustavopinto

    View Slide

  22. 666
    38%
    48%
    32%
    @gustavopinto

    View Slide

  23. 666
    38%
    48%
    32%
    17%
    13%
    18%
    @gustavopinto

    View Slide

  24. 666
    38%
    48%
    32%
    13%
    18%
    About 25% of all builds have failed!
    @gustavopinto
    17%

    View Slide

  25. 666
    1,100 random failing builds
    @gustavopinto

    View Slide

  26. 666
    1,100 random failing builds
    @gustavopinto

    View Slide

  27. Dear Andrei,
    we saw that your pull-request #33292 broke the build of
    the rails/rails project.
    Since we are studying the reasons behind broken builds,
    would you mind to answer a few questions? Link here!
    Thank you,
    Gustavo Pinto
    Hi Gustavo! Sure, I’m happy to help.
    Andrei
    @gustavopinto

    View Slide

  28. Background
    Experience with CI
    (Q1) Current position
    (Q2) Work for
    (Q3) How often contribute to open source?
    (Q4) How familiar with CI?
    (Q5) How many OSS projects contributed to with CI?
    (Q6) Have configured any project to use CI?
    (Q7) Why do you use CI?
    @gustavopinto

    View Slide

  29. Reasons for build breakage?
    (Q12) What are the technical reasons?
    (Q13) What are the social reasons?
    @gustavopinto
    Benefits and Challenges
    (Q14) What are the benefits of using CI?
    (Q15) What are the challenges of using CI?

    View Slide

  30. 1,100 emails sent
    @gustavopinto

    View Slide

  31. 1,100 emails sent
    @gustavopinto

    View Slide

  32. 158 responses (~15% of response rate!)
    1,100 emails sent
    @gustavopinto

    View Slide

  33. @gustavopinto

    View Slide

  34. @gustavopinto

    View Slide

  35. codes
    @gustavopinto

    View Slide

  36. !
    !
    !
    !
    !
    !
    !
    !
    !
    !
    !
    !
    !
    !
    !
    !
    !
    !
    @gustavopinto
    !
    !
    !
    !
    !
    !

    View Slide

  37. Respondents Background
    @gustavopinto

    View Slide

  38. Experience with CI
    @gustavopinto

    View Slide

  39. Experience with CI
    @gustavopinto

    View Slide

  40. Experience with CI: Reasons for adoption
    Quick feedback on produced
    development for adding
    business value
    running *all* tests locally is
    taking to long (10+ mins)
    Speed up software dev.
    Improve software quality
    Catch regressions/bugs
    Enforce automation
    Credibility
    @gustavopinto

    View Slide

  41. Experience with CI: Reasons for adoption
    ensuring both, mine and other
    people changes, do not break
    anything
    Speed up software dev.
    Improve software quality
    Catch regressions/bugs
    Enforce automation
    Credibility
    @gustavopinto

    View Slide

  42. Experience with CI: Reasons for adoption
    Much easier regression
    testing in complex
    projects
    Prevent to release a bug
    software into production
    environment
    Speed up software dev.
    Improve software quality
    Catch regressions/bugs
    Enforce automation
    Credibility
    @gustavopinto

    View Slide

  43. Experience with CI: Reasons for adoption
    To make sure tests are always
    run, since humans can be
    inconsistent on that
    Speed up software dev.
    Improve software quality
    Catch regressions/bugs
    Enforce automation
    Credibility
    @gustavopinto

    View Slide

  44. Experience with CI: Reasons for adoption
    it makes the project
    look more legit
    Speed up software dev.
    Improve software quality
    Catch regressions/bugs
    Enforce automation
    Credibility
    @gustavopinto

    View Slide

  45. Technical reasons for build breakage
    @gustavopinto
    Inadequate Testing
    Version changes
    Dependency Management
    Complex code base
    Git usage
    Badly written that fail with
    minor bugfixes
    not enough tests

    View Slide

  46. Technical reasons for build breakage
    @gustavopinto
    Inadequate Testing
    Version changes
    Dependency Management
    Complex code base
    Git usage
    the version of a language
    component is different, and
    the change made to the
    language cause breakage

    View Slide

  47. Technical reasons for build breakage
    @gustavopinto
    Inadequate Testing
    Version changes
    Dependency management
    Complex code base
    Git usage
    people sometimes don’t update
    dependencies, so the Continuous
    Integration server detects errors
    that do not happen locally

    View Slide

  48. Technical reasons for build breakage
    @gustavopinto
    Inadequate Testing
    Version changes
    Dependency Management
    Complex code base
    Git usage
    unfamiliarity with the
    architecture of the code and
    overall module interactions

    View Slide

  49. Technical reasons for build breakage
    @gustavopinto
    Inadequate Testing
    Version changes
    Dependency Management
    Complex code base
    Git usage
    git rebase and left the work on
    hold for too long

    View Slide

  50. Social reasons for build breakage
    just be happy I’m commiting to the
    project, somebody else can test if
    what I did works
    Carelessness
    Lack of testing culture
    Overconfidence
    Lack of communication
    It’s just fine
    @gustavopinto

    View Slide

  51. Social reasons for build breakage
    Carelessness
    Lack of testing culture
    Overconfidence
    Lack of communication
    It’s just fine
    @gustavopinto
    people not running the tests and
    build on their machines before
    pushing the changes

    View Slide

  52. Social reasons for build breakage
    Carelessness
    Lack of testing culture
    Overconfidence
    Lack of communication
    It’s just fine
    @gustavopinto
    this is only a small fix, it should not
    break anything

    View Slide

  53. Social reasons for build breakage
    Carelessness
    Lack of testing culture
    Overconfidence
    Lack of communication
    It’s just fine
    @gustavopinto
    not knowing who to ask for help

    View Slide

  54. Social reasons for build breakage
    @gustavopinto
    Carelessness
    Lack of testing culture
    Overconfidence
    Lack of communication
    It’s just fine
    CI is there for you not to be afraid
    for broken builds in a branch

    View Slide

  55. Social reasons for build breakage
    @gustavopinto
    Carelessness
    Lack of testing culture
    Overconfidence
    Lack of communication
    It’s just fine
    CI is there for you not to be afraid
    for broken builds in a branch
    Only if it is not in the master branch

    View Slide

  56. Benefits of CI
    @gustavopinto
    Catch problems early
    Cross-platform testing
    Confidence
    Automation
    Software quality
    Being aware of when/where
    breakage occurs greatly accelerates
    solution

    View Slide

  57. Benefits of CI
    @gustavopinto
    Catch problems early
    Cross-platform testing
    Confidence
    Automation
    Software quality
    compiling for multiple different targets
    (x86, arm, windows, linux etc)

    View Slide

  58. Benefits of CI
    @gustavopinto
    Catch problems early
    Cross-platform testing
    Confidence
    Automation
    Software quality
    compiling for multiple different targets
    (x86, arm, windows, linux etc)
    this is not feasible or cost-effective to
    do manually

    View Slide

  59. Benefits of CI
    @gustavopinto
    Catch problems early
    Cross-platform testing
    Confidence
    Automation
    Software quality
    Some sort of confidence that
    introduced changes don’t break the
    current behavior
    But depends on the coverage

    View Slide

  60. Benefits of CI
    @gustavopinto
    Catch problems early
    Cross-platform testing
    Confidence
    Automation
    Software quality
    performing a wide range of manual
    steps that a human would not
    normally be bothered to check

    View Slide

  61. Benefits of CI
    @gustavopinto
    Catch problems early
    Cross-platform testing
    Confidence
    Automation
    Software quality
    When you use CI, you have a good health
    check in your code base […] It is a
    warrant of quality of your code

    View Slide

  62. Challenges of CI
    @gustavopinto
    Set up environment
    False sense of confidence
    Discipline
    Monetary costs
    Flaky tests
    I found it hard to setup some
    distributed/multicomponent tests.
    This partially can be resolved by
    the containers

    View Slide

  63. Challenges of CI
    @gustavopinto
    Set up environment
    False sense of confidence
    Discipline
    Monetary costs
    Flaky tests
    Over-reliance on a passing build result can
    encourage a reviewer to merge code
    without a thorough review.

    View Slide

  64. Challenges of CI
    @gustavopinto
    Set up environment
    False sense of confidence
    Discipline
    Monetary costs
    Flaky tests
    instead of only focusing on the code of your
    application, you also focus on the code of
    your build system

    View Slide

  65. Challenges of CI
    @gustavopinto
    Set up environment
    False sense of confidence
    Discipline
    Monetary costs
    Flaky tests
    Hosted services like Travis CI are not free if
    you want to use them at a bigger scale or
    you need to maintain by yourself

    View Slide

  66. Challenges of CI
    @gustavopinto
    Set up environment
    False sense of confidence
    Discipline
    Monetary costs
    Flaky tests
    Tests can sometimes be flaky, which is
    frustrating

    View Slide

  67. @gustavopinto
    Speed up
    software dev.
    Catch bugs/
    regressions
    Credibility Transparency
    Reason for adoption

    View Slide

  68. @gustavopinto
    Inadequate
    testing
    Dependency
    management
    Carelessness Overconfidence
    Why build breaks
    Speed up
    software dev.
    Catch bugs/
    regressions
    Credibility Transparency
    Reason for adoption

    View Slide

  69. @gustavopinto
    Catch problems
    Cross-platform
    testing
    Confidence Automation
    Benefits
    Inadequate
    testing
    Dependency
    management
    Carelessness Overconfidence
    Why build breaks
    Speed up
    software dev.
    Catch bugs/
    regressions
    Credibility Transparency
    Reason for adoption

    View Slide

  70. @gustavopinto
    Catch problems
    Cross-platform
    testing
    Confidence Automation
    Benefits
    Set up
    environment
    False sense of
    confidence
    Discipline
    Monetary
    costs
    Challenges
    Inadequate
    testing
    Dependency
    management
    Carelessness Overconfidence
    Why build breaks
    Speed up
    software dev.
    Catch bugs/
    regressions
    Credibility Transparency
    Reason for adoption

    View Slide

  71. @gustavopinto

    View Slide

  72. Confidence
    @gustavopinto

    View Slide

  73. Confidence
    False sense of
    @gustavopinto

    View Slide

  74. Confidence
    Over
    @gustavopinto

    View Slide

  75. How do (over/lack of)
    confidence on CI impact
    software development
    practice?
    @gustavopinto

    View Slide

  76. Some non-software projects use CI
    Why?
    @gustavopinto

    View Slide

  77. Some non-software projects use CI
    Devs that broken the build are more experienced
    Why?
    Really?
    @gustavopinto

    View Slide

  78. Some non-software projects use CI
    Devs that broken the build are more experienced
    Some PLs have higher proportion of broken builds
    Why?
    Really?
    Why?
    @gustavopinto

    View Slide

  79. Some non-software projects use CI
    Devs that broken the build are more experienced
    Some PLs have higher proportion of broken builds
    Some commits that lead to broken builds are not broken
    Why?
    Really?
    Why?
    How?
    @gustavopinto

    View Slide

  80. Are casual contributors
    more prone to create a
    failing build?
    Research Question
    MSR’17 @gustavopinto

    View Slide

  81. Methodology
    TravisTorrent
    TravisCI
    CI Build Data
    Commiter Data
    Dataset Dataset without

    duplicated users
    User
    Disambiguation
    Technique
    Data
    Cleaning
    Dataset with 1,074

    curated projects
    Data
    Processing
    Data
    Statistical
    Tests
    @gustavopinto

    View Slide

  82. 1,074 projects 619,370 builds
    35,360 contributors
    @gustavopinto

    View Slide

  83. # Users
    0
    5000
    10000
    15000
    20000
    Builds
    1 2 3 4 5+
    0
    150000
    300000
    450000
    600000
    Casual Non-Casual
    1,074 projects 619,370 builds
    # Builds
    35,360 contributors
    @gustavopinto

    View Slide

  84. Being a casual contributor is not a strong indicator
    for creating failing builds
    0
    250
    500
    750
    1000
    No difference
    Higher Casual Success
    Lower Casual Success
    # Number of Projects
    @gustavopinto

    View Slide

  85. 0
    22.5
    45
    67.5
    90
    0
    1.15
    2.3
    3.45
    4.6
    Casual contributions are smaller, both in modified
    source-code lines and modified files
    Median of Modified
    LoC
    Median of Modified
    Files

    View Slide

  86. 0
    1.15
    2.3
    3.45
    4.6
    No difference
    Higher Casual Success
    Lower Casual Success
    Projects in which casuals fail more than non-casuals
    run more jobs per build
    Median of jobs per
    build
    @gustavopinto

    View Slide

  87. Thanks!
    @gustavopinto

    View Slide

  88. Work Practices, Challenges, and Research Opportunities in
    Continuous Integration
    Gustavo Pinto
    @gustavopinto
    [email protected]
    gustavopinto.org

    View Slide