Continuous Delivery: 5 years later (Incontro DevOps 2018)

Continuous Delivery: 5 years later (Incontro DevOps 2018)

#idi2018

416c04c6f0793e236381c2f5df80c9ed?s=128

Giovanni Toraldo

March 09, 2018
Tweet

Transcript

  1. 2.

    About me Giovanni Toraldo Open Source Enthusiast, Java coder, writer

    of the OpenNebula book, lead developer & co-founder at Cloudesire.com, shooting to 2 euro coin at 36 meters with medieval crossbow 2
  2. 4.

    4

  3. 5.

    What is Continuous Delivery? “Continuous delivery (CD) is a software

    engineering approach in which teams produce software in short cycles, ensuring that the software can be reliably released at any time. It aims at building, testing, and releasing software faster and more frequently. The approach helps reduce the cost, time, and risk of delivering changes by allowing for more incremental updates to applications in production.” https://en.wikipedia.org/wiki/Continuous_delivery 5
  4. 6.

    6

  5. 7.

    Continuous Delivery top-level checklist • Code quality should be kept

    high ◦ Write tests until nausea to avoid regressions ◦ Don’t Repeat Yourself ◦ Keep It Simple Stupid ◦ Don’t reinvent the wheel • Automate (almost) everything ◦ Time spent writing code is never lost ◦ Project building must be automated ◦ Test phase must be automated ◦ Deploy/Release should be automated • Team awareness (no working in silos) ◦ Everyone should know what is going on ◦ Monitoring ◦ Metrics 7
  6. 8.

    Our development pipeline 1. Pick an open issue 2. Write

    code and tests in a new branch 3. Open a pull request 4. Wait for the first green build 5. Ask for a code review 6. Handle received feedback 7. Merge code into master branch 8. Have green builds on dependant projects 9. Eventually test on staging environment (especially the UX) 10. Release to end-users 8
  7. 9.

    Issue management (Github) • One catch-all repository for high-level issues

    (epic, stories), easily understandable by non-developers • We use issue labels to mark: ◦ Bug/enhancement/task ◦ Affected components ◦ Effort estimate ▪ Light ▪ Medium ▪ Heavy ▪ Epic ◦ Project (customer) assignment ◦ Workflow steps ▪ Triaged ▪ Ready (to be release) ▪ Closed (released) 9
  8. 10.

    10

  9. 11.

    Issue management • We plan weekly sprints, where developers pick-up

    new issues and treat them as priorities (github milestones) • Technical details are discussed on project repository • Workflow labels: ◦ Triaged: someone has looked to the issue and defined impact ◦ Awaiting-feedback: more upstream information is required to proceed ◦ Backend-ready: backend code is merged into master ◦ Frontend-ready: frontend code is merged into master ◦ Ready: issue will be into the next release 11
  10. 12.

    12

  11. 13.

    13

  12. 14.

    Pull request and code review Code review is a revolution

    for code quality and team awareness • Different eyes catch different bugs • Push code laziness away • Knowledge sharing • Code changes awareness • Asynchronous interaction between developers 14
  13. 15.

    15

  14. 16.

    Continuous Integration - Jenkins Our previous experience with Jenkins: •

    Build parallelization bound to number of nodes ◦ More nodes to increase parallelization of a single project • Isolation across projects fragile ◦ Hack to use different ports for each project • Failed builds may interferes with next builds ◦ Hack to ensure everything get cleaned properly • Toolchains needs to coexist together and maintained ◦ Beefy node configuration recipes ◦ Rbenv, nvm, pyenv and similar is a must • Github PR plugin integration broke multiple times ◦ Leaving us in panic 16
  15. 17.

    Continuous Integration - CircleCI The day after all the jenkins

    github builds broke, we decided to migrate CI to a paid service and keep headaches only for the code we write Enhancements of CircleCI over Jenkins: • Build isolation guaranteed by the platform • Parallelization bound to the bought plan • No infrastructure to manage • Project build configuration versioned into repository • Finally a decent UI 17
  16. 18.

    CircleCI versions CircleCI 1.0 (EOL August 31st, 2018) • Simple

    builds • Node image based on ubuntu 14.04 with all toolchains preinstalled CircleCI 2.0 • Multi-step builds (workflows) • Docker based building ◦ CircleCI maintained images ◦ Docker hub images • Cron-based builds (nightly) 18
  17. 19.

    19

  18. 20.

    20

  19. 21.

    21

  20. 22.

    22

  21. 23.

    23

  22. 24.

    Static code analysis - SonarQube Why static code analysis matters:

    • Catch bugs automatically • Learn new things • Enforce standards With SonarQube is possible to: • Run on every build for every project • New warnings added on PR as comments • Generate reports about current projects status 24
  23. 25.

    25

  24. 26.

    26

  25. 27.

    27

  26. 28.

    28

  27. 30.

    Application packaging - Docker Started experimenting with docker since v1.0.0

    (2014) Nowadays, every project: • Have a docker-compose.yml launched via autoenv • Build a docker image and runs integration tests against it • Push image to the registry as final build step ◦ Build number is the image tag • Get released via chef recipes 30
  28. 31.

    Configuration Management - Chef Opscode Chef configuration used since day

    0. • Chef-zero (serverless) • Provision via fabric script ◦ Sourcing node ssh details from ssh-config ◦ Copy packaged cookbooks via rsync ◦ Run chef-zero • Platform modules deployed as docker containers ◦ Consul + registrator to access containers running on different hosts ◦ Consul-template to autoregister upstream on nginx • All cookbooks tested via test-kitchen + serverspec (migrating to inspec) 31
  29. 32.

    Infrastructure as code - Terraform Design, implement, and deploy infrastructure

    with known software best practices: • Code versioning • Code reuse (modularization/abstraction) • Code sharing In order to achieve: • Repeatability • Speed • Reliability 32
  30. 33.

    33

  31. 34.

    Automatic release of new versions Release a new version means

    restart containers with updated images: image versions are stored inside chef environment json, modified with a hand-made script that: • Iterate over circleci projects looking for last successful build • Ensure that latest master dependencies are green (no regressions) • Retrieve commit list via Github API between current version and next release ◦ Print which new commits are going into this release ◦ Reference support ticket for automatic closing • Commit version change and run deploy job (parametrized for different envs) 34
  32. 35.

    35

  33. 36.

    Error reporting - Sentry Error tracking platform: • Automatic event

    aggregation • Full-text search on error message, function names, stacktraces • Link GitHub issue • Remember if an error was already marked as fixed (regression) 36
  34. 37.

    37

  35. 38.

    38

  36. 39.

    API monitoring - updown.io Dead-simple HTTP health check service •

    Multiple locations • Slack/Email alerts • Beautiful report pages • Public status pages 39
  37. 40.

    40

  38. 41.

    41

  39. 42.

    42

  40. 43.

    Team communication - Slack Slack as a team communication tool

    and notification system • #dev: developers communication, notification from github and circleci • #exceptions: high priority notification from sentry and updown.io • #ops: notification from chef runs, rss from external system status • #marketplace: notifications generated by cloudesire platform for a specific environment • #bugs: issue creation/closing and deployment to production environments notifications 43
  41. 44.

    44

  42. 45.

    45

  43. 46.

    46

  44. 47.

    47

  45. 48.

    48

  46. 49.

    Metrics - Prometheus + Grafana Prometheus is a monitoring system

    / time-series database • Server scrape and store time series data • Metrics exporter for nodes and applications (nginx, mysql, etc) • Client libraries to instrument code • Alert manager to send notifications • Node autodiscovery via consul 49
  47. 50.

    50

  48. 51.

    Centralized Logging - Graylog Cat | grep anyone? Impossibile to

    avoid when having multiple backends Graylog is an open source log management platform • ElasticSearch backend for logs storage/indexing • MongoDB backend for webapp persistence • Rsyslog TCP+SSL input • GELF input for application logs 51
  49. 52.

    52

  50. 53.

    Future? Things currently on our radar: • Kubernetes/Mesos to replace

    chef orchestration ◦ Avoid complexity in chef recipes ◦ Scalable/fault-tolerant infrastructure • Kotlin to replace Java for backend modules ◦ Concise code ◦ Backward-compatibility • Full-remote developers ◦ Currently working 2 days a week from home 53