Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Continuous Delivery: 5 years later (Incontro DevOps 2018)

Continuous Delivery: 5 years later (Incontro DevOps 2018)

#idi2018

Giovanni Toraldo

March 09, 2018
Tweet

More Decks by Giovanni Toraldo

Other Decks in Programming

Transcript

  1. Continuous Delivery
    5 years later

    View Slide

  2. About me
    Giovanni Toraldo
    Open Source Enthusiast, Java
    coder, writer of the OpenNebula
    book, lead developer &
    co-founder at Cloudesire.com,
    shooting to 2 euro coin at 36
    meters with medieval crossbow
    2

    View Slide

  3. Monetization & Brokering
    Platform for immediate
    SaaSification and automated
    distribution of business
    applications and services
    3

    View Slide

  4. 4

    View Slide

  5. What is Continuous Delivery?
    “Continuous delivery (CD) is a software engineering approach
    in which teams produce software in short cycles, ensuring
    that the software can be reliably released at any time. It
    aims at building, testing, and releasing software faster and
    more frequently. The approach helps reduce the cost, time,
    and risk of delivering changes by allowing for more
    incremental updates to applications in production.”
    https://en.wikipedia.org/wiki/Continuous_delivery
    5

    View Slide

  6. 6

    View Slide

  7. Continuous Delivery top-level checklist
    ● Code quality should be kept high
    ○ Write tests until nausea to avoid regressions
    ○ Don’t Repeat Yourself
    ○ Keep It Simple Stupid
    ○ Don’t reinvent the wheel
    ● Automate (almost) everything
    ○ Time spent writing code is never lost
    ○ Project building must be automated
    ○ Test phase must be automated
    ○ Deploy/Release should be automated
    ● Team awareness (no working in silos)
    ○ Everyone should know what is going on
    ○ Monitoring
    ○ Metrics
    7

    View Slide

  8. Our development pipeline
    1. Pick an open issue
    2. Write code and tests in a new branch
    3. Open a pull request
    4. Wait for the first green build
    5. Ask for a code review
    6. Handle received feedback
    7. Merge code into master branch
    8. Have green builds on dependant projects
    9. Eventually test on staging environment (especially the UX)
    10. Release to end-users
    8

    View Slide

  9. Issue management (Github)
    ● One catch-all repository for high-level issues (epic, stories),
    easily understandable by non-developers
    ● We use issue labels to mark:
    ○ Bug/enhancement/task
    ○ Affected components
    ○ Effort estimate
    ■ Light
    ■ Medium
    ■ Heavy
    ■ Epic
    ○ Project (customer) assignment
    ○ Workflow steps
    ■ Triaged
    ■ Ready (to be release)
    ■ Closed (released)
    9

    View Slide

  10. 10

    View Slide

  11. Issue management
    ● We plan weekly sprints, where developers pick-up new issues
    and treat them as priorities (github milestones)
    ● Technical details are discussed on project repository
    ● Workflow labels:
    ○ Triaged: someone has looked to the issue and defined impact
    ○ Awaiting-feedback: more upstream information is required to proceed
    ○ Backend-ready: backend code is merged into master
    ○ Frontend-ready: frontend code is merged into master
    ○ Ready: issue will be into the next release
    11

    View Slide

  12. 12

    View Slide

  13. 13

    View Slide

  14. Pull request and code review
    Code review is a revolution for code quality and team awareness
    ● Different eyes catch different bugs
    ● Push code laziness away
    ● Knowledge sharing
    ● Code changes awareness
    ● Asynchronous interaction between developers
    14

    View Slide

  15. 15

    View Slide

  16. Continuous Integration - Jenkins
    Our previous experience with Jenkins:
    ● Build parallelization bound to number of nodes
    ○ More nodes to increase parallelization of a single project
    ● Isolation across projects fragile
    ○ Hack to use different ports for each project
    ● Failed builds may interferes with next builds
    ○ Hack to ensure everything get cleaned properly
    ● Toolchains needs to coexist together and maintained
    ○ Beefy node configuration recipes
    ○ Rbenv, nvm, pyenv and similar is a must
    ● Github PR plugin integration broke multiple times
    ○ Leaving us in panic
    16

    View Slide

  17. Continuous Integration - CircleCI
    The day after all the jenkins github builds broke, we decided to
    migrate CI to a paid service and keep headaches only for the code
    we write
    Enhancements of CircleCI over Jenkins:
    ● Build isolation guaranteed by the platform
    ● Parallelization bound to the bought plan
    ● No infrastructure to manage
    ● Project build configuration versioned into repository
    ● Finally a decent UI
    17

    View Slide

  18. CircleCI versions
    CircleCI 1.0 (EOL August 31st, 2018)
    ● Simple builds
    ● Node image based on ubuntu 14.04 with all toolchains
    preinstalled
    CircleCI 2.0
    ● Multi-step builds (workflows)
    ● Docker based building
    ○ CircleCI maintained images
    ○ Docker hub images
    ● Cron-based builds (nightly)
    18

    View Slide

  19. 19

    View Slide

  20. 20

    View Slide

  21. 21

    View Slide

  22. 22

    View Slide

  23. 23

    View Slide

  24. Static code analysis - SonarQube
    Why static code analysis matters:
    ● Catch bugs automatically
    ● Learn new things
    ● Enforce standards
    With SonarQube is possible to:
    ● Run on every build for every project
    ● New warnings added on PR as comments
    ● Generate reports about current projects status
    24

    View Slide

  25. 25

    View Slide

  26. 26

    View Slide

  27. 27

    View Slide

  28. 28

    View Slide

  29. FUN: CTO “suggesting” to write good code
    29

    View Slide

  30. Application packaging - Docker
    Started experimenting with docker since v1.0.0 (2014)
    Nowadays, every project:
    ● Have a docker-compose.yml launched via autoenv
    ● Build a docker image and runs integration tests against it
    ● Push image to the registry as final build step
    ○ Build number is the image tag
    ● Get released via chef recipes
    30

    View Slide

  31. Configuration Management - Chef
    Opscode Chef configuration used since day 0.
    ● Chef-zero (serverless)
    ● Provision via fabric script
    ○ Sourcing node ssh details from ssh-config
    ○ Copy packaged cookbooks via rsync
    ○ Run chef-zero
    ● Platform modules deployed as docker containers
    ○ Consul + registrator to access containers running on different hosts
    ○ Consul-template to autoregister upstream on nginx
    ● All cookbooks tested via test-kitchen + serverspec (migrating to
    inspec)
    31

    View Slide

  32. Infrastructure as code - Terraform
    Design, implement, and deploy infrastructure with known software
    best practices:
    ● Code versioning
    ● Code reuse (modularization/abstraction)
    ● Code sharing
    In order to achieve:
    ● Repeatability
    ● Speed
    ● Reliability
    32

    View Slide

  33. 33

    View Slide

  34. Automatic release of new versions
    Release a new version means restart containers with updated
    images: image versions are stored inside chef environment json,
    modified with a hand-made script that:
    ● Iterate over circleci projects looking for last successful build
    ● Ensure that latest master dependencies are green (no regressions)
    ● Retrieve commit list via Github API between current version and next
    release
    ○ Print which new commits are going into this release
    ○ Reference support ticket for automatic closing
    ● Commit version change and run deploy job (parametrized for different
    envs)
    34

    View Slide

  35. 35

    View Slide

  36. Error reporting - Sentry
    Error tracking platform:
    ● Automatic event aggregation
    ● Full-text search on error message, function names, stacktraces
    ● Link GitHub issue
    ● Remember if an error was already marked as fixed (regression)
    36

    View Slide

  37. 37

    View Slide

  38. 38

    View Slide

  39. API monitoring - updown.io
    Dead-simple HTTP health check service
    ● Multiple locations
    ● Slack/Email alerts
    ● Beautiful report pages
    ● Public status pages
    39

    View Slide

  40. 40

    View Slide

  41. 41

    View Slide

  42. 42

    View Slide

  43. Team communication - Slack
    Slack as a team communication tool and notification system
    ● #dev: developers communication, notification from github and
    circleci
    ● #exceptions: high priority notification from sentry and updown.io
    ● #ops: notification from chef runs, rss from external system
    status
    ● #marketplace: notifications generated by cloudesire platform for
    a specific environment
    ● #bugs: issue creation/closing and deployment to production
    environments notifications
    43

    View Slide

  44. 44

    View Slide

  45. 45

    View Slide

  46. 46

    View Slide

  47. 47

    View Slide

  48. 48

    View Slide

  49. Metrics - Prometheus + Grafana
    Prometheus is a monitoring system / time-series database
    ● Server scrape and store time series data
    ● Metrics exporter for nodes and applications (nginx, mysql, etc)
    ● Client libraries to instrument code
    ● Alert manager to send notifications
    ● Node autodiscovery via consul
    49

    View Slide

  50. 50

    View Slide

  51. Centralized Logging - Graylog
    Cat | grep anyone?
    Impossibile to avoid when having multiple backends
    Graylog is an open source log management platform
    ● ElasticSearch backend for logs storage/indexing
    ● MongoDB backend for webapp persistence
    ● Rsyslog TCP+SSL input
    ● GELF input for application logs
    51

    View Slide

  52. 52

    View Slide

  53. Future?
    Things currently on our radar:
    ● Kubernetes/Mesos to replace chef orchestration
    ○ Avoid complexity in chef recipes
    ○ Scalable/fault-tolerant infrastructure
    ● Kotlin to replace Java for backend modules
    ○ Concise code
    ○ Backward-compatibility
    ● Full-remote developers
    ○ Currently working 2 days a week from home
    53

    View Slide

  54. Questions?
    54

    View Slide