Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Easing into continuous deployment

Easing into continuous deployment

How we moved our team from static deployments into continuous deployment.

06f8b41980eb4c577fa40c41d5030c19?s=128

Chris Keathley

July 28, 2017
Tweet

Transcript

  1. Continuous Deployment Chris Keathley / @ChrisKeathley / c@keathley.io

  2. I work with a distributed team

  3. I work with a distributed team

  4. None
  5. None
  6. None
  7. None
  8. None
  9. Warehouse

  10. Warehouse API

  11. Warehouse API Apps

  12. The problem

  13. Slow Iteration Cycle Deployment Deployment 2 weeks

  14. Slow Iteration Cycle Deployment Deployment Deployment 2 weeks 2 weeks

  15. Slow Iteration Cycle Deployment Deployment Deployment 3 weeks

  16. Slow Iteration Cycle Deployment Deployment Deployment Hopefully someday

  17. Large PRs

  18. Unsure about state of the application

  19. Unsure about state of the application

  20. Unsure about state of the application

  21. Unsure about state of the application

  22. Unsure about state of the application

  23. Rollbacks are a scam

  24. None
  25. None
  26. Data Migration

  27. Data Migration ?

  28. Don’t do this

  29. always Move forward

  30. always Move forward

  31. always Move forward

  32. The goal should never be to roll back a deployment

  33. The goal is to minimize the damage done by any

    given deployment
  34. There are bugs in your system

  35. Solutions

  36. We needed to deploy more often

  37. So we did

  38. Automated Deployment

  39. What do you deploy?

  40. Commit Sha

  41. None
  42. Jars

  43. Artifacts

  44. Git Tags

  45. Containers

  46. Your App

  47. Your App Server

  48. None
  49. Master Branch How we merge our code PR

  50. CI Github Registry Container Slack PR Notification

  51. CI Kubernetes Deploy Auto-deploy Green builds of master

  52. CI Kubernetes Deploy Auto-deploy Green builds of master Service A

    Service B
  53. CI Kubernetes Deploy Auto-deploy Green builds of master Service B

  54. CI Kubernetes Deploy Auto-deploy Green builds of master Service B

    Service A
  55. CI Kubernetes Deploy Auto-deploy Green builds of master Service A

  56. CI Kubernetes Deploy Auto-deploy Green builds of master Service A

    Service B
  57. Tests Metrics &

  58. Integration Tests + Property Tests

  59. Integration Tests TEst App DB Service

  60. Modeling Users as FSMs logged_out logged_in login logout vote

  61. Property Tests Add Todo Edit Todo Delete Todo

  62. Property Tests Add Todo Edit Todo Delete Todo

  63. Property Tests Add Todo Edit Todo Delete Todo

  64. Property Tests Add Todo Edit Todo Delete Todo

  65. Generate Commands

  66. Generated Commands [{:add_todo, “Test Todo”, 1}, {:edit_todo, "Edited", 2}, {:delete_todo,

    "", 1}, {:add_todo, “New Todo", 3}, {:delete_todo, "", 2} {:edit_todo, “Edited Todo”, 2}]
  67. Generate Commands

  68. Generate Commands

  69. Generate Commands

  70. Generate Commands

  71. Generate Commands

  72. Generated Commands [{:add_todo, “Test Todo”, 1}, {:edit_todo, "Edited", 2}, {:delete_todo,

    "", 1}, {:add_todo, “New Todo", 3}, {:delete_todo, "", 2} {:edit_todo, “Edited Todo”, 2}]
  73. Generated Commands [{:add_todo, “Test Todo”, 1}, {:delete_todo, "", 2}] [{:add_todo,

    “Test Todo”, 1}, {:edit_todo, "Edited", 2}, {:delete_todo, "", 1}, {:add_todo, “New Todo", 3}, {:delete_todo, "", 2} {:edit_todo, “Edited Todo”, 2}]
  74. Prometheus Service A Grafana Service B Service C

  75. Prometheus Service A Grafana Service B Service C Slack

  76. # Alert for any instance that have a 95th percentile

    > 200ms. ALERT APIHighRequestLatency IF api_http_request_latencies_second{quantile="0.95"} > 0.2 FOR 5m ANNOTATIONS { summary = "High request latency on {{ $labels.instance }}", description = "{{ $labels.instance }} has a median request latency above 1s (current value: {{ $value }}s)", }
  77. Track “Business” Metrics

  78. None
  79. Feature releases and flags

  80. None
  81. Features aren’t all or nothing

  82. Features != Deployments

  83. Deployment

  84. Deployment Features

  85. Deployment Features

  86. User

  87. User staff?(user) == true

  88. User staff?(user) == false

  89. User staff?(user) == false

  90. defmodule MyApp.FeatureFlags do alias MyApp.User def foo_enabled?(%User{staff: is_staff}), do: is_staff

    def foo_enabled?(_), do: false def bar_enabled?(%User{staff: is_staff}), do: is_staff def bar_enabled?(_), do: false end
  91. Browser Feature Service

  92. Feature Service Feature Service Feature Service

  93. Feature Service Feature Service Feature Service

  94. Feature Service Feature Service Feature Service

  95. You have updates ready! Reset

  96. None
  97. With larger Traffic numbers you could use percentages

  98. Alchemy

  99. “Transmute lead code into gold in production”

  100. Prior Art: https://github.com/github/scientist

  101. Users_Controller DB User.all

  102. DB User.all UserService.all

  103. User.all UserService.all ==

  104. def index(conn) do users = old_query() render(conn, "index.json", users: users)

    end
  105. def index(conn) do users = experiment("users-query") |> control(&old_query/0) |> candidate(&new_query/0)

    |> run render(conn, "index.json", users: users) end
  106. def index(conn) do users = experiment("users-query") |> control(&old_query/0) |> candidate(&new_query/0)

    |> candidate(&fancy_query/0) |> run render(conn, "index.json", users: users) end
  107. 1) Shuffles test order 2) Runs Each test in parallel

    3) exports the data Alchemy
  108. DB User.all UserService.all Control Candidate Control UserController

  109. None
  110. 1) Do the results match? 2) How long does each

    test take to return? Measure
  111. No more cutovers

  112. DB User.all UserService.all

  113. DB User.all UserService.all User service

  114. Migrations

  115. http://blog.datomic.com/2017/01/the-ten-rules-of-schema-growth.html

  116. DB Schema App Application Coupling

  117. Your application knows about your schema

  118. Lets remove a column

  119. Lets remove a column 1) all application code needs to

    stop using that column
  120. Lets remove a column 1) all application code needs to

    stop using that column 2) Update all ETL processes
  121. Lets remove a column 1) all application code needs to

    stop using that column 2) Update all ETL processes 3) Update Reporting
  122. Lets remove a column 1) all application code needs to

    stop using that column 2) Update all ETL processes 3) Update Reporting 4) Remove the column
  123. Lets remove a column 1) all application code needs to

    stop using that column 2) Update all ETL processes 3) Update Reporting 4) Remove the column Split all of these up
  124. Lets Add a column

  125. Lets Add a column 1) Add the column

  126. Lets Add a column 1) Add the column 2) Eventually

    start using it
  127. Prefer Additive Migrations

  128. CI Kubernetes Deploy Auto-deploy Green builds of master

  129. CI Kubernetes Deploy Auto-deploy Green builds of master Migration

  130. CI Kubernetes Deploy Auto-deploy Green builds of master Migration DB

  131. Chat-Ops

  132. None
  133. Chat is…

  134. Chat is… Centralized

  135. Chat is… Centralized Transparent

  136. Chat is… Centralized Transparent Open

  137. Try to do operational tasks in chat

  138. None
  139. defmodule Hedwig.Responders.Ping do use Hedwig.Responder @usage """ hedwig: ping -

    Responds with 'pong' """ respond ~r/ping$/i, msg do reply msg, "pong" end end
  140. None
  141. Generate grafana graphs

  142. None
  143. Deploy

  144. None
  145. Team Building

  146. Conclusion

  147. These are tools at our disposal

  148. Deploy more often, safely

  149. Thanks Chris Keathley / @ChrisKeathley / c@keathley.io