Slide 1

Slide 1 text

Continuous Delivery @ Shopify DevOps Ottawa Meetup John Arthorne Shopify Production Engineering @jarthorne

Slide 2

Slide 2 text

2 Data Center Host Web Server Load Balancers Host Job Server Host Web Server Hosts Web Servers Host Job Server Hosts Job Servers Host DB Standby The Internet Host DB Reader Load Balancers Host DB Writer Edge Router Edge Router Data Center Host Web Server Load Balancers Host Job Server Host Web Server Hosts Web Servers Host Job Server Hosts Job Servers Host DB Standby Host DB Reader Load Balancers Host DB Writer Edge Router Edge Router Shopify Architecture CDN

Slide 3

Slide 3 text

3 Dev Time Architecture

Slide 4

Slide 4 text

A tale of two environments

Slide 5

Slide 5 text

More environments Dev Test Stage Prod App Code Parity +OS +Container +Hardware +Database +Middleware +Traffic Volume +Credentials

Slide 6

Slide 6 text

Enter continuous delivery ● You can’t be sure your code works until it is in production ● Minimize time to production for all changes ● Small batch sizes keep the risk low ● Dark launches, beta flags, ...

Slide 7

Slide 7 text

Shopify style continuous delivery ● Code handoffs slow us down and hurt problem determination ● Everyone in Shopify R&D can deploy ● Everyone in Shopify R&D must deploy ● Dedicated team to build the tools to enable everyone to ship their changes with confidence

Slide 8

Slide 8 text

Continuous delivery culture ● There is a higher level of chaos with CD ● Every dev takes ownership of ensuring their change lands safely ● Every dev needs access and permission to act ● ATC role is very helpful for herding the chaos

Slide 9

Slide 9 text

Mechanics of Shipping ● Develop in a localhost environment ● Push changes in a branch, make the test suite pass ● Code review ● Add to merge queue (or manual git merge) ● Deploy to production (usually automatic) ● Monitor/verify your changes

Slide 10

Slide 10 text

Local Development ● Big investment in tools to automate local dev setup ● Ensure it is easy to set up an env locally that is as close as possible to production

Slide 11

Slide 11 text

Getting ready to ship: Push to GitHub

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

Getting ready to ship: Make a pull request

Slide 14

Slide 14 text

Value of code review ● Extra eyes catch mistakes missed during development ● Pushes code towards cultural/style norms ● Shared understanding of code - reduced bus factor

Slide 15

Slide 15 text

Getting things deployed: a pipeline built for speed Image Build Git Merge Automated Tests Deploy 5s 5m 5m 5m Goal: Merged to deployed in 15 minutes Pull Request

Slide 16

Slide 16 text

Deploy speed: webscale It required some considerable feats of engineering to make this pipeline fast. Why is this important? ● Less wasted time for developers ● Faster time to a fix for merchants ● Fewer changes per deploy, so it’s safer

Slide 17

Slide 17 text

Batch Size vs Pipeline Speed 200 commits merged to shopify master on a busy day Commit every 2.4 minutes assuming 8 hour work day 3 minute deploy required for smallest batch size Builds have to keep getting faster to keep batch size down

Slide 18

Slide 18 text

Container Build Automated Tests Deploy Git Merge

Slide 19

Slide 19 text

As soon as you merge, Pipa will start building 2 Docker images, one for production, and one for the automated tests. Automated Tests Deploy Container Build Git Merge

Slide 20

Slide 20 text

Buildkite will run the 70,000+ automated tests. If the test succeeded on your branch, they will likely succeed on master after merging as well. If not, the failure has to be investigated, and potentially your merge has to be reverted. Automated Tests Deploy Container Build Git Merge

Slide 21

Slide 21 text

Buildkite Hosted build and test orchestration service Test agents run in parallel on our own GKE boxes Agents pull tests from Redis queue Ruby tests + Browser tests run with Selenium/Chrome 330 N1-standard-16 VMs 7000 Peak agents 73k Tests/Build

Slide 22

Slide 22 text

Shipit automatically deploys code to production. Changes deployed in parallel across 4 data centres, ~800 servers, and 500,000+ merchants. Automated Tests Deploy Container Build Git Merge

Slide 23

Slide 23 text

Chat notifications

Slide 24

Slide 24 text

Deploy Dashboard

Slide 25

Slide 25 text

Deploy Dashboard

Slide 26

Slide 26 text

No content

Slide 27

Slide 27 text

No content

Slide 28

Slide 28 text

A successful deploy

Slide 29

Slide 29 text

A failed deploy

Slide 30

Slide 30 text

No content

Slide 31

Slide 31 text

● Lock automatic deploys ● Roll back to previously deployed version using shipit. ● Revert change in Git ● Always be communicating ● ATC and incident response team standing by to help What if shit hits the fan?

Slide 32

Slide 32 text

● It is impossible to simulate a production environment ● Strive to keep environment differences to a minimum ● Push smallest possible units of change to production continuously in order to validate code ● Invest in tools to keep it flowing smoothly Summary

Slide 33

Slide 33 text

Continuous Delivery @ Shopify DevOps Ottawa Meetup John Arthorne Shopify Production Engineering @jarthorne