Shipping Code at Saltside: 2 Years on Docker

Slide 1

Slide 1 text

Shipping Code at Saltside: 2 Years on Docker Adam Hawkins - DevopsDays India - BLR - 2016-11-04

Slide 2

Slide 2 text

Hi, I’m Adam • Joined Saltside in 2013 • Lead SRE Team @ Saltside • @adman65 • http://blog.slashdeploy.com

Slide 3

Slide 3 text

This Talk • Saltside: A Curious Case • Our production infrastructure • Why we are the way we are (for better & worse) • Saltside’s future • How you can make a better future (for yourself and your team)

Slide 4

Slide 4 text

Saltside Org & Tech

Slide 5

Slide 5 text

One Product, One Code Base, Four Markets

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

Quick Stats • 2 Product Development Ofﬁces. • Hundreds global employees, ~50 in PD • 1 Product, 4 Markets • 2 AWS Regions • SoA; 15+ production services • 5 production Languages • ~300 production containers • Mix of cross functional teams; 4 People in SRE Team (pssst. We need SRE!)

Slide 8

Slide 8 text

Q3 2014 • Rewrite entire product and infrastructure from scratch starts (Never do this; no seriously, never do this!) • Docker 1.0 Released • Introduce market speciﬁc infrastructure • Shift to SoA • I am the quasi-architect for new system

Slide 9

Slide 9 text

Why Docker? • Give engineers the freedom to choose best stack stack for the problem • Infrastructure standardization • dev & production parity. The development environment is make, docker, docker-compose, and few others

Slide 10

Slide 10 text

2014 != 2016 • Mature container orchestration tools … lol no • ofﬁcial images … lol no • docker-machine … lol no • established best practices … lol no

Slide 11

Slide 11 text

Oh my Times have changed

Slide 12

Slide 12 text

Rolling Our Own: Apollo • conﬁguration ﬁle driven (similar to docker-compose) • Set’s horizontal & vertical scales per market/container • It’s dynamic CloudFormation all the way down • One EC2 instances runs one Docker container • Every EC2 instance is behind an ELB • Zero down time deploys via HAproxy • Instances poll S3 for which image/tag to use; change containers where appropriate

Slide 13

Slide 13 text

apollo • $ apollo deploy -e production -m bikroy -t 823714 • $ apollo update -e production -m bikroy

Slide 14

Slide 14 text

2 Years Later • 1 container per 1 EC2 instance is damn expensive • Things are stable and work • No blue/green, canary, or alternate deployment strategies • No indication of deploy status (started, failed, rollout percentage etc) • Time to bootstrap new services = Days • Time to each new employees apollo = Months • Sunset

Slide 15

Slide 15 text

2014 != 2016 We have Kubernetes. Punt successful!

Slide 16

Slide 16 text

Kubernetes PoC • One cluster per-region; multiple markets per cluster • Decrease costs • Increase reliability • Move to maintained and active open source project instead of end-of-life’d private internal tools • Increase velocity

Slide 17

Slide 17 text

How can we have environments per topic branch?

Slide 18

Slide 18 text

Issues • Cannot do with apollo; only support fixed environment names. Also too slow and expensive • How to give client developers a functioning API platform to develop against? • How to give QA access to N number of service configurations? • How give engineers a place to experiment outside production? • How to save everyone from configuring 15+ services?

Slide 19

Slide 19 text

Sandbox Inﬁnite Environments with Docker Compose & Docker Machine

Slide 20

Slide 20 text

• sandbox create # create a new environment • sandbox sync # Pull image tags from production • sandbox reset # wipe data and boot everything • sandbox logs # Get logs from containers • sandbox tag foo-service bar # change tags • sandbox dev foo-service # Build images locally

Slide 21

Slide 21 text

Take Aways • Prefer Kubernetes/Mesos/DCOS instead of rolling your own • Prefer docker-compose over manual docker commands • Google Container Engine is the easiest way to get production ready container infrastructure • Package up your distributed application instead of making engineers manage it themselves • Include log and metrics systems in your budgets from t-zero • Create deployment APIs instead of CLIs • Distribute internal tools as Docker images • Prefer containerized workflows over host level dependencies • Prefer one Dockerfile per project • Prefer the official Docker images over internally maintained base images

Slide 22

Slide 22 text

Adjust Madi