Slide 1

Slide 1 text

Scaling Development at State with Docker [email protected] 2015-07-08

Slide 2

Slide 2 text

This all started about 15 months ago… When I joined State 2

Slide 3

Slide 3 text

https://state.com/ • Opinion Network • You shouldn't need a following to get heard • Exchange opinions with people around the world • Social Network structured like News where people share opinions on topics they care about • Founded by Alex & Mark Asseily (Jawbone & Skype) 3

Slide 4

Slide 4 text

Product Market Fit • Efficiency and speed of iteration • Ability to be nimble • Ability to make changes quick • Ability to test out hypotheses 4

Slide 5

Slide 5 text

Challenges • State is a startup • we burn cash • we have no revenue • pushy product team • super smart engineering team 5

Slide 6

Slide 6 text

Time is of the essence 6

Slide 7

Slide 7 text

From a monolith … • Single Repository • Multiple languages used JVM (Storm), Node, Ruby, … • Multiple versions of languages for each component Node 6, Node 10, … • SOA without an easy ability to ship individual WS • In-house build and deployment tools • Regression prone massive deploys • Massive Rails App battered through continuous product development 7

Slide 8

Slide 8 text

Motivations • Remove friction, increase productivity, and increase iteration cycles • Enable the team to pick the best tools for the job at hand • Move the engineering towards a true SOA where individual WS could be deployed with ease 8

Slide 9

Slide 9 text

Three Main Issues • As an engineering team we had to decide on how we could start breaking off parts of the Monolith into standalone WS without compromising on our objective of trying to find Product Market Fit • How could we design an infrastructure that would allow developers to pick the best tool for the job at hand • Cost Saving - More QA environment, more powerful infrastructure 9

Slide 10

Slide 10 text

I had read Jay Kreps’ blog post : “The Log: What every software engineer should know about real-time data's unifying abstraction” 10

Slide 11

Slide 11 text

and 11

Slide 12

Slide 12 text

12

Slide 13

Slide 13 text

Desires • Decouple system architecture into individual standalone components • Enable engineers to have the freedom/flexibility to use the right tools for each and every problem • Framework / Pattern for building new components 13

Slide 14

Slide 14 text

Some details 14

Slide 15

Slide 15 text

State Docker Infrastructure 15

Slide 16

Slide 16 text

High-level • 2 clusters of CoreOS instances • Datastores cluster (MongoDB, Elastic Search, Kafka, etc) • Uses Docker Host Networking • Uses AWS EC2 Tags for Discovery • 1 container per EC2 instance • Services cluster (15 or so Stateless Services) • Uses etcd + confd for Discovery • Uses HA Proxy to route to Docker Containers 16

Slide 17

Slide 17 text

AWS ELB (SSL) services cluster … AWS instance Container Presence Service etcd HA Proxy + confd AWS instance Container Presence Service HA Proxy + confd AWS instance Container Presence Service HA Proxy + confd datastores cluster Docker Host Networking AWS instance Container AWS instance Container AWS instance Container …

Slide 18

Slide 18 text

Service Cluster Discovery Flow (Not Host Networking) • Docker Assigns Each Container a Port • Every Container has a partner “Presence Service” which find the Port of the service and adds it to ETCD with a 5m TTL and deletes when service is stopped • CONFD generates HA Proxy config (HA Proxy Runs on Every instance [in a Container]) and restarts HA Proxy accordingly • Every HA Proxy (runs in Host Networking) can route to every container • ELBs load balance externals requests to the HA Proxies running on every instance 18

Slide 19

Slide 19 text

HA Proxy 19

Slide 20

Slide 20 text

Service Deployment • We use Fleet as our Scheduler to manage our containers • We have YAML definitions for all of our services • We generate Fleet Units from these, which are made up of explicit Docker Image Tags so we don’t ever use “latest” • We use Fleet Constraints to ensure services are spread across multiple instances 20

Slide 21

Slide 21 text

State Development Workflow 21

Slide 22

Slide 22 text

Developers Push Code • Using git & github • To Branches 22

Slide 23

Slide 23 text

Using Github’s webhooks • We Trigger a build in our Drone Build Server 23

Slide 24

Slide 24 text

Drone runs ours tests • Using Github’s Statuses API reports this back 24

Slide 25

Slide 25 text

Docker Image Creation • When all tests are passing Drone • Creates a new Image • Names the image by using the following convention • “service-name:git-XXXXXX” • Pushes to our Private Registry in AWS on S3 • Docker Hub too flakey • Registry locality • Only depend on AWS 25

Slide 26

Slide 26 text

Final Slide :) • It is stable, it works, it is a bunch of simple bash scripts, it has been in production for 9 months • We open-sourced our Cloud Formation templating scripts (used by some Gov Agency) • https://github.com/State/stacks • We would consider open-sourcing our Fleet Unit YAML templating if people are interested 26

Slide 27

Slide 27 text

THANK YOU for listening And thanks to JD Hancock for these amazing stormtrooper photos all photos are shared under CC 2.0 27

Slide 28

Slide 28 text

Questions? http://twitter.com/mischat 28