Slide 1

Slide 1 text

MITT DARKO BRUTAL ENGINEERING I Hate Computers

Slide 2

Slide 2 text

It's an App Container Framework Service Docker

Slide 3

Slide 3 text

Ansible • Automation/Control framework • Written in Python, configured with yaml • SSH, push based with a pull option • Simple, lots of integrations

Slide 4

Slide 4 text

So what is a container? • You put things in it • So things don't get in it • So things don't get out of it • So it's easy to carry • It's a multi-purpose utensil.

Slide 5

Slide 5 text

Containers: Recent History • “OpenSolaris is 10 years ahead of Linux” - Me, 2005 • Ended up being 9 years. Close enough • BSD has chroot jails, the concept isn't new.

Slide 6

Slide 6 text

Change of Subject

Slide 7

Slide 7 text

So, Fooda • It's lunch. Lunch is actually really complicated. • It's a really complicated graph of vendors, menus, menu items, schedules, recurring events, and orders for multiple products. • It's logistics. • It's curated, it's distributed. • We send a lot of email • It's a stock market model and needs to be scalable and reliable • If you don't get lunch, you're going to be pissed off.

Slide 8

Slide 8 text

Fooda on OpsWorks • Moving from Heroku. • Capistrano isn't adept to lots of tasks regarding administration and is largely topology dependent. • Hey look, OpsWorks, I know Chef, it sounds like a good idea. • Fast forward a few months...

Slide 9

Slide 9 text

Opinion Warning: • I have lots of Chef experience. • I have lots of systems automation experience, it's my primary field. • I've done Chef consulting for world class organizations that specialize in Chef consulting.

Slide 10

Slide 10 text

It didn't work out • Seven pages of gripes pulled from this slide deck • To AWS's Team credit, they reached out, just a bit late • You do whatever, but there is a reason I bring up OpsWorks...

Slide 11

Slide 11 text

What needed to be fixed • Long deploy times killing auto scaling ability. 25 minutes. • QA gets nothing done other than wearing out mice and track pads. No real automation without writing Chef recipes for each case. • Six hours a day of my own time did not need to be spent on deployment support. • CI had no bearing on anything other than having to wear the fez of shame. • We needed a Godzilla plan

Slide 12

Slide 12 text

Docker: Case Studies • I looked because I didn’t get it. All these people talking... • "Everything sucks, but it looks like this sucks less" - Alex B. • Large organizations with complex problems • Almost all cases were logical progressions where people had built the majority of the components themselves and then realized they had Docker.

Slide 13

Slide 13 text

So Docker • Formalizes the assembly of your container with revision control. • Caches build steps for speed. • Use the same container with the same code for automated tests, QA tests, staging, production, and demos.

Slide 14

Slide 14 text

The Build: How does it work? • The Dockerfile • Docker build starts from a base image, you add your app, and end up with an image. • Tag the image and host it in a registry, public or private. • Pull and run it as required

Slide 15

Slide 15 text

What goes into your container? • Isolate your services to multiple containers. DB, Redis, Elasticsearch, Rails. • Despite what you may read, it's ok to run a few processes in a container. • USE INIT

Slide 16

Slide 16 text

Example Flow • Git push to a "staging" branch. • GitHub hook hits Docker.io account • Docker autobuilds your container. • Upon successful container build, callback hits Ansible Tower. • Tower deploys your image and runs it with tests as an argument. • If test exit status is 0, Ansible continues with it's playbook.

Slide 17

Slide 17 text

Example Flow cont. • Ansible connects to your staging environment machines and pulls your new staging image. • Half your instances are removed from your load balancer. • Stop old staging build containers • Start new ones • If the new ones start, Ansible continues and runs migrations

Slide 18

Slide 18 text

More Example Flow • If migrations run, add instances to the load balancer and remove the old ones. • Continue to upgrade the old instances, if successful, add them back to the load balancer • If any single instance fails, Ansible will not add it back to the load balancer.

Slide 19

Slide 19 text

Feature Branch Example • Git push triggers Jenkins to build your container and run tests for all branches • If tests are green, Jenkins executes Ansible playbook • If successful for the following steps... • Said playbook tags container with the feature branch name and pushes it to a docker registry • Starts an EC2 AMI with docker preinstalled. • Launch your docker containers on that host with the appropriate tags • Update route53 DNS: feature-branch.dev.example.com • Ansible queues an email to QA with the new URL: hey, check this out!

Slide 20

Slide 20 text

Baller Enterprise Ambassadorship Edition • Now with no in-container port exposure, ambassadors handle ports. • Git-push with release-tag • Triggers container build on jenkins, if tests green, pushed to S3 backed private registry • Build success, knife environment from file in repo to update release tag • trigger 1000’s of chef-client updates or wait for the cron run. • 1000’s of nodes deferred to S3 to pull new image

Slide 21

Slide 21 text

Baller Enterprise Ambassadorship Edition…. cont. • Migrations, etc. • While the ambassador and nginx/rails/old release are still running, start the new release container • Restart the ambassador to point to the new nginx/rails release. • If you’re happy, kill the old release. If you’re not, restart the ambassadors to point to the old release and kill the new one. • Try with HAProxy for actual zero downtime.

Slide 22

Slide 22 text

It's not that easy, learn from me • Dockerfile is a dark art. Read the source, read the reference. • Read all the docs, then read them again. . • Come to know, initially hate, then love Phusion's docker image. • Use an OS that has AppArmor/SELinux

Slide 23

Slide 23 text

Fooda Build Results • Highly cached build time: 4 minutes (includes AMI boot) • Only asset changes: 11 minutes • Asset + gemfile changes: 18 minutes • As many parallel builds as we want • All results include push to docker registry

Slide 24

Slide 24 text

Fooda Deploy Results • Four Minutes for QA feature branches (20+ minutes on OpsWorks) including instance creation. • 30 second staging deploys • Cross cloud deploys just as fast. • 3 deploys a day was: 3 x 1.5 hours x multiple team members. Now 3 x 15 minutes.

Slide 25

Slide 25 text

Serendipitous results • Vagrant + Docker Registry = instant demos and dev environments that are clean and potentially work offline • Über fast CI: split your tests across X instances. • Process isolation and optimization = far cheaper EC2 instances per Rails app (Large -> medium)