$30 off During Our Annual Pro Sale. View Details »

Shipping Code at Saltside: 2 Years on Docker

Adam Hawkins
November 04, 2016

Shipping Code at Saltside: 2 Years on Docker

Presented for Devops Days India in BLR

Adam Hawkins

November 04, 2016
Tweet

More Decks by Adam Hawkins

Other Decks in Technology

Transcript

  1. Shipping Code at
    Saltside:
    2 Years on Docker
    Adam Hawkins - DevopsDays India - BLR -
    2016-11-04

    View Slide

  2. Hi, I’m Adam
    • Joined Saltside in 2013
    • Lead SRE Team @ Saltside
    • @adman65
    • http://blog.slashdeploy.com

    View Slide

  3. This Talk
    • Saltside: A Curious Case
    • Our production infrastructure
    • Why we are the way we are (for better & worse)
    • Saltside’s future
    • How you can make a better future (for yourself and your
    team)

    View Slide

  4. Saltside
    Org & Tech

    View Slide

  5. One Product, One Code Base, Four
    Markets

    View Slide

  6. View Slide

  7. Quick Stats
    • 2 Product Development Offices.
    • Hundreds global employees, ~50 in PD
    • 1 Product, 4 Markets
    • 2 AWS Regions
    • SoA; 15+ production services
    • 5 production Languages
    • ~300 production containers
    • Mix of cross functional teams; 4 People in SRE Team (pssst. We need SRE!)

    View Slide

  8. Q3 2014
    • Rewrite entire product and infrastructure from
    scratch starts (Never do this; no seriously, never
    do this!)
    • Docker 1.0 Released
    • Introduce market specific infrastructure
    • Shift to SoA
    • I am the quasi-architect for new system

    View Slide

  9. Why Docker?
    • Give engineers the freedom to choose
    best stack stack for the problem
    • Infrastructure standardization
    • dev & production parity. The
    development environment is make,
    docker, docker-compose, and few others

    View Slide

  10. 2014 != 2016
    • Mature container orchestration tools …
    lol no
    • official images … lol no
    • docker-machine … lol no
    • established best practices … lol no

    View Slide

  11. Oh my
    Times have changed

    View Slide

  12. Rolling Our Own: Apollo
    • configuration file driven (similar to docker-compose)
    • Set’s horizontal & vertical scales per market/container
    • It’s dynamic CloudFormation all the way down
    • One EC2 instances runs one Docker container
    • Every EC2 instance is behind an ELB
    • Zero down time deploys via HAproxy
    • Instances poll S3 for which image/tag to use; change containers
    where appropriate

    View Slide

  13. apollo
    • $ apollo deploy -e production -m bikroy -t 823714
    • $ apollo update -e production -m bikroy

    View Slide

  14. 2 Years Later
    • 1 container per 1 EC2 instance is damn expensive
    • Things are stable and work
    • No blue/green, canary, or alternate deployment strategies
    • No indication of deploy status (started, failed, rollout percentage
    etc)
    • Time to bootstrap new services = Days
    • Time to each new employees apollo = Months
    • Sunset

    View Slide

  15. 2014 != 2016
    We have Kubernetes. Punt successful!

    View Slide

  16. Kubernetes PoC
    • One cluster per-region; multiple markets per
    cluster
    • Decrease costs
    • Increase reliability
    • Move to maintained and active open source
    project instead of end-of-life’d private internal tools
    • Increase velocity

    View Slide

  17. How can we have
    environments per topic
    branch?

    View Slide

  18. Issues
    • Cannot do with apollo; only support fixed environment
    names. Also too slow and expensive
    • How to give client developers a functioning API platform to
    develop against?
    • How to give QA access to N number of service
    configurations?
    • How give engineers a place to experiment outside
    production?
    • How to save everyone from configuring 15+ services?

    View Slide

  19. Sandbox
    Infinite Environments with Docker Compose & Docker
    Machine

    View Slide

  20. • sandbox create # create a new environment
    • sandbox sync # Pull image tags from production
    • sandbox reset # wipe data and boot everything
    • sandbox logs # Get logs from containers
    • sandbox tag foo-service bar # change tags
    • sandbox dev foo-service # Build images locally

    View Slide

  21. Take Aways
    • Prefer Kubernetes/Mesos/DCOS instead of rolling your own
    • Prefer docker-compose over manual docker commands
    • Google Container Engine is the easiest way to get production ready container infrastructure
    • Package up your distributed application instead of making engineers manage it themselves
    • Include log and metrics systems in your budgets from t-zero
    • Create deployment APIs instead of CLIs
    • Distribute internal tools as Docker images
    • Prefer containerized workflows over host level dependencies
    • Prefer one Dockerfile per project
    • Prefer the official Docker images over internally maintained base images

    View Slide

  22. Adjust Madi

    View Slide