Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Docker in Production

spiddy
July 22, 2015

Docker in Production

A fast-forward of the process to modernize an infrastructure to a Micro-service oriented and Container-based cluster.

spiddy

July 22, 2015
Tweet

More Decks by spiddy

Other Decks in Technology

Transcript

  1. Index - Requirements - Phase 1 - Continuous Delivery -

    Phase 2 - Operating System - Benchmarking Day - Disaster Recovery Day
  2. Requirements - Production - 23 Million daily hits - Off-load

    DB usage - Fast response time - Process - Quick deploy cycles - Easy rollbacks - Easy scaling - Automatic failover - Dev,QA,Prod parity - No SPOF - Canary Releases
  3. Issues with This Delivery Lifecycle - Too many open questions

    - How is Production environment provisioned? - What JDK, Tomcat version? - How are JDK/Tomcat installed? (apt-get, tarball) - Values of variables? - JAVA_OPTS, JAVA_HOME - CATALINA_OPTS, CATALINA_HOME - Logging mechanism - Application Configuration
  4. How? - Application is a WAR file - The migration

    is initially an Ops project - Gave the confidence to bootstrap the rest - Keep it incremental!
  5. Container Image Tuning - Tomcat APR Installation - JAVA_OPTS Memory

    tuning - Remove default Tomcat webapps - Expand WAR during docker build - Set Locale, Timezone
  6. - Use official parent Images - Add variable Configuration (with

    defaults) - Use ENV variables when possible - Use Template config files to inject values - Add variable Validation (with exit status) - Follow open-source examples Container Image Build Checklist
  7. Adding entrypoint FROM tomcat:7.0.59 ADD app.war webapps/ COPY docker-entrypoint.sh /entrypoint.sh

    ENTRYPOINT [“/entrypoint.sh”] Dockerfile > ls app.war Dockerfile docker-entrypoint.sh Directory Structure
  8. app: build: Dockerfile image: registry.local/user/app pre: - docker run -it

    --rm -v ~/.m2:/root/.m2 \ -v "$PWD":/app -w /app maven:3.3.3 mvn clean install Captain Configuration captain.yml
  9. How to Achieve - Infrastructure as Code - Cluster-centric Operating

    System - Designed with fail-safes - Avoid Single Point-of-Failure (SPOF) - Designed with redundancy - Partition tolerant - Lightweight
  10. - Initialization of cluster instances - Infrastructure as Code -

    Treat servers as Cattle (not pets) - Scale-out with one command - CoreOS Update reboot-strategy: off - Discovery URL to auto-configure cluster - Configured Fleet Metadata Cloud Config YML
  11. Fleet Units - One per Service - Configuration Management with

    Etcd - Side-kick to activate Load Balancer
  12. Production Cluster cloud-config.yml (injecting private keys for read-only access of

    git repo & docker registry) docker pull git checkout Fleet Unit Files
  13. - Entire Cluster failure - Running Containers +350 - Rebuilt

    cluster etcd from scratch - Downtime: 2 hours Worse Case Scenario
  14. - Buggy App (100% cpu) deployed on cluster - Measures:

    Deployment on Test environment - Measures: Canary deploys (10% of cluster) - High CPU usage disrupted etcd cluster - Measures: Tune etcd time-outs - Measures: Setup isolated etcd-cluster servers - Reboot of machines provoked OS upgrade - Measures: Lock-down CoreOS version by disabling service-upgrade; Upgrades on-demand Worse Case Scenario - Timeline