Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Configuration Management is an Antipattern

Configuration Management is an Antipattern

Slides from SCaLE 15x, Pasadena, CA

Jonah Horowitz

March 05, 2017
Tweet

More Decks by Jonah Horowitz

Other Decks in Technology

Transcript

  1. @jonahhorowitz Configuration Management is an anti-pattern @jonahhorowitz

  2. @jonahhorowitz jonah@laptop$ cvs update website jonah@laptop$ tar zcvf website.tar.gz website

    jonah@laptop$ scp website.tar.gz root@server1:/ var/something/ jonah@laptop$ ssh root@server1 server1# cd /var/something server1# mv website website-`date` server1# tar zxf website.tar.gz server1# /etc/init.d/website restart server1# ^D … Rinse, Repeat …
  3. @jonahhorowitz #!/bin/bash BOX=$1 NEWCODE=$2 scp $NEWCODE root@$BOX:/var/something/ ssh root@$BOX “(cd

    /var/something ; tar zxf $NEWCODE ; /etc/init.d/tomcat restart)
  4. @jonahhorowitz #!/bin/bash BOX=$1 NEWCODE=$2 scp $NEWCODE root@$BOX:/var/something/ ssh root@$BOX “(cd

    /var/something ; tar zxf $NEWCODE ; /etc/init.d/tomcat restart) jonah@laptop$ cvs update website jonah@laptop$ tar zcvf website.tar.gz website jonah@laptop$ for box in `cat serverlist\boxen.txt` ; do \ tools/update-code.sh $box website.tar.gz done
  5. @jonahhorowitz Server Install Process (2001) • Install server in rack

    • Use Mandrake Linux CD to install OS • Run through long manual configuration checklist - some of which was eventually scripted • Push latest code (using the earlier script) • Add to load balancer
  6. @jonahhorowitz Server Install Process (2012+) • Launch new Amazon AMI

    • Use the current version of Amazon Linux • Run through long manual configuration checklist - some of which was eventually scripted • Push latest code (using the earlier script) • Add to ELB
  7. @jonahhorowitz So, who am I? Jonah Horowitz Site Reliability Engineer

    Soon to be at Stripe incoming@jonahhorowitz.com
  8. @jonahhorowitz Automating Linux and Unix Nate Campi Kirk Bauer

  9. @jonahhorowitz CFEngine (2.x) was great... for its time Before CFEngine

    • Time to provision a new server: 1 Day • Chance a mistake was made: 50/50 • Percentage of fleet we understood: 70
  10. @jonahhorowitz CFEngine (2.x) was great... for its time Before CFEngine

    • Time to provision a new server: 1 Day • Chance a mistake was made: 50/50 • Percentage of fleet we understood: 70 After CFEngine 2 • Time to provision a new server: 1 hour • Chance a mistake was made: 1% • Percentage of fleet we understood: 99
  11. @jonahhorowitz

  12. @jonahhorowitz Puppet Chef Salt Ansible RedHat

  13. @jonahhorowitz What sucks about Config Management?

  14. @jonahhorowitz What sucks about Config Management?

  15. @jonahhorowitz Bad Option #1 Ops owns all configuration management What

    sucks about Config Management?
  16. @jonahhorowitz Bad Option #1 Ops owns all configuration management What

    sucks about Config Management? Bad Option #2 Ops doesn’t own all configuration management
  17. @jonahhorowitz Broken/Buggy/Out-of-Sync Deployments

  18. @jonahhorowitz Broken/Buggy/Out-of-Sync Deployments That one server…

  19. @jonahhorowitz Release Engineering Still Sucks

  20. @jonahhorowitz What’s the alternative?

  21. @jonahhorowitz What’s the alternative?

  22. @jonahhorowitz What’s the alternative?

  23. @jonahhorowitz Let’s walk through that again, slowly

  24. @jonahhorowitz • Base or Foundation AMI • Security patches •

    Infrastructure Packages (monitoring, logging, etc)
  25. @jonahhorowitz • Your application package and its dependencies

  26. @jonahhorowitz Tools Required • Package Build System (Gradle) • Image

    Build System (Aminator/Bakery/Docker/Packer) • Deployment System (Spinnaker/Terraform/ CloudFormation) • Service Discovery (Eureka/Zookeeper/ELBs/DNS?/Swarm/ Kubernetes) • Dynamic Configuration (Feature Flags/Fast Properties)
  27. @jonahhorowitz Benefits

  28. @jonahhorowitz Benefits • Simpler Operations

  29. @jonahhorowitz Benefits • Continuous Deployments

  30. @jonahhorowitz Benefits • Faster startup times • Horizontal/Auto-scaling • Instance

    Failure • Chaos Monkey • Cloud Reboots
  31. @jonahhorowitz Benefits • Configuration in-sync / no “cruft” / always

    a known state
  32. @jonahhorowitz Benefits • Same application code in Dev/Test/Prod

  33. @jonahhorowitz Benefits • Easier to respond to security threats

  34. @jonahhorowitz Benefits • Multi-region operations

  35. @jonahhorowitz Benefits • That one server… sticks out like a

    sore thumb
  36. @jonahhorowitz Release Strategies • Rolling Release • Blue/Green Releases

  37. @jonahhorowitz Caveats

  38. @jonahhorowitz Oh, that database thing…

  39. @jonahhorowitz Jonah Horowitz Site Reliability Engineer @jonahhorowitz incoming@jonahhorowitz.com https://jonahhorowitz.com/ ⃠