Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Configuration Management is an Anti-Pattern

Configuration Management is an Anti-Pattern

Slides from my talk at DevOpsDays Boise

Jonah Horowitz

October 07, 2016
Tweet

More Decks by Jonah Horowitz

Other Decks in Technology

Transcript

  1. @jonahhorowitz
    The Configuration
    Management Antipattern

    View full-size slide

  2. @jonahhorowitz
    jonah@laptop$ cvs update website
    jonah@laptop$ tar zcvf website.tar.gz website
    jonah@laptop$ scp website.tar.gz root@server1:/var/something/
    jonah@laptop$ ssh root@server1
    server1# cd /var/something
    server1# mv website website-`date`
    server1# tar zxf website.tar.gz
    server1# /etc/init.d/website restart
    server1# ^D
    … Rinse, Repeat …

    View full-size slide

  3. @jonahhorowitz
    #!/bin/bash
    BOX=$1
    NEWCODE=$2
    scp $NEWCODE root@$BOX:/var/something/
    ssh root@$BOX “(cd /var/something ; tar zxf $NEWCODE ; /etc/init.d/tomcat restart)”

    View full-size slide

  4. @jonahhorowitz
    #!/bin/bash
    BOX=$1
    NEWCODE=$2
    scp $NEWCODE root@$BOX:/var/something/
    ssh root@$BOX “(cd /var/something ; tar zxf $NEWCODE ; /etc/init.d/tomcat restart)”
    jonah@laptop$ cvs update website
    jonah@laptop$ tar zcvf website.tar.gz website
    jonah@laptop $ for box in `cat serverlist\boxen.txt` ; do \
    tools/update-code.sh $box website.tar.gz
    done

    View full-size slide

  5. @jonahhorowitz

    View full-size slide

  6. @jonahhorowitz
    CFEngine (2) was great… for its time
    Before CFEngine
    Time to provision a new server: 1 Day
    Chance a mistake was made: 50/50
    Percentage of fleet we understood: 70
    After CFEngine 2
    Time to provision a new server: 1 hour
    Chance a mistake was made: 1%
    Percentage of fleet we understood: 99

    View full-size slide

  7. @jonahhorowitz
    Puppet

    View full-size slide

  8. @jonahhorowitz
    So, who am I?
    Jonah Horowitz
    Senior Site Reliability Engineer
    Netflix CORE SRE (Cloud Operations Reliability Engineering)
    [email protected]

    View full-size slide

  9. @jonahhorowitz
    C loud
    O perations
    R eliability
    E ngineering

    View full-size slide

  10. @jonahhorowitz
    What sucks about Config
    Management?

    View full-size slide

  11. @jonahhorowitz
    Release Engineering
    Still Sucks

    View full-size slide

  12. @jonahhorowitz
    What’s the alternative?

    View full-size slide

  13. @jonahhorowitz
    Chaos

    View full-size slide

  14. @jonahhorowitz
    Cloud or Not

    View full-size slide

  15. @jonahhorowitz
    Jonah Horowitz
    Senior Site Reliability Engineer
    @jonahhorowitz
    https://netflix.github.io/
    https://jobs.netflix.com/

    View full-size slide