$30 off During Our Annual Pro Sale. View Details »

Mentoring Devs into DevOps - SaltConf 2014

Mentoring Devs into DevOps - SaltConf 2014

How we're trying to increase DevOps among our Devs.

Justin Carmony

January 30, 2014
Tweet

More Decks by Justin Carmony

Other Decks in Programming

Transcript

  1. Mentoring Devs
    Into DevOps
    Justin Carmony
    Director of Development
    Deseret Digital Media

    View Slide

  2. View Slide

  3. Lets Measure
    The Audience
    • Who here is a…
    • System Administrator?
    • Developer?
    • Manager / Management?
    • “DevOp?”

    View Slide

  4. Confession:
    I’m a Developer

    View Slide

  5. View Slide

  6. Self-Taught Ops
    Because There Was No One Else To Do It

    View Slide

  7. About Me
    • Director of Development

    for Deseret Digital Media
    • Utah PHP Usergroup

    President
    • I Make (and Break) 

    Web Stuff (10 years)
    • Salt User in Production
    since 0.8
    (I <3 Salt)

    View Slide

  8. This Presentation
    • Lessons learned at DDM & previous jobs
    • Insight into our process of increasing “DevOps”
    • We’re still learning, but this what we’ve found.
    • Slides will be posted online, so don’t worry about
    copying slide content.
    • Feel free to ask on-topic questions during, and
    we’ll have questions at the end.

    View Slide

  9. About DDM
    • Deseret Digital Media runs local website like
    KSL.com, DeseretNews.com
    • Running National and International Websites like
    OK.com, familia.com.br, etc.
    • ~10 million pageviews a day across sites.
    • ~150 internal VMs, a few dozen physical
    machines, some AWS sprinkled around.

    View Slide

  10. Lets Start With a Story!

    View Slide

  11. You Work for an
    Awesome Tech Company

    View Slide

  12. Team Is Working Hard to
    Build New Things!

    View Slide

  13. You launch your
    awesome product!

    View Slide

  14. A Few More Features…

    View Slide

  15. … and next thing you know…

    View Slide

  16. View Slide

  17. Awesome Job
    Team, We Rock!

    View Slide

  18. View Slide

  19. We Need !
    Real-Time XYZ Feature!
    ASAP!

    View Slide

  20. We Need !
    Real-Time XYZ Feature!
    ASAP!

    View Slide

  21. We Need !
    Real-Time XYZ Feature!
    ASAP!
    $%!

    View Slide

  22. View Slide

  23. View Slide

  24. “Huh, it works if you !
    just turn off caching…”!
    - Dev @ 80th Hour This Week

    View Slide

  25. View Slide

  26. “I’m sure this !
    will work…”

    View Slide

  27. View Slide

  28. View Slide

  29. View Slide

  30. “Our servers are melting!”

    View Slide

  31. View Slide

  32. View Slide

  33. View Slide

  34. View Slide

  35. View Slide

  36. View Slide

  37. “We Need a Better Solution!”

    View Slide

  38. So…

    View Slide

  39. Where Do We Start?

    View Slide

  40. View Slide

  41. View Slide

  42. View Slide

  43. View Slide

  44. We Have This Problem

    View Slide

  45. Challenges We Faced
    • Giant mesh-up of technologies
    • Tightly-coupled & fragile infrastructure
    • Debugging production only bugs was difficult
    • Bugs that were part code, part environment were
    a nightmare to track down.

    View Slide

  46. View Slide

  47. View Slide

  48. So One Day…
    We Had A Genius Idea!

    View Slide

  49. Lets Hire a DevOp!

    View Slide

  50. I’m Not Joking
    We Actually Said This

    View Slide

  51. Two Problems
    with this “Idea”

    View Slide

  52. Problem #1 - We Didn’t Understand
    What We Really Wanted

    View Slide

  53. Step 1: Hire a DevOp!
    Step 2: ????????????!
    Step 3: Profit!

    View Slide

  54. Step 1: Hire a DevOp!
    Step 2: ????????????!
    Step 3: Profit! Everything Works !
    Perfectly!

    View Slide

  55. Problem #2 - People Who Are Great
    At Dev & Ops Are Hard To Find

    View Slide

  56. Expectation:

    View Slide

  57. Reality:

    View Slide

  58. Honest Team Discussion:
    What is it we’re really looking for?

    View Slide

  59. We Discovered a Few Things

    View Slide

  60. What does DevOps
    Mean To Us?
    • DevOps: Dev & Ops, a Culture of Collaboration
    • Our Goal: “10 deploys a day without issues”
    • Everyone shares the goal of quick development of
    features AND a stable system that stays up.

    View Slide

  61. Team Structure
    Devs: 30 Ops: 2

    View Slide

  62. Team Structure
    Devs: 30 Ops: 2
    DevOps: 1

    View Slide

  63. Team Structure
    Devs: 30 Ops: 2
    DevOps: 1

    View Slide

  64. Team Structure
    Devs: 30 Ops: 2
    DevOps: 1
    Hiring one person won’t just solve all our problems!

    View Slide

  65. Team Realizations
    • Hardest problem already solved: awesome team
    • No foreseeable rapid expansion, must operate at
    our current scale
    • Each Project’s Director of Development was
    acting as the bridge between Dev and Ops, but
    would become a bottleneck.

    View Slide

  66. Teams Already Had Some
    Ad-Hoc DevOps Tools
    - Real-time Logging
    - Capistrano Deploys
    - Nagios Alerts
    - Server Metrics
    - Puppet for File Mgmt
    - App Stats w/ Graphite
    - Graphite Dashboards
    - Salt for Cfg Management
    - Homebrewed Metrics Sys.
    - Homebrewed Alert System

    View Slide

  67. Teams Already Had Some
    Ad-Hoc DevOps Tools
    - Real-time Logging
    - Capistrano Deploys
    - Nagios Alerts
    - Server Metrics
    - Puppet for File Mgmt
    - App Stats w/ Graphite
    - Graphite Dashboards
    - Salt for Cfg Management
    - Homebrewed Metrics Sys.
    - Homebrewed Alert System

    View Slide

  68. Step 1: Hire a DevOp!
    Step 2: ????????????!
    Step 3: Profit! Everything Works !
    Perfectly!

    View Slide

  69. Step 1: Hire a DevOp!
    Step 2: ????????????!
    Step 3: Profit! Everything Works !
    Perfectly!

    View Slide

  70. We Formed A Strategy

    View Slide

  71. Step #1: Promote Dev
    to DevOp Role

    View Slide

  72. WAIT!
    Isn’t that the advice you just
    said was a bad idea?!

    View Slide

  73. DevOp Engineer
    • Well Defined Role:
    • Ownership over the TOOLS to
    improve DevOps efforts.
    • Resource for other teams to
    help use DevOps Tools.
    • Easy to work with, aptitude for
    systems & ops, likes to try
    new things.

    View Slide

  74. Promoting From Within
    • A seasoned dev for your team already knows:
    • Your Pain Points
    • Your System’s Quirks
    • How the “Chaos Works”
    • Knows the people & personalities on your team

    View Slide

  75. Step #2: Change
    Team Structure

    View Slide

  76. Team Structure
    Devs: 30 Ops: 2

    View Slide

  77. Team Structure
    Devs: 30 Ops: 2

    View Slide

  78. Team Structure
    Goal: Spread Out Expertise By Increasing
    Ops Experience & Skills Among Devs
    Dev Ops

    View Slide

  79. Team Structure
    Goal: Spread Out Expertise By Increasing
    Ops Experience & Skills Among Devs
    Dev Ops

    View Slide

  80. Team Structure
    Dev Ops

    View Slide

  81. Team Structure
    Dev Ops

    View Slide

  82. Increasing Ops Among Devs
    • Identify Devs who liked “Ops” & wanted to Learn
    • Pair Dev with Op / Director
    • Learning Dev works on things, not Op /Director.
    • Pair program if needed.

    View Slide

  83. Step #3: Increase
    Everyone’s Insight

    View Slide

  84. Step #3: Increase
    Everyone’s Insight

    View Slide

  85. View Slide

  86. Metrics
    • Everyone has access to Network, Server, and
    Application Metrics.
    • Consolidate & reduce places to look. We try to
    pipe everything to StatsD / Graphite
    • Each developer trained to add & track metrics in
    production.
    • We’re okay with 98% uptime of stats to avoid
    complexity.

    View Slide

  87. View Slide

  88. Real-Time Logging

    View Slide

  89. Real-Time Logging
    • Harder & more complicated at scale
    • Still trying to solve well, we have lots of logs.
    • Start with small window of data (i.e. 48 hours) and
    start to expand window.
    • We’re trying Logstash, ElasticSearch, and Kibana
    right now.
    • Generate Statistics off our Logs

    View Slide

  90. Tracking Changes
    • Everything, everything, everything in git 

    (we use GitHub)
    • Everyone has access to all repos
    • Everyone does work through Pull Requests
    • Everyone has their work code reviewed *
    * - Your can merge w/o a review, but must be
    willing to defend your choice

    View Slide

  91. Deploys

    View Slide

  92. Everyone Can Deploy
    • Automated our deployment process to a single
    step.
    • Everyone can deploy, deployments are logged
    • Easy rollback is a requirement!
    • Implementing feature flags to turn off single parts
    of our application.

    View Slide

  93. Tests
    Unit
    Functional
    Integration
    Acceptance
    etc

    View Slide

  94. Automated Tests
    • If you want to trust your Devs, you need tests
    • Legacy apps we wrote Integration Tests
    • New Apps & Refactored Legacy Parts have Unit
    Tests
    • Continuous Integration to make sure tests run

    View Slide

  95. View Slide

  96. So, where’s the salt?

    View Slide

  97. Step #4: Devs
    Use The Ops Tools

    View Slide

  98. Devs can grok salt

    View Slide

  99. Safe Environment
    For Devs to Learn

    View Slide

  100. Safe Environment
    For Devs to Learn
    salt \* cmd.run "rm -rf /tmp /*"

    View Slide

  101. Safe Environment
    For Devs to Learn
    salt \* cmd.run "rm -rf /tmp /*"
    Salt is awesome, but it can’t !
    recover from that

    View Slide

  102. Dev Salt Master
    Devs Can Look Into Every Server

    View Slide

  103. Dev Salt Master
    • Every server has two minions:
    • Admin Salt (aka root)
    • Dev Salt (aka bob)
    • Each connect to different master server:
    • All Devs have access to Dev Salt Master
    • Trusted Devs get access to Admin Salt Master

    View Slide

  104. Everything Salty in Git
    Reminder:

    View Slide

  105. Dev Environment
    • Developers own the Dev Environment
    • Dev Teams manage the Salt States for their Env
    • Vagrant + Salt for their Env
    • Who makes changes? Developers
    • DevOp helps advise & offer support

    View Slide

  106. Team Structure
    Dev Ops

    View Slide

  107. Stage Environment
    • Stage & Production use same salt repos, different
    branches
    • Developers make all the changes for Application
    Servers
    • All Changes through Pull Requests
    • We’ll worry about env changes before code
    • Small changes we quickly release, large or long
    running branches are scary & dangerous

    View Slide

  108. Production Environment
    • Merge change to Production Branch
    • salt \* state.highstate
    • Reminder: Small quick changes over time, never
    a large change at once.

    View Slide

  109. Environment Caveats
    • Ops & DevOps Manage VM Hosts, Physical Load
    Balancers, FireWalls, etc
    • Ops & DevOps manage servers that deal with
    data:
    • MySQL
    • MongoDB
    • etc

    View Slide

  110. Mentoring Devs

    View Slide

  111. Mentoring Devs
    • Not every Dev will become an amazing DevOp
    • Thats okay!

    View Slide

  112. Level of “DevOps” Skills
    • Thinks about their impact on Ops: Everyone
    • Able to debug issues with production: Most
    • Able to make changes to environments: Many
    • “Awesome DevOp”: Some

    View Slide

  113. Team’s “DevOps” Skills
    0
    25
    50
    75
    100
    Think Debug Change DevOp
    Current Goal

    View Slide

  114. So Everything Is
    Awesome for us, right?

    View Slide

  115. Honesty Slide: We Have
    Skeletons In Our Closets

    View Slide

  116. Where We Are At
    • All Dev Environments using Vagrant + Salt
    • All New Stage & Prod Environments are Salty
    • Some Legacy Stage & Production Envs are Salty
    • Continuously working on getting out stuff salty.

    View Slide

  117. Making This Work For Your Team

    View Slide

  118. Honest Introspection
    • Determine for your
    team what are your…
    • Strengths
    • Weaknesses
    • Problems
    • Goals

    View Slide

  119. Increase Team’s Insight
    • Make sure devs can see
    & understand how their
    code performs
    • Increase responsibility of
    team for those metrics.
    • If they break it, they fix it. 

    Do not always bail them
    out.
    • Everyone can see
    everything.

    View Slide

  120. Increase Team’s Insight
    • Make sure devs can see
    & understand how their
    code performs
    • Increase responsibility of
    team for those metrics.
    • If they break it, they fix it. 

    Do not always bail them
    out.
    • Everyone can see
    everything.

    View Slide

  121. Mentor Those With
    Desire / Aptitude
    • Give Developers Safe
    Environment to Learn
    • Let them submit code-reviewed
    changes for Stage & Production
    • When teaching / mentoring, let
    the learner drive, kindly offer
    advice and help.
    • It takes time, but worth the
    investment.

    View Slide

  122. A Few Final Thoughts

    View Slide

  123. Team Culture Matters

    View Slide

  124. Positive Influence

    View Slide

  125. Questions?

    View Slide

  126. Thank You
    Justin Carmony
    Email: [email protected]
    Twitter: @JustinCarmony
    IRC: carmony #salt #uphpu
    Website: [email protected]

    View Slide

  127. p.s. we’re hiring, email / pm / tweet me

    View Slide