Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Mentoring Devs into DevOps - SaltConf 2014

Mentoring Devs into DevOps - SaltConf 2014

How we're trying to increase DevOps among our Devs.

42e57550044496027f9a3a4303f13362?s=128

Justin Carmony

January 30, 2014
Tweet

Transcript

  1. Mentoring Devs Into DevOps Justin Carmony Director of Development Deseret

    Digital Media
  2. None
  3. Lets Measure The Audience • Who here is a… •

    System Administrator? • Developer? • Manager / Management? • “DevOp?”
  4. Confession: I’m a Developer

  5. None
  6. Self-Taught Ops Because There Was No One Else To Do

    It
  7. About Me • Director of Development
 for Deseret Digital Media

    • Utah PHP Usergroup
 President • I Make (and Break) 
 Web Stuff (10 years) • Salt User in Production since 0.8 (I <3 Salt)
  8. This Presentation • Lessons learned at DDM & previous jobs

    • Insight into our process of increasing “DevOps” • We’re still learning, but this what we’ve found. • Slides will be posted online, so don’t worry about copying slide content. • Feel free to ask on-topic questions during, and we’ll have questions at the end.
  9. About DDM • Deseret Digital Media runs local website like

    KSL.com, DeseretNews.com • Running National and International Websites like OK.com, familia.com.br, etc. • ~10 million pageviews a day across sites. • ~150 internal VMs, a few dozen physical machines, some AWS sprinkled around.
  10. Lets Start With a Story!

  11. You Work for an Awesome Tech Company

  12. Team Is Working Hard to Build New Things!

  13. You launch your awesome product!

  14. A Few More Features…

  15. … and next thing you know…

  16. None
  17. Awesome Job Team, We Rock!

  18. None
  19. We Need ! Real-Time XYZ Feature! ASAP!

  20. We Need ! Real-Time XYZ Feature! ASAP!

  21. We Need ! Real-Time XYZ Feature! ASAP! &#$%!

  22. None
  23. None
  24. “Huh, it works if you ! just turn off caching…”!

    - Dev @ 80th Hour This Week
  25. None
  26. “I’m sure this ! will work…”

  27. None
  28. None
  29. None
  30. “Our servers are melting!”

  31. None
  32. None
  33. None
  34. None
  35. None
  36. None
  37. “We Need a Better Solution!”

  38. So…

  39. Where Do We Start?

  40. None
  41. None
  42. None
  43. None
  44. We Have This Problem

  45. Challenges We Faced • Giant mesh-up of technologies • Tightly-coupled

    & fragile infrastructure • Debugging production only bugs was difficult • Bugs that were part code, part environment were a nightmare to track down.
  46. None
  47. None
  48. So One Day… We Had A Genius Idea!

  49. Lets Hire a DevOp!

  50. I’m Not Joking We Actually Said This

  51. Two Problems with this “Idea”

  52. Problem #1 - We Didn’t Understand What We Really Wanted

  53. Step 1: Hire a DevOp! Step 2: ????????????! Step 3:

    Profit!
  54. Step 1: Hire a DevOp! Step 2: ????????????! Step 3:

    Profit! Everything Works ! Perfectly!
  55. Problem #2 - People Who Are Great At Dev &

    Ops Are Hard To Find
  56. Expectation:

  57. Reality:

  58. Honest Team Discussion: What is it we’re really looking for?

  59. We Discovered a Few Things

  60. What does DevOps Mean To Us? • DevOps: Dev &

    Ops, a Culture of Collaboration • Our Goal: “10 deploys a day without issues” • Everyone shares the goal of quick development of features AND a stable system that stays up.
  61. Team Structure Devs: 30 Ops: 2

  62. Team Structure Devs: 30 Ops: 2 DevOps: 1

  63. Team Structure Devs: 30 Ops: 2 DevOps: 1

  64. Team Structure Devs: 30 Ops: 2 DevOps: 1 Hiring one

    person won’t just solve all our problems!
  65. Team Realizations • Hardest problem already solved: awesome team •

    No foreseeable rapid expansion, must operate at our current scale • Each Project’s Director of Development was acting as the bridge between Dev and Ops, but would become a bottleneck.
  66. Teams Already Had Some Ad-Hoc DevOps Tools - Real-time Logging

    - Capistrano Deploys - Nagios Alerts - Server Metrics - Puppet for File Mgmt - App Stats w/ Graphite - Graphite Dashboards - Salt for Cfg Management - Homebrewed Metrics Sys. - Homebrewed Alert System
  67. Teams Already Had Some Ad-Hoc DevOps Tools - Real-time Logging

    - Capistrano Deploys - Nagios Alerts - Server Metrics - Puppet for File Mgmt - App Stats w/ Graphite - Graphite Dashboards - Salt for Cfg Management - Homebrewed Metrics Sys. - Homebrewed Alert System
  68. Step 1: Hire a DevOp! Step 2: ????????????! Step 3:

    Profit! Everything Works ! Perfectly!
  69. Step 1: Hire a DevOp! Step 2: ????????????! Step 3:

    Profit! Everything Works ! Perfectly!
  70. We Formed A Strategy

  71. Step #1: Promote Dev to DevOp Role

  72. WAIT! Isn’t that the advice you just said was a

    bad idea?!
  73. DevOp Engineer • Well Defined Role: • Ownership over the

    TOOLS to improve DevOps efforts. • Resource for other teams to help use DevOps Tools. • Easy to work with, aptitude for systems & ops, likes to try new things.
  74. Promoting From Within • A seasoned dev for your team

    already knows: • Your Pain Points • Your System’s Quirks • How the “Chaos Works” • Knows the people & personalities on your team
  75. Step #2: Change Team Structure

  76. Team Structure Devs: 30 Ops: 2

  77. Team Structure Devs: 30 Ops: 2

  78. Team Structure Goal: Spread Out Expertise By Increasing Ops Experience

    & Skills Among Devs Dev Ops
  79. Team Structure Goal: Spread Out Expertise By Increasing Ops Experience

    & Skills Among Devs Dev Ops
  80. Team Structure Dev Ops

  81. Team Structure Dev Ops

  82. Increasing Ops Among Devs • Identify Devs who liked “Ops”

    & wanted to Learn • Pair Dev with Op / Director • Learning Dev works on things, not Op /Director. • Pair program if needed.
  83. Step #3: Increase Everyone’s Insight

  84. Step #3: Increase Everyone’s Insight

  85. None
  86. Metrics • Everyone has access to Network, Server, and Application

    Metrics. • Consolidate & reduce places to look. We try to pipe everything to StatsD / Graphite • Each developer trained to add & track metrics in production. • We’re okay with 98% uptime of stats to avoid complexity.
  87. None
  88. Real-Time Logging

  89. Real-Time Logging • Harder & more complicated at scale •

    Still trying to solve well, we have lots of logs. • Start with small window of data (i.e. 48 hours) and start to expand window. • We’re trying Logstash, ElasticSearch, and Kibana right now. • Generate Statistics off our Logs
  90. Tracking Changes • Everything, everything, everything in git 
 (we

    use GitHub) • Everyone has access to all repos • Everyone does work through Pull Requests • Everyone has their work code reviewed * * - Your can merge w/o a review, but must be willing to defend your choice
  91. Deploys

  92. Everyone Can Deploy • Automated our deployment process to a

    single step. • Everyone can deploy, deployments are logged • Easy rollback is a requirement! • Implementing feature flags to turn off single parts of our application.
  93. Tests Unit Functional Integration Acceptance etc

  94. Automated Tests • If you want to trust your Devs,

    you need tests • Legacy apps we wrote Integration Tests • New Apps & Refactored Legacy Parts have Unit Tests • Continuous Integration to make sure tests run
  95. None
  96. So, where’s the salt?

  97. Step #4: Devs Use The Ops Tools

  98. Devs can grok salt

  99. Safe Environment For Devs to Learn

  100. Safe Environment For Devs to Learn salt \* cmd.run "rm

    -rf /tmp /*"
  101. Safe Environment For Devs to Learn salt \* cmd.run "rm

    -rf /tmp /*" Salt is awesome, but it can’t ! recover from that
  102. Dev Salt Master Devs Can Look Into Every Server

  103. Dev Salt Master • Every server has two minions: •

    Admin Salt (aka root) • Dev Salt (aka bob) • Each connect to different master server: • All Devs have access to Dev Salt Master • Trusted Devs get access to Admin Salt Master
  104. Everything Salty in Git Reminder:

  105. Dev Environment • Developers own the Dev Environment • Dev

    Teams manage the Salt States for their Env • Vagrant + Salt for their Env • Who makes changes? Developers • DevOp helps advise & offer support
  106. Team Structure Dev Ops

  107. Stage Environment • Stage & Production use same salt repos,

    different branches • Developers make all the changes for Application Servers • All Changes through Pull Requests • We’ll worry about env changes before code • Small changes we quickly release, large or long running branches are scary & dangerous
  108. Production Environment • Merge change to Production Branch • salt

    \* state.highstate • Reminder: Small quick changes over time, never a large change at once.
  109. Environment Caveats • Ops & DevOps Manage VM Hosts, Physical

    Load Balancers, FireWalls, etc • Ops & DevOps manage servers that deal with data: • MySQL • MongoDB • etc
  110. Mentoring Devs

  111. Mentoring Devs • Not every Dev will become an amazing

    DevOp • Thats okay!
  112. Level of “DevOps” Skills • Thinks about their impact on

    Ops: Everyone • Able to debug issues with production: Most • Able to make changes to environments: Many • “Awesome DevOp”: Some
  113. Team’s “DevOps” Skills 0 25 50 75 100 Think Debug

    Change DevOp Current Goal
  114. So Everything Is Awesome for us, right?

  115. Honesty Slide: We Have Skeletons In Our Closets

  116. Where We Are At • All Dev Environments using Vagrant

    + Salt • All New Stage & Prod Environments are Salty • Some Legacy Stage & Production Envs are Salty • Continuously working on getting out stuff salty.
  117. Making This Work For Your Team

  118. Honest Introspection • Determine for your team what are your…

    • Strengths • Weaknesses • Problems • Goals
  119. Increase Team’s Insight • Make sure devs can see &

    understand how their code performs • Increase responsibility of team for those metrics. • If they break it, they fix it. 
 Do not always bail them out. • Everyone can see everything.
  120. Increase Team’s Insight • Make sure devs can see &

    understand how their code performs • Increase responsibility of team for those metrics. • If they break it, they fix it. 
 Do not always bail them out. • Everyone can see everything.
  121. Mentor Those With Desire / Aptitude • Give Developers Safe

    Environment to Learn • Let them submit code-reviewed changes for Stage & Production • When teaching / mentoring, let the learner drive, kindly offer advice and help. • It takes time, but worth the investment.
  122. A Few Final Thoughts

  123. Team Culture Matters

  124. Positive Influence

  125. Questions?

  126. Thank You Justin Carmony Email: justin@justincarmony.com Twitter: @JustinCarmony IRC: carmony

    #salt #uphpu Website: justin@justincarmony.com
  127. p.s. we’re hiring, email / pm / tweet me