DevOps for Developers: Building an Effective Ops Org

DevOps for Developers: Building an Effective Ops Org

The dirty little secret about DevOps is that everybody talks about what it means for operations teams, and hardly anybody talks about what it means for software engineers. Which is possibly even more important!

Operations is not really a dedicated role, it is more like a social contract. Your ops org consists of all the skills, habits, tribal knowledge, and cultural values you have built up around delivering software, and every single engineer (and execs, and customer support) participates in this org. Engineering teams tend to struggle when software engineers don't have the operational skills they need to truly own their own services, and don't know how to learn those skills, and may not even understand why they should care.

So: how do you help your software engineers develop their ops muscles? How do you interview and hire for engineers who will enthusiastically co-create a vibrant ops culture? How do you identify and reward the heroes doing the silent, unsung work of paying down technical debt and shipping stable software? We'll talk about how to create the kind of tight feedback loops that help engineers improve their craft, prevent burnout, and encourage a healthy, fun, collaborative culture of operational excellence.

Ac734fc32781678475b577944bb5a9ae?s=128

Charity Majors

April 29, 2016
Tweet

Transcript

  1. Charity Majors @mipsytipsy DevOps for Developers

  2. @mipsytipsy engineer, cofounder, CTO

  3. Why your software engineers need to get better at operations,

    and how to do it. DevOps for Developers:
  4. “Dear operations people, learn to be more like software engineers.”

    Love, DevOps (2009-2016)
  5. “Dear software engineers: your turn. Time to get better at

    ops.” Love, Everyone in Tech
  6. This is not optional, this is not “nice-to-have” This is

    table stakes.
  7. What is operations? Operations is the constellation of your org’s

    technical skills, practices, and cultural values around designing, building and maintaining systems, shipping software, and solving problems with technology.
  8. Operations is a social contract.

  9. Do you need an “ops team”? Do you need quality

    operations engineering skills and culture? ¯\_(ϑ)_/¯ YES.
  10. So you have an Ops Org …

  11. Your Mission 1. Support your people in developing new skill

    sets 2. Express institutional value (and mean it)
  12. Software engineers need to get better at ops. (And they

    should WANT TO!! Ops is like a superpower!!!)
  13. Developing new skill sets

  14. Engineers should be on call for their own services.

  15. Common protests: * learned helplessness * fear of breaking things

    * strategic incompetence * “my time is too valuable!”
  16. • Guard your people’s time and sleep • No hero

    complexes. No martyrs. • Don’t over-page. Align engineering pain with customer pain • Roll up non-urgent alerts for daytime hours • Your most valuable paging alerts are end-to-end checks on critical code paths. Corollary: on-call must not be hell.
  17. Software engineers should deploy their own code.

  18. Build guard-rails, not walls Feedback needs to be fast to

    be effective
  19. The most powerful weapon in your arsenal is always cause

    and effect.
  20. Pair your SWEs with ops/DBA for debugging, oncall “cool! let’s

    sit down and figure this out together, and I’ll show you how to do it next time!”
  21. Your eng teams should share the same review processes, tasks

    and tools.
  22. Emphasize ops feedback in early design phase. What are the

    reliability requirements? How do we distribute load or degrade gracefully? Are we reusing components that are already known & supported as much as possible? Who supports this service, how is it going to fail, what are the ripple effects when it does? What instrumentation and metrics will we need?
  23. None
  24. The cost and pain of developing software is approximately zero

    compared to the operational cost of maintaining it over time. h/t @mcfunley, “choose boring technology”
  25. Dear fellow ops/DBAs: BE NICE

  26. Creating Institutional Value

  27. • Interviewing • Promoting • Performance Reviews • Compensation How?

  28. Probe every software engineering candidate for their ops experience &

    attitude. … yep, even FE/mobile devs!
  29. • “Tell me about the last time you caused a

    production outage.” • “What are your favorite tools for visibility, instrumentation, and debugging?” • “How would you design a deploy process?” • “You developed service $x, and latency is 5x higher today than yesterday. How do you start debugging the problem?” • “What happens when you type “google.com” into a browser? Good operational questions for SWEs
  30. Good engineers should be able to communicate in great detail

    everything that SUCKS about their favorite technologies.
  31. Do they expect the network to be reliable, disks to

    be fast, databases to respond, retries to succeed … Signals … How do they react to the idea of being on call for their own services? Are they overly clever? Ugh.
  32. “Operations is valued here.” you are signaling …

  33. • Solicit regular feedback from peers, ops, support teams •

    Ask questions about relevant operational skills: • “Who would you most like to be paired with on call? Least?” • “Who do you ask for help when you’re completely stumped?” • “Whose code would you be least willing to maintain?” • Include this feedback every cycle, it should not be a surprise. Performance reviews
  34. Senior software engineers should be reasonably good at these things.

    So if they are not, don’t promote them. Operations engineering is about making systems maintainable, reliable, and comprehensible.
  35. You need to actively solicit this feedback by asking different

    questions. It is much, much harder to recognize and reward operational excellence than shipping shiny features.
  36. Your operational priorities must be clearly communicated by management, details

    left up to the engineers/teams.
  37. The patterns you call out and celebrate in your culture

    will get repeated.
  38. In conclusion …

  39. Yes, you need an ops team, IF you have hard

    operational problems. You should try to not have hard operational problems.
  40. Needing a dedicated operations engineering team is a sign of

    success. Good job!
  41. • Bootstrapping a world-class ops team: • www.heavybit.com/library/video/2015-02-24-charity-majors • Allspaw

    on blameless post mortems • https://codeascraft.com/2012/05/22/blameless-postmortems/ • Choose boring technology: • http://mcfunley.com/choose-boring-technology • DevOps Weekly: devopsweekly.com • SRE Weekly: sreweekly.com Useful links:
  42. with special thanks to: Caitie McCaffrey Mark Ferlatte Mihasya (Pancakes)

    Bridget Kromhout Dan McKinley
  43. Charity Majors @mipsytipsy