Upgrade to Pro — share decks privately, control downloads, hide ads and more …

DevOps for Developers: Building an Effective Ops Org

DevOps for Developers: Building an Effective Ops Org

The dirty little secret about DevOps is that everybody talks about what it means for operations teams, and hardly anybody talks about what it means for software engineers. Which is possibly even more important!

Operations is not really a dedicated role, it is more like a social contract. Your ops org consists of all the skills, habits, tribal knowledge, and cultural values you have built up around delivering software, and every single engineer (and execs, and customer support) participates in this org. Engineering teams tend to struggle when software engineers don't have the operational skills they need to truly own their own services, and don't know how to learn those skills, and may not even understand why they should care.

So: how do you help your software engineers develop their ops muscles? How do you interview and hire for engineers who will enthusiastically co-create a vibrant ops culture? How do you identify and reward the heroes doing the silent, unsung work of paying down technical debt and shipping stable software? We'll talk about how to create the kind of tight feedback loops that help engineers improve their craft, prevent burnout, and encourage a healthy, fun, collaborative culture of operational excellence.

Charity Majors

April 29, 2016

More Decks by Charity Majors

Other Decks in Technology


  1. Why your software engineers need to get better at operations,

    and how to do it. DevOps for Developers:
  2. What is operations? Operations is the constellation of your org’s

    technical skills, practices, and cultural values around designing, building and maintaining systems, shipping software, and solving problems with technology.
  3. Do you need an “ops team”? Do you need quality

    operations engineering skills and culture? ¯\_(ϑ)_/¯ YES.
  4. Your Mission 1. Support your people in developing new skill

    sets 2. Express institutional value (and mean it)
  5. Software engineers need to get better at ops. (And they

    should WANT TO!! Ops is like a superpower!!!)
  6. Common protests: * learned helplessness * fear of breaking things

    * strategic incompetence * “my time is too valuable!”
  7. • Guard your people’s time and sleep • No hero

    complexes. No martyrs. • Don’t over-page. Align engineering pain with customer pain • Roll up non-urgent alerts for daytime hours • Your most valuable paging alerts are end-to-end checks on critical code paths. Corollary: on-call must not be hell.
  8. Pair your SWEs with ops/DBA for debugging, oncall “cool! let’s

    sit down and figure this out together, and I’ll show you how to do it next time!”
  9. Emphasize ops feedback in early design phase. What are the

    reliability requirements? How do we distribute load or degrade gracefully? Are we reusing components that are already known & supported as much as possible? Who supports this service, how is it going to fail, what are the ripple effects when it does? What instrumentation and metrics will we need?
  10. The cost and pain of developing software is approximately zero

    compared to the operational cost of maintaining it over time. h/t @mcfunley, “choose boring technology”
  11. • “Tell me about the last time you caused a

    production outage.” • “What are your favorite tools for visibility, instrumentation, and debugging?” • “How would you design a deploy process?” • “You developed service $x, and latency is 5x higher today than yesterday. How do you start debugging the problem?” • “What happens when you type “google.com” into a browser? Good operational questions for SWEs
  12. Good engineers should be able to communicate in great detail

    everything that SUCKS about their favorite technologies.
  13. Do they expect the network to be reliable, disks to

    be fast, databases to respond, retries to succeed … Signals … How do they react to the idea of being on call for their own services? Are they overly clever? Ugh.
  14. • Solicit regular feedback from peers, ops, support teams •

    Ask questions about relevant operational skills: • “Who would you most like to be paired with on call? Least?” • “Who do you ask for help when you’re completely stumped?” • “Whose code would you be least willing to maintain?” • Include this feedback every cycle, it should not be a surprise. Performance reviews
  15. Senior software engineers should be reasonably good at these things.

    So if they are not, don’t promote them. Operations engineering is about making systems maintainable, reliable, and comprehensible.
  16. You need to actively solicit this feedback by asking different

    questions. It is much, much harder to recognize and reward operational excellence than shipping shiny features.
  17. Yes, you need an ops team, IF you have hard

    operational problems. You should try to not have hard operational problems.
  18. • Bootstrapping a world-class ops team: • www.heavybit.com/library/video/2015-02-24-charity-majors • Allspaw

    on blameless post mortems • https://codeascraft.com/2012/05/22/blameless-postmortems/ • Choose boring technology: • http://mcfunley.com/choose-boring-technology • DevOps Weekly: devopsweekly.com • SRE Weekly: sreweekly.com Useful links: