Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building a Culture of Reliability - SRECon EMEA...

Building a Culture of Reliability - SRECon EMEA 2017

Getting customers to care about Reliability is hard. Getting stakeholders to care about Reliability is harder. Getting the entire company to care about Reliability is even harder.

In this talk, I will cover what steps that every leader in any organization can take to get more people to care about Reliability. Because Reliability is one of those things that people only notice when it goes in the wrong direction, it can be hard to show the value of it and why it is so important.

We will walk through cultural and management changes, metrics to watch and obsess over, and some tooling that can help along the way.

Video Available Here: https://www.youtube.com/watch?v=xH4FqAHeq08

Arup Chakrabarti

August 31, 2017
Tweet

More Decks by Arup Chakrabarti

Other Decks in Technology

Transcript

  1. @arupchak A way to get your colleagues to behave the

    way you want them to without staring at them all the time
  2. @arupchak “Here is a graph of open File Descriptors going

    through the roof” -Frustrated Engineer
  3. @arupchak Individual Transaction Business $$$ per Minute $0 $23 $45

    $68 $90 Monday Tuesday Wednesday Thursday Friday
  4. @arupchak Individual Transaction Business $$$ per Minute $0 $23 $45

    $68 $90 Monday Tuesday Wednesday Thursday Friday
  5. @arupchak Individual Transaction Business $$$ per Minute $0 $23 $45

    $68 $90 Monday Tuesday Wednesday Thursday Friday
  6. @arupchak Individual Transaction Business $$$ per Minute $0 $23 $45

    $68 $90 Monday Tuesday Wednesday Thursday Friday
  7. @arupchak Individual Transaction Business $$$ per Minute $0 $23 $45

    $68 $90 Monday Tuesday Wednesday Thursday Friday $
  8. @arupchak Individual Transaction Business $$$ per Minute $0 $23 $45

    $68 $90 Monday Tuesday Wednesday Thursday Friday $ €
  9. @arupchak Subscription Businesses • Cannot solely measure when you make

    money • Poor Reliability erodes trust and will cause you lose revenue • Need to find something between how money is made and what customers care about
  10. @arupchak Distributed Operations Org • Sets expectations around availability of

    people • More small incidents over single major incident • Builds empathy and why Reliability is hard
  11. @arupchak “If we just install Nagios, everything will be fine

    and all of our problems will be solved” -Arup in 2002
  12. @arupchak “We humans co-evolve with our tools. We change the

    tools, and the tools change us, and that cycle repeats.” -Jeff Bezos