Ops and Operability

Ops and Operability

Once Ada Lovelace invented programming Jane Austen knew it wasn't going to end well unless she invented Operations. She proposed DevOps in the early 19th century in a series of coded stories with titles like "Support and Supportability".

DevOps is a synthesis of agile development practises—small releases, high automation, close collaboration—with the "keeping the lights on" rigour and discipline of the Operations Centre. For it to succeed we need to learn to treat Ops as equals to Dev rather than voiceless downstream consumers.

08145ecb1ce091d9dd3c328ea2a707fb?s=128

Daniel Terhorst-North

February 25, 2016
Tweet

Transcript

  1. OPS AND OPERABILITY

  2. @tastapod Dan North

  3. @tastapod Imagine a company that treated its customers like this…

    No consistency in its products… …or even within a product No clues when something breaks… …or even that something has broken! No warning that things are about to break Blaming the customer when things go wrong “Provided without warranty”
  4. @tastapod How can I fix it? “Provided without warranty” What

    happened? Can I fix it? How long will it be broken? Who can I escalate this to? Am I just stuck with this? Does anyone care?
  5. @tastapod This is how Dev treats Ops DevOps starts with

    Dev pushing on Ops …efforts to compromise their governance, assurance, audit, compliance, control processes and structures Ops resists…
  6. @tastapod Ops is still on the hook for… Runtime operations

    SLAs Diagnosis Recovery Restoration Business continuity
  7. AUTOMATION & AUTONOMY

  8. @tastapod The downstream view of “autonomy” Let’s just push this

    to production
  9. @tastapod The downstream view of “autonomy”

  10. @tastapod The downstream view of “autonomy”

  11. @tastapod The downstream view of “autonomy”

  12. @tastapod Autonomy needs accountability How to resolve local autonomy and

    global consistency? “The Spotify problem”
  13. @tastapod Contextual Consistency — a pattern “Given the same context,

    and the same constraints,
 we are likely to make similar decisions” or
 
 “What’s the smallest amount of advice you can give me
 so I’m unlikely to screw this up?”
  14. SUPPORT & SUPPORTABILITY

  15. @tastapod Meet the Ops Team You build it, you run

    it! They have no understanding of monitoring Developers should be on the support rota* *This isn’t always possible
  16. @tastapod How supportable is your application? Three magic questions for

    incident management: 1. What happened? 2. Who is impacted? 3. How do we fix it? The real question: - How could we reduce the impact of this? MTTR trumps MTBF Imagine being paged at 4am for your error message
  17. @tastapod Captain’s Log — a pattern “Don’t tell me, let

    me figure it out” A log message should contain: - a timestamp, for humans and machines - a unique correlation ID, “edge-to-edge” - the cause, the whole cause, and nothing but the cause - answers to the three questions, or at least pointers A log is an append-only, read-only, user interface!
  18. PACKAGING

  19. It is a truth universally acknowledged, that a developer in

    possession of a build must be in want of a server
  20. @tastapod Automating deployment is one thing. Understanding the release process

    is another. Having something worth deploying is something else again!
  21. @tastapod Phone Home — a pattern Every component should heartbeat

    There are lots of options for this: - Broadcasting a UDP packet - Writing to a service registry - Sending a message A single packet can carry 1500 bytes - That’s a lot of information! { "name" : "product_search", "app" : "online_shop", "requires": ["other", "components"], "address": { "host": "10.0.0.135", "port": "1337" }, "heartbeat": { "interval" : 500, "mia_interval": 5000 }, "config": { "git_revision" : "3ef82c", "deployed_from": "Dan's laptop", "deployed_by" : "Dan North", "deployed_on" : "2016-01-15 13:22:00" }, "status": { "memory" : 80, "cpu_load": [4.92, 2.94, 2.14], "io_load" : 45, "disk" : 72 }, "rel": { "config": "/config", "status": "/status" } }
  22. OPS & OPERABILITY

  23. USE & USABILITY

  24. —Bill Buxton “User experience is the experience a user has”

  25. @tastapod Developers: What does it feel like to build your

    software? What does it feel like to deploy your software? What does it feel like to test your software? What does it feel like to release your software? What does it feel like to monitor your software? What does it feel like to support your software?
  26. @tastapod Ops engineers: How can you help the developers help

    you? How can you help them help themselves? How can you “get out of their way”? Where should you start?
  27. @tastapod Happy Ops! Developers study how we work! They look

    beyond their own apps They are Devs thinking
 like Ops They learn about release engineering! They learn about security! There should be a word for that…
  28. THE END