SLO Creation and You: Or, How We Learned to Stop Worrying and Love the Queue Length

SLO Creation and You: Or, How We Learned to Stop Worrying and Love the Queue Length

Similar material to DevOps Days, but updated for ExploreTech 2018

90604a245b7afffa43dd7a2b35c72608?s=128

Lisa Seelye

October 24, 2018
Tweet

Transcript

  1. Lisa Seelye @thedoh SLO Creation and You Or: How We

    Learned to Stop Worrying and Love the Queue Length 1
  2. Lisa Seelye @thedoh About Me 2 By @burst at https://www.pexels.com/photo/architecture-bridge-building-business-374754/

    CC0
  3. Lisa Seelye @thedoh About Our Story 3 Empty Theater Seats

    by @jamie-fernandez-201894 at https://www.pexels.com/photo/empty-theater-seats-758976/ / CC0 / cropped
  4. Lisa Seelye @thedoh Our Cast of Characters 4 • Service

    Level Objectives (SLOs) ◦ Service Level Agreement (SLAs) as the evil twin • RabbitMQ • Service Ownership • New Service Owners (aka Developers and Product Owners)
  5. Lisa Seelye @thedoh Act I: We Made an SLO 5

  6. Lisa Seelye @thedoh Creating The SLO 6 Group Hand Fist

    Bump by @rawpixel.com at https://www.pexels.com/photo/group-hand-fist-bump-1068523/
  7. Lisa Seelye @thedoh 99.95% Monthly Availability 7 So… What do

    you think?
  8. Lisa Seelye @thedoh Conspicuously Absent Objectives • Queue length •

    Queue consumer count 8
  9. Lisa Seelye @thedoh The Reaction “We think you should have

    queue length as an SLO.” 9
  10. Lisa Seelye @thedoh Act II: All About Queue Length 10

  11. Lisa Seelye @thedoh Causes of long queues - Capacity 11

    Rush Hour by msvg at https://www.flickr.com/photos/msvg/4476789745/ CC-BY 2.0 / cropped
  12. Lisa Seelye @thedoh Causes of long queues - Usage Spike

    12 Image from NewRelic.com for FreshBooks (used with permission)
  13. Lisa Seelye @thedoh Causes of long queues - Buggy Deploy

    13 Bugs by searleb at https://www.flickr.com/photos/searleb/3122477836/ CC-BY 2.0 / cropped
  14. Lisa Seelye @thedoh Causes of long queues - Normal Growth

    14 Landscape Photography of Pavement Road by @jc-estrada-341132 at https://www.pexels.com/photo/landscape-photography-of-pavement-road-1046606/ CC0 / cropped
  15. Lisa Seelye @thedoh Causes of long queues - RabbitMQ Did

    It! 15 Images from FreshBooks
  16. Lisa Seelye @thedoh The Value of Queue Length • We

    might not have enough capacity • We might be a problem with the queue services • Is RabbitMQ well-provisioned? 16
  17. Lisa Seelye @thedoh Act III: Asking Questions 17

  18. Lisa Seelye @thedoh Key Goals • Quicker on-call alert handling

    • Easier capacity planning • Sets service expectations • Hold ourselves accountable 18
  19. Lisa Seelye @thedoh Ask Direct Questions • Specific questions have

    specific answers • Usually service specific • “Why is that important” 19
  20. Lisa Seelye @thedoh What To Ask? - Is My Service

    Healthy? 20 No more use... by smithser at https://www.flickr.com/photos/smithser/3434266313 CC-BY 2.0 / cropped
  21. Lisa Seelye @thedoh What To Ask? - Service Misuse? 21

    DSC_1607 by justinbaeder at https://www.flickr.com/photos/justinbaeder/5317820857 CC-BY-2.0 / cropped
  22. Lisa Seelye @thedoh 22 White Pocket Watch With Gold-colored Frame

    on Brown Wooden Board by @iseeghoststoo at https://www.pexels.com/photo/white-pocket-watch-with-gold-colored-frame-on-brown-wooden-board-1010513/ CC0 / cropped What To Ask? - Customer Wait Time?
  23. Lisa Seelye @thedoh 23 The big queue at an ATM

    in Masalli, Azerbaijan by Ds02006 at https://commons.wikimedia.org/wiki/File:ATM_Masalli.jpg / Public Domain / cropped What To Ask? - Consumer Throughput?
  24. Lisa Seelye @thedoh 24 King's Highway 12 - Ontario by

    dougtone at https://www.flickr.com/photos/dougtone/9190014238/ / CC-SA 2.0 / cropped What To Ask? - Violating Queue Limits?
  25. Lisa Seelye @thedoh Wait, Did You Just Suggest Queue Length??

    25
  26. Lisa Seelye @thedoh We Have Our Questions … Now What?

    26 Stealthy Cosmo by pargon at https://www.flickr.com/photos/pargon/2381366401 / CC-BY 2.0 / cropped
  27. Lisa Seelye @thedoh Epilogue 27

  28. Lisa Seelye @thedoh Looking Back • Unfamiliar with RabbitMQ instrumentation

    • Correlated queue length with problem • Alert fatigue :( 28
  29. Lisa Seelye @thedoh Key Lessons 29 • Old ways weren’t

    working • Be open to a discussion with people
  30. Lisa Seelye @thedoh One Last Thing... 30 • SLOs aren’t

    just for software - Think Customer Support
  31. Lisa Seelye @thedoh The End 31 SLOs sound cool? Learn

    More in Google’s SRE Book (Ch. 4) https://landing.google.com/sre/book.html Special thank you to FreshBooks For granting permission to give this talk.