Upgrade to Pro — share decks privately, control downloads, hide ads and more …

SLOs and You, or: How We Learned to Stop Worrying and Love the Queue Length

SLOs and You, or: How We Learned to Stop Worrying and Love the Queue Length

RabbitMQ and the services that use it serve as the backdrop to sharing how FreshBooks came up with some of its first service level objectives. Why are you taking this thing we care about? How else can we know if there are connectivity issues to RabbitMQ? How else can we know if the consumer doesn’t have enough capacity? What does queue length tell us - and what doesn’t it tell us? How do we let go of single-metric “something could be wrong” indicators to more direct indications that something IS wrong?

Join and hear the tale in four parts of how we helped service owners let go of RabbitMQ queue length and take a customer-centric approach to curating services’ SLOs, or how we learned to stop worrying and love the queue length.

Lisa Seelye

May 17, 2018
Tweet

More Decks by Lisa Seelye

Other Decks in Technology

Transcript

  1. Lisa Seelye @thedoh FreshBooks SLO Creation and You Or: How

    We Learned to Stop Worrying and Love the Queue Length 1
  2. Lisa Seelye @thedoh FreshBooks About Me 2 FreshBooks.com Mission: Build

    a world class online accounting application to help small businesses better manage their finances. By @burst at https://www.pexels.com/photo/architecture-bridge-building-business-374754/ CC0
  3. Lisa Seelye @thedoh FreshBooks About Our Story 3 Empty Theater

    Seats by @jamie-fernandez-201894 at https://www.pexels.com/photo/empty-theater-seats-758976/ / CC0 / cropped
  4. Lisa Seelye @thedoh FreshBooks Our Cast of Characters 4 •

    Service Level Objectives (SLOs) ◦ Service Level Agreement (SLAs) as the evil twin • RabbitMQ • Service Ownership • New Service Owners (aka Developers and Product Owners)
  5. Lisa Seelye @thedoh FreshBooks Causes of long queues - Capacity

    10 Rush Hour by msvg at https://www.flickr.com/photos/msvg/4476789745/ CC-BY 2.0 / cropped
  6. Lisa Seelye @thedoh FreshBooks Causes of long queues - Usage

    Spike 11 Image from NewRelic.com for FreshBooks (used with permission)
  7. Lisa Seelye @thedoh FreshBooks Causes of long queues - Buggy

    Deploy 12 Bugs by searleb at https://www.flickr.com/photos/searleb/3122477836/ CC-BY 2.0 / cropped
  8. Lisa Seelye @thedoh FreshBooks Causes of long queues - Normal

    Growth 13 Landscape Photography of Pavement Road by @jc-estrada-341132 at https://www.pexels.com/photo/landscape-photography-of-pavement-road-1046606/ CC0 / cropped
  9. Lisa Seelye @thedoh FreshBooks It’s So Broad, What Value Do

    You Get? • We might not have enough capacity • We might be a problem with the workers • Is RabbitMQ well-provisioned? 15
  10. Lisa Seelye @thedoh FreshBooks Key Goals • Pager response is

    quicker • Easier capacity planning • Sets service expectations • Hold ourselves accountable 17
  11. Lisa Seelye @thedoh FreshBooks Ask Direct Questions • Specific questions

    have specific answers • Usually service specific • “Why is that important” 18
  12. Lisa Seelye @thedoh FreshBooks Ok, What To Ask? - Is

    My Service Healthy? 19 No more use... by smithser at https://www.flickr.com/photos/smithser/3434266313 CC-BY 2.0 / cropped
  13. Lisa Seelye @thedoh FreshBooks Ok, What To Ask? - Service

    Misuse? 20 DSC_1607 by justinbaeder at https://www.flickr.com/photos/justinbaeder/5317820857 CC-BY-2.0 / cropped
  14. Lisa Seelye @thedoh FreshBooks 21 White Pocket Watch With Gold-colored

    Frame on Brown Wooden Board by @iseeghoststoo at https://www.pexels.com/photo/white-pocket-watch-with-gold-colored-frame-on-brown-wooden-board-1010513/ CC0 / cropped Ok, What To Ask? - Customer Wait Time?
  15. Lisa Seelye @thedoh FreshBooks 22 The big queue at an

    ATM in Masalli, Azerbaijan by Ds02006 at https://commons.wikimedia.org/wiki/File:ATM_Masalli.jpg / Public Domain / cropped Ok, What To Ask? - Consumer Throughput?
  16. Lisa Seelye @thedoh FreshBooks 23 King's Highway 12 - Ontario

    by dougtone at https://www.flickr.com/photos/dougtone/9190014238/ / CC-SA 2.0 / cropped Ok, What To Ask? - Violating Queue Limits?
  17. Lisa Seelye @thedoh FreshBooks We Have Our Questions … Now

    What? 25 Stealthy Cosmo by pargon at https://www.flickr.com/photos/pargon/2381366401 / CC-BY 2.0 / cropped
  18. Lisa Seelye @thedoh FreshBooks A Look Back • Unfamiliar with

    RabbitMQ instrumentation • Correlated queue length with problem • Pager fatigue :( 27
  19. Lisa Seelye @thedoh FreshBooks The End 30 SLOs sound cool?

    Learn More in Google’s SRE Book (Ch. 4) https://landing.google.com/sre/book.html