Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Ethical and Sustainable On-Call

Jon Daniel
August 13, 2015

Ethical and Sustainable On-Call

Jon Daniel

August 13, 2015
Tweet

More Decks by Jon Daniel

Other Decks in Technology

Transcript

  1. Working Environment • 150+ developers • more apps than developers

    • some apps are owned by multiple teams • on-call rules/procedures differ by team
  2. My Team: Core Services • builds libraries and APIs essential

    to business needs • mostly ruby, some clojure • 8 developers + few split between teams
  3. all* developers included rotation is 1 week defend against interruption

    no shipping features flexibility when systems are stable * even junior developers once they get up to speed
  4. • finding and understanding failure cases • adding failover/error recovery

    where necessary • writing playbooks for handling outages • updating libraries and addressing deprecations • testing alerting and diagnostic tools • applying duct-tape when necessary but documenting for the future
  5. • check acks in paging system • check email/chat for

    alerts • offer help/advice if you are knowledgeable • own up to outages if you caused them
  6. • keep asking “is it fixed yet” • start blaming

    others • stop communication if you have important information
  7. • Long Island Solar Farm - https://flic.kr/p/drsrm8 • Hurricane Sandy

    power outage in Lower Manhattan, New York - https:// flic.kr/p/dp4D78 • Blackouts are fun - https://flic.kr/p/3oomWN • Linesman at work - https://flic.kr/p/7Z4TKU • Power pole replacement - https://flic.kr/p/oLo3Ye • Kudankulam nuclear power plant - https://flic.kr/p/dM66tE • Blueprint - https://www.flickr.com/photos/wscullin/3770015991 • Power Plant - https://www.flickr.com/photos/52336371@N07/19638501728 Image Credits