Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Tracking Service Infrastructure at Scale

Tracking Service Infrastructure at Scale

Talk from SRECon North America 2017 on tracking and automating service infrastructure at Shopify

A5f3383a1a0c7e6d3df7f06361e39a5c?s=128

John Arthorne

March 13, 2017
Tweet

Transcript

  1. None
  2. None
  3. A startup building hosted software for commerce Facing rapid growth

    in customers, RPM, devs, deploys
  4. Still growing fast, things on fire all the time Production

    Engineering to the rescue!
  5. Still in “double all the things” mode SRE mindset helped

    us get ahead of the growth Concern is more about growth rate than actual #’s
  6. None
  7. None
  8. None
  9. None
  10. None
  11. None
  12. None
  13. • • • ➢

  14. None
  15. None
  16. Collective Ownership in common Ability to deliver with high speed

    Works well in small teams No specialized roles Authoritarian No change without permission Bureaucratic, slow, safe The norm in massive orgs Highly specialized roles Shopify 2015 Shopify 2017
  17. • • •

  18. None
  19. None
  20. • • •

  21. Tier Impact Needs 1 Critical Playbooks, defined SLO, resiliency patterns,

    DC failover, scheduled load tests, security reviews 2 Important On call, monitoring with alerts, metrics instrumentation, dedicated DB, load tested, rolling deploy (preboot) 3 Useful >1 owner, deploy automation, CI, standard dev setup, uptime monitor, bugsnag, log retention, backups, SSL 4 Experiments Owner, Security bugs, resolve outages
  22. None
  23. None
  24. None
  25. None
  26. None
  27. None
  28. None
  29. Office Hours Keep In Touch - -