In this talk, we’ll dive into the wonderful world of observability. We’ll first get to know what those famous SLIs and SLOs stand for and how we can define and utilize them for our application. Once we understand those concepts, we’ll look at how we can tie our alerting and monitoring together in a declarative manner.
Links:
Service Level Objectives: https://sre.google/sre-book/service-level-objectives/
Implementing SLOs: https://sre.google/workbook/implementing-slos/
Practical alerting: https://sre.google/sre-book/practical-alerting/
Availability table: https://sre.google/sre-book/availability-table/
Motivation for Error Budgets: https://sre.google/sre-book/embracing-risk/#xref_risk-management_unreliability-budgets
Experimental refactoring: https://www.youtube.com/watch?v=9MW4H6kFb7M
Alerting on SLOs: https://sre.google/workbook/alerting-on-slos/