social groups construct the material objects of their civilizations. The things made are socially constructed just as much as technically constructed. The merging of these two things, construction and insight, is sociotechnology” — wikipedia if you change the tools people use, you can change how they behave and even who they are.
and reliably track down any new problem with no prior knowledge. For software engineers, this means being able to reason about your code, identify and ﬁx bugs, and understand user experiences and behaviors ... via your instrumentation.
Exploratory, open-ended investigation based on raw events • Service Level Objectives. No preaggregation. • Based on arbitrarily-wide structured events with span support • No indexes, schemas, or predeﬁned structure • Bundling the full context of the request across network hops • Metrics != observability. Unstructured logs != observability.
and technical debt 4. Predictable releases 5. Understand user behavior https://www.honeycomb.io/wp-content/uploads/2019/06/Framework-for-an-Observability-Maturity-Model.pdf Observability Maturity Model … ﬁnd your weakest category, and tackle that ﬁrst
it instrumenting two steps in front of you as you build never accept a PR unless you can explain it if it breaks watch your code go out as it deploys is it working as intended? does anything look weird look through the lens of your instrumentation
mostly predictable failures • Many monitoring checks/paging alerts • "Flip a switch" to deploy, changes are big bang and binary (all on/all off) • Failures to be prevented • Production is to be feared • Debug by intuition and scar tissue of past outages • Canned dashboards, runbooks, playbooks • Deploys are scary • Masochistic on-call culture sociotechnical causes & effects
Unknown-unknowns dominate • Every alert is a novel question • Rich, ﬂexible instrumentation • Few paging alerts, tied to SLOs and keying off user pain • A deploy is just the beginning of gaining conﬁdence in your code • Failures are your friend • Production is where your users live, you should be in there too, watching them every day • Debug methodically by examining the evidence and following the clues • Inspect the full context of the event • Deploys are opportunities • On-call must be sustainable, humane sociotechnical causes & effects Microservices
and reason about them -- if we try, we'll be outcompeted by teams who use proper tools. Our systems are emergent and unpredictable. We need more than just your logical brain; we need your full creative self.