* **automating failure testing research at internet scale** https://people.ucsc.edu/~palvaro/socc16.pdf * data on the outside vs data on the inside http://cidrdb.org/cidr2005/papers/P12.pdf * pivot tracing http://sigops.org/sosp/sosp15/current/2015-Monterey/printable/122-mace.pdf ### articles * ok log https://peter.bourgon.org/ok-log/ * logs - 12 factor application https://12factor.net/logs * the problem with logging https://blog.codinghorror.com/the-problem-with-logging/ * logging v. instrumentation https://peter.bourgon.org/blog/2016/02/07/logging-v-instrumentation.html * logs and metrics https://medium.com/@copyconstruct/logs-and-metrics-6d34d3026e38 * measure anything, measure everything https://codeascraft.com/2011/02/15/measure-anything-measure-everything/ * metrics, tracing and logging https://peter.bourgon.org/blog/2017/02/21/metrics-tracing-and-logging.html * monitoring and observability https://medium.com/@copyconstruct/monitoring-and-observability-8417d1952e1c * monitoring in the time of cloud native https://medium.com/@copyconstruct/monitoring-in-the-time-of-cloud-native-c87c7a5bfa3e * sre book https://landing.google.com/sre/book/index.html * distributed tracing at uber https://eng.uber.com/distributed-tracing/ * spigo and simianviz https://github.com/adrianco/spigo * observability: what’s in a name? https://honeycomb.io/blog/2017/08/observability-whats-in-a-name/ * wtf is operations? #serverless https://charity.wtf/2016/05/31/wtf-is-operations-serverless/ * event foo: what should i add to an event https://honeycomb.io/blog/2017/08/event-foo-what-should-i-add-to-an-event/