Systems Tracing Infrastructure” https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/36356.pdf ➔ Raja R. Sambasivan et al. “So, you want to trace your distributed system? Key design insights from years of practical experience” http://www.pdl.cmu.edu/PDL-FTP/SelfStar/CMU-PDL-14-102.pdf ➔ Monitoring in the time of cloud native https://medium.com/@copyconstruct/monitoring-in-the-time-of-cloud-native-c87c7a5bfa3e ➔ OK Log https://peter.bourgon.org/ok-log/ ➔ Metrics, Tracing and Logging https://peter.bourgon.org/blog/2017/02/21/metrics-tracing-and-logging.html ➔ Distributed Tracing at Uber https://eng.uber.com/distributed-tracing/ ➔ Monitoring and Observability https://medium.com/@copyconstruct/monitoring-and-observability-8417d1952e1c ➔ Measure Anything, Measure Everything https://codeascraft.com/2011/02/15/measure-anything-measure-everything/ ➔ The death of ops is greatly exaggerated https://medium.com/@copyconstruct/the-death-of-ops-is-greatly-exaggerated-ff3bd4a67f24 ➔ Logs and Metrics https://medium.com/@copyconstruct/logs-and-metrics-6d34d3026e38 ➔ Logs - 12 Factor Application https://12factor.net/logs ➔ Take OpenTracing for a HotRod Ride https://medium.com/opentracing/take-opentracing-for-a-hotrod-ride-f6e3141f7941 ➔ The Problem with Logging https://blog.codinghorror.com/the-problem-with-logging/ ➔ Logging v. Instrumentation https://peter.bourgon.org/blog/2016/02/07/logging-v-instrumentation.html ➔ SRE Book https://landing.google.com/sre/book/index.html ➔ Canopy: An End-to-End Performance Tracing And Analysis System http://cs.brown.edu/~jcmace/papers/kaldor2017canopy.pdf ➔ Peter Alvaro et al. - Lineage-Driven Fault Injection https://people.eecs.berkeley.edu/~palvaro/molly.pdf ➔ Vizceral Open Source - Netflix Techblog https://medium.com/netflix-techblog/vizceral-open-source-acc0c32113fe Referencias