Observando Sistemas Distribuidos - PeruJUG

Jorge Quilcate

December 16, 2017

  1. Peruano en Noruega Ingeniero de Software en Sysco AS, parte

    del equipo de Middleware Iniciando mi trayecto en Sistemas Distribuidos Open-Source Contributor, Apache Kafka project Oracle ACE Associate jeqo.github.io | github.com/jeqo | @jeqo89 Jorge Quilcate
  2. Logging ➔ Eventos discretos: `+load => +logs` ➔ Logging eventos

    accionables: peter.bourgon.org/blog/2016/02/07/logging-v-instrumentation.html ➔ Fácil de agregar, difícil de gestionar: blog.codinghorror.com/the-problem-with-logging/ ➔ No intentes gestionar logs como parte de tu aplicación: 12factor.net/logs
  3. OK Log --> ingestor . . . store | |

    service --(stdout)--> forwarder --|--> ingestor . . . store | | --> ingestor . . . store OK Log: Distributed and Coördination-Free Logging - Peter Bourgon https://www.youtube.com/watch?v=gWWK2eyZ-sc
  4. FluentD/Fluent-bit --> fluentd . . . store | (files) |

    service --(stdout)--> fluent-bit--|--> fluentd . . . store (docker) | | --> fluentd . . . store
  5. ➔ Post tweets: ◆ 4.6k requests/second en promedio ◆ 12k

    requests/second en pico ➔ Home timeline: ◆ 300k requests/second Soportar una carga de 12,000 writes/second seria sencillo. Sin embargo, el problema no era el volumen de tweets, pero el fan-out. `#reads/sec = 25 * #writes/sec` Describiendo Carga Twitter use-case - Nov, 2016 Designing Data-Intensive Applications (Chapter 1) - Martin Kleppmann https://dataintensive.net
  6. ➔ Basado en el paper “Dapper” ◆ Utilizado en la

    mayoría de sistemas en Google ◆ Siguen un enfoque basado en Anotaciones, en comparación al enfoque basado en Black-box ➔ “Just an API” ➔ `Trace = DAG[Span]` OpenTracing DAG: Directed Acyclic Graph a.k.a Tree
  7. OpenTracing OpenTracing API application logic µ-service frameworks control-flow packages RPC

    frameworks existing instrumentation tracing infrastructure main() T R A C E R J a e g e r service process OpenTracing Isn't just Tracing: Measure Twice, Instrument Once - Ben Sigelman https://www.youtube.com/watch?v=NyySNe6Rr_g
  8. Qué sucede cuando tomamos algo apestoso y aumentamos su área

    de superficie? Engineering you - Martin Thompson https://www.youtube.com/watch?v=S4LzzuMTqjs&t=1177s MONOLITH
  9. ➔ Adopción, compatibilidad y nivel de conformidad con el API

    (Gitter) ➔ Acceso, alcance y nivel de granularidad para diferentes escenarios (Canopy) Retos y Oportunidades con OpenTracing
  10. Lineage-Driven Fault Injection Orchestrating Chaos Applying Database Research in the

    Wild - Peter Alvaro https://www.youtube.com/watch?v=YplkQu6a80Q
  11. Lineage-Driven Fault Injection Orchestrating Chaos Applying Database Research in the

    Wild - Peter Alvaro https://www.youtube.com/watch?v=YplkQu6a80Q
  12. Lineage-Driven Fault Injection Orchestrating Chaos Applying Database Research in the

    Wild - Peter Alvaro https://www.youtube.com/watch?v=YplkQu6a80Q
  13. Lineage-Driven Fault Injection Orchestrating Chaos Applying Database Research in the

    Wild - Peter Alvaro https://www.youtube.com/watch?v=YplkQu6a80Q
  14. ➔ Benjamin Sigelman et al. - “Dapper, a Large-Scale Distributed

    Systems Tracing Infrastructure” https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/36356.pdf ➔ Raja R. Sambasivan et al. “So, you want to trace your distributed system? Key design insights from years of practical experience” http://www.pdl.cmu.edu/PDL-FTP/SelfStar/CMU-PDL-14-102.pdf ➔ Monitoring in the time of cloud native https://medium.com/@copyconstruct/monitoring-in-the-time-of-cloud-native-c87c7a5bfa3e ➔ OK Log https://peter.bourgon.org/ok-log/ ➔ Metrics, Tracing and Logging https://peter.bourgon.org/blog/2017/02/21/metrics-tracing-and-logging.html ➔ Distributed Tracing at Uber https://eng.uber.com/distributed-tracing/ ➔ Monitoring and Observability https://medium.com/@copyconstruct/monitoring-and-observability-8417d1952e1c ➔ Measure Anything, Measure Everything https://codeascraft.com/2011/02/15/measure-anything-measure-everything/ ➔ The death of ops is greatly exaggerated https://medium.com/@copyconstruct/the-death-of-ops-is-greatly-exaggerated-ff3bd4a67f24 ➔ Logs and Metrics https://medium.com/@copyconstruct/logs-and-metrics-6d34d3026e38 ➔ Logs - 12 Factor Application https://12factor.net/logs ➔ Take OpenTracing for a HotRod Ride https://medium.com/opentracing/take-opentracing-for-a-hotrod-ride-f6e3141f7941 ➔ The Problem with Logging https://blog.codinghorror.com/the-problem-with-logging/ ➔ Logging v. Instrumentation https://peter.bourgon.org/blog/2016/02/07/logging-v-instrumentation.html ➔ SRE Book https://landing.google.com/sre/book/index.html ➔ Canopy: An End-to-End Performance Tracing And Analysis System http://cs.brown.edu/~jcmace/papers/kaldor2017canopy.pdf ➔ Peter Alvaro et al. - Lineage-Driven Fault Injection https://people.eecs.berkeley.edu/~palvaro/molly.pdf ➔ Vizceral Open Source - Netflix Techblog https://medium.com/netflix-techblog/vizceral-open-source-acc0c32113fe Referencias