Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Release with Confidence - Observability for Microservices

Release with Confidence - Observability for Microservices

In modern microservice environments, it's no longer enough to only collect telemetry; there must be actionable insights derived from this data compiled in real-time. In this session, Kevin Crawley of Instana demonstrates how modern organizations are capable of understanding the complexity of large microservice environments using a combination of distributed tracing, metrics, and logging. Kevin demonstrates application deployment in Kubernetes and how Instana combines and analyzes the data collected via all three pillars of observability to empower developers and SREs to understand the performance impacts of those deployments.

Kevin Crawley

December 04, 2019
Tweet

More Decks by Kevin Crawley

Other Decks in Technology

Transcript

  1. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Release with Confidence - Observability for Microservices Kevin Crawley S e s s i o n I D Developer Relations Instana
  2. @notsureifkevin $> whoami • Developer Relations @ Instana ◦ Education

    / Awareness ◦ Product Focus on SRE topics ◦ Blogs / Talks / Webinars / etc • Principal SRE @ Single Music ◦ Co-Owner and Consultant ◦ Built Delivery Systems and Manage Infrastructure ◦ Maintain Production Excellence • 20 years software dev exp ◦ Early Adoption of Docker (2014) ◦ Docker Captain ◦ Gitlab Hero
  3. @notsureifkevin Discussion Points • What is Observability • What is

    Distributed Tracing • Monitoring Landscape • Observability In Action: Live Demo
  4. @notsureifkevin Observability Theory Kalman, 1961 paper On the general theory

    of control systems • A system is observable if the behavior of the entire system can be determined by only looking at its inputs and outputs. • Lesson: control theory is a well-documented approach which people can understand and adopt https://en.wikipedia.org/wiki/Control_theory
  5. @notsureifkevin Observability should enable us to: • Identify Patterns •

    Assign Significance • Aid Reasoning • Guide Action
  6. @notsureifkevin Distributed Tracing Abstract In 2010, Google published a technical

    report on their distributed tracing project named Dapper. In their abstract they summarized why they built Dapper in the first place: “Modern Internet services are often implemented as complex, large-scale distributed systems. These applications are constructed from collections of software modules that may be developed by different teams, perhaps in different programming languages, and could span many thousands of machines across multiple physical facilities. Tools that aid in understanding system behavior and reasoning about performance issues are invaluable in such an environment.” Google Technical Report dapper-2010-1, April 2010, p. 1 - https://ai.google/research/pubs/pub36356
  7. @notsureifkevin How to Visualize Distributed Interactions • Every transaction (HTTP,

    Messaging, RPC, etc) has a custom header injected into it which is intercepted and processed by a system of record • This is visualized with a GANTT chart to show the hierarchical structure and timing of every transaction which occurred once the initial trace is generated Google Technical Report dapper-2010-1, April 2010, p. 3, fig. 2 - https://ai.google/research/pubs/pub36356
  8. @notsureifkevin We Need More Than Just Distributed Tracing • No

    longer treating services like Schrödinger's cat • Much more context around events and transactions • Actionable insights generated by aggregated request-scoped events https://peter.bourgon.org/blog/2017/02/21/metrics-tracing-and-logging.html
  9. @notsureifkevin What Do We Get From All This Data? •

    A tremendous amount of telemetry which is perfect for: ◦ Aggregation ◦ SLI/SLOs ◦ Machine Learning ◦ Performance Analysis ◦ Debugging
  10. @notsureifkevin We need machines to reconstruct this data so we

    can easily make decisions on how to react!
  11. @notsureifkevin Spring Pet Clinic - Architecture • Original was a

    monolith, refactored to microservices by the community ◦ Removed dependencies on Zuul, Hystrix, etc. to ease compatibility with K8S ◦ Added Notifications / Kafka Service ◦ Built a Load Testing Script ◦ Built Deploy pipelines for K8S / Gitlab https://gitlab.com/opentracing-workshop/spring-petclinic-kubernetes
  12. @notsureifkevin Spring Pet Clinic - Problem? • Load script generates

    new customers, while accessing the endpoint which loads all customers ◦ https://gitlab.com/notsureifkevin/spring-petclinic- kubernetes/blob/master/scripts/spc-load/spc-load.py#L25-33 ◦ https://gitlab.com/notsureifkevin/spring-petclinic- kubernetes/blob/master/scripts/spc-load/spc-load.py#L18 • Pagination is non-existent (but we’ll deploy it) ◦ https://gitlab.com/kc_wrhse/spring-petclinic- kubernetes/commit/7b52d3fcdfe20945cbc53e2269b12be4191e2777 • Live demonstration on how this is visualized and remediated using modern observability tools
  13. @notsureifkevin In Summary • Microservices are HARD (ask Segment), instrument

    your services so you can make these systems easier to understand and manage • Observability Tools should help you understand how your systems are performing without creating additional work for your team • Share your successes and lessons learned with the community!
  14. Thank you! © 2019, Amazon Web Services, Inc. or its

    affiliates. All rights reserved. Kevin Crawley Twitter: @notsureifkevin Visit our booth #511 or schedule some time with us http://bit.ly/instana-reinvent