Release with Confidence - Observability for Microservices

© 2019, Amazon Web Services, Inc. or its affiliates. All
rights reserved. Release with Confidence - Observability for Microservices Kevin Crawley S e s s i o n I D Developer Relations Instana

@notsureifkevin $> whoami • Developer Relations @ Instana ◦ Education
/ Awareness ◦ Product Focus on SRE topics ◦ Blogs / Talks / Webinars / etc • Principal SRE @ Single Music ◦ Co-Owner and Consultant ◦ Built Delivery Systems and Manage Infrastructure ◦ Maintain Production Excellence • 20 years software dev exp ◦ Early Adoption of Docker (2014) ◦ Docker Captain ◦ Gitlab Hero

@notsureifkevin Discussion Points • What is Observability • What is
Distributed Tracing • Monitoring Landscape • Observability In Action: Live Demo

@notsureifkevin Observability Theory and Reasoning

@notsureifkevin Observability Theory Kalman, 1961 paper On the general theory
of control systems • A system is observable if the behavior of the entire system can be determined by only looking at its inputs and outputs. • Lesson: control theory is a well-documented approach which people can understand and adopt https://en.wikipedia.org/wiki/Control_theory

@notsureifkevin Observability should enable us to: • Identify Patterns •
Assign Significance • Aid Reasoning • Guide Action

@notsureifkevin Why Does My Organization Need Observability?

@notsureifkevin Distributed Tracing Abstract In 2010, Google published a technical
report on their distributed tracing project named Dapper. In their abstract they summarized why they built Dapper in the first place: “Modern Internet services are often implemented as complex, large-scale distributed systems. These applications are constructed from collections of software modules that may be developed by different teams, perhaps in different programming languages, and could span many thousands of machines across multiple physical facilities. Tools that aid in understanding system behavior and reasoning about performance issues are invaluable in such an environment.” Google Technical Report dapper-2010-1, April 2010, p. 1 - https://ai.google/research/pubs/pub36356

@notsureifkevin How to Visualize Distributed Interactions • Every transaction (HTTP,
Messaging, RPC, etc) has a custom header injected into it which is intercepted and processed by a system of record • This is visualized with a GANTT chart to show the hierarchical structure and timing of every transaction which occurred once the initial trace is generated Google Technical Report dapper-2010-1, April 2010, p. 3, fig. 2 - https://ai.google/research/pubs/pub36356

@notsureifkevin We Need More Than Just Distributed Tracing • No
longer treating services like Schrödinger's cat • Much more context around events and transactions • Actionable insights generated by aggregated request-scoped events https://peter.bourgon.org/blog/2017/02/21/metrics-tracing-and-logging.html

@notsureifkevin What Do We Get From All This Data? •
A tremendous amount of telemetry which is perfect for: ◦ Aggregation ◦ SLI/SLOs ◦ Machine Learning ◦ Performance Analysis ◦ Debugging

@notsureifkevin … or rather, a big ‘ol Data Lake aHhghgh
hhhghhh nnnng...

@notsureifkevin We need machines to reconstruct this data so we
can easily make decisions on how to react!

@notsureifkevin Pet Clinic Microservice Demo Kubernetes, REST, Kafka

@notsureifkevin Spring Pet Clinic - Architecture • Original was a
monolith, refactored to microservices by the community ◦ Removed dependencies on Zuul, Hystrix, etc. to ease compatibility with K8S ◦ Added Notifications / Kafka Service ◦ Built a Load Testing Script ◦ Built Deploy pipelines for K8S / Gitlab https://gitlab.com/opentracing-workshop/spring-petclinic-kubernetes

@notsureifkevin Spring Pet Clinic - Problem? • Load script generates
new customers, while accessing the endpoint which loads all customers ◦ https://gitlab.com/notsureifkevin/spring-petclinic- kubernetes/blob/master/scripts/spc-load/spc-load.py#L25-33 ◦ https://gitlab.com/notsureifkevin/spring-petclinic- kubernetes/blob/master/scripts/spc-load/spc-load.py#L18 • Pagination is non-existent (but we’ll deploy it) ◦ https://gitlab.com/kc_wrhse/spring-petclinic- kubernetes/commit/7b52d3fcdfe20945cbc53e2269b12be4191e2777 • Live demonstration on how this is visualized and remediated using modern observability tools

@notsureifkevin

@notsureifkevin In Summary • Microservices are HARD (ask Segment), instrument
your services so you can make these systems easier to understand and manage • Observability Tools should help you understand how your systems are performing without creating additional work for your team • Share your successes and lessons learned with the community!

Thank you! © 2019, Amazon Web Services, Inc. or its
affiliates. All rights reserved. Kevin Crawley Twitter: @notsureifkevin Visit our booth #511 or schedule some time with us http://bit.ly/instana-reinvent

Release with Confidence - Observability for Mic...

Release with Confidence - Observability for Microservices

Kevin Crawley

More Decks by Kevin Crawley

Other Decks in Technology

Featured

Transcript

© 2019, Amazon Web Services, Inc. or its affiliates. All

@notsureifkevin $> whoami • Developer Relations @ Instana ◦ Education

@notsureifkevin Discussion Points • What is Observability • What is

@notsureifkevin Observability Theory and Reasoning

@notsureifkevin Observability Theory Kalman, 1961 paper On the general theory

@notsureifkevin Observability should enable us to: • Identify Patterns •

@notsureifkevin Why Does My Organization Need Observability?

@notsureifkevin Distributed Tracing Abstract In 2010, Google published a technical

@notsureifkevin How to Visualize Distributed Interactions • Every transaction (HTTP,

@notsureifkevin We Need More Than Just Distributed Tracing • No

@notsureifkevin What Do We Get From All This Data? •

@notsureifkevin … or rather, a big ‘ol Data Lake aHhghgh

@notsureifkevin We need machines to reconstruct this data so we

@notsureifkevin Pet Clinic Microservice Demo Kubernetes, REST, Kafka

@notsureifkevin Spring Pet Clinic - Architecture • Original was a

@notsureifkevin Spring Pet Clinic - Problem? • Load script generates

@notsureifkevin

@notsureifkevin

@notsureifkevin

@notsureifkevin

@notsureifkevin In Summary • Microservices are HARD (ask Segment), instrument

Thank you! © 2019, Amazon Web Services, Inc. or its

© 2019, Amazon Web Services, Inc. or its affiliates. All