Slide 1

Slide 1 text

Jonatan Ivanov Marcin Grzejszczak Tommy Ludwig 2022-03-31 Let's talk Micrometer, Sleuth, and Tanzu Observability Copyright © 2022 VMware, Inc. or its affiliates.

Slide 2

Slide 2 text

About Us Tommy Ludwig @TommyLudwig Spring Team @ VMware Micrometer “Spring Observability” Marcin Grzejszczak @mgrzejszczak Spring Team @ VMware Micrometer, Spring Cloud Sleuth “Spring Observability” Spring Cloud Contract Jonatan Ivanov @jonatan_ivanov Spring Team @ VMware Micrometer, Spring Cloud Sleuth “Spring Observability”

Slide 3

Slide 3 text

Disclaimer This presentation may contain product features or functionality that are currently under development. This overview of new technology represents no commitment from VMware to deliver these features in any generally available product. Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind. Technical feasibility and market demand will affect final delivery. Pricing and packaging for any new features/functionality/technology discussed or presented, have not been determined. The information in this presentation is for informational purposes only and may not be incorporated into any contract. There is no commitment or obligation to deliver any items presented herein.

Slide 4

Slide 4 text

Cover w/ Image Agenda - Introduction to - Observability - Metrics (Micrometer) - Distributed Tracing (SC Sleuth) - Tanzu Observability - What’s new in Micrometer? - What’s new in Spring Cloud Sleuth? - Observation API - Q&A

Slide 5

Slide 5 text

Intros Observability, Metrics, Distributed Tracing

Slide 6

Slide 6 text

What is Observability? How well we can understand the internals of a system based on its outputs

Slide 7

Slide 7 text

What is Observability? Being able to ask arbitrary questions without knowing ahead what you want to ask Turning data points and context into insights Being able to quickly troubleshoot problems with no prior knowledge (unknown unknowns)

Slide 8

Slide 8 text

Why do we need Observability? Today's systems are insanely complex (cloud) (Death Star Architecture, Big Ball of Mud) We need to face unknown unknowns We might not know where our apps are We might not know how many instances we have We can’t modify/debug/etc. it Something is always broken (Fallacies of Distributed Computing) Like sending rovers to Mars: You can’t touch/modify them after launch

Slide 9

Slide 9 text

Logging - Metrics - Distributed Tracing Metrics What is the context? Measure-and-Combine data Aggregatable Can identify trends Not traffic-sensitive (usually) Distributed Tracing Why happened? Recording events With causal ordering Can identify cause across services Context Propagation (traceId, spanId) Logging What happened? Emitting events Easy to read (grep) INFO/WARN/ERROR/… Stacktraces

Slide 10

Slide 10 text

Example: Latency Metrics “99.999% of the requests were faster than 140ms.” “The max was 150ms.” So it’s quite bad. But why was this slow? Logging “Processing a request took 140ms.” Is it bad? Is it good? What is the context? Distributed Tracing “Service A called Service B.” “Service B called the DB.” “The services were ok.” “The network was ok.” “The DB was slow.” “Because somebody requested a lot of data.”

Slide 11

Slide 11 text

Example: Error Metrics “The error rate is 0.001/sec.” “We had 2 errors recently.” So it’s not that bad. But why did this happen? Logging “Request processing failed.” “Here’s the stacktrace.” Is it bad? (Well, it failed.) How bad? How many of them failed? What is the context? Distributed Tracing “Service A called Service B.” “Service B called the DB.” “The services were ok.” “The network was ok.” “The DB call failed.” “Because of invalid input.”

Slide 12

Slide 12 text

SLF4J with Logback comes pre-configured but you can replace Logback SLF4J - Simple Logging Façade for Java - Simple API for various logging frameworks - Allows to plug in the desired logging framework Logback - Modern logging framework - Natively implements the SLF4J API If you want Log4j2 instead of Logback: - spring-boot-starter-logging + spring-boot-starter-log4j2 Logging with Spring: SLF4J + Logback

Slide 13

Slide 13 text

Metrics with Spring: Micrometer Popular Metrics library on the JVM Like SLF4J, but for metrics Simple API Supports the most popular metric backends Comes with spring-boot-actuator Spring projects are instrumented using Micrometer A lot of third-party libraries use Micrometer

Slide 14

Slide 14 text

Micrometer - Like SLF4J, but for metrics Ganglia Graphite Humio InfluxDB JMX KairosDB New Relic OpenTSDB Prometheus SignalFx Stackdriver (GCP) StatsD Wavefront* (VMware) (/actuator/metrics) AppOptics Atlas Azure Monitor CloudWatch (AWS) Datadog Dynatrace Elastic *VMware Tanzu Observability by Wavefront

Slide 15

Slide 15 text

Distributed Tracing with Spring: Spring Cloud Sleuth Distributed Tracing Support for Spring Provides an abstraction layer on top of tracing libraries (3.x) - Brave (OpenZipkin), default - OpenTelemetry (CNCF), experimental Log Correlation + Context Propagation Instrumentation for Spring Projects (and your application) Instrumentation for third-party libraries (through Brave and OTel) Supports various backends (through Brave and OTel)

Slide 16

Slide 16 text

Tanzu Observability

Slide 17

Slide 17 text

start.spring.io

Slide 18

Slide 18 text

Generate a project Spring Initializr link Dependencies: ● Web ● Actuator ● Wavefront (Tanzu Observability) ● Sleuth Add a controller endpoint using RestTemplate to call another service. Application properties: spring.application.name =webinar-demo Start the application, send some requests to the endpoint. http :8080/hi

Slide 19

Slide 19 text

Demo In the startup logs, there will be a link to Wavefront: “Connect to your Wavefront dashboard using this one-time use link:” Logs format is updated by Sleuth to include context (service name, trace/span id) In Tanzu Observability, you can see: ● Spring Boot dashboard based on Micrometer metrics ● Tracing from Sleuth

Slide 20

Slide 20 text

What’s new in Micrometer?

Slide 21

Slide 21 text

Micrometer 1.9 - exemplars Exemplars - provide high-cardinality examples for aggregate metric data. (Only supported with Prometheus and OpenMetrics scrape format) Most common use case is correlating metrics and tracing For a metric summarizing many recordings, provide an exemplar trace Speeds up the process of finding a representative trace for an event of interest

Slide 22

Slide 22 text

Exemplars example Timer with histogram sample scrape snippet: http :8080/actuator/prometheus 'Accept: application/openmetrics-text; version=1.0.0' http_server_requests_seconds_bucket{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/api/people",le="0.20132659"} 0.0 http_server_requests_seconds_bucket{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/api/people",le="0.223696211"} 3.0 # {span_id="c1862c07a28f4920",trace_id="c1862c07a28f4920"} 0.208048291 1648532263.194 http_server_requests_seconds_bucket{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/api/people",le="0.24606583 2"} 3.0 http_server_requests_seconds_bucket{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/api/people",le="0.26843545 6"} 3.0 http_server_requests_seconds_bucket{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/api/people",le="0.357913941"} 4.0 # {span_id="fbdacc0ec219805b",trace_id="fbdacc0ec219805b"} 0.284813042 1648532262.316 http_server_requests_seconds_bucket{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/api/people",le="0.44739242 6"} 4.0

Slide 23

Slide 23 text

Exemplars visualization Sample visualization of the 95th percentile based on histogram metric data with exemplar tracing data (dots) https://grafana.com/docs/grafana/latest/basics/exemplars/

Slide 24

Slide 24 text

What’s new in Sleuth?

Slide 25

Slide 25 text

What’s new in Sleuth 3.1.0 JDBC #1930 Tomcat Valve #1329 Spring Vault #1952 Automatic tag table generation for documentation #1950 Spring Cloud Deployer #1947 R2DBC #1524 Kafka #2013 and Reactor Kafka #1708 Spring TX #1941

Slide 26

Slide 26 text

What’s new in Sleuth 3.1.0 Spring Batch #1904 RSocket #1677 Spring Cloud Task #1903 Spring Cloud Config #1915 Spring Cloud CircuitBreaker Reactive #1910 Cassandra #1974 Spring Session #1961 Spring Security #2011

Slide 27

Slide 27 text

What’s new in Sleuth 3.1.0 Prometheus Exemplars #2039 Spring Cloud Stream Reactive #2038 Reactive Mongo #2044 Abstracted Redis instrumentation #2046 Custom Actuator for storing traces #1879

Slide 28

Slide 28 text

JDBC (with p6spy or datasource-proxy)

Slide 29

Slide 29 text

Tomcat Valve

Slide 30

Slide 30 text

Spring Vault

Slide 31

Slide 31 text

R2DBC

Slide 32

Slide 32 text

Kafka and Reactor Kafka

Slide 33

Slide 33 text

Spring TX

Slide 34

Slide 34 text

Spring Batch

Slide 35

Slide 35 text

RSocket

Slide 36

Slide 36 text

Spring Cloud Config

Slide 37

Slide 37 text

Spring Session

Slide 38

Slide 38 text

Spring Security

Slide 39

Slide 39 text

Observation API

Slide 40

Slide 40 text

No content

Slide 41

Slide 41 text

Registry

Slide 42

Slide 42 text

Context

Slide 43

Slide 43 text

No content

Slide 44

Slide 44 text

Observation

Slide 45

Slide 45 text

No content

Slide 46

Slide 46 text

No content

Slide 47

Slide 47 text

No content

Slide 48

Slide 48 text

TagsProvider

Slide 49

Slide 49 text

No content

Slide 50

Slide 50 text

Observe concrete code

Slide 51

Slide 51 text

ObservationRegistry ObservationHandler Tracing Metrics ???

Slide 52

Slide 52 text

Instrument Once Benefit Multiple Times

Slide 53

Slide 53 text

Questions/Feedback? Contact us on Twitter @jonatan_ivanov - develotters.com @TommyLudwig - github.com/shakuzen @mgrzejszczak - toomuchcoding.com © 2022 Spring. A VMware-backed project.