2024-10-01 dev2next - Observability for Modern JVM Applications

Jonatan Ivanov 2024-10-01 Observability for Modern JVM Applications

About Me - Spring Team - Micrometer - Spring Cloud,
Spring Boot - Spring Observability Team - Seattle Java User Group - develotters.com - @jonatan_ivanov

Gauge the audience • Observability in production? • Spring Boot
3? • Micrometer? • Prometheus? • OpenTelemetry? How many people are using:

What is Observability?

Various Opinions 3 pillars: Logging, Metrics, Distributed Tracing 4 pillars:
+ Events/Lineage(?)/Context/Metadata 6 pillars: + Proﬁles + Exceptions Arbitrary Wide Events, Signals But what about: /health, /info, etc. Service Registry/Discoverability, API Discoverability

What is Observability? How well we can understand the internals
of a system based on its outputs (Providing meaningful information about what happens inside) (Data about your app)

Why do we need Observability?

Why do we need Observability? Today's systems are increasingly complex
(cloud) (Death Star Architecture, Big Ball of Mud)

Environments can be chaotic You turn a knob here a
little and apps are going down there We need to deal with unknown unknowns We can’t know everything Things can be perceived diﬀerently by observers Everything is broken for the users but seems ok to you Why do we need Observability?

Why do we need Observability? (business perspective) Reduce lost revenue
from production incidents Lower mean time to recovery (MTTR) Require less specialized knowledge Shared method of investigating across system Quantify user experience Don't guess, measure!

Want to improve something? • Measure it ﬁrst! • Resource
utilization (number of instances, cpu, ram, io, etc.)? • Throughput/latency (max.) patterns? • Deployment frequency? • Time to go live? • Time to troubleshoot/recover? • How often are you paged? Why do we need Observability? (Continuous Improvement)

• Chaos Engineering • Anomaly Detection • Feature ﬂags •
A/B Testing • Auto-tuning • Adaptive Apps Why do we need Observability? (Advanced Capabilities)

Logging Metrics Distributed Tracing

Logging What happened (why)? Emitting events Metrics What is the
context? Aggregating data Distributed Tracing Why happened? Recording causal ordering of events Logging - Metrics - Distributed Tracing

Examples Latency Logging (What?) Processing took 140ms Metrics (Context?) P99.999:
140ms Max: 150ms Distributed Tracing (Why?) DB was slow (lot of data was requested) Error Logging (What?) Processing failed (stacktrace?) Metrics (Context?) The error rate is 0.001/sec 2 errors in the last 30 minutes Distributed Tracing (Why?) DB call failed (invalid input)

DEMO 🍵 github.com/jonatan-ivanov/teahouse

Tea Service 💻 Tealeaf Service Water Service Architecture Tealeaf DB
Water DB

spring-boot-starter-web spring-boot-starter-data-jpa spring-cloud-starter-openfeign spring-boot-starter-actuator (micrometer-observation) micrometer-registry-prometheus micrometer-tracing-bridge-brave + zipkin-reporter-brave net.ttddyy.observation:datasource-micrometer-spring-boot

Let’s make some tea! 🍵

by Kenneth Kousen

through traces TraceID ❮ Exemplars Tags ❯ metrics logs traces

Logging With JVM/Spring

SLF4J with Logback comes pre-conﬁgured SLF4J (Simple Logging Façade for
Java) Simple API for logging libraries Logback Natively implements the SLF4J API If you want Log4j2 instead of Logback: - spring-boot-starter-logging + spring-boot-starter-log4j2 Logging with JVM/Spring: SLF4J + Logback

Payload, Access, GC logs Payload logs: Logbook + logbook-spring-boot-starter (auto-conﬁgured)
Access logs: server.tomcat.accesslog.enabled=true server.jetty.accesslog.enabled=true server.undertow.accesslog.enabled=true GC logs: JVM args

Metrics With JVM/Spring

Metrics with JVM/Spring: Micrometer Dimensional Metrics library on the JVM
Like SLF4J, but for metrics API is independent of the conﬁgured metrics backend Supports many backends Comes with spring-boot-actuator Spring projects are instrumented using Micrometer Many third-party libraries use Micrometer

Supported metrics backends/formats/protocols Ganglia Graphite Humio InﬂuxDB JMX KairosDB New
Relic (/actuator/metrics) OpenTSDB OTLP Prometheus SignalFx Stackdriver (GCP) StatsD Wavefront (VMware) AppOptics Atlas Azure Monitor CloudWatch (AWS) Datadog Dynatrace Elastic

Tracing With JVM/Spring

Distributed Tracing with JVM/Spring Boot 2.x: Spring Cloud Sleuth Boot
3.x: Micrometer Tracing (Sleuth w/o Spring dependencies) Provide an abstraction layer on top of tracing libraries - Brave (OpenZipkin), “default” - OpenTelemetry (CNCF), “experimental” Instrumentation for Spring Projects, 3rd party libraries, your app Support for various backends

Observation API

• Add Logs (application logs) • Add Metrics • Add
Distributed Tracing You want to instrument your application…

Observation API basic usage example Observation observation = Observation.start("talk",registry); try
{ // TODO: scope doSomething(); // ← This is what we’re observing } catch (Exception exception) { observation.error(exception); throw exception; } finally { // TODO: attach tags (key-value) observation.stop(); }

Conﬁguring an ObservationHandler (without Boot) ObservationRegistry registry = ObservationRegistry.create(); registry.observationConfig()
.observationHandler(new MetricsHandler(...)) .observationHandler(new TracingHandler(...)) .observationHandler(new LoggingHandler(...)) .observationHandler(new AuditEventHandler(...));

Observation API shortcuts Observation.createNotStarted("talk",registry) .lowCardinalityKeyValue("conf", "dev2next") .highCardinalityKeyValue("uid", userId) .observe(this::talk); @Observed

Health Endpoint Is my app healthy (k8s probes)? Dependencies? Info
Endpoint Build Info (name, version, git commit, build time): Boot 2.x Java Info (JRE/JVM name, version, vendor): Boot 2.6 OS Info (name, arch, version): Boot 2.7 Process Info (pid, owner, cpus, memory) Boot 3.3, 3.4 Dependencies (SBOM) Boot 3.3 TLS Info (subject, issuer, validity) Boot 3.4 Cloud Info (instanceId, region, account) GC Info, Timezone, Current Time, Language, Start Time, Uptime Spring Boot Actuator

Service Discoverability, API Discoverability How many service instances do we
have? Where? (host/ip, port, instanceId, region, account) What versions are deployed? (by environment) Eureka, Spring Boot Admin How to call/use them? Spring REST Docs Spring Cloud Contract + Pact Broker Swagger / OpenAPI + ReDoc Spring HATEOAS + HAL Explorer

Thank you! @jonatan_ivanov develotters.com github.com/jonatan-ivanov/teahouse (branch: 2024-dev2next) slack.micrometer.io

2024-10-01 dev2next - Observability for Modern ...

2024-10-01 dev2next - Observability for Modern JVM Applications

More Decks by Jonatan Ivanov

Other Decks in Programming

Featured

Transcript