Upgrade to Pro — share decks privately, control downloads, hide ads and more …

I know what you did 
last summer

I know what you did 
last summer

Can you say the same about your Java applications? Today, applications run not only as monoliths on a server, but also distributed as (micro-)services in the cloud. Continuous monitoring is becomes more important in such scenarios. For example, as soon as you want to be informed about running out of memory, too many user sessions or increasing memory consumption, logging alone is no longer sufficient. For these and many other scenarios, metrics are the tool of choice today.

In this session we will look at how to implement, visualize and evaluate metrics for Java applications. We will not only look at products and ready-to-use solutions like Grafana or Prometheus. Rather, we also want to find out how to implement metrics in Java in a meaningful way without vendor lock-in. With Micrometer and Java Flight Recorder, for example, there are several APIs that can be used to easily extend a Java application with metrics and the possibility of continuous monitoring. Together we will look at different approaches and discuss best practices or pitfalls when using metrics.

Hendrik Ebbers

March 24, 2022
Tweet

More Decks by Hendrik Ebbers

Other Decks in Technology

Transcript

  1. This is a very very very long gag @hendrikEbbers Hendrik

    Ebbers • Karakun Co-Founder • Founder of JUG Dortmund • JSR EG member • JavaOne Rockstar, Java Champion • AdoptOpenJDK / Adoptium TSC member
  2. This is a very very very long gag @hendrikEbbers Stephan

    Classen • Karakun Co-Founder • SoCraTes CH community member • OpenWebStart committer • Conference speaker
  3. This is a very very very long gag @hendrikEbbers Poll

    •Who is using logging in their application ??
  4. This is a very very very long gag @hendrikEbbers Poll

    •Who is using logging in their application ?? Almost all
  5. This is a very very very long gag @hendrikEbbers Poll

    •Who is using metrics in their application ??
  6. This is a very very very long gag @hendrikEbbers Poll

    •Who is using metrics in their application ?? Not as Many
  7. This is a very very very long gag @hendrikEbbers Both

    logs and metrics are messages from within the application to inform an 
 external observer about its run time behavior. Two sides of the same medal
  8. This is a very very very long gag @hendrikEbbers Two

    sides of the same medal Logs Metrics Meta Data Data structured • time • level • logger structured • time • name • text unstructured • message structured • value (numeric) • unit
  9. This is a very very very long gag @hendrikEbbers Why

    two? If metrics are almost the same as logs why do we need both ??
  10. This is a very very very long gag @hendrikEbbers Why

    two? If metrics are almost the same as logs why do we need both ?? Mathematics
  11. This is a very very very long gag @hendrikEbbers Interested

    parties • Resource usage (cpu, memory, network, disc) • Alerting in extreme cases • Prediction about scaling Operations
  12. This is a very very very long gag @hendrikEbbers Interested

    parties • Performance • Bottlenecks • Limitations Developers
  13. This is a very very very long gag @hendrikEbbers Interested

    parties • User behavior • A/B Testing • Drop-Off Points UX
  14. This is a very very very long gag @hendrikEbbers Interested

    parties • User count • Conversion rate • Retention Management
  15. This is a very very very long gag @hendrikEbbers Two

    kind of Sources Continuous value Events with value
  16. This is a very very very long gag @hendrikEbbers Continuous

    - Gauge Time Value Gauge • Sample the source in a regular interval
  17. This is a very very very long gag @hendrikEbbers Continuous

    Time Value • Sample the source in a regular interval • Chose sampling rate according to expected source features
  18. This is a very very very long gag @hendrikEbbers Continuous

    Time Value High sample rate High accuracy Large amount of data High overhead for sampling and handling data
  19. This is a very very very long gag @hendrikEbbers Continuous

    Low sample rate Low overhead Low accuracy Important information may be lost Time Value
  20. This is a very very very long gag @hendrikEbbers Events

    Time Value • Most of the time the value is not de fi ned
  21. This is a very very very long gag @hendrikEbbers Events

    Time Value • Most of the time the value is not de fi ned • Aggregate events between two samples
  22. This is a very very very long gag @hendrikEbbers Events

    - Counter Time Value Count the total events Very easy to do Not very meaningful Value is completely 
 ignored
  23. This is a very very very long gag @hendrikEbbers Events

    - Rates Time Value Count the delta Very easy to do More insight Value is still ignored
  24. This is a very very very long gag @hendrikEbbers Events

    - Values Time Value Incorporate Values • Sum of all values • Sum values between two samples • Average and variance
  25. This is a very very very long gag @hendrikEbbers Headline

    Divide Values in Buckets Easy to do Allows to qualify 
 values Events - Histograms Time Value
  26. This is a very very very long gag @hendrikEbbers Headline

    Emit raw events Low overhead All information is persisted Large amount of data 
 must be handled Aquire Metrics
  27. This is a very very very long gag @hendrikEbbers Headline

    Emit aggregations Data volume is reduced Some information is lost Overhead for creating the 
 aggregations Aquire Metrics
  28. This is a very very very long gag @hendrikEbbers Headline

    Store on local file system Fast access and low latency Hard to collect data 
 for evaluation Store Metrics
  29. This is a very very very long gag @hendrikEbbers Headline

    Store on centralized server Simple to collect data 
 for evaluation Slow access and high latency Limitation in bandwidth Store Metrics
  30. This is a very very very long gag @hendrikEbbers Micrometer

    • Metrics facade for Java • Only API - No Implementation • Like SLF4J but for metrics • Used in Spring Boot for metrics https://micrometer.io
  31. This is a very very very long gag @hendrikEbbers Micrometer

    Meter Types • timers • gauges • counters • distribution summaries • long task timers Gauge Counter Timer
  32. This is a very very very long gag @hendrikEbbers Micrometer

    Support • AppOptics • Azure Monitor • Net fl ix Atlas • CloudWatch • Datadog • Dynatrace • Elastic • Ganglia • Graphite • Humio • In fl ux/Telegraf • JMX • KairosDB • New Relic • Prometheus • SignalFx • Google Stackdriver • StatsD • Wavefront Java SPI is defined / Used in Micrometer
  33. This is a very very very long gag @hendrikEbbers Micrometer

    Registry • A registry holds the collection of all measurements 
 
 • Meters can be created by the registry 
 final MeterRegistry registry = ...; final Counter c = registry.counter("test");
  34. This is a very very very long gag @hendrikEbbers Micrometer

    Registry • A registry can be created by several ways 
 
 final MeterRegistry registry = new SimpleMeterRegistry(); 
 
 
 
 @Autowired MeterRegistry registry; 
 
 
 final MeterRegistry registry = new PrometheusMeterRegistry(PrometheusConfig.DEFAULT); Does not export anything Use it to play SpringBoot provides managed instance by default Specific registry for used monitoring system
  35. This is a very very very long gag @hendrikEbbers Micrometer

    names & Tags • Each meter has a base metric name 
 
 • The base metric name must not be unique! 
 A meter is de fi ned by the metric name and tags final Counter c1 = registry.counter("database.calls"); Metric name final Counter c2 = registry.counter("database.calls", tags);
  36. This is a very very very long gag @hendrikEbbers Micrometer

    names & Tags • Tags can easily de fi nes as string pairs: 
 
 Counter c = registry.counter("database.calls", 
 "database", "production", "operation", "insert"); Metric name Collection<Meter> meters = registry.find("database.calls") .tag("database", "production") .meters(); Tags
  37. This is a very very very long gag @hendrikEbbers Micrometer

    Counter Sample public boolean checkIfPrime(long number) { if ( testPrimeNumber(number) ) { registry.counter("example.prime.number", "type", "prime").increment(); return true; } registry.counter("example.prime.number", "type", "not-prime").increment(); return false; }
  38. This is a very very very long gag @hendrikEbbers Micrometer

    Counter Sample public boolean checkIfPrime(long number) { if (number < 1) { registry.counter("example.prime.number", "type", "not-natural").increment(); return false; } if (number == 1 ) { registry.counter("example.prime.number", "type", "one").increment(); return false; } if (number == 2 || number % 2 == 0) { registry.counter("example.prime.number", "type", "even").increment(); return false; } if ( testPrimeNumber(number) ) { registry.counter("example.prime.number", "type", "prime").increment(); return true; } registry.counter("example.prime.number", "type", "not-prime").increment(); return false; }
  39. This is a very very very long gag @hendrikEbbers Micrometer

    Timer Sample public Result executeStatement(String statement) { var result = registry.timer("myservice.db.requests").record(() -> { return database.execute(statement); }); return Result; } @Timed("myservice.db.requests") public Result executeStatement(String statement) { return database.execute(statement); } Supported on managed beans by Spring boot, Quarkus, ...
  40. This is a very very very long gag @hendrikEbbers Monitoring

    Tools in your JDK • Java VisualVM (jvisualvm.exe) • JConsole (jconsole.exe) • Java Mission Control (jmc.exe) • Diagnostic Command Tool (jcmd.exe) Not shipped anymore with Java 9+ Can be download separately: https://visualvm.github.io * *
  41. This is a very very very long gag @hendrikEbbers Java

    Flight Recorder • Java Flight Recorder (JFR) is part of OpenJDK based Java builds since version 11 • JFR is integrated directly in the JVM • JFR affects the performance of a running application as little as possible
  42. This is a very very very long gag @hendrikEbbers JFR

    in Oracle JDK 8 • For Java 8 the situation with JFR is more complex: • Before Java 8 update 262 JFR was only available as part of the Oracle JDK • It was only allowed to use by support customers of Oracle by using the +UnlockCommercialFeatures and +FlightRecorder fl ags
  43. This is a very very very long gag @hendrikEbbers JFR

    in Oracle JDK 8 • Since Java 8 update 262 is part of any OpenJDK build • Next to this Java Mission Control releases can be downloaded at Eclipse Adoptium: 
 https://adoptium.net/jmc.html
  44. This is a very very very long gag @hendrikEbbers Create

    custom JFR events import jdk.jfr.Event; import jdk.jfr.Label; import jdk.jfr.Name; @Name("com.karakun.Hello") @Label("Hello World!") class MyEvents extends Event { @Label("Message") String message; } JFR API to define events Define custom (structured) metadata Event class provides basic functionality
  45. This is a very very very long gag @hendrikEbbers Create

    custom JFR events MyEvent event = new MyEvent(); event.begin(); event.message = "Hello world!"; event.commit(); Methods defined in the Event class
  46. This is a very very very long gag @hendrikEbbers JFR

    Event Categories public static final String UPLOAD = "Upload"; public static final String IMAGE_UPLOAD = "Image Upload"; @Name("com.karakun.ImageRead") @Label("Image Read") @Category({UPLOAD, IMAGE_UPLOAD}) class ImageReadEvent extends Event { ... }
  47. This is a very very very long gag @hendrikEbbers Measure

    Time with JFR Events MyEvent event = new MyEvent(); event.begin(); database.execute(statement); event.commit(); This will take some time... • JFR Events contain timing metdata by default • Starttime and duration will always be stored
  48. This is a very very very long gag @hendrikEbbers Centralized

    Metric Servers • For (micro-)services the approach of local metrics and analytics tools is not working well • Assume you want to monitor several instances of a service • Assume you want to monitor different services
  49. This is a very very very long gag @hendrikEbbers Centralized

    Metric Servers • Next to this an application can run for a long time • And you want to have metrics over the complete lifetime of the application
  50. This is a very very very long gag @hendrikEbbers Centralized

    Metric Servers • Storing metrics in a centralized service has a lot of bene fi ts • Store all metrics over a long application lifetime • Store metrics of multiple app / service instances • Store metrics of multiple services
  51. This is a very very very long gag @hendrikEbbers Centralized

    Metric Servers • There are several tools on the market • Open source / commercial • Cloud services (SaaS) / On Premises • In general you needs to fi nd the tools that work best for your infrastructure
  52. This is a very very very long gag @hendrikEbbers Centralized

    Metric Servers • We will have a look at a Prometheus & Grafana based centralized metric server solution • This combination can be found very often • Based on Open Source components • Good integration in tools & APIs
  53. This is a very very very long gag @hendrikEbbers Centralized

    Metric Servers Application /metrics Prometheus
  54. This is a very very very long gag @hendrikEbbers Centralized

    Metric Servers Application /metrics Prometheus query
  55. This is a very very very long gag @hendrikEbbers Centralized

    Metric Servers Application Prometheus query Application Application