$30 off During Our Annual Pro Sale. View Details »

I know what you did 
last summer

I know what you did 
last summer

Can you say the same about your Java applications? Today, applications run not only as monoliths on a server, but also distributed as (micro-)services in the cloud. Continuous monitoring is becomes more important in such scenarios. For example, as soon as you want to be informed about running out of memory, too many user sessions or increasing memory consumption, logging alone is no longer sufficient. For these and many other scenarios, metrics are the tool of choice today.

In this session we will look at how to implement, visualize and evaluate metrics for Java applications. We will not only look at products and ready-to-use solutions like Grafana or Prometheus. Rather, we also want to find out how to implement metrics in Java in a meaningful way without vendor lock-in. With Micrometer and Java Flight Recorder, for example, there are several APIs that can be used to easily extend a Java application with metrics and the possibility of continuous monitoring. Together we will look at different approaches and discuss best practices or pitfalls when using metrics.

Hendrik Ebbers

March 24, 2022
Tweet

More Decks by Hendrik Ebbers

Other Decks in Technology

Transcript

  1. View Slide

  2. I know what you did

    last summer

    View Slide

  3. This is a very very very long gag
    @hendrikEbbers
    Hendrik Ebbers
    • Karakun Co-Founder


    • Founder of JUG Dortmund


    • JSR EG member


    • JavaOne Rockstar, Java Champion


    • AdoptOpenJDK / Adoptium TSC member

    View Slide

  4. This is a very very very long gag
    @hendrikEbbers
    Stephan Classen
    • Karakun Co-Founder


    • SoCraTes CH community member


    • OpenWebStart committer


    • Conference speaker

    View Slide

  5. What are Metrics

    View Slide

  6. This is a very very very long gag
    @hendrikEbbers
    Poll
    •Who is using logging in their
    application ??

    View Slide

  7. This is a very very very long gag
    @hendrikEbbers
    Poll
    •Who is using logging in their
    application ??
    Almost all

    View Slide

  8. This is a very very very long gag
    @hendrikEbbers
    Poll
    •Who is using metrics in their
    application ??

    View Slide

  9. This is a very very very long gag
    @hendrikEbbers
    Poll
    •Who is using metrics in their
    application ??
    Not as Many

    View Slide

  10. This is a very very very long gag
    @hendrikEbbers
    Both logs and metrics are messages from
    within the application to inform an

    external observer about its run time behavior.


    Two sides of the same medal

    View Slide

  11. This is a very very very long gag
    @hendrikEbbers
    Two sides of the same medal
    Logs Metrics
    Meta
    Data
    Data
    structured


    • time


    • level


    • logger
    structured


    • time


    • name


    • text
    unstructured


    • message
    structured


    • value (numeric)


    • unit

    View Slide

  12. This is a very very very long gag
    @hendrikEbbers
    Why two?
    If metrics are almost the same as logs why do
    we need both ??


    View Slide

  13. This is a very very very long gag
    @hendrikEbbers
    Why two?
    If metrics are almost the same as logs why do
    we need both ??


    Mathematics

    View Slide

  14. Who is interested in
    Metrics?

    View Slide

  15. This is a very very very long gag
    @hendrikEbbers
    Interested parties
    • Resource usage (cpu, memory, network, disc)


    • Alerting in extreme cases


    • Prediction about scaling
    Operations

    View Slide

  16. This is a very very very long gag
    @hendrikEbbers
    Interested parties
    • Performance


    • Bottlenecks


    • Limitations
    Developers

    View Slide

  17. This is a very very very long gag
    @hendrikEbbers
    Interested parties
    • User behavior


    • A/B Testing


    • Drop-Off Points
    UX

    View Slide

  18. This is a very very very long gag
    @hendrikEbbers
    Interested parties
    • User count


    • Conversion rate


    • Retention
    Management

    View Slide

  19. Metrics Basics

    View Slide

  20. This is a very very very long gag
    @hendrikEbbers
    Two kind of Sources
    Continuous value Events with value

    View Slide

  21. This is a very very very long gag
    @hendrikEbbers
    Continuous - Gauge
    Time
    Value
    Gauge
    • Sample the source in a
    regular interval


    View Slide

  22. This is a very very very long gag
    @hendrikEbbers
    Continuous
    Time
    Value
    • Sample the source in a
    regular interval


    • Chose sampling rate
    according to expected
    source features

    View Slide

  23. This is a very very very long gag
    @hendrikEbbers
    Continuous
    Time
    Value
    High sample rate


    High accuracy


    Large amount of data


    High overhead for sampling
    and handling data

    View Slide

  24. This is a very very very long gag
    @hendrikEbbers
    Continuous
    Low sample rate


    Low overhead


    Low accuracy


    Important information
    may be lost
    Time
    Value

    View Slide

  25. This is a very very very long gag
    @hendrikEbbers
    Events
    Time
    Value
    • Most of the time the
    value is not de
    fi
    ned


    View Slide

  26. This is a very very very long gag
    @hendrikEbbers
    Events
    Time
    Value
    • Most of the time the
    value is not de
    fi
    ned


    • Aggregate events
    between two samples

    View Slide

  27. This is a very very very long gag
    @hendrikEbbers
    Events - Counter
    Time
    Value
    Count the total events


    Very easy to do


    Not very meaningful


    Value is completely

    ignored

    View Slide

  28. This is a very very very long gag
    @hendrikEbbers
    Events - Rates
    Time
    Value
    Count the delta


    Very easy to do


    More insight


    Value is still ignored


    View Slide

  29. This is a very very very long gag
    @hendrikEbbers
    Events - Values
    Time
    Value
    Incorporate Values


    • Sum of all values


    • Sum values between
    two samples


    • Average and variance

    View Slide

  30. This is a very very very long gag
    @hendrikEbbers
    Headline
    Divide Values in Buckets


    Easy to do


    Allows to qualify

    values
    Events - Histograms
    Time
    Value

    View Slide

  31. This is a very very very long gag
    @hendrikEbbers
    Headline
    Events - Rate Histograms

    View Slide

  32. Metrics Handling

    View Slide

  33. This is a very very very long gag
    @hendrikEbbers
    Headline
    Metrics Handling

    View Slide

  34. This is a very very very long gag
    @hendrikEbbers
    Headline
    Emit raw events


    Low overhead


    All information is persisted


    Large amount of data

    must be handled
    Aquire Metrics

    View Slide

  35. This is a very very very long gag
    @hendrikEbbers
    Headline
    Emit aggregations


    Data volume is reduced


    Some information is lost


    Overhead for creating the

    aggregations
    Aquire Metrics

    View Slide

  36. This is a very very very long gag
    @hendrikEbbers
    Headline
    Store on local file system


    Fast access and low latency


    Hard to collect data

    for evaluation
    Store Metrics

    View Slide

  37. This is a very very very long gag
    @hendrikEbbers
    Headline
    Store on centralized server


    Simple to collect data

    for evaluation


    Slow access and high latency


    Limitation in bandwidth
    Store Metrics

    View Slide

  38. Micrometer

    View Slide

  39. This is a very very very long gag
    @hendrikEbbers
    Micrometer
    • Metrics facade for Java


    • Only API - No Implementation


    • Like SLF4J but for metrics


    • Used in Spring Boot for metrics
    https://micrometer.io

    View Slide

  40. This is a very very very long gag
    @hendrikEbbers
    Micrometer Meter Types
    • timers


    • gauges


    • counters


    • distribution summaries


    • long task timers
    Gauge
    Counter Timer

    View Slide

  41. This is a very very very long gag
    @hendrikEbbers
    Micrometer Support
    • AppOptics


    • Azure Monitor


    • Net
    fl
    ix Atlas


    • CloudWatch


    • Datadog


    • Dynatrace


    • Elastic


    • Ganglia


    • Graphite


    • Humio


    • In
    fl
    ux/Telegraf


    • JMX


    • KairosDB


    • New Relic


    • Prometheus


    • SignalFx


    • Google Stackdriver


    • StatsD


    • Wavefront
    Java SPI is defined / Used
    in Micrometer

    View Slide

  42. This is a very very very long gag
    @hendrikEbbers
    Micrometer Registry
    • A registry holds the collection of all measurements


    • Meters can be created by the registry

    final MeterRegistry registry = ...;
    final Counter c = registry.counter("test");

    View Slide

  43. This is a very very very long gag
    @hendrikEbbers
    Micrometer Registry
    • A registry can be created by several ways


    final MeterRegistry registry = new SimpleMeterRegistry();




    @Autowired MeterRegistry registry;



    final MeterRegistry registry = new
    PrometheusMeterRegistry(PrometheusConfig.DEFAULT);


    Does not export anything
    Use it to play
    SpringBoot provides managed
    instance by default
    Specific registry for
    used monitoring system

    View Slide

  44. This is a very very very long gag
    @hendrikEbbers
    Micrometer names & Tags
    • Each meter has a base metric name


    • The base metric name must not be unique!

    A meter is de
    fi
    ned by the metric name and tags
    final Counter c1 = registry.counter("database.calls");
    Metric name
    final Counter c2 = registry.counter("database.calls", tags);

    View Slide

  45. This is a very very very long gag
    @hendrikEbbers
    Micrometer names & Tags
    • Tags can easily de
    fi
    nes as string pairs:


    Counter c = registry.counter("database.calls",

    "database", "production",


    "operation", "insert");
    Metric name
    Collection meters = registry.find("database.calls")


    .tag("database", "production")


    .meters();
    Tags

    View Slide

  46. This is a very very very long gag
    @hendrikEbbers
    Micrometer Counter Sample
    public boolean checkIfPrime(long number) {


    if ( testPrimeNumber(number) ) {


    registry.counter("example.prime.number", "type", "prime").increment();


    return true;


    }


    registry.counter("example.prime.number", "type", "not-prime").increment();


    return false;


    }

    View Slide

  47. This is a very very very long gag
    @hendrikEbbers
    Micrometer Counter Sample
    public boolean checkIfPrime(long number) {


    if (number < 1) {


    registry.counter("example.prime.number", "type", "not-natural").increment();


    return false;


    }


    if (number == 1 ) {


    registry.counter("example.prime.number", "type", "one").increment();


    return false;


    }


    if (number == 2 || number % 2 == 0) {


    registry.counter("example.prime.number", "type", "even").increment();


    return false;


    }


    if ( testPrimeNumber(number) ) {


    registry.counter("example.prime.number", "type", "prime").increment();


    return true;


    }


    registry.counter("example.prime.number", "type", "not-prime").increment();


    return false;


    }


    View Slide

  48. This is a very very very long gag
    @hendrikEbbers
    Micrometer Timer Sample
    public Result executeStatement(String statement) {


    var result = registry.timer("myservice.db.requests").record(() -> {


    return database.execute(statement);


    });


    return Result;


    }
    @Timed("myservice.db.requests")


    public Result executeStatement(String statement) {


    return database.execute(statement);


    }
    Supported on managed beans
    by Spring boot, Quarkus, ...

    View Slide

  49. Java Flight Recorder

    View Slide

  50. This is a very very very long gag
    @hendrikEbbers
    Monitoring Tools in your JDK
    • Java VisualVM (jvisualvm.exe)


    • JConsole (jconsole.exe)


    • Java Mission Control (jmc.exe)


    • Diagnostic Command Tool (jcmd.exe)
    Not shipped anymore
    with Java 9+
    Can be download separately: https://visualvm.github.io
    *
    *

    View Slide

  51. This is a very very very long gag
    @hendrikEbbers
    Java Flight Recorder
    • Java Flight Recorder (JFR) is part of OpenJDK based
    Java builds since version 11


    • JFR is integrated directly in the JVM


    • JFR affects the performance of a running application
    as little as possible

    View Slide

  52. This is a very very very long gag
    @hendrikEbbers
    JFR in Oracle JDK 8
    • For Java 8 the situation with JFR is more complex:


    • Before Java 8 update 262 JFR was only available as
    part of the Oracle JDK


    • It was only allowed to use by support customers of
    Oracle by using the +UnlockCommercialFeatures
    and +FlightRecorder
    fl
    ags

    View Slide

  53. This is a very very very long gag
    @hendrikEbbers
    JFR in Oracle JDK 8
    • Since Java 8 update 262 is part of any OpenJDK build


    • Next to this Java Mission Control releases can be
    downloaded at Eclipse Adoptium:

    https://adoptium.net/jmc.html

    View Slide

  54. View Slide

  55. View Slide

  56. View Slide

  57. This is a very very very long gag
    @hendrikEbbers
    Create custom JFR events
    import jdk.jfr.Event;


    import jdk.jfr.Label;


    import jdk.jfr.Name;


    @Name("com.karakun.Hello")


    @Label("Hello World!")


    class MyEvents extends Event {


    @Label("Message")


    String message;


    }
    JFR API to define
    events
    Define custom
    (structured)
    metadata
    Event class provides
    basic functionality

    View Slide

  58. This is a very very very long gag
    @hendrikEbbers
    Create custom JFR events
    MyEvent event = new MyEvent();


    event.begin();


    event.message = "Hello world!";


    event.commit();
    Methods defined in
    the Event class

    View Slide

  59. This is a very very very long gag
    @hendrikEbbers
    JFR Event Categories

    View Slide

  60. This is a very very very long gag
    @hendrikEbbers
    JFR Event Categories
    public static final String UPLOAD = "Upload";


    public static final String IMAGE_UPLOAD = "Image Upload";


    @Name("com.karakun.ImageRead")


    @Label("Image Read")


    @Category({UPLOAD, IMAGE_UPLOAD})


    class ImageReadEvent extends Event {


    ...


    }

    View Slide

  61. This is a very very very long gag
    @hendrikEbbers
    JFR Event Categories

    View Slide

  62. This is a very very very long gag
    @hendrikEbbers
    Measure Time with JFR Events
    MyEvent event = new MyEvent();


    event.begin();


    database.execute(statement);


    event.commit();
    This will take
    some time...
    • JFR Events contain timing metdata by default


    • Starttime and duration will always be stored

    View Slide

  63. Centralized Metric
    Servers

    View Slide

  64. This is a very very very long gag
    @hendrikEbbers
    Centralized Metric Servers
    • For (micro-)services the approach of local metrics and
    analytics tools is not working well


    • Assume you want to monitor several instances of a
    service


    • Assume you want to monitor different services

    View Slide

  65. This is a very very very long gag
    @hendrikEbbers
    Modern usecases

    View Slide

  66. This is a very very very long gag
    @hendrikEbbers
    Centralized Metric Servers
    • Next to this an application can run for a long time


    • And you want to have metrics over the complete
    lifetime of the application

    View Slide

  67. This is a very very very long gag
    @hendrikEbbers
    Centralized Metric Servers
    • Storing metrics in a centralized service has a lot of
    bene
    fi
    ts


    • Store all metrics over a long application lifetime


    • Store metrics of multiple app / service instances


    • Store metrics of multiple services

    View Slide

  68. This is a very very very long gag
    @hendrikEbbers
    Centralized Metric Servers
    • There are several tools on the market


    • Open source / commercial


    • Cloud services (SaaS) / On Premises


    • In general you needs to
    fi
    nd the tools that work best
    for your infrastructure

    View Slide

  69. This is a very very very long gag
    @hendrikEbbers
    Centralized Metric Servers
    • We will have a look at a Prometheus & Grafana based
    centralized metric server solution


    • This combination can be found very often


    • Based on Open Source components


    • Good integration in tools & APIs

    View Slide

  70. This is a very very very long gag
    @hendrikEbbers
    Centralized Metric Servers
    Application
    /metrics
    Prometheus

    View Slide

  71. @hendrikEbbers

    View Slide

  72. This is a very very very long gag
    @hendrikEbbers
    Centralized Metric Servers
    Application
    /metrics
    Prometheus
    query

    View Slide

  73. @hendrikEbbers

    View Slide

  74. This is a very very very long gag
    @hendrikEbbers
    Centralized Metric Servers
    Application
    Prometheus
    query
    Application
    Application

    View Slide

  75. Grafana Playground
    https://play.grafana.org

    View Slide

  76. Stay safe & healthy

    View Slide

  77. @hendrikEbbers
    dev.karakun.com

    View Slide