Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Taming performance issues into the wild: a practical guide to JVM profiling

Taming performance issues into the wild: a practical guide to JVM profiling

The session will start with a quick introduction to the theory of profiling discussing the motivations, explaining the different types of profiling and visualization format while listing the tools available for this purpose. This also includes some tooling for reliably emulating the load generation and validating the improvements.

Then we will guide the attendees through the profiling tools that we want to use throughout the session:

Java VisualVM: https://visualvm.github.io/download.html
Async-profiler: https://github.com/jvm-profiling-tools/async-profiler
JDK Mission Control: https://www.oracle.com/java/technologies/jdk-mission-control.html
JMH: https://github.com/openjdk/jmh
Hyperfoil: https://hyperfoil.io/ (for load generation)

Then we will put these tools at work in a practical real-world scenario. We will provide a sample Quarkus-based Java webapp, using a simple but realistic technology stack, having different performance issues involving excessive memory allocation, CPU consumption, slow blocking I/O, locks contention and cache misses. We will demonstrate how to use the before mentioned profiling tools to discover, investigate, fix and verify these issues.

Mario Fusco

May 05, 2023
Tweet

More Decks by Mario Fusco

Other Decks in Programming

Transcript

  1. 2 Agenda ➢ (Java) Profiling Introduction ➢ Flame-graphs and table

    (tree) views with examples ◦ Mixed-Mode Flame Graphs ➢ Challenges of Java Sampling Profiling ◦ Safepoints biasing ◦ Observer effect ◦ Native events (garbage collection, JNI, operating system calls, JIT) ◦ Skid native perf events ◦ Method invocations count ➢ Tools setup ➢ Introduction to application to be profiled ◦ Quarkus and its threading model ➢ Load generation tooling ➢ Profiling and investigation sessions
  2. 3

  3. 16 CPU Flamegraph By Example • x-axis alphabetical stack sort

    ie NOT A TIME SERIES! • Top edge shows who is running on-CPU • Top down shows ancestry eg b() is called by a() • Width proportional to samples presence eg c() has ~ twice the samples of d()
  4. 17 Mixed-Mode Flame Graphs • Colors ◦ green - Java

    ◦ aqua - Java Inlined(!!!!) ◦ orange - Kernel code ◦ red - C libraries (eg JNI) ◦ yellow - C++ (eg JVM) • Color intensity is randomized to differentiate frames • Thread ID is added as a base frame
  5. Inlining is a way to optimize compiled source code at

    runtime by replacing the invocations of the most often executed methods with its bodies. It's the responsibility of the Just-In-Time (JIT) compiler which tries to inline the methods that we call more often so that we can avoid the overhead of a method invocation.
  6. 20 Native frames...on a JVM?! • JNI code ◦ I/O

    wrapper operations, user code calling native libs, etc • Intrinsics ◦ System::arrayCopy, Arrays::equals, ...vmSymbols.hpp • SIMD opportunity ◦ Arrays::fill • JVM C++ code ◦ JIT Compiler, GC, etc • OS/Kernel C code ◦ I/O OS/Kernel calls, page faults, interrupt handlers, etc
  7. Safewhat? 22 safepoint A point during program execution at which

    all GC roots are known and all heap object contents are consistent. From a global point of view, all threads must block at a safepoint before the GC can run. (As a special case, threads running JNI code can continue to run, because they use only handles. During a safepoint they must block instead of loading the contents of the handle.) From a local point of view, a safepoint is a distinguished point in a block of code where the executing thread may block for the GC. Most call sites qualify as safepoints. There are strong invariants which hold true at every safepoint, which may be disregarded at non-safepoints. Both compiled Java code and C/C++ code be optimized between safepoints, but less so across safepoints. The JIT compiler emits a GC map at each safepoint. C/C++ code in the VM uses stylized macro-based conventions (e.g., TRAPS) to mark potential safepoints. - HotSpot Glossary of Terms -
  8. 23 • GC • Deoptimization • PrintThreads • PrintJNI •

    FindDeadlock • ThreadDump Safepoint operations ie that requires a safepoint • EnableBiasLocking • RevokeBias • HeapDumper • GetAllStackTraces • GetStackTrace • [-XX:GuaranteedSafepointInterval=1000] • ...
  9. 24 Observer effects -Xlog:safepoint make easier to spot how much

    Safepoint Biased profilers could affect a profiled program
  10. Why choose it? • AsyncGetCallTrace (no safepoint bias, but can

    collect “corrupted” java frames) Linux Timer + Signal Handler (ITIMER_PROF/SIG_PROF) • Native using perf_events • out-of-the-box Flame-Graphs support • Open-Source • very low Observer effect • Java 6+ • can profile Java Monitor/ReentrantLock, allocations*, Cache Misses... 27
  11. 29

  12. 30

  13. Hardware Event Skid Event skid is the recording of an

    event not exactly on the code line that caused the event. It may even result in a caller function event being recorded in the callee function. Event skid is caused by a number of factors: • The delay in propagating the event out of the processor's microcode through the interrupt controller (APIC) and back into the processor. • The current instruction retirement cycle must be completed. • When the interrupt is received, the processor must serialize its instruction stream which causes a flushing of the execution pipeline.