Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Measure, don’t guess - Benchmarking stories from the trenches

Mario Fusco
April 16, 2024
350

Measure, don’t guess - Benchmarking stories from the trenches

How many times have you implemented a clever performance improvement, and maybe put it in production, because it seemed the right thing™ to do, without even measuring the actual consequences of your change? And even if you are measuring, are you using the right tools and interpreting the results correctly? During this deep dive session we will use some examples, taken from real-world situations, to demonstrate how to develop meaningful benchmarks, avoiding the most common, but also often subtle, possible pitfalls and how to correctly interpret their results and taking actions to improve them. In particular we will illustrate how to use JMH for these purposes, explaining why it is the only reliable tool to be used when benchmarking Java applications, and showing what can go horribly wrong if you decide to measure the actual performance of a Java program without it. At the end of this session you will be able to create your own JMH based benchmarks and more important to effectively use their results in order to improve the overall performance of your software.

Mario Fusco

April 16, 2024
Tweet

More Decks by Mario Fusco

Transcript

  1. So You Want to Write a (Micro)Benchmark 1. Read a

    reputable paper on JVMs and micro-benchmarking 2. Always include a warmup phase which runs your test kernel all the way through, enough to trigger all initializations and compilations before timing phase(s) 3. Always run with -XX:+PrintCompilation, -verbose:gc, etc., so you can verify that the compiler and other parts of the JVM are not doing unexpected work during your timing phase. a. Print messages at the beginning and end of timing and warmup phases, so you can verify that there is no output during the timing phase. 4. Be aware of the difference between -client and -server, and OSR and regular compilations. Also be aware of the effects of -XX:+TieredCompilation, which mixes client and server modes together. 5. Be aware of initialization effects. Do not print for the first time during your timing phase, since printing loads and initializes classes. Do not load new classes outside of the warmup/reporting phase, unless you are testing class loading. 6. Be aware of deoptimization and recompilation effects. 7. Use appropriate tools to read the compiler's mind, and expect to be surprised by the code it produces. Inspect the code yourself before forming theories about what makes something faster or slower. 8. Reduce noise in your measurements. Run your benchmark on a quiet machine, and run it several times, discarding outliers.
  2. In Java-ish... The optimized version execute the load of the

    field just once for each test and (incredibly) get the same results too! * the actual optimization depends on the JVM version
  3. USE JMH USE JMH USE JMH “A badly written benchmark

    can lead you to wrong conclusions that will let you focus on useless optimizations, confusing yourself and wasting others’ time” - An anonymous performance engineer - *Effects of a poorly written benchmark
  4. A bad benchmark (and its meaningless results) also mislead others

    How many times a badly written blog post has pushed developers to adopt bad practices? 😢
  5. JMH TLDR “JMH is a Java harness for building, running,

    and analysing nano/micro/milli/macro benchmarks written in Java and other languages targeting the JVM.” - OpenJDK Code Tools -
  6. “Is a collection of software and test data configured to

    test a program unit by running it under varying conditions and monitoring its behavior and outputs. ... The typical objectives of a test harness are to: • Automate the testing process. • Execute test suites of test cases. • Generate associated test reports.” - Wikipedia: Test Harness - Test Harness
  7. Under the hood Method under benchmark nanoTime() is a costly

    operation, called only once isDone is a volatile variable set by a timer
  8. Purpose is everything “Benchmark numbers don’t matter on their own.

    It’s important what models you derive from those numbers.”
  9. Making sense of data: Active vs. passive benchmarking • Passive

    Benchmarking ◦ Benchmarks are commonly executed and then ignored until they have completed. That is passive benchmarking, where the main objective is the collection of benchmark data. Data is not Information. • Active Benchmarking ◦ With active benchmarking, you analyze performance while the benchmark is still running (not just after it's done), using other tools. You can confirm that the benchmark tests what you intend it to, and that you understand what that is. Data becomes Information. This can also identify the true limiters of the system under test, or of the benchmark itself.
  10. To recap Benchmarks are experiments intended to reproduce in a

    controlled environment exactly the same behaviour that you would otherwise experience into the wild
  11. To recap ( yes, I should have tell you before

    😛 ) Software Engineer Software Performance Engineer • Mostly don’t care about underlying hardware and data specifics • Work based on abstract principles, actual formal science • Care writing beautiful, readable, composable, reusable … code • Explore complex interactions between hardware, software, and data • Work based on empirical evidence, more similar to natural science • Sacrifice all good software principles to squeeze the last microsecond
  12. References • Code examples - https://github.com/mariofusco/jmh-playground • So You Want

    to Write a Micro-Benchmark - https://wiki.openjdk.org/display/HotSpot/MicroBenchmarks • Active Benchmarking - https://www.brendangregg.com/activebenchmarking.html • JMH - https://github.com/openjdk/jmh • JMH Samples - https://github.com/openjdk/jmh/tree/master/jmh-samples/src/main/java/org/openjdk/jmh/s amples • VM Options Explorer - https://chriswhocodes.com/ • HotSpot disassembly plugin - https://chriswhocodes.com/hsdis/ • Environment OSTuning - https://github.com/ionutbalosin/jvm-performance-benchmarks?tab=readme-ov-file#os-tu ning • JMH Visualizer - https://jmh.morethan.io/ • Mastering the mechanics of Java method invocation - https://blogs.oracle.com/javamagazine/post/mastering-the-mechanics-of-java-method-in vocation • What’s Wrong With My Benchmark Results? Studying Bad Practices in JMH Benchmarks