Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Tradeoffs, Bad Science, and Polar Bears – The W...

Tradeoffs, Bad Science, and Polar Bears – The World of Java Optimisation

Welcome to the Java optimisation jungle. Why can’t we “just make it go faster”? It turns out, in most cases, we need to first work out “faster for whom?” and “why do we want to go faster?” and “what even is faster?” This talk introduces the basic principles of optimisation, before bouncing through the pitfalls of optimisation; why the exact same techniques which make Quarkus rocket-fast used to be a terrible idea fifteen years ago, why fast benchmarks make for slow programs, why project loom may not be the speedup you’re hoping for, and why even though it can be easy to get wrong, optimisation really really matters. Along the way we’ll talk about measuring things, bad advice, garbage collection, and climate change.

Holly Cummins

October 04, 2023
Tweet

More Decks by Holly Cummins

Other Decks in Programming

Transcript

  1. tradeoffs, bad science, and polar bears: the world of java

    optimisation Holly Cummins Red Hat Devoxx Belgium October 5, 2023
  2. #Quarkus @[email protected] why optimise? 0.5s extra search page time 20%

    drop in traffic 100 ms latency on page load 7% lower conversion rate
  3. #Quarkus @[email protected] why optimise? 0.5s extra search page time 20%

    drop in traffic 100 ms latency on page load 7% lower conversion rate
  4. #Quarkus @[email protected] why optimise? 0.5s extra search page time 20%

    drop in traffic 10 ms delay in trading platform 100 ms latency on page load 7% lower conversion rate
  5. #Quarkus @[email protected] why optimise? 0.5s extra search page time 20%

    drop in traffic 10 ms delay in trading platform 10% drop in revenue 100 ms latency on page load 7% lower conversion rate
  6. #Quarkus @holly_cummins performance can be: throughput latency capacity ramp-up time

    transactions per second response time start-up time bandwidth
  7. #Quarkus @holly_cummins performance can be: throughput latency capacity ramp-up time

    transactions per second response time start-up time footprint bandwidth
  8. #Quarkus @holly_cummins performance can be: throughput latency capacity ramp-up time

    transactions per second response time start-up time CPU usage footprint bandwidth
  9. #Quarkus @holly_cummins performance can be: throughput latency capacity utilisation ramp-up

    time transactions per second response time start-up time CPU usage footprint bandwidth
  10. #Quarkus @holly_cummins performance can be: throughput latency capacity utilisation …

    ramp-up time transactions per second response time start-up time CPU usage footprint bandwidth
  11. #Quarkus @holly_cummins performance can be: throughput latency capacity utilisation …

    ramp-up time transactions per second response time start-up time CPU usage footprint bandwidth
  12. #Quarkus @[email protected] Never underestimate the bandwidth [throughput] of a station

    wagon full of tapes hurtling down the highway. –Andrew Tanenbaum, 1981
  13. #Quarkus @[email protected] Never underestimate the bandwidth [throughput] of a station

    wagon full of tapes hurtling down the highway. –Andrew Tanenbaum, 1981 but the latency is terrible …
  14. #Quarkus @[email protected] quarkus + graalvm trading-off flexibility + throughput against

    startup speed and footprint uhh … are you supposed to shut down applications after using them?
  15. #Quarkus @[email protected] aside: quarkus + jvm better startup time better

    footprint better throughput better developer experience
  16. #Quarkus @[email protected] aside: quarkus + jvm better startup time better

    footprint better throughput better developer experience
  17. #Quarkus @[email protected] aside: quarkus + jvm better startup time better

    footprint better throughput better developer experience there is no tradeoff
  18. #Quarkus @[email protected] aside: quarkus + jvm better startup time better

    footprint better throughput better developer experience there is no tradeoff (only elimination of waste)
  19. #Quarkus @[email protected] aside: quarkus + jvm better startup time better

    footprint better throughput better developer experience there is no tradeoff (only elimination of waste) ok, there is a tradeoff: not optimising for dynamism nobody needs in the cloud
  20. #Quarkus @holly_cummins ephemeral or serverless OpenJDK GraalVM Quarkus Quarkus Application

    Application running your application for a long time recap: which is “faster?”
  21. #Quarkus @holly_cummins ephemeral or serverless OpenJDK GraalVM Quarkus Quarkus Application

    Application running your application for a long time recap: which is “faster?”
  22. #Quarkus @[email protected] leading indicators we care about them easy to

    measure hard to change lagging indicators easy to change
  23. #Quarkus @[email protected] leading indicators we care about them easy to

    measure hard to change lagging indicators predictive of a thing we care about easy to change
  24. #Quarkus @[email protected] leading indicators we care about them easy to

    measure hard to change lagging indicators predictive of a thing we care about hard to identify easy to change
  25. #Quarkus @[email protected] leading indicators we care about them easy to

    measure hard to change lagging indicators predictive of a thing we care about hard to identify easy to change
  26. #Quarkus @[email protected] bad-ish advice: “reduce time spent in garbage collection”

    actually, garbage collection can make your application go faster
  27. #Quarkus @[email protected] -verbose:gc -Xverbosegclog:gclog.xml -Xgcpolicy:optthruput total GC time: 21.6s 4.1%

    of time in GC pause total GC time: 12.0s 3.6% of time in GC pause tool: GCMV
  28. #Quarkus @[email protected] -verbose:gc -Xverbosegclog:gclog.xml -Xgcpolicy:optthruput total GC time: 21.6s 4.1%

    of time in GC pause 23.9 GB garbage collected total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected tool: GCMV
  29. #Quarkus @[email protected] -verbose:gc -Xverbosegclog:gclog.xml -Xgcpolicy:optthruput total GC time: 21.6s 4.1%

    of time in GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s tool: GCMV
  30. #Quarkus @[email protected] -verbose:gc -Xverbosegclog:gclog.xml -Xgcpolicy:optthruput total GC time: 21.6s 4.1%

    of time in GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s tool: GCMV
  31. #Quarkus @[email protected] -verbose:gc -Xverbosegclog:gclog.xml -Xgcpolicy:optthruput total GC time: 21.6s 4.1%

    of time in GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s tool: GCMV
  32. #Quarkus @holly_cummins total GC time: 21.6s 4.1% of time in

    GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s
  33. #Quarkus @holly_cummins total GC time: 21.6s 4.1% of time in

    GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s
  34. #Quarkus @holly_cummins total GC time: 21.6s 4.1% of time in

    GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s leading indicator
  35. #Quarkus @holly_cummins total GC time: 21.6s 4.1% of time in

    GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s leading indicator
  36. #Quarkus @holly_cummins total GC time: 21.6s 4.1% of time in

    GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s leading indicator lagging indicator
  37. #Quarkus @holly_cummins total GC time: 21.6s 4.1% of time in

    GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s leading indicator lagging indicator ?
  38. #Quarkus @holly_cummins total GC time: 21.6s 4.1% of time in

    GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s leading indicator lagging indicator ? ?
  39. #Quarkus @[email protected] so wait, what changed to make the app

    faster? running jmeter on the same machine as the app gives a big speedup!
  40. #Quarkus @[email protected] zgc Brian Goetz, yesterday tradeoff: throughput is ~2%

    lower tradeoff: memory is higher java 21 adds generational zgc reduces zgc footprint by 75%
  41. #Quarkus @[email protected] the takeaways: gc can improve performance by rearranging

    the heap find the bottleneck validate advice independently optimise the right thing … for you
  42. #IBM @holly_cummins noooooo! “to tune your JVM, use this command-line:”

    -server -Xms1g -Xmx1g -XX:PermSize=1g -XX:MaxPermSize=256m -Xmn256m -Xss64k -XX:SurvivorRatio=30 -XX:+UseConcMarkSweepGC -XX: +CMSParallelRemarkEnabled -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=10 -XX:+ScavengeBeforeFullGC -XX: +CMSScavengeBeforeRemark -XX:+PrintGCDateStamps -verbose:gc -XX: +PrintGCDetails -Dsun.net.inetaddr.ttl=5 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=`date`.hprof -Dcom.sun.management.jmxremote.port=5616 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -server -Xms2g -Xmx2g -XX:MaxPermSize=256m -XX:NewRatio=1 -XX:+UseConcMarkSweepGC
  43. #Quarkus @[email protected] static string beSlow() { string result = "";

    for (int i = 0; i < 314159; i++) { result += getStringData(i); } return result; }
  44. #Quarkus @[email protected] @Override public String toString() { String ret =

    "\n\tMarket Summary at: " + getSummaryDate() + “\n\t\t TSIA:" + getTSIA() + “\n\t\t openTSIA:" + getOpenTSIA() + “\n\t\t gain:" + getGainPercent() + “\n\t\t volume:" + getVolume(); if ((getTopGainers() == null) || (getTopLosers() == null)) { return ret; } ret += "\n\t\t Current Top Gainers:"; Iterator<QuoteDataBean> it = getTopGainers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += ("\n\t\t\t" + quoteData.toString()); } ret += "\n\t\t Current Top Losers:"; it = getTopLosers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += ("\n\t\t\t" + quoteData.toString()); } return ret; }
  45. #Quarkus @[email protected] @Override public String toString() { String ret =

    "\n\tMarket Summary at: " + getSummaryDate() + “\n\t\t TSIA:" + getTSIA() + “\n\t\t openTSIA:" + getOpenTSIA() + “\n\t\t gain:" + getGainPercent() + “\n\t\t volume:" + getVolume(); if ((getTopGainers() == null) || (getTopLosers() == null)) { return ret; } ret += "\n\t\t Current Top Gainers:"; Iterator<QuoteDataBean> it = getTopGainers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += ("\n\t\t\t" + quoteData.toString()); } ret += "\n\t\t Current Top Losers:"; it = getTopLosers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += ("\n\t\t\t" + quoteData.toString()); } return ret; }
  46. #Quarkus @[email protected] @Override public String toString() { String ret =

    "\n\tMarket Summary at: " + getSummaryDate() + “\n\t\t TSIA:" + getTSIA() + “\n\t\t openTSIA:" + getOpenTSIA() + “\n\t\t gain:" + getGainPercent() + “\n\t\t volume:" + getVolume(); if ((getTopGainers() == null) || (getTopLosers() == null)) { return ret; } ret += "\n\t\t Current Top Gainers:"; Iterator<QuoteDataBean> it = getTopGainers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += ("\n\t\t\t" + quoteData.toString()); } ret += "\n\t\t Current Top Losers:"; it = getTopLosers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += ("\n\t\t\t" + quoteData.toString()); } return ret; }
  47. #Quarkus @[email protected] @Override public String toString() { String ret =

    "\n\tMarket Summary at: " + getSummaryDate() + “\n\t\t TSIA:" + getTSIA() + “\n\t\t openTSIA:" + getOpenTSIA() + “\n\t\t gain:" + getGainPercent() + “\n\t\t volume:" + getVolume(); if ((getTopGainers() == null) || (getTopLosers() == null)) { return ret; } ret += "\n\t\t Current Top Gainers:"; Iterator<QuoteDataBean> it = getTopGainers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += ("\n\t\t\t" + quoteData.toString()); } ret += "\n\t\t Current Top Losers:"; it = getTopLosers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += ("\n\t\t\t" + quoteData.toString()); } return ret; } this never gets called
  48. #Quarkus @[email protected] static string beSlow() { string result = "";

    for (int i = 0; i < 314159; i++) { result += getStringData(i); } return result; }
  49. #Quarkus @[email protected] static string beSlow() { string result = “”;

    result += getStringData(1); result += getStringData(2); result += getStringData(3); return result; }
  50. #Quarkus @[email protected] static string beSlow() { string result = “”;

    result += getStringData(1); result += getStringData(2); result += getStringData(3); return result; } this is fine
  51. #Quarkus @[email protected] the JVM writers have far more time for

    optimising than you do clean, typical, code runs best
  52. #Quarkus @[email protected] why did the quarkus 2.0.0 codebase use stringbuilder

    so much? remember, the outside world can change but …
  53. #Quarkus @[email protected] loom very good at waiting not so good

    at cpu-bound tasks can interact badly with libraries
  54. #Quarkus @[email protected] loom very good at waiting not so good

    at cpu-bound tasks can interact badly with libraries if a virtual thread gets pinned or does a long cpu process all its friends grind to a halt
  55. #Quarkus @[email protected] * not free this is an incomplete list,

    because there are a lot of tools out there, and many cost money
  56. #Quarkus @[email protected] method profiler * not free this is an

    incomplete list, because there are a lot of tools out there, and many cost money
  57. #Quarkus @[email protected] method profiler VisualVM * not free this is

    an incomplete list, because there are a lot of tools out there, and many cost money
  58. #Quarkus @[email protected] method profiler Mission Control VisualVM * not free

    this is an incomplete list, because there are a lot of tools out there, and many cost money
  59. #Quarkus @[email protected] method profiler Mission Control VisualVM * not free

    this is an incomplete list, because there are a lot of tools out there, and many cost money IntelliJ Profiler
  60. #Quarkus @[email protected] method profiler Mission Control VisualVM * not free

    this is an incomplete list, because there are a lot of tools out there, and many cost money IBM Health Center (for OpenJ9) IntelliJ Profiler
  61. #Quarkus @[email protected] method profiler Mission Control flame graphs VisualVM *

    not free this is an incomplete list, because there are a lot of tools out there, and many cost money IBM Health Center (for OpenJ9) IntelliJ Profiler
  62. #Quarkus @[email protected] method profiler GC analysis Mission Control flame graphs

    VisualVM * not free this is an incomplete list, because there are a lot of tools out there, and many cost money IBM Health Center (for OpenJ9) IntelliJ Profiler
  63. #Quarkus @[email protected] method profiler GC analysis Mission Control flame graphs

    GCMV VisualVM * not free this is an incomplete list, because there are a lot of tools out there, and many cost money IBM Health Center (for OpenJ9) IntelliJ Profiler
  64. #Quarkus @[email protected] method profiler GC analysis heap analysis Mission Control

    flame graphs GCMV VisualVM * not free this is an incomplete list, because there are a lot of tools out there, and many cost money IBM Health Center (for OpenJ9) IntelliJ Profiler
  65. #Quarkus @[email protected] method profiler GC analysis heap analysis Mission Control

    flame graphs GCMV VisualVM Eclipse MAT * not free this is an incomplete list, because there are a lot of tools out there, and many cost money IBM Health Center (for OpenJ9) IntelliJ Profiler
  66. #Quarkus @[email protected] method profiler GC analysis heap analysis APM Mission

    Control flame graphs GCMV VisualVM Eclipse MAT * not free this is an incomplete list, because there are a lot of tools out there, and many cost money IBM Health Center (for OpenJ9) IntelliJ Profiler
  67. #Quarkus @[email protected] method profiler GC analysis heap analysis APM Mission

    Control flame graphs GCMV VisualVM Eclipse MAT * not free this is an incomplete list, because there are a lot of tools out there, and many cost money GlowRoot IBM Health Center (for OpenJ9) IntelliJ Profiler
  68. #Quarkus @[email protected] method profiler GC analysis heap analysis APM Mission

    Control flame graphs GCMV New Relic* VisualVM Eclipse MAT * not free this is an incomplete list, because there are a lot of tools out there, and many cost money GlowRoot IBM Health Center (for OpenJ9) IntelliJ Profiler
  69. #Quarkus @[email protected] method profiler GC analysis heap analysis APM Mission

    Control flame graphs GCMV New Relic* AppDynamics* VisualVM Eclipse MAT * not free this is an incomplete list, because there are a lot of tools out there, and many cost money GlowRoot IBM Health Center (for OpenJ9) IntelliJ Profiler
  70. #Quarkus @[email protected] method profiler GC analysis heap analysis APM Mission

    Control flame graphs GCMV New Relic* AppDynamics* VisualVM Dynatrace* Eclipse MAT * not free this is an incomplete list, because there are a lot of tools out there, and many cost money GlowRoot IBM Health Center (for OpenJ9) IntelliJ Profiler
  71. #Quarkus @[email protected] method profiler GC analysis heap analysis APM Mission

    Control flame graphs GCMV New Relic* AppDynamics* VisualVM Dynatrace* Eclipse MAT * not free this is an incomplete list, because there are a lot of tools out there, and many cost money GlowRoot IBM Health Center (for OpenJ9) IntelliJ Profiler
  72. #Quarkus @[email protected] “When it comes to IT performance, amateurs look

    at averages. Professionals look at distributions.” – Avishai Ish-Shalom
  73. #Quarkus @[email protected] if you leave the TV on when you’re

    not using it, you’re a polar bear murderer
  74. @holly_cummins #RedHat Source: Clement Escoffier cost impact of framework choice

    Setup: • 800 requests/second, over 20 days • SLA > 99% • AWS instances Assumptions: • Costs are for us-east-1 data centre
  75. @holly_cummins #RedHat Setup: • 800 requests/second, over 20 days •

    SLA > 99% Assumptions: • 50% load • us-east-1 data centre • Teads dataset Source: Clement Escoffier x Teads cloud carbon impact of framework choice carbon impact of framework choice
  76. @holly_cummins #RedHat Setup: • 800 requests/second, over 20 days •

    SLA > 99% Assumptions: • 50% load • us-east-1 data centre • Teads dataset Source: Clement Escoffier x Teads cloud carbon impact of framework choice carbon impact of framework choice economic model in action: the cost and carbon metrics are (roughly) the same
  77. @holly_cummins #RedHat capacity Source: John O’Hara Setup: • REST +

    CRUD • large heap • RAPL energy measurement Assumptions: • US energy mix climate impact of framework choice
  78. @holly_cummins #RedHat capacity Source: John O’Hara Setup: • REST +

    CRUD • large heap • RAPL energy measurement Assumptions: • US energy mix climate impact of framework choice shorter line means lower max throughput
  79. @holly_cummins #RedHat capacity Source: John O’Hara Setup: • REST +

    CRUD • large heap • RAPL energy measurement Assumptions: • US energy mix climate impact of framework choice shorter line means lower max throughput higher line means worse carbon footprint
  80. @holly_cummins #RedHat capacity Source: John O’Hara Setup: • REST +

    CRUD • large heap • RAPL energy measurement Assumptions: • US energy mix climate impact of framework choice vrrrooom model in action: quarkus on JVM has the smallest footprint … because it has the highest throughput shorter line means lower max throughput higher line means worse carbon footprint
  81. #Quarkus @[email protected] sooo … you can optimise, and it can

    be fun measure, don’t guess only optimise what matters @holly_cummins #Quarkus Slides