Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Tradeoffs, Bad Science, and Polar Bears – The W...

Tradeoffs, Bad Science, and Polar Bears – The World of Java Optimisation

Welcome to the Java optimisation jungle. Why can’t we “just make it go faster”? It turns out, in most cases, we need to first work out “faster for whom?” and “why do we want to go faster?” and “what even is faster?” This talk introduces the basic principles of optimisation, before bouncing through the pitfalls of optimisation; why the exact same techniques which make Quarkus rocket-fast used to be a terrible idea fifteen years ago, why fast benchmarks make for slow programs, why project loom may not be the speedup you’re hoping for, and why even though it can be easy to get wrong, optimisation really really matters. Along the way we’ll talk about measuring things, bad advice, garbage collection, and climate change.

Holly Cummins

October 04, 2023
Tweet

More Decks by Holly Cummins

Other Decks in Programming

Transcript

  1. tradeoffs, bad science, and polar bears: the world of java

    optimisation Holly Cummins Red Hat Devoxx Belgium October 5, 2023
  2. #Quarkus @holly_cummins@hachyderm.io why optimise? 0.5s extra search page time 20%

    drop in traffic 100 ms latency on page load 7% lower conversion rate
  3. #Quarkus @holly_cummins@hachyderm.io why optimise? 0.5s extra search page time 20%

    drop in traffic 100 ms latency on page load 7% lower conversion rate
  4. #Quarkus @holly_cummins@hachyderm.io why optimise? 0.5s extra search page time 20%

    drop in traffic 10 ms delay in trading platform 100 ms latency on page load 7% lower conversion rate
  5. #Quarkus @holly_cummins@hachyderm.io why optimise? 0.5s extra search page time 20%

    drop in traffic 10 ms delay in trading platform 10% drop in revenue 100 ms latency on page load 7% lower conversion rate
  6. #Quarkus @holly_cummins performance can be: throughput latency capacity ramp-up time

    transactions per second response time start-up time bandwidth
  7. #Quarkus @holly_cummins performance can be: throughput latency capacity ramp-up time

    transactions per second response time start-up time footprint bandwidth
  8. #Quarkus @holly_cummins performance can be: throughput latency capacity ramp-up time

    transactions per second response time start-up time CPU usage footprint bandwidth
  9. #Quarkus @holly_cummins performance can be: throughput latency capacity utilisation ramp-up

    time transactions per second response time start-up time CPU usage footprint bandwidth
  10. #Quarkus @holly_cummins performance can be: throughput latency capacity utilisation …

    ramp-up time transactions per second response time start-up time CPU usage footprint bandwidth
  11. #Quarkus @holly_cummins performance can be: throughput latency capacity utilisation …

    ramp-up time transactions per second response time start-up time CPU usage footprint bandwidth
  12. #Quarkus @holly_cummins@hachyderm.io Never underestimate the bandwidth [throughput] of a station

    wagon full of tapes hurtling down the highway. –Andrew Tanenbaum, 1981
  13. #Quarkus @holly_cummins@hachyderm.io Never underestimate the bandwidth [throughput] of a station

    wagon full of tapes hurtling down the highway. –Andrew Tanenbaum, 1981 but the latency is terrible …
  14. #Quarkus @holly_cummins@hachyderm.io quarkus + graalvm trading-off flexibility + throughput against

    startup speed and footprint uhh … are you supposed to shut down applications after using them?
  15. #Quarkus @holly_cummins@hachyderm.io aside: quarkus + jvm better startup time better

    footprint better throughput better developer experience
  16. #Quarkus @holly_cummins@hachyderm.io aside: quarkus + jvm better startup time better

    footprint better throughput better developer experience
  17. #Quarkus @holly_cummins@hachyderm.io aside: quarkus + jvm better startup time better

    footprint better throughput better developer experience there is no tradeoff
  18. #Quarkus @holly_cummins@hachyderm.io aside: quarkus + jvm better startup time better

    footprint better throughput better developer experience there is no tradeoff (only elimination of waste)
  19. #Quarkus @holly_cummins@hachyderm.io aside: quarkus + jvm better startup time better

    footprint better throughput better developer experience there is no tradeoff (only elimination of waste) ok, there is a tradeoff: not optimising for dynamism nobody needs in the cloud
  20. #Quarkus @holly_cummins ephemeral or serverless OpenJDK GraalVM Quarkus Quarkus Application

    Application running your application for a long time recap: which is “faster?”
  21. #Quarkus @holly_cummins ephemeral or serverless OpenJDK GraalVM Quarkus Quarkus Application

    Application running your application for a long time recap: which is “faster?”
  22. #Quarkus @holly_cummins@hachyderm.io leading indicators we care about them easy to

    measure hard to change lagging indicators easy to change
  23. #Quarkus @holly_cummins@hachyderm.io leading indicators we care about them easy to

    measure hard to change lagging indicators predictive of a thing we care about easy to change
  24. #Quarkus @holly_cummins@hachyderm.io leading indicators we care about them easy to

    measure hard to change lagging indicators predictive of a thing we care about hard to identify easy to change
  25. #Quarkus @holly_cummins@hachyderm.io leading indicators we care about them easy to

    measure hard to change lagging indicators predictive of a thing we care about hard to identify easy to change
  26. #Quarkus @holly_cummins@hachyderm.io bad-ish advice: “reduce time spent in garbage collection”

    actually, garbage collection can make your application go faster
  27. #Quarkus @holly_cummins@hachyderm.io -verbose:gc -Xverbosegclog:gclog.xml -Xgcpolicy:optthruput total GC time: 21.6s 4.1%

    of time in GC pause 23.9 GB garbage collected total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected tool: GCMV
  28. #Quarkus @holly_cummins@hachyderm.io -verbose:gc -Xverbosegclog:gclog.xml -Xgcpolicy:optthruput total GC time: 21.6s 4.1%

    of time in GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s tool: GCMV
  29. #Quarkus @holly_cummins@hachyderm.io -verbose:gc -Xverbosegclog:gclog.xml -Xgcpolicy:optthruput total GC time: 21.6s 4.1%

    of time in GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s tool: GCMV
  30. #Quarkus @holly_cummins@hachyderm.io -verbose:gc -Xverbosegclog:gclog.xml -Xgcpolicy:optthruput total GC time: 21.6s 4.1%

    of time in GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s tool: GCMV
  31. #Quarkus @holly_cummins total GC time: 21.6s 4.1% of time in

    GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s
  32. #Quarkus @holly_cummins total GC time: 21.6s 4.1% of time in

    GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s
  33. #Quarkus @holly_cummins total GC time: 21.6s 4.1% of time in

    GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s leading indicator
  34. #Quarkus @holly_cummins total GC time: 21.6s 4.1% of time in

    GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s leading indicator
  35. #Quarkus @holly_cummins total GC time: 21.6s 4.1% of time in

    GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s leading indicator lagging indicator
  36. #Quarkus @holly_cummins total GC time: 21.6s 4.1% of time in

    GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s leading indicator lagging indicator ?
  37. #Quarkus @holly_cummins total GC time: 21.6s 4.1% of time in

    GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s leading indicator lagging indicator ? ?
  38. #Quarkus @holly_cummins@hachyderm.io so wait, what changed to make the app

    faster? running jmeter on the same machine as the app gives a big speedup!
  39. #Quarkus @holly_cummins@hachyderm.io zgc Brian Goetz, yesterday tradeoff: throughput is ~2%

    lower tradeoff: memory is higher java 21 adds generational zgc reduces zgc footprint by 75%
  40. #Quarkus @holly_cummins@hachyderm.io the takeaways: gc can improve performance by rearranging

    the heap find the bottleneck validate advice independently optimise the right thing … for you
  41. #IBM @holly_cummins noooooo! “to tune your JVM, use this command-line:”

    -server -Xms1g -Xmx1g -XX:PermSize=1g -XX:MaxPermSize=256m -Xmn256m -Xss64k -XX:SurvivorRatio=30 -XX:+UseConcMarkSweepGC -XX: +CMSParallelRemarkEnabled -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=10 -XX:+ScavengeBeforeFullGC -XX: +CMSScavengeBeforeRemark -XX:+PrintGCDateStamps -verbose:gc -XX: +PrintGCDetails -Dsun.net.inetaddr.ttl=5 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=`date`.hprof -Dcom.sun.management.jmxremote.port=5616 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -server -Xms2g -Xmx2g -XX:MaxPermSize=256m -XX:NewRatio=1 -XX:+UseConcMarkSweepGC
  42. #Quarkus @holly_cummins@hachyderm.io static string beSlow() { string result = "";

    for (int i = 0; i < 314159; i++) { result += getStringData(i); } return result; }
  43. #Quarkus @holly_cummins@hachyderm.io @Override public String toString() { String ret =

    "\n\tMarket Summary at: " + getSummaryDate() + “\n\t\t TSIA:" + getTSIA() + “\n\t\t openTSIA:" + getOpenTSIA() + “\n\t\t gain:" + getGainPercent() + “\n\t\t volume:" + getVolume(); if ((getTopGainers() == null) || (getTopLosers() == null)) { return ret; } ret += "\n\t\t Current Top Gainers:"; Iterator<QuoteDataBean> it = getTopGainers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += ("\n\t\t\t" + quoteData.toString()); } ret += "\n\t\t Current Top Losers:"; it = getTopLosers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += ("\n\t\t\t" + quoteData.toString()); } return ret; }
  44. #Quarkus @holly_cummins@hachyderm.io @Override public String toString() { String ret =

    "\n\tMarket Summary at: " + getSummaryDate() + “\n\t\t TSIA:" + getTSIA() + “\n\t\t openTSIA:" + getOpenTSIA() + “\n\t\t gain:" + getGainPercent() + “\n\t\t volume:" + getVolume(); if ((getTopGainers() == null) || (getTopLosers() == null)) { return ret; } ret += "\n\t\t Current Top Gainers:"; Iterator<QuoteDataBean> it = getTopGainers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += ("\n\t\t\t" + quoteData.toString()); } ret += "\n\t\t Current Top Losers:"; it = getTopLosers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += ("\n\t\t\t" + quoteData.toString()); } return ret; }
  45. #Quarkus @holly_cummins@hachyderm.io @Override public String toString() { String ret =

    "\n\tMarket Summary at: " + getSummaryDate() + “\n\t\t TSIA:" + getTSIA() + “\n\t\t openTSIA:" + getOpenTSIA() + “\n\t\t gain:" + getGainPercent() + “\n\t\t volume:" + getVolume(); if ((getTopGainers() == null) || (getTopLosers() == null)) { return ret; } ret += "\n\t\t Current Top Gainers:"; Iterator<QuoteDataBean> it = getTopGainers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += ("\n\t\t\t" + quoteData.toString()); } ret += "\n\t\t Current Top Losers:"; it = getTopLosers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += ("\n\t\t\t" + quoteData.toString()); } return ret; }
  46. #Quarkus @holly_cummins@hachyderm.io @Override public String toString() { String ret =

    "\n\tMarket Summary at: " + getSummaryDate() + “\n\t\t TSIA:" + getTSIA() + “\n\t\t openTSIA:" + getOpenTSIA() + “\n\t\t gain:" + getGainPercent() + “\n\t\t volume:" + getVolume(); if ((getTopGainers() == null) || (getTopLosers() == null)) { return ret; } ret += "\n\t\t Current Top Gainers:"; Iterator<QuoteDataBean> it = getTopGainers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += ("\n\t\t\t" + quoteData.toString()); } ret += "\n\t\t Current Top Losers:"; it = getTopLosers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += ("\n\t\t\t" + quoteData.toString()); } return ret; } this never gets called
  47. #Quarkus @holly_cummins@hachyderm.io static string beSlow() { string result = "";

    for (int i = 0; i < 314159; i++) { result += getStringData(i); } return result; }
  48. #Quarkus @holly_cummins@hachyderm.io static string beSlow() { string result = “”;

    result += getStringData(1); result += getStringData(2); result += getStringData(3); return result; }
  49. #Quarkus @holly_cummins@hachyderm.io static string beSlow() { string result = “”;

    result += getStringData(1); result += getStringData(2); result += getStringData(3); return result; } this is fine
  50. #Quarkus @holly_cummins@hachyderm.io the JVM writers have far more time for

    optimising than you do clean, typical, code runs best
  51. #Quarkus @holly_cummins@hachyderm.io loom very good at waiting not so good

    at cpu-bound tasks can interact badly with libraries
  52. #Quarkus @holly_cummins@hachyderm.io loom very good at waiting not so good

    at cpu-bound tasks can interact badly with libraries if a virtual thread gets pinned or does a long cpu process all its friends grind to a halt
  53. #Quarkus @holly_cummins@hachyderm.io * not free this is an incomplete list,

    because there are a lot of tools out there, and many cost money
  54. #Quarkus @holly_cummins@hachyderm.io method profiler * not free this is an

    incomplete list, because there are a lot of tools out there, and many cost money
  55. #Quarkus @holly_cummins@hachyderm.io method profiler VisualVM * not free this is

    an incomplete list, because there are a lot of tools out there, and many cost money
  56. #Quarkus @holly_cummins@hachyderm.io method profiler Mission Control VisualVM * not free

    this is an incomplete list, because there are a lot of tools out there, and many cost money
  57. #Quarkus @holly_cummins@hachyderm.io method profiler Mission Control VisualVM * not free

    this is an incomplete list, because there are a lot of tools out there, and many cost money IntelliJ Profiler
  58. #Quarkus @holly_cummins@hachyderm.io method profiler Mission Control VisualVM * not free

    this is an incomplete list, because there are a lot of tools out there, and many cost money IBM Health Center (for OpenJ9) IntelliJ Profiler
  59. #Quarkus @holly_cummins@hachyderm.io method profiler Mission Control flame graphs VisualVM *

    not free this is an incomplete list, because there are a lot of tools out there, and many cost money IBM Health Center (for OpenJ9) IntelliJ Profiler
  60. #Quarkus @holly_cummins@hachyderm.io method profiler GC analysis Mission Control flame graphs

    VisualVM * not free this is an incomplete list, because there are a lot of tools out there, and many cost money IBM Health Center (for OpenJ9) IntelliJ Profiler
  61. #Quarkus @holly_cummins@hachyderm.io method profiler GC analysis Mission Control flame graphs

    GCMV VisualVM * not free this is an incomplete list, because there are a lot of tools out there, and many cost money IBM Health Center (for OpenJ9) IntelliJ Profiler
  62. #Quarkus @holly_cummins@hachyderm.io method profiler GC analysis heap analysis Mission Control

    flame graphs GCMV VisualVM * not free this is an incomplete list, because there are a lot of tools out there, and many cost money IBM Health Center (for OpenJ9) IntelliJ Profiler
  63. #Quarkus @holly_cummins@hachyderm.io method profiler GC analysis heap analysis Mission Control

    flame graphs GCMV VisualVM Eclipse MAT * not free this is an incomplete list, because there are a lot of tools out there, and many cost money IBM Health Center (for OpenJ9) IntelliJ Profiler
  64. #Quarkus @holly_cummins@hachyderm.io method profiler GC analysis heap analysis APM Mission

    Control flame graphs GCMV VisualVM Eclipse MAT * not free this is an incomplete list, because there are a lot of tools out there, and many cost money IBM Health Center (for OpenJ9) IntelliJ Profiler
  65. #Quarkus @holly_cummins@hachyderm.io method profiler GC analysis heap analysis APM Mission

    Control flame graphs GCMV VisualVM Eclipse MAT * not free this is an incomplete list, because there are a lot of tools out there, and many cost money GlowRoot IBM Health Center (for OpenJ9) IntelliJ Profiler
  66. #Quarkus @holly_cummins@hachyderm.io method profiler GC analysis heap analysis APM Mission

    Control flame graphs GCMV New Relic* VisualVM Eclipse MAT * not free this is an incomplete list, because there are a lot of tools out there, and many cost money GlowRoot IBM Health Center (for OpenJ9) IntelliJ Profiler
  67. #Quarkus @holly_cummins@hachyderm.io method profiler GC analysis heap analysis APM Mission

    Control flame graphs GCMV New Relic* AppDynamics* VisualVM Eclipse MAT * not free this is an incomplete list, because there are a lot of tools out there, and many cost money GlowRoot IBM Health Center (for OpenJ9) IntelliJ Profiler
  68. #Quarkus @holly_cummins@hachyderm.io method profiler GC analysis heap analysis APM Mission

    Control flame graphs GCMV New Relic* AppDynamics* VisualVM Dynatrace* Eclipse MAT * not free this is an incomplete list, because there are a lot of tools out there, and many cost money GlowRoot IBM Health Center (for OpenJ9) IntelliJ Profiler
  69. #Quarkus @holly_cummins@hachyderm.io method profiler GC analysis heap analysis APM Mission

    Control flame graphs GCMV New Relic* AppDynamics* VisualVM Dynatrace* Eclipse MAT * not free this is an incomplete list, because there are a lot of tools out there, and many cost money GlowRoot IBM Health Center (for OpenJ9) IntelliJ Profiler
  70. #Quarkus @holly_cummins@hachyderm.io “When it comes to IT performance, amateurs look

    at averages. Professionals look at distributions.” – Avishai Ish-Shalom
  71. @holly_cummins #RedHat Source: Clement Escoffier cost impact of framework choice

    Setup: • 800 requests/second, over 20 days • SLA > 99% • AWS instances Assumptions: • Costs are for us-east-1 data centre
  72. @holly_cummins #RedHat Setup: • 800 requests/second, over 20 days •

    SLA > 99% Assumptions: • 50% load • us-east-1 data centre • Teads dataset Source: Clement Escoffier x Teads cloud carbon impact of framework choice carbon impact of framework choice
  73. @holly_cummins #RedHat Setup: • 800 requests/second, over 20 days •

    SLA > 99% Assumptions: • 50% load • us-east-1 data centre • Teads dataset Source: Clement Escoffier x Teads cloud carbon impact of framework choice carbon impact of framework choice economic model in action: the cost and carbon metrics are (roughly) the same
  74. @holly_cummins #RedHat capacity Source: John O’Hara Setup: • REST +

    CRUD • large heap • RAPL energy measurement Assumptions: • US energy mix climate impact of framework choice
  75. @holly_cummins #RedHat capacity Source: John O’Hara Setup: • REST +

    CRUD • large heap • RAPL energy measurement Assumptions: • US energy mix climate impact of framework choice shorter line means lower max throughput
  76. @holly_cummins #RedHat capacity Source: John O’Hara Setup: • REST +

    CRUD • large heap • RAPL energy measurement Assumptions: • US energy mix climate impact of framework choice shorter line means lower max throughput higher line means worse carbon footprint
  77. @holly_cummins #RedHat capacity Source: John O’Hara Setup: • REST +

    CRUD • large heap • RAPL energy measurement Assumptions: • US energy mix climate impact of framework choice vrrrooom model in action: quarkus on JVM has the smallest footprint … because it has the highest throughput shorter line means lower max throughput higher line means worse carbon footprint
  78. #Quarkus @holly_cummins@hachyderm.io sooo … you can optimise, and it can

    be fun measure, don’t guess only optimise what matters @holly_cummins #Quarkus Slides