Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Tradeoffs, bad science, and polar bears: the world of Java optimisation

Tradeoffs, bad science, and polar bears: the world of Java optimisation

Welcome to the Java optimisation jungle. Why can’t we “just make it go faster”? It turns out, in most cases, we need to first work out “faster for whom?” and “why do we want to go faster?” and “what even *is* faster?” This talk introduces the basic principles of optimisation, before bouncing through the pitfalls of optimisation; why the exact same techniques which make Quarkus rocket-fast used to be a terrible idea fifteen years ago, why fast benchmarks make for slow programs, and why even though it can be easy to get wrong, optimisation really really matters. Along the way we’ll talk about measuring things, bad advice, garbage collection, and climate change.

Holly Cummins

November 29, 2022
Tweet

More Decks by Holly Cummins

Other Decks in Programming

Transcript

  1. #Quarkus @holly_cummins why optimise? 0.5s extra search page time 20%

    drop in traffic 100 ms latency on page load
  2. #Quarkus @holly_cummins why optimise? 0.5s extra search page time 20%

    drop in traffic 100 ms latency on page load 7% lower conversion rate
  3. #Quarkus @holly_cummins why optimise? 0.5s extra search page time 20%

    drop in traffic 100 ms latency on page load 7% lower conversion rate
  4. #Quarkus @holly_cummins why optimise? 0.5s extra search page time 20%

    drop in traffic 10 ms delay in trading platform 100 ms latency on page load 7% lower conversion rate
  5. #Quarkus @holly_cummins why optimise? 0.5s extra search page time 20%

    drop in traffic 10 ms delay in trading platform 10% drop in revenue 100 ms latency on page load 7% lower conversion rate
  6. #Quarkus @holly_cummins performance can be: throughput latency capacity ramp-up time

    transactions per second response time start-up time bandwidth
  7. #Quarkus @holly_cummins performance can be: throughput latency capacity ramp-up time

    transactions per second response time start-up time footprint bandwidth
  8. #Quarkus @holly_cummins performance can be: throughput latency capacity ramp-up time

    transactions per second response time start-up time CPU usage footprint bandwidth
  9. #Quarkus @holly_cummins performance can be: throughput latency capacity utilisation ramp-up

    time transactions per second response time start-up time CPU usage footprint bandwidth
  10. #Quarkus @holly_cummins performance can be: throughput latency capacity utilisation …

    ramp-up time transactions per second response time start-up time CPU usage footprint bandwidth
  11. #Quarkus @holly_cummins performance can be: throughput latency capacity utilisation …

    ramp-up time transactions per second response time start-up time CPU usage footprint bandwidth
  12. #Quarkus @holly_cummins Never underestimate the bandwidth [throughput] of a station

    wagon full of tapes hurtling down the highway. –Andrew Tanenbaum, 1981
  13. #Quarkus @holly_cummins Never underestimate the bandwidth [throughput] of a station

    wagon full of tapes hurtling down the highway. –Andrew Tanenbaum, 1981 but the latency is terrible …
  14. #Quarkus @holly_cummins quarkus trading-off flexibility against startup speed and footprint

    uhh … are you supposed to shut down applications after using them?
  15. #Quarkus @holly_cummins which is better? ephemeral or serverless OpenJDK GraalVM

    Quarkus Quarkus Application Application which is faster?
  16. #Quarkus @holly_cummins which is better? ephemeral or serverless OpenJDK GraalVM

    Quarkus Quarkus Application Application running your application for a long time which is faster?
  17. #Quarkus @holly_cummins leading indicators we care about them easy to

    measure hard to change lagging indicators easy to change
  18. #Quarkus @holly_cummins leading indicators we care about them easy to

    measure hard to change lagging indicators predictive of a thing we care about easy to change
  19. #Quarkus @holly_cummins leading indicators we care about them easy to

    measure hard to change lagging indicators predictive of a thing we care about hard to identify easy to change
  20. #Quarkus @holly_cummins leading indicators we care about them easy to

    measure hard to change lagging indicators predictive of a thing we care about hard to identify easy to change
  21. #Quarkus @holly_cummins bad-ish advice: “reduce time spent in garbage collection”

    actually, garbage collection can make your application go faster
  22. #Quarkus @holly_cummins -verbose:gc -Xverbosegclog:gclog.xml -Xgcpolicy:optthruput total GC time: 21.6s 4.1%

    of time in GC pause total GC time: 12.0s 3.6% of time in GC pause tool: GCMV
  23. #Quarkus @holly_cummins -verbose:gc -Xverbosegclog:gclog.xml -Xgcpolicy:optthruput total GC time: 21.6s 4.1%

    of time in GC pause 23.9 GB garbage collected total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected tool: GCMV
  24. #Quarkus @holly_cummins -verbose:gc -Xverbosegclog:gclog.xml -Xgcpolicy:optthruput total GC time: 21.6s 4.1%

    of time in GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s tool: GCMV
  25. #Quarkus @holly_cummins -verbose:gc -Xverbosegclog:gclog.xml -Xgcpolicy:optthruput total GC time: 21.6s 4.1%

    of time in GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s tool: GCMV
  26. #Quarkus @holly_cummins -verbose:gc -Xverbosegclog:gclog.xml -Xgcpolicy:optthruput total GC time: 21.6s 4.1%

    of time in GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s tool: GCMV
  27. #Quarkus @holly_cummins total GC time: 21.6s 4.1% of time in

    GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s
  28. #Quarkus @holly_cummins total GC time: 21.6s 4.1% of time in

    GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s
  29. #Quarkus @holly_cummins total GC time: 21.6s 4.1% of time in

    GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s leading indicator
  30. #Quarkus @holly_cummins total GC time: 21.6s 4.1% of time in

    GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s leading indicator
  31. #Quarkus @holly_cummins total GC time: 21.6s 4.1% of time in

    GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s leading indicator lagging indicator
  32. #Quarkus @holly_cummins total GC time: 21.6s 4.1% of time in

    GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s leading indicator lagging indicator ?
  33. #Quarkus @holly_cummins total GC time: 21.6s 4.1% of time in

    GC pause 23.9 GB garbage collected 493 transactions/s total GC time: 12.0s 3.6% of time in GC pause 13.0 GB garbage collected 260 transactions/s leading indicator lagging indicator ? ?
  34. #Quarkus @holly_cummins so wait, what changed to make the app

    faster? running jmeter on the same machine as the app gives a big speedup!
  35. #Quarkus @holly_cummins the takeaways: gc can improve performance by rearranging

    the heap find the bottleneck validate advice independently
  36. #IBM @holly_cummins noooooo! “to tune your JVM, use this command-line:”

    -server -Xms1g -Xmx1g -XX:PermSize=1g -XX:MaxPermSize=256m -Xmn256m -Xss64k -XX:SurvivorRatio=30 -XX:+UseConcMarkSweepGC -XX: +CMSParallelRemarkEnabled -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=10 -XX:+ScavengeBeforeFullGC -XX: +CMSScavengeBeforeRemark -XX:+PrintGCDateStamps -verbose:gc -XX: +PrintGCDetails -Dsun.net.inetaddr.ttl=5 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=`date`.hprof -Dcom.sun.management.jmxremote.port=5616 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -server -Xms2g -Xmx2g -XX:MaxPermSize=256m -XX:NewRatio=1 -XX:+UseConcMarkSweepGC
  37. #Quarkus @holly_cummins static string beSlow() { string result = "";

    for (int i = 0; i < 314159; i++) { result += getStringData(i); } return result; }
  38. #Quarkus @holly_cummins @Override public String toString() { String ret =

    "\n\tMarket Summary at: " + getSummaryDate() + “\n\t\t TSIA:" + getTSIA() + “\n\t\t openTSIA:" + getOpenTSIA() + “\n\t\t gain:" + getGainPercent() + “\n\t\t volume:" + getVolume(); if ((getTopGainers() == null) || (getTopLosers() == null)) { return ret; } ret += "\n\t\t Current Top Gainers:"; Iterator<QuoteDataBean> it = getTopGainers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += ("\n\t\t\t" + quoteData.toString()); } ret += "\n\t\t Current Top Losers:"; it = getTopLosers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += ("\n\t\t\t" + quoteData.toString()); } return ret; }
  39. #Quarkus @holly_cummins @Override public String toString() { String ret =

    "\n\tMarket Summary at: " + getSummaryDate() + “\n\t\t TSIA:" + getTSIA() + “\n\t\t openTSIA:" + getOpenTSIA() + “\n\t\t gain:" + getGainPercent() + “\n\t\t volume:" + getVolume(); if ((getTopGainers() == null) || (getTopLosers() == null)) { return ret; } ret += "\n\t\t Current Top Gainers:"; Iterator<QuoteDataBean> it = getTopGainers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += ("\n\t\t\t" + quoteData.toString()); } ret += "\n\t\t Current Top Losers:"; it = getTopLosers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += ("\n\t\t\t" + quoteData.toString()); } return ret; }
  40. #Quarkus @holly_cummins @Override public String toString() { String ret =

    "\n\tMarket Summary at: " + getSummaryDate() + “\n\t\t TSIA:" + getTSIA() + “\n\t\t openTSIA:" + getOpenTSIA() + “\n\t\t gain:" + getGainPercent() + “\n\t\t volume:" + getVolume(); if ((getTopGainers() == null) || (getTopLosers() == null)) { return ret; } ret += "\n\t\t Current Top Gainers:"; Iterator<QuoteDataBean> it = getTopGainers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += ("\n\t\t\t" + quoteData.toString()); } ret += "\n\t\t Current Top Losers:"; it = getTopLosers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += ("\n\t\t\t" + quoteData.toString()); } return ret; }
  41. #Quarkus @holly_cummins @Override public String toString() { String ret =

    "\n\tMarket Summary at: " + getSummaryDate() + “\n\t\t TSIA:" + getTSIA() + “\n\t\t openTSIA:" + getOpenTSIA() + “\n\t\t gain:" + getGainPercent() + “\n\t\t volume:" + getVolume(); if ((getTopGainers() == null) || (getTopLosers() == null)) { return ret; } ret += "\n\t\t Current Top Gainers:"; Iterator<QuoteDataBean> it = getTopGainers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += ("\n\t\t\t" + quoteData.toString()); } ret += "\n\t\t Current Top Losers:"; it = getTopLosers().iterator(); while (it.hasNext()) { QuoteDataBean quoteData = it.next(); ret += ("\n\t\t\t" + quoteData.toString()); } return ret; } this never gets called
  42. #Quarkus @holly_cummins static string beSlow() { string result = "";

    for (int i = 0; i < 314159; i++) { result += getStringData(i); } return result; }
  43. #Quarkus @holly_cummins static string beSlow() { string result = “";

    result += getStringData(1); result += getStringData(2); result += getStringData(3); return result; }
  44. #Quarkus @holly_cummins static string beSlow() { string result = “";

    result += getStringData(1); result += getStringData(2); result += getStringData(3); return result; } this is fine
  45. #Quarkus @holly_cummins the JVM writers have far more time for

    optimising than you do clean, typical, code runs best
  46. #Quarkus @holly_cummins method profiler GC analysis heap analysis APM distributed

    tracing * not free this is an incomplete list, because there are a lot of tools out there, and many cost money
  47. #Quarkus @holly_cummins method profiler GC analysis heap analysis APM distributed

    tracing VisualVM * not free this is an incomplete list, because there are a lot of tools out there, and many cost money
  48. #Quarkus @holly_cummins method profiler GC analysis heap analysis APM distributed

    tracing Mission Control VisualVM * not free this is an incomplete list, because there are a lot of tools out there, and many cost money
  49. #Quarkus @holly_cummins method profiler GC analysis heap analysis APM distributed

    tracing Mission Control VisualVM * not free this is an incomplete list, because there are a lot of tools out there, and many cost money IBM Health Center (for OpenJ9)
  50. #Quarkus @holly_cummins method profiler GC analysis heap analysis APM distributed

    tracing Mission Control flame graphs VisualVM * not free this is an incomplete list, because there are a lot of tools out there, and many cost money IBM Health Center (for OpenJ9)
  51. #Quarkus @holly_cummins method profiler GC analysis heap analysis APM distributed

    tracing Mission Control flame graphs GCMV VisualVM * not free this is an incomplete list, because there are a lot of tools out there, and many cost money IBM Health Center (for OpenJ9)
  52. #Quarkus @holly_cummins method profiler GC analysis heap analysis APM distributed

    tracing Mission Control flame graphs GCMV VisualVM Eclipse MAT * not free this is an incomplete list, because there are a lot of tools out there, and many cost money IBM Health Center (for OpenJ9)
  53. #Quarkus @holly_cummins method profiler GC analysis heap analysis APM distributed

    tracing Mission Control flame graphs GCMV VisualVM Eclipse MAT * not free this is an incomplete list, because there are a lot of tools out there, and many cost money GlowRoot IBM Health Center (for OpenJ9)
  54. #Quarkus @holly_cummins method profiler GC analysis heap analysis APM distributed

    tracing Mission Control flame graphs GCMV New Relic* VisualVM Eclipse MAT * not free this is an incomplete list, because there are a lot of tools out there, and many cost money GlowRoot IBM Health Center (for OpenJ9)
  55. #Quarkus @holly_cummins method profiler GC analysis heap analysis APM distributed

    tracing Mission Control flame graphs GCMV New Relic* AppDynamics* VisualVM Eclipse MAT * not free this is an incomplete list, because there are a lot of tools out there, and many cost money GlowRoot IBM Health Center (for OpenJ9)
  56. #Quarkus @holly_cummins method profiler GC analysis heap analysis APM distributed

    tracing Mission Control flame graphs GCMV New Relic* AppDynamics* VisualVM Dynatrace* Eclipse MAT * not free this is an incomplete list, because there are a lot of tools out there, and many cost money GlowRoot IBM Health Center (for OpenJ9)
  57. #Quarkus @holly_cummins method profiler GC analysis heap analysis APM distributed

    tracing Mission Control flame graphs GCMV New Relic* AppDynamics* Zipkin VisualVM Dynatrace* Eclipse MAT * not free this is an incomplete list, because there are a lot of tools out there, and many cost money GlowRoot IBM Health Center (for OpenJ9)
  58. #Quarkus @holly_cummins method profiler GC analysis heap analysis APM distributed

    tracing Mission Control flame graphs GCMV New Relic* AppDynamics* Jaeger Zipkin VisualVM Dynatrace* Eclipse MAT * not free this is an incomplete list, because there are a lot of tools out there, and many cost money GlowRoot IBM Health Center (for OpenJ9)
  59. #Quarkus @holly_cummins “When it comes to IT performance, amateurs look

    at averages. Professionals look at distributions.” – Avishai Ish-Shalom
  60. #Quarkus @holly_cummins if you leave the TV on when you’re

    not using it, you’re a polar bear murderer
  61. @holly_cummins #RedHat density Source: Clement Escoffier cost impact of framework

    choice Setup: • 800 requests/second, over 20 days • SLA > 99% • AWS instances Assumptions: • Costs are for us-east-1 data centre
  62. @holly_cummins #RedHat Setup: • 800 requests/second, over 20 days •

    SLA > 99% Assumptions: • 50% load • us-east-1 data centre • Teads dataset Source: Clement Escoffier x Teads cloud carbon impact of framework choice
  63. @holly_cummins #RedHat capacity Source: John O’Hara Setup: • REST +

    CRUD • large heap • RAPL energy measurement Assumptions: • US energy mix climate impact of framework choice
  64. @holly_cummins #RedHat our experiments show strong correlations between • lower

    runtime cost • higher throughput • better carbon-efficiency
  65. @holly_cummins #RedHat our experiments show strong correlations between • lower

    runtime cost • higher throughput • better carbon-efficiency
  66. #Quarkus @holly_cummins fewer devices longer lifetime higher efficiency lower footprint

    more multitenancy optimise for longevity the end of planned obsolescence?
  67. #Quarkus @holly_cummins sooo … you can optimise, and it can

    be fun measure, don’t guess only optimise what matters now for questions! @holly_cummins #Quarkus