Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Beyond default settings: Optimizing Java on K8s...

Beyond default settings: Optimizing Java on K8s with AI-driven performance tuning

Modern Java applications are increasingly deployed on Kubernetes, aiming for fast horizontal scaling, efficient resource usage and consistent peak performance.

However, reality often falls short of this ideal. The JVM was not designed for cloud-native environments, leading to common issues like excessive CPU usage, long warmup times and unpredictable latency spikes.

But this doesn’t mean the JVM can’t thrive on K8s! The solution lies in mastering the complex configuration interplay across the full stack, including JVM settings (e.g. GC, JIT) and K8s pod parameters (resource requests/limits, HPA scaling policies).

In this talk, I will demonstrate how to apply automated experimentation and AI-driven optimization to find the optimal configuration in this vast configuration space. We will analyze real-world scenarios where tuning specific flags unlocked 20%-50% JVM performance and efficiency gains, proving that data-driven tuning beats default settings and manual tuning.

https://devnexus.com/events/beyond-default-settings-optimizing-java-on-k8s-with-ai-driven-performance-tuning

Avatar for Stefano Doni

Stefano Doni

March 24, 2026
Tweet

More Decks by Stefano Doni

Other Decks in Technology

Transcript

  1. © 2026 Akamas • All Rights Reserved • Confidential Beyond

    default settings: Optimizing Java on K8s with AI-driven performance tuning Stefano Doni, Akamas Co-founder & CTO DevNexus 2026  Atlanta
  2. © 2026 Akamas • All Rights Reserved • Confidential The

    Java on K8s optimization stack Cloud Instance Pod Horizontal Scaling Application QoS JVM Pod Vertical Scaling Node Scaling Cloud Pricing Pods Clusters Cloud infrastructure Applications
  3. © 2026 Akamas • All Rights Reserved • Confidential Top

    Java performance on K8s challenges https://akamas.io/resources/the-state-of-java-on-kubernetes-2026-w hy-defaults-are-killing-your-performance
  4. © 2026 Akamas • All Rights Reserved • Confidential The

    Java + HPA autoscaling challenge Java apps experience slower performance and significantly higher CPU utilization during the initial startup or warm-up phase This can cause • App violating SLOs • HPA over-scale and replicas saturation • CPU spikes, causing noisy neighbours or even node stability issues
  5. © 2026 Akamas • All Rights Reserved • Confidential HPA

    going crazy with Java Spring Petclinic
  6. © 2026 Akamas • All Rights Reserved • Confidential JVM

    JIT compiler 101 • The JIT compiler compiles bytecode to native code for frequently executed methods (“hotspotsˮ) • The JVM provides two compilers: C1 (client) and C2 (server) • JIT compilers use CPU/memory to do their work • Trade-off: code speed vs resource usage vs app runtime • Key JVM configs ◦ XXTieredStopAtLevel=N ◦ XXCompileThresholdScaling=N ◦ XXTieredCompilation ◦ XXCICompilerCount=N Interpreter C1: no profiling C1: limited profiling C1: full profiling C2 0 1 2 3 4 Compilation level
  7. © 2026 Akamas • All Rights Reserved • Confidential K8s

    HPA scaling 101 • The Horizontal Pod Autoscaler HPA) adjusts the number of pod replicas • The scaling decision is based on metrics and threshold ◦ Example: CPU util > 50% vs CPU requests • Traffic is forwarded to new replicas once ready (pod probes) • Itʼs reactive - can be slow to react to sudden peaks • Key HPA configs ◦ HPA Scaling metric & threshold ◦ Pod requests & limits ◦ Pod readiness/liveness/startup probes Pod Pod Pod Pod Pod replicas Deployment + HPA
  8. © 2026 Akamas • All Rights Reserved • Confidential A

    mental model for full-stack K8s efficiency Time Cluster scaling efficiency Workload scaling efficiency Application runtime efficiency The 3 K8s efficiency metrics Resources CPU, mem) Allocatable Requests App demand Used
  9. © 2026 Akamas • All Rights Reserved • Confidential Java

    heap sizing is a long standing problem
  10. © 2026 Akamas • All Rights Reserved • Confidential Why

    heap size tuning is important? JVM uses all of the available memory 2 GiB 1.2 GiB JVM heap used JVM max heap App response time • The JVM tends to use all of the memory it has been configured with • Sizing based on K8s container memory usage is going to miss a lot of savings • Experiment with JVM max heap size to see how much you can save - while monitoring app performance! 40% Mem used
  11. © 2026 Akamas • All Rights Reserved • Confidential How

    does people really set heap size? https://akamas.io/resources/the-state-of-java-on-kubernetes-2026-why-defaults-are-killing-your-performance
  12. © 2026 Akamas • All Rights Reserved • Confidential JVM

    ergonomics in K8s: heap memory sizing Source: Microsoft • MaxRAMPercentage default is very conservative: increase it to use all the requested memory of your pod • Watch out for out of memory kills by k8s - the JVM allocates off-heap memory in addition to the heap • Do not trust JVM ergonomics: itʼs best to explicitly set JVM flags to avoid surprises
  13. © 2026 Akamas • All Rights Reserved • Confidential OpenJDK

    garbage collectors Collector name Best for Serial Memory footprint Parallel Throughput G1 Balanced throughput - performance Shenandoah Low latency ZGC Low latency
  14. © 2026 Akamas • All Rights Reserved • Confidential Choose

    your GC carefully, trade-offs apply Choose your GC carefully, trade-offs apply Serial is 10% slower, but very efficient on memory and CPU Parallel is 22% faster, while also very efficient on memory (-31%) Z and Shenandoah are significantly slower and use more resources https://shipilev.net/jvm/anatomy-quarks/21-heap-uncommit https://akamas.io/resources/right-app-gc-maximum-performance
  15. © 2026 Akamas • All Rights Reserved • Confidential JVM

    default ergonomics in K8s: GC 2 4 6 8 1 Number of CPUs Memory MB 1791 MB Serial GC G1 GC • Default GC selection is based on hard-coded thresholds defined decades ago • You may end up paying the cost of a suboptimal GC, and you may not even know it! • Other good collectors like Parallel GC are not considered
  16. © 2026 Akamas • All Rights Reserved • Confidential Java

    on K8s: lesson learned • JVM configuration drives your application performance and efficiency, not your code • Default values may be far from optimal • Donʼt trust JVM ergonomics on K8s, always choose your heap and GC • JVM resource management is deeply interdependent with K8s pod settings and HPA behaviour • Proper JVM and K8s configurations can fix the biggest performance & efficiency issues
  17. © 2026 Akamas • All Rights Reserved • Confidential How

    performance tuning is done today Todayʼs approach: manual, slow, requires full-stack skills, doesnʼt scale, reactive Developer SRE / DevOps K8s/JVM app Performance problem! Analyzes and recommend new config Validates new config vs requirements
  18. © 2026 Akamas • All Rights Reserved • Confidential What

    if we could automate that? Developer SRE / DevOps K8s/JVM app K8s/JVM app K8s/JVM app K8s/JVM app K8s/JVM app K8s/JVM app K8s/JVM app K8s/JVM app K8s/JVM app Developer … JVM & K8s Automated optimization platform New approach: automated, requires low skills & effort, scales to big environments, proactive
  19. © 2026 Akamas • All Rights Reserved • Confidential Application

    Telemetry AI Optimization Engine Full Stack Performance Models DEV Tuning Profiles Define the goals GitOps Pipelines Open a PR Tuned App Review and Merge Human In the Loop Optimization Opportunities DEV Informed Decisions Reinforcement Learning AI-powered optimization architecture
  20. © 2026 Akamas • All Rights Reserved • Confidential Key

    optimization capabilities • Goal-driven (cost & performance) + constraints • Full-stack, application-aware • Fast convergence • Automated optimization with human-in-the loop controls • Explainable & deterministic • Integrates with observability • Safe, high confidence of changes (must be deployed in prod, canʼt learn from failures) • UX
  21. © 2026 Akamas • All Rights Reserved • Confidential Performance

    optimization: 76% p90 latency Before optimization After optimization
  22. © 2026 Akamas • All Rights Reserved • Confidential Performance

    optimization: 48% CPU used Before optimization After optimization
  23. © 2026 Akamas • All Rights Reserved • Confidential Efficiency

    optimization: 28% throughput & meeting SLOs Baseline configuration Peak Throughput matching SLO 74 TPS Best configuration 28% Peak Throughput matching SLO 95 TPS SLO breaking at 100ms
  24. © 2026 Akamas • All Rights Reserved • Confidential Takeaways

    • Default JVM settings on Kubernetes are often suboptimal for performance and cost • Application performance & efficiency is primarily driven by JVM and K8s configuration • Manual approaches donʼt work anymore in the new cloud-native world • AI-driven performance tuning automates optimization for better cost-performance trade-offs