Maximizing the Launch Reliability: Ensuring Stable Application Lift-off and Orbit on Kubernetes

#KubeCon #CloudNativeCon Maximizing the Launch Reliability: Ensuring Stable Application Lift-oﬀ
and Orbit on Kubernetes Hiroshi Hayakawa, LY Corporation

KubeCon + CloudNativeCon North America 2025 2 About Me •
Senior Platform Engineer@LY Corporation • Contributing to CNCF Platform Engineering Community Group • Author of books on Kubernetes • DIY keyboard enthusiast 2 Hiroshi Hayakawa | @hhiroshell

KubeCon + CloudNativeCon North America 2025 3 📝 Agenda 1.
Introduction 2. Practices for Stable Application Launch a. Tune your application (container and application level) b. Make your application "truly" initialized before accepting user traﬃc 3. Recent Kubernetes Features for Launch Reliability 4. Conclusion 3

Introduction 👈 2. Practices for Stable Application Launch a. Tune your application (container and application level) b. Make your application "truly" initialized before accepting user traﬃc 3. Recent Kubernetes Features for Launch Reliability 4. Conclusion 4

KubeCon + CloudNativeCon North America 2025 🚀 Overview of Application
Launch in Kubernetes 5 5 Pre-start process Containers start up 1. Scheduling • A new pod is registered with the kube-apiserver • The scheduler decides which node to place the pod 2. Preparation • The kubelet on the Node makes environments for containers in the pod 3. Lift oﬀ • The application process starts and performs a series of initialization procedures 4. In orbit • The application enters stable operation

KubeCon + CloudNativeCon North America 2025 🚀 Overview of Application
Launch in Kubernetes 6 6 Pre-start process Containers start up 1. Scheduling • A new pod is registered with the kube-apiserver • The scheduler decides which node to place the pod 2. Preparation • The kubelet on the Node makes environments for containers in the pod 3. Lift oﬀ • The application process starts and performs a series of initialization procedures ⚠ Launch failure Due to initialization issues or external factors, the application cannot reach orbit and crashes.

KubeCon + CloudNativeCon North America 2025 7 Why Launch Reliability
Matters • Hard to foresee ◦ Slight performance degradation or high traﬃc volume can cause launch failures • Can happen at any time ◦ In Kubernetes, applications are frequently restarted due to rolling updates, rescheduling, etc. 7 "It all looked so easy when you did it on paper -- where valves never froze, gyros never drifted, and rocket motors did not blow up in your face.” — Milton W. Rosen, rocket engineer, 1956.

KubeCon + CloudNativeCon North America 2025 8 Why Is Application
Launch Difficult? • There are various differences from applications in orbit ◦ Frequent GC caused by initialization process ◦ Cold cache ◦ Incomplete thread pool / connection pool initialization ◦ Incomplete class loading ◦ Insufficient JIT compilation ◦ ...etc 8

KubeCon + CloudNativeCon North America 2025 9 Why Is Application
Launch Difficult? • There are various differences from applications in orbit ◦ Frequent GC caused by initialization process ◦ Cold cache ◦ Incomplete thread pool / connection pool initialization ◦ Incomplete class loading ◦ Insufficient JIT compilation ◦ ...etc 👉 Applications immediately after launch generally have lower performance 👉 Let's get rid of these hindrances! 9

Introduction 2. Practices for Stable Application Launch a. Tune your application (container and application level) 👈 b. Make your application "truly" initialized before accepting user traﬃc 3. Recent Kubernetes Features for Launch Reliability 4. Conclusion 10

KubeCon + CloudNativeCon North America 2025 11 Application Tuning for
Startup • Good tuning helps overcome some startup hindrances: ◦ Frequent GC caused by initialization process ◦ Incomplete class loading ◦ Insuﬃcient JIT compilation 11

KubeCon + CloudNativeCon North America 2025 12 Application Tuning for
Startup • Good tuning helps overcome some startup hindrances: ◦ Frequent GC caused by initialization process ◦ Incomplete class loading ◦ Insuﬃcient JIT compilation • Tune applications in two levels: ◦ Container level = resource requests / limits ◦ Application level = language runtime, framework, your code 12

KubeCon + CloudNativeCon North America 2025 13 A Quick review
of Container Level Resource Control • .spec.containers[].resources.requests.[cpu|memory]  ◦ quantity of resources guaranteed to be available to the container • .spec.containers[].resources.limits.[cpu|memory]  ◦ limits for resource consumption beyond requests 13 Actual resource consumption of the container resources.limits resources.requests

KubeCon + CloudNativeCon North America 2025 14 Tuning for Stable
Launch - Container Level ✅ Allocate higher CPU / memory limits for resource bursting in startup • Whether to raise requests as well depends on several trade-oﬀs ◦ requests = limits -> reduces resource utilization eﬃciency ◦ requests < limits -> need to consider QoS Class for stability 14 🔥 resource consumption time resource limits

KubeCon + CloudNativeCon North America 2025 15 QoS (Quality of
Service) Class • Pod eviction priority is influenced by QoS Class • Guaranteed: ◦ All containers have CPU and Memory limits and requests set, and limits equal requests ◦ Setting only limits automatically sets requests to the same value, so specifying only CPU and Memory limits also results in Guaranteed class • Burstable: ◦ Applies when neither Guaranteed nor BestEffort conditions are met • BestEffort: ◦ All containers have no limits or requests set 15 Less likely to be evicted More likely to be evicted

KubeCon + CloudNativeCon North America 2025 16 In our production
application… • Increasing CPU limits may eliminate latency degradation during rolling updates 16 CPU limits = 4 CPU limits = 6 rolling update rolling update (but there is no degradation)

Launch - Application Level ✅ Consider the eﬀects of cgroups(*) on language runtimes, libraries, and frameworks behavior *) cgroups = resource limits (Roughly speaking) 17 CPU Memory Java • The return value of Runtime.availableProcessors(), as well as the allocation of ForkJoin pools and thread pools, changes according to cgroups limitations. • Libraries and frameworks that depend on these values or behaviors are also affected. The heap memory size is automatically determined according to the value of the limits (ergonomics). This also affects the selection of the GC algorithm. Go • <= 1.24: GOMAXPROCS is set based on the number of logical CPUs on the host machine. It is not affected by cgroups. • 1.25: The default value of GOMAXPROCS is adjusted according to the CPU quota defined by cgroups v2. Not affected by cgroups. The behavior of the GC can be manually tuned using the GOMEMLIMIT parameter.

Launch - Application Level ✅ Raise the maximum heap size in JVM applications • The maximum heap memory is 20-30% of the resource limit. It often means that the resource requests / limits are left unused. • We can use ﬂags like -XX:MaxRAMPercentage=50.0 in JVM ﬂags to set heap size as a percentage of the limits 18 Max size of heap memory resources.limits resources.requests Heap other areas

KubeCon + CloudNativeCon North America 2025 19 One More Tuning
Tip for JVM Applications • JVM's GC algorithm is automatically selected based on cgroups values • Be aware that increasing limits may unintentionally trigger a change in GC algorithm 19 Resource Limits GC algorithm • Memory: < 1,8 GB SerialGC • CPU: 2+ [cores] • Memory: > 1,8 [GB] G1GC

Introduction 2. Practices for Stable Application Launch a. Tune your application (container and application level) b. Make your application "truly" initialized before accepting user traﬃc 3. Recent Kubernetes Features for Launch Reliability 4. Conclusion 20

KubeCon + CloudNativeCon North America 2025 Rethinking "initialization" 21 21
1. Runtime Bootstrapping • Initialize fundamental runtime components such as memory management and thread scheduling 2. Core System Setup • Load standard libraries and initialize core system services 3. Flamework / Container Setup • Start the web server • Resolve dependencies • Register request handlers • Begin listening ports 4. Application Initialization • Initialize the data access layer • Load caches • Start background jobs 5. Continuous Optimization • JIT Compilation

1. Runtime Bootstrapping • Initialize fundamental runtime components such as memory management and thread scheduling 2. Core System Setup • Load standard libraries and initialize core system services 3. Flamework / Container Setup • Start the web server • Resolve dependencies • Register request handlers • Begin listening ports 4. Application Initialization • Initialize the data access layer • Load caches • Start background jobs 5. Continuous Optimization • JIT Compilation ⚠ Live traﬃc may comes in from here,

1. Runtime Bootstrapping • Initialize fundamental runtime components such as memory management and thread scheduling 2. Core System Setup • Load standard libraries and initialize core system services 3. Flamework / Container Setup • Start the web server • Resolve dependencies • Register request handlers • Begin listening ports 4. Application Initialization • Initialize the data access layer • Load caches • Start background jobs 5. Continuous Optimization • JIT Compilation ✅ Shift to here ⚠ Live traﬃc may comes in from here,

KubeCon + CloudNativeCon North America 2025 Pod Startup Flow 24
24 time InitContainers run sequentially Containers starts and runs ENTRYPOINT command startup probe readiness probe liveness probe … … … postStart lifecycle hook starts as well 🚀

KubeCon + CloudNativeCon North America 2025 Pod Startup Flow 25
25 time InitContainers run sequentially Containers starts and runs ENTRYPOINT command startup probe readiness probe liveness probe … … … postStart lifecycle hook starts as well Service In Pod becomes “READY.” Requests come into the pod. Make containers ready here, 🚀

KubeCon + CloudNativeCon North America 2025 26 Ensure applications are
truly initialized ✅ Perform application-level initialization before the Readiness Probe succeeds • Delay the readiness probe or block with startup probe, to allow time for application-level initialization • Ensure performing enough initialization in the application, for example: ◦ Fill the DB connection pool with idle connections ◦ Activate threads for request handling • Utilize postStart hook or startup probe for supporting initialization ◦ e.g. Send warmup traﬃc from inside the pod 26

KubeCon + CloudNativeCon North America 2025 Example: Fill the DB
Connection Pool 27 27 # Maintain the number of minimum idle connection before the app goes live

KubeCon + CloudNativeCon North America 2025 Example: Activating threads for
Request Handling 28 28 # Increase minimum thread count to reduce thread creation overhead

KubeCon + CloudNativeCon North America 2025 29 For Further Optimization...
• For applications requiring further performance optimization, sending traﬃc through automatic warm-up can be eﬀective ◦ Example: JIT compilation 29

KubeCon + CloudNativeCon North America 2025 30 Considerations for automatic
warmup • Startup procedures may become more complex, potentially increasing the risk of startup failures • Startup time may increase • Dependent components may experience unexpected load 30

Introduction 2. Practices for Stable Application Launch a. Tune your application (container and application level) b. Make your application "truly" initialized before accepting user traﬃc 3. Recent Kubernetes Features relate to Launch Reliability 👈 4. Conclusion 31

KubeCon + CloudNativeCon North America 2025 32 In-place Pod Resize
(Beta at >= Kubernetes v1.33) • Feature that allows changing resource limits / requests without restarting containers 🤔 Could it be used for resource bursting during application startup? 32 resource consumption time ▼ lower the resource limits

KubeCon + CloudNativeCon North America 2025 33 Concerns about Memory
Limit Reduction • Reducing memory limits to lower than actual usage without restart can trigger OOMKill (cgroups v2) ◦ Best-eﬀort checking by kubelet will be performed, but it's not guaranteed to be safe: https://github.com/kubernetes/kubernetes/pull/133012 • 👉 Strategies to mitigate risks: ◦ Avoid reducing memory limits when possible ◦ Wait for a certain period after startup before reducing limits 33

KubeCon + CloudNativeCon North America 2025 34 Runtime and cgroups
Relationship Matters • Runtimes, libraries, and frameworks need to dynamically adapt to cgroup changes • Dynamic adaptation capabilities in Java and Go runtimes are still evolving 34 CPU Memory Java ❌ No dynamic update — detects cgroup CPU limits only at startup. ❌ No dynamic update — reads memory limits only at startup for heap sizing. Go ✅ Auto-adjusts GOMAXPROCS when cgroup CPU quota changes (since Go 1.25). ⚠ Manual only — supports GOMEMLIMIT, but no built-in auto-update. However, custom implementation is possible (e.g., polling cgroup values).

KubeCon + CloudNativeCon North America 2025 35 Concerns from Operational
Perspective • Resize is triggered by a Pod's subresource, so you need to specify resize for each replica individually 👉 Automation is needed for production use 👉 The integration of VPA and in-place resizing currently under development? • https://kubernetes.io/docs/concepts/workloads/autoscaling/#in-place-pod-vertical-s caling • https://github.com/kubernetes/autoscaler/issues/4016 35

KubeCon + CloudNativeCon North America 2025 36 Dedicated Autoscaling Feature
for Startup? • In some cases, we might want to adjust resource allocation after startup only, while handling regular autoscaling with HPA • Resource consumption patterns during startup and steady state often have signiﬁcantly diﬀerent characteristics 36 startup steady state

Introduction 2. Practices for Stable Application Launch a. Tune your application (container and application level) b. Make your application "truly" initialized before accepting user traﬃc 3. Recent Kubernetes Features relate to Launch Reliability 4. Conclusion 👈 37

KubeCon + CloudNativeCon North America 2025 38 Conclusion • 🚀
Application launch reliability is critical for production Kubernetes workloads • ✅ Key practices for achieving stable launch: ◦ Tune application resources at both container and runtime levels ◦ Ensure complete initialization before accepting live traﬃc • 🔄 Kubernetes ecosystem is evolving to address launch challenges ◦ In-place Pod Resize introduces new optimization possibilities ◦ Consider operational implications and work around current limitations • 💡 Remember: Reliable launches lead to stable orbits! 38

Any Questions? 39

Maximizing the Launch Reliability: Ensuring Sta...

Maximizing the Launch Reliability: Ensuring Stable Application Lift-off and Orbit on Kubernetes

More Decks by hhiroshell

Other Decks in Technology

Featured

Transcript