Kubernetes performance tuning dilemma: How to solve it with AI

© 2023 Akamas • All Rights Reserved • Confidential Kubernetes
performance tuning dilemma: How to solve it with AI Stefano Doni, CTO

© 2023 Akamas • All Rights Reserved • Confidential Agenda
1 The dark side of K8s 2 Why is K8s so hard? A peek under the cover 3 Enter AI-powered optimization 4 Demo

© 2023 Akamas • All Rights Reserved • Confidential •
Obsessed with performance optimization • 18+ years of capacity & performance work • Conference speaker since 2014 • Co-founder and CTO @ Akamas, the software platform for autonomous optimization, powered by AI Who Am I

© 2023 Akamas • All Rights Reserved • Confidential And
so K8s was born... “So let me get this straight. You want to build an external version of the Borg task scheduler. One of our most important competitive advantages. The one we don’t even talk about externally. And, on top of that, you want to open source it?” Craig McLuckie Co-founder of Kubernetes and Senior Product Manager at Google 2013 https://cloud.google.com/blog/products/containers-kubernet es/from-google-to-the-world-the-kubernetes-origin-story

© 2023 Akamas • All Rights Reserved • Confidential The
dark side of Kubernetes youtu.be/watch?v=4CT0cI62YHk youtu.be/QXApVwRBeys Cost efficiency Apps reliability Apps performance Kubernetes FinOps Report, 2021 June Kubernetes failure stories: k8s.af

© 2023 Akamas • All Rights Reserved • Conﬁdential Application
runtime resource management Kubernetes resource management • Heap memory sizing • Garbage collection • Processor & thread settings • Container resource requests & limits • Number of replicas • Horizontal auto-scaling settings New tuning challenges for cloud-native apps 100s-1000s microservices 10s-100s inter-dependent configurations

© 2023 Akamas • All Rights Reserved • Confidential Why
is K8s so hard? K8s resource management

© 2023 Akamas • All Rights Reserved • Confidential Pod
A Resource requests drive K8s cluster costs CPU Memory • Requests are resources the container is guaranteed to get • Cluster capacity is based on pod resource requests - there is no overcommitment! • Resource requests != resource utilization: a cluster can be full even if utilization is 10% Node (4 CPU, 8 GB Memory) Resource requests from pod manifest Pod A 2 cores 2GB Memory Pod A apiVersion: v1 kind: Pod metadata: name: Pod A spec: containers: - name: app image: nginx:1.1 resources: requests: memory: “2Gi” cpu: “2” 2 4 2 4 6 8 Resource used

© 2023 Akamas • All Rights Reserved • Confidential Resource
limits may strongly impact application performance and stability • A container can consume more resources than it has requested • Resource limits allow to specify the maximum resources a container can use (e.g. CPU = 2) • When a container hits its resource limits bad things can happen Container CPU limit Container Memory limit K8s throttle container CPU -> Application performance slowdown When hitting Memory Limits When hitting CPU Limits K8s kills the container -> Application stability issues X CPU Usage Memory Usage

© 2023 Akamas • All Rights Reserved • Confidential So
achieving cost effective, performant & reliable apps on K8s is EASY right? 10 … hell NO! YES sure… image generated with Midjourney

© 2023 Akamas • All Rights Reserved • Confidential Get
ready for some real-world K8s horror stories…

© 2023 Akamas • All Rights Reserved • Confidential CPU
limits don’t work the way you think! Surprising impacts on app performance Dev / SRE Significant CPU throttling… … with CPU < 40% “The container's CPU use is being throttled, because the container is attempting to use more CPU resources than its limit” https://kubernetes.io/docs/tasks/configure-pod- container/assign-cpu-resource Why do I have CPU throttling if I’m using less than 40% of my CPU limit? Must be a K8s infrastructure issue… Performance Impact

© 2023 Akamas • All Rights Reserved • Confidential 13
• CPU limits act on CPU time - your container can access all of the CPUs of the node • There are no universally good thresholds for CPU throttling • Experiment with your different K8s and app runtime settings and monitor app performance! period (100 ms) threads thread 1 Single-threaded app period (100 ms) threads thread 1 Multi-threaded app thread 2 thread 3 thread 4 CPU Throttling - app is stalled! CPU quota: 100 ms CPU quota: 100 ms Example: How CPU throttling works with CPU limit = 1 core How CPU limits & throttling actually work Key Takeaways

© 2021 Akamas • All Rights Reserved • Conﬁdential Memory
limits don’t work the way you think! Surprising impacts on app availability Container Memory limit Container Memory used Container memory usage < 70% A new configuration is recommended to save costs, adapting container memory limits to resource usage New configuration - 10% Mem < 70%

© 2021 Akamas • All Rights Reserved • Conﬁdential Container
Memory limit New configuration XX Out of memory pod restarts The new configuration causes a misalignment of K8s resources and app runtime (JVM heap size vs mem limits) As a result, the application pods get killed by K8s, causing service availability issues Container Memory used Memory limits don’t work the way you think! Surprising impacts on app availability Availability Impact

© 2023 Akamas • All Rights Reserved • Confidential Rightsizing
overprovisioned containers is easy, right? 16 Container memory is far from saturation… Context: Java microservice running in K8s container with 6GB memory limit (default JVM settings) SRE Container memory limit Container memory used I can safely save money by reducing memory limits… right? Mem < 35%

App slows down significantly, while container memory utilization stays below 50%! Tuning experiment: Container memory limit is progressively cut from 6 to 2GB (constant load) SRE Container memory limit Container memory used App response time Ouch! App performance severely degraded… Why is that? Rightsizing overprovisioned containers is easy, right? Performance Impact

© 2023 Akamas • All Rights Reserved • Confidential Let’s
dive deeper in the stack… Application runtime resource management

© 2023 Moviri • All Rights Reserved App runtimes are
complex engines Heap Thread stack Loaded Classes Compiled code … Memory Class loader Just-in-time compiler Garbage collector Interpreter Execution engine JVM Architecture

© 2023 Akamas • All Rights Reserved • Confidential How
does the JVM set the max heap? JVM ergonomics in K8s are tricky Source: Microsoft • MaxRAMPercentage default is very conservative: increase it, but watch out for OOM kills by k8s • Do not trust JVM ergonomics: it’s best to explicitly set JVM flags to avoid surprises (-Xmx <max-heap>) • Check your apps: “docker run --memory 1G <image> java -XX:+PrintFlagsFinal 2>&1 | grep -w MaxHeapSize” Key Takeaways Max Heap 256MB Container mem limit: 1 GB

The JVM ergonomics configure heap memory based on container memory (max heap = 25% of mem limit) As the JVM max heap gets reduced, JVM memory pressure builds up, impacting app performance • Rightsizing K8s containers looking only at resource usage can lead to huge perf issues • Understand what’s happening in your application runtime environment • Explicitly set app runtime options and check app performance! Container memory limit JVM heap size JVM heap used Rightsizing overprovisioned containers is hard! JVM memory saturation App response time Key Takeaways

© 2023 Moviri • All Rights Reserved A well tuned
GC delivers huge cost benefits as well 1500 millicores 600 millicores CPU used App response time G1 GC (-XX:+UseG1GC) Parallel GC (-XX:+UseParallelGC) -60% CPU used ($$$)

© 2023 Akamas • All Rights Reserved • Confidential JVM
default ergonomics in K8s: garbage collector 2 4 6 8 1 Number of CPUs Memory (MB) 1791 MB Serial GC G1 GC • Default GC selection is based on hard-coded thresholds defined decades ago • You may end up paying the cost of a suboptimal GC, and you may not even know it! • Do not trust JVM ergonomics - always set your JVM options! Key Takeaways

© 2023 Akamas • All Rights Reserved • Confidential What
about Golang, Node.js and .NET? Kind of the same :) 400 millicores 180 millicores -55% CPU used • Golang: https://tip.golang.org/doc/gc-guide • Node.js/V8: https://flaviocopes.com/node-runtime-v8-options • .NET: https://learn.microsoft.com/en-us/dotnet/core/runtime-config/garbage-collector Golang microservice GOGC tuning

© 2023 Akamas • All Rights Reserved • Confidential Key
takeaways • K8s enables unprecedented scalability & efficiency, but it’s not automatic • Tuning is your responsibility - if you don’t tune, you don’t save! • The biggest cost & reliability wins lie in K8s workload and app runtime layers - don’t rely on ergonomics! • AI-powered optimization enables you to automate tuning and achieve savings at scale 1 2 3 4

Contacts [email protected] @AkamasLabs @akamaslabs Italy HQ Via Schiaffino 11 Milan,
20158 +39-02-4951-7001 USA East 211 Congress Street Boston, MA 02110 +1-617-936-0212 Singapore 5 Temasek Blvd Singapore 038985 USA West 12130 Millennium Drive Los Angeles, CA 90094 +1-323-524-0524 LinkedIn Twitter Email © 2023 Akamas • All Rights Reserved • Confidential

Kubernetes performance tuning dilemma: How to s...

Kubernetes performance tuning dilemma: How to solve it with AI

Stefano Doni

More Decks by Stefano Doni

Other Decks in Technology

Featured

Transcript

© 2023 Akamas • All Rights Reserved • Confidential Kubernetes

© 2023 Akamas • All Rights Reserved • Confidential Agenda

© 2023 Akamas • All Rights Reserved • Confidential •

© 2023 Akamas • All Rights Reserved • Confidential And

© 2023 Akamas • All Rights Reserved • Confidential The

© 2023 Akamas • All Rights Reserved • Conﬁdential Application

© 2023 Akamas • All Rights Reserved • Confidential Why

© 2023 Akamas • All Rights Reserved • Confidential Pod

© 2023 Akamas • All Rights Reserved • Confidential Resource

© 2023 Akamas • All Rights Reserved • Confidential So

© 2023 Akamas • All Rights Reserved • Confidential Get

© 2023 Akamas • All Rights Reserved • Confidential CPU

© 2023 Akamas • All Rights Reserved • Confidential 13

© 2021 Akamas • All Rights Reserved • Conﬁdential Memory

© 2021 Akamas • All Rights Reserved • Conﬁdential Container

© 2023 Akamas • All Rights Reserved • Confidential Rightsizing

© 2023 Akamas • All Rights Reserved • Confidential 17

© 2023 Akamas • All Rights Reserved • Confidential Let’s

© 2023 Moviri • All Rights Reserved App runtimes are

© 2023 Akamas • All Rights Reserved • Confidential How

© 2023 Akamas • All Rights Reserved • Confidential 21

© 2023 Moviri • All Rights Reserved A well tuned

© 2023 Akamas • All Rights Reserved • Confidential JVM

© 2023 Akamas • All Rights Reserved • Confidential What

© 2023 Akamas • All Rights Reserved • Confidential How

© 2023 Akamas • All Rights Reserved • Conﬁdential Optimization

© 2023 Akamas • All Rights Reserved • Confidential Autonomous

© 2023 Akamas • All Rights Reserved • Confidential Autonomous

© 2023 Akamas • All Rights Reserved • Confidential Reducing

© 2023 Akamas • All Rights Reserved • Conﬁdential Optimization

© 2023 Akamas • All Rights Reserved • Confidential Key

© 2023 Akamas • All Rights Reserved • Confidential Q&A

Contacts [email protected] @AkamasLabs @akamaslabs Italy HQ Via Schiaffino 11 Milan,