Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Kubernetesコントローラーのパフォーマンスチューニング
Search
Akihiro Ikezoe
March 16, 2023
Programming
2.2k
4
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Kubernetesコントローラーのパフォーマンスチューニング
Kubernetes Meetup Tokyo #56
2023/03/16
https://k8sjp.connpass.com/event/275280/
Akihiro Ikezoe
March 16, 2023
More Decks by Akihiro Ikezoe
See All by Akihiro Ikezoe
Kubernetes Admission Webhook Deep Dive
zoetrope
8
1.6k
Kubernetesオペレータのアンチパターン&ベストプラクティス
zoetrope
11
4.9k
Production-Ready Kubernetesに至るまでの3年間とこれから
zoetrope
4
950
オンプレKubernetesでMySQLクラスタの運用を自動化するためにOperatorを自作している話
zoetrope
5
2.5k
サイボウズを支える技術~インフラ刷新プロジェクトNecoを中心に紹介~
zoetrope
1
1.3k
Kuebernetesクラスタのマルチテナンシーベストプラクティス
zoetrope
8
6.9k
クラウドネイティブなチームづくり
zoetrope
7
4k
Open Policy Agent / Gatekeeper 勉強会
zoetrope
5
3k
Kubernetesクラスタの自動管理システムのつくりかた
zoetrope
3
19k
Other Decks in Programming
See All in Programming
Datadog × OpenTelemetry 入門と実践のあいだ
kn_to_maxpno
1
150
IBM Bobを活用したレガシーアプリの最新化
oniak3ibm
PRO
1
180
These Five Tricks Can Make Your Apps Greener, Cheaper, & Nicer
hollycummins
0
280
TypeScript+Orvalで実現する型安全かつ堅牢でスケーラブルなマルチチャネル通知基盤 / TSKaigi Night talks ~after conference~
d0riven
0
320
作って学ぶ、 JSX (TSX) ランタイムの基本
syumai
7
1.6k
Hunting Vulnerabilities in Symfony with LLMs
vinceamstoutz
0
520
セキュリティの専門家じゃなくてもできる。「セキュリティ意識」をアップデートして サプライチェーン攻撃への耐性を高めよう。
tk3fftk
5
680
3Dシーンの圧縮
fadis
1
680
Modding RubyKaigi for Myself
yui_knk
0
910
さぁV100、メモリをお食べ・・・
nilpe
0
130
A2UI という光を覗いてみる
satohjohn
1
120
Copilot CLI の継戦能力を高める コンテキスト管理
nozomutu
1
1.2k
Featured
See All Featured
Agile Actions for Facilitating Distributed Teams - ADO2019
mkilby
0
200
SERP Conf. Vienna - Web Accessibility: Optimizing for Inclusivity and SEO
sarafernandez
2
1.5k
Large-scale JavaScript Application Architecture
addyosmani
515
110k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
32
2.9k
Faster Mobile Websites
deanohume
310
31k
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
21
1.5k
DevOps and Value Stream Thinking: Enabling flow, efficiency and business value
helenjbeal
1
230
Introduction to Domain-Driven Design and Collaborative software design
baasie
1
830
Music & Morning Musume
bryan
47
7.2k
Tips & Tricks on How to Get Your First Job In Tech
honzajavorek
1
540
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
12
1.7k
Building an army of robots
kneath
306
46k
Transcript
None
◆ ◆ ◆ ◼ ◼ ◼ ◼ ◼ ◼ ◼
◆ ◼ ◆ ◼ ◆ ◼ ⚫ ⚫ ◼
None
◆ ◆ ◼ ◆ ◼ ◼ ◼
✓ ✓ ✓ ✓ ✓ ✓
◆ ◆ ◼ ◼ ◆ ◼ ◼
Controller Workers Workers Workers Workers Reconciler Informer
Controller Workers Workers Workers Workers Reconciler Informer
◆ ◼ ◆ ◼ ◼ ◆ ◼ ◼
◆ ◼ ◼ ◆ ◼ ◼ ◆ ◼ ◼
◆ ◼ ◼ ◆
◆ ◼ https://github.com/kubernetes/enhancements/issues/1602 ◆ ◼ https://kubernetes.io/docs/reference/instrumentation/metrics/ ◆ ◼ https://kubernetes.io/docs/concepts/cluster-administration/system-traces/
◆ ◼ ◼ ◼ ⚫ ⚫ https://cybozu-go.github.io/moco/metrics.html ⚫
◆ ◼ ◼ ⚫ ⚫ ⚫ ◼
◆ ◼ ◼ ◼ ◆ ◼ ◼ ◼ https://github.com/cybozu-go/moco/pull/500
◆ ◼ ◼ ◼ ◼ ◼ ◆
◆ ◼ ◼ ⚫ ◆ ◼
◆ ◆ ◼ ◼ ◼ ◼ ◼ ◼
◆ ◼ ◆ ◆ ◼ ◆
None
◆ ◼ ◼ ◆ ◼ ◼ ◆ ◼
Kubernetes Cluster Application Controller ArgoCD Server Repo Server Application Resource
Application Resource
application-controller Workers Workers Workers Workers Status Processors Workers Workers Operation
Processors Application Resource Informer Informer watch Events Application Resource
◆ ◆ ◼ ◼ ◼
◆ ◼ ◆ ◼ ◆ ◼ ◆ ◼
◆ ◼ ◼ ◆
◆ ◼ ◼
application-controller Workers Workers Workers Workers Status Processors Workers Workers Operation
Processors Application Resource Informer Informer watch Events
◆ ◼ ◼ ◆ ◼ ◆ ◼
◆ ◆
◆ ◼ ◼ ◼ ◆
workqueue_depth{job="kube-controller-manager",name="volumes"}
histogram_quantile(0.99, sum(rate( rest_client_rate_limiter_duration_seconds_bucket{ job="kube-controller-manager" }[1m] )) by (le))
kube-controller-manager PersistentVolume Controller
◆ ◼ --kube-api-qps ◆ ◼ ◆ ◼ ◼
None
◆ ◆ ◆
None
◆ ◼ https://github.com/zoetrope/kubbernecker ◼ ◼ ⚫ ◼ ⚫ ⚫
None
# Reconcile 99 histogram_quantile(0.99, sum( rate(controller_runtime_reconcile_time_seconds_bucket[1m]) ) by(job, controller, le)
) # Reconcile sum(rate(controller_runtime_reconcile_total[1m]))by(job, controller, result)
# 99 histogram_quantile(0.99, sum(rate(workqueue_queue_duration_seconds_bucket[1m])) by(job, name, le)) # sum(workqueue_depth) by
(job, name)
◆ ◆ import ( "context" "net/url" "time" "github.com/prometheus/client_golang/prometheus" clmetrics "k8s.io/client-go/tools/metrics"
crmetrics "sigs.k8s.io/controller-runtime/pkg/metrics" ) var ( rateLimiterDelay = prometheus.NewHistogramVec( prometheus.HistogramOpts{ Name: "rest_client_rate_limiter_duration_seconds", Help: "client-go rate limiter delay in seconds. Broken down by verb, and host.", Buckets: []float64{0.005, 0.025, 0.1, 0.25, 0.5, 1.0, 2.0, 4.0, 8.0, 15.0, 30.0, 60.0}, }, []string{"verb", "host"}, ) _ clmetrics.LatencyMetric = &latencyAdapter{} ) func init() { crmetrics.Registry.MustRegister(rateLimiterDelay) adapter := latencyAdapter{ metric: rateLimiterDelay, } clmetrics.RateLimiterLatency = &adapter } type latencyAdapter struct { metric *prometheus.HistogramVec } func (c *latencyAdapter) Observe(_ context.Context, verb string, u url.URL, latency time.Duration) { c.metric.WithLabelValues(verb, u.Host).Observe(latency.Seconds()) }
# Rate Limiter 99 histogram_quantile(0.99, sum( rate(rest_client_rate_limiter_duration_seconds_bucket[1m]) ) by(job, verb,
le) )
# Application Reconcile Status Processor {job=~"argocd/argocd-application-controller"} | logfmt | msg
="Reconciliation completed" | line_format "{{.application}}: {{.time_ms}}" # Application Reconcile Operation Processor {job=~"argocd/argocd-application-controller"} | logfmt | msg = "sync/terminate complete" | line_format "{{.application}}: {{.duration}}"
# {job=~"argocd/argocd-application-controller"} | logfmt | level = "debug" msg =~
"Refreshing app .*" apiVersion: v1 kind: ConfigMap metadata: name: argocd-cmd-params-cm data: # Application Controller debug default "info" controller.log.level: "debug"
◆
◆ $ kubectl port-forward svc/argocd-application-controller-metrics -n argocd 8082:8082 # 30
$ curl localhost:8082/debug/pprof/profile > cpu.pprof # goroutine $ curl localhost:8082/debug/pprof/goroutine?debug=1
◆ ◆ --otlp-address ◆
apiVersion: v1 kind: ConfigMap metadata: name: argocd-cmd-params-cm data: # Number
of application status processors (default 20) controller.status.processors: "20" # Number of application operation processors (default 10) controller.operation.processors: "10" ◆ ◆
import ctrl "sigs.k8s.io/controller-runtime" // ・・・途中省略・・・ cfg, err := ctrl.GetConfig() if
err != nil { return err } cfg.QPS = 50 cfg.Burst = int(cfg.QPS * 1.5) mgr, err := ctrl.NewManager(cfg, ctrl.Options{ ... })