Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Kubernetes Controllers - are they loops or events?
Search
Tim Hockin
February 20, 2021
Technology
11
4.1k
Kubernetes Controllers - are they loops or events?
Tim Hockin
February 20, 2021
Tweet
Share
More Decks by Tim Hockin
See All by Tim Hockin
Kubernetes in the 2nd Decade
thockin
0
490
Why Service is the worst API in Kubernetes, and what we can do about it
thockin
2
1.1k
Kubernetes Pod Probes
thockin
6
4.8k
Go Workspaces for Kubernetes
thockin
2
1.1k
Code Review in Kubernetes
thockin
2
1.9k
Multi-cluster: past, present, future
thockin
0
580
Kubernetes Network Models (why is this so dang hard?)
thockin
9
2k
KubeCon EU 2020: SIG-Network Intro and Deep-Dive
thockin
8
1.4k
A Non-Technical Kubernetes Talk (KubeCon EU 2020)
thockin
3
660
Other Decks in Technology
See All in Technology
Google系サービスで文字起こしから勝手にカレンダーを埋めるエージェントを作った話
risatube
0
200
「お金で解決」が全てではない!大規模WebアプリのCI高速化 #phperkaigi
stefafafan
4
1.8k
CyberAgentの生成AI戦略 〜変わるものと変わらないもの〜
katayan
0
280
【Λ(らむだ)】最近のアプデ情報 / RPALT20260318
lambda
0
120
Zeal of the Convert: Taming Shai-Hulud with AI
ramimac
0
160
スケールアップ企業でQA組織が機能し続けるための組織設計と仕組み〜ボトムアップとトップダウンを両輪としたアプローチ〜
tarappo
2
230
内製AIチャットボットで学んだDatadog LLM Observability活用術
mkdev10
0
140
SLI/SLO 導入で 避けるべきこと3選
yagikota
0
130
俺の/私の最強アーキテクチャ決定戦開催 ― チームで新しいアーキテクチャに適合していくために / 20260322 Naoki Takahashi
shift_evolve
PRO
0
300
形式手法特論:SMT ソルバで解く認可ポリシの静的解析 #kernelvm / Kernel VM Study Tsukuba No3
ytaka23
1
670
ReactのdangerouslySetInnerHTMLは“dangerously”だから危険 / Security.any #09 卒業したいセキュリティLT
flatt_security
0
350
OpenClaw を Amazon Lightsail で動かす理由
uechishingo
0
230
Featured
See All Featured
SEO for Brand Visibility & Recognition
aleyda
0
4.4k
Building Adaptive Systems
keathley
44
3k
The Organizational Zoo: Understanding Human Behavior Agility Through Metaphoric Constructive Conversations (based on the works of Arthur Shelley, Ph.D)
kimpetersen
PRO
0
270
Optimizing for Happiness
mojombo
378
71k
From π to Pie charts
rasagy
0
150
Build The Right Thing And Hit Your Dates
maggiecrowley
39
3.1k
Getting science done with accelerated Python computing platforms
jacobtomlinson
2
150
Context Engineering - Making Every Token Count
addyosmani
9
770
Groundhog Day: Seeking Process in Gaming for Health
codingconduct
0
130
Making the Leap to Tech Lead
cromwellryan
135
9.8k
End of SEO as We Know It (SMX Advanced Version)
ipullrank
3
4.1k
Designing Dashboards & Data Visualisations in Web Apps
destraynor
231
54k
Transcript
Kubernetes Controllers Are they loops or events? Tim Hockin @thockin
v1
Background on “reconciliation”: https://speakerdeck.com/thockin/kubernetes-what-is-reconciliation
Background on “edge vs. level”: https://speakerdeck.com/thockin/edge-vs-level-triggered-logic
Usually when we talk about controllers we refer to them
as a “loop”
Imagine a controller for Pods (aka kubelet). It has 2
jobs: 1) Actuate the pod API 2) Report status on pods
What you’d expect looks something like:
Node Kubernetes API a kubelet b c Get all pods
Node Kubernetes API a kubelet b c { name: a,
... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet b c for each pod
p { if p is running { verify p config } else { start p } gather status }
Node Kubernetes API a kubelet b c Set status c
a b
...then repeat (aka “a poll loop”)
Here’s where it matters
Node Kubernetes API a kubelet b c c a b
kubectl delete pod b
Node Kubernetes API a kubelet c c a b kubectl
delete pod b
Node Kubernetes API a kubelet c Get all pods c
a b
Node Kubernetes API a kubelet c { name: a, ...
} { name: c, ... } c a b
Node Kubernetes API a kubelet c I have “b” but
API doesn’t - delete it! c a b
Node Kubernetes API a kubelet c Set status c a
This is correct level-triggered reconciliation Read desired state, make it
so
Some controllers are implemented this way, but it’s inefficient at
scale
Imagine thousands of controllers (kubelet, kube-proxy, dns, ingress, storage...) polling
continuously
We need to achieve the same behavior more efficiently
We could poll less often, but then it takes a
long (and variable) time to react - not a great UX
Enter the “list-watch” model
Node Kubernetes API a kubelet b c Get all pods
Node Kubernetes API a kubelet b c { name: a,
... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet b c Cache: { name:
a, ... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet b c Watch all pods
Cache: { name: a, ... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet b c Cache: { name:
a, ... } { name: b, ... } { name: c, ... } for each pod p { if p is running { verify p config } else { start p } gather status }
Node Kubernetes API a kubelet b c Set status c
a b Cache: { name: a, ... } { name: b, ... } { name: c, ... }
We trade memory (the cache) for other resources (API server
CPU in particular)
There’s no point in polling my own cache, so what
happens next?
Remember that watch we did earlier? That’s an open stream
for events.
Node Kubernetes API a kubelet b c c a b
kubectl delete pod b Cache: { name: a, ... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet c c a b kubectl
delete pod b Cache: { name: a, ... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet c Delete: { name: b,
... } c a b Cache: { name: a, ... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet c Delete: { name: b,
... } c a b Cache: { name: a, ... } { name: c, ... }
Node Kubernetes API a kubelet c Cache: { name: a,
... } { name: c, ... } c a b API said to delete pod “b”.
Node Kubernetes API a kubelet c Cache: { name: a,
... } { name: c, ... } c a API said to delete pod “b”.
“But you said edge-triggered is bad!”
It is! But this isn’t edge-triggered.
The cache is updated by events (edges) but we are
still reconciling state
“???”
The controller can be restarted at any time and the
cache will be reconstructed - we can’t “miss an edge*” * modulo bugs, read on
Even if you miss an event, you can still recover
the state
Ultimately it’s all just software, and software has bugs. Controllers
should re-list periodically to get full state...
...but we’ve put a lot of energy into making sure
that our list-watch is reliable.