Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Kubernetes Controllers - are they loops or events?
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Tim Hockin
February 20, 2021
Technology
4.1k
11
Share
Kubernetes Controllers - are they loops or events?
Tim Hockin
February 20, 2021
More Decks by Tim Hockin
See All by Tim Hockin
Kubernetes in the 2nd Decade
thockin
0
510
Why Service is the worst API in Kubernetes, and what we can do about it
thockin
2
1.1k
Kubernetes Pod Probes
thockin
6
4.8k
Go Workspaces for Kubernetes
thockin
2
1.1k
Code Review in Kubernetes
thockin
2
1.9k
Multi-cluster: past, present, future
thockin
0
600
Kubernetes Network Models (why is this so dang hard?)
thockin
9
2.1k
KubeCon EU 2020: SIG-Network Intro and Deep-Dive
thockin
8
1.4k
A Non-Technical Kubernetes Talk (KubeCon EU 2020)
thockin
3
660
Other Decks in Technology
See All in Technology
AI時代に求められる思考のパラダイムシフト
nrinetcom
PRO
0
100
Databricks 月刊サービスアップデートまとめ 2026年04月号
tyosi1212
0
140
実例から学ぶ GuardDuty(SSH BruteForce)調査の全体フローと勘所【SecurityJAWS】
cscengineer
PRO
0
160
実践 TanStack Start ― 新規プロダクトを開発して確立した、サーバーとクライアント境界の設計パターン / Practical TanStack Start Server-Client Boundary Patterns
kaminashi
1
150
CARTA HOLDINGS エンジニア向け 採用ピッチ資料 / CARTA-GUIDE-for-Engineers
carta_engineering
0
47k
キャリア25年目にしてTypeScript に出会うまで - 「型」を通じて振り返るプログラミング言語遍歴 / Meeting TypeScript After 25 Years in Tech - Looking Back at My Programming Language Journey Through "Types"
bitkey
PRO
1
120
AI時代に、 データアナリストがデータエンジニアに異動して
jackojacko_
0
1.1k
コーディングエージェントはTypeScriptの 型エラーをどう自己修正しているのか
melonps
2
240
いつの間にかデータエンジニア以外の業務も増えていたけど、意外と経験が役に立ってる
zozotech
PRO
0
730
M&Aで増え続けるプロダクトに少数QAはどう立ち向かうか─GENDAが挑む、全員で取り組む品質標準化戦略 / GENDA Tech Talk #4
genda
0
260
AIAgentと取り組むKaggle
508shuto
2
460
Purview Endpoint DLP 動かしてみた
kozakigh
1
460
Featured
See All Featured
Exploring anti-patterns in Rails
aemeredith
3
360
Practical Orchestrator
shlominoach
191
11k
The State of eCommerce SEO: How to Win in Today's Products SERPs - #SEOweek
aleyda
2
10k
How To Speak Unicorn (iThemes Webinar)
marktimemedia
1
460
The B2B funnel & how to create a winning content strategy
katarinadahlin
PRO
1
360
VelocityConf: Rendering Performance Case Studies
addyosmani
333
25k
The Pragmatic Product Professional
lauravandoore
37
7.3k
Large-scale JavaScript Application Architecture
addyosmani
515
110k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
31
10k
Applied NLP in the Age of Generative AI
inesmontani
PRO
4
2.2k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
133
19k
AI: The stuff that nobody shows you
jnunemaker
PRO
7
640
Transcript
Kubernetes Controllers Are they loops or events? Tim Hockin @thockin
v1
Background on “reconciliation”: https://speakerdeck.com/thockin/kubernetes-what-is-reconciliation
Background on “edge vs. level”: https://speakerdeck.com/thockin/edge-vs-level-triggered-logic
Usually when we talk about controllers we refer to them
as a “loop”
Imagine a controller for Pods (aka kubelet). It has 2
jobs: 1) Actuate the pod API 2) Report status on pods
What you’d expect looks something like:
Node Kubernetes API a kubelet b c Get all pods
Node Kubernetes API a kubelet b c { name: a,
... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet b c for each pod
p { if p is running { verify p config } else { start p } gather status }
Node Kubernetes API a kubelet b c Set status c
a b
...then repeat (aka “a poll loop”)
Here’s where it matters
Node Kubernetes API a kubelet b c c a b
kubectl delete pod b
Node Kubernetes API a kubelet c c a b kubectl
delete pod b
Node Kubernetes API a kubelet c Get all pods c
a b
Node Kubernetes API a kubelet c { name: a, ...
} { name: c, ... } c a b
Node Kubernetes API a kubelet c I have “b” but
API doesn’t - delete it! c a b
Node Kubernetes API a kubelet c Set status c a
This is correct level-triggered reconciliation Read desired state, make it
so
Some controllers are implemented this way, but it’s inefficient at
scale
Imagine thousands of controllers (kubelet, kube-proxy, dns, ingress, storage...) polling
continuously
We need to achieve the same behavior more efficiently
We could poll less often, but then it takes a
long (and variable) time to react - not a great UX
Enter the “list-watch” model
Node Kubernetes API a kubelet b c Get all pods
Node Kubernetes API a kubelet b c { name: a,
... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet b c Cache: { name:
a, ... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet b c Watch all pods
Cache: { name: a, ... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet b c Cache: { name:
a, ... } { name: b, ... } { name: c, ... } for each pod p { if p is running { verify p config } else { start p } gather status }
Node Kubernetes API a kubelet b c Set status c
a b Cache: { name: a, ... } { name: b, ... } { name: c, ... }
We trade memory (the cache) for other resources (API server
CPU in particular)
There’s no point in polling my own cache, so what
happens next?
Remember that watch we did earlier? That’s an open stream
for events.
Node Kubernetes API a kubelet b c c a b
kubectl delete pod b Cache: { name: a, ... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet c c a b kubectl
delete pod b Cache: { name: a, ... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet c Delete: { name: b,
... } c a b Cache: { name: a, ... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet c Delete: { name: b,
... } c a b Cache: { name: a, ... } { name: c, ... }
Node Kubernetes API a kubelet c Cache: { name: a,
... } { name: c, ... } c a b API said to delete pod “b”.
Node Kubernetes API a kubelet c Cache: { name: a,
... } { name: c, ... } c a API said to delete pod “b”.
“But you said edge-triggered is bad!”
It is! But this isn’t edge-triggered.
The cache is updated by events (edges) but we are
still reconciling state
“???”
The controller can be restarted at any time and the
cache will be reconstructed - we can’t “miss an edge*” * modulo bugs, read on
Even if you miss an event, you can still recover
the state
Ultimately it’s all just software, and software has bugs. Controllers
should re-list periodically to get full state...
...but we’ve put a lot of energy into making sure
that our list-watch is reliable.