Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Speaker Deck
PRO
Sign in
Sign up for free
Kubernetes Controllers - are they loops or events?
Tim Hockin
February 20, 2021
Technology
11
2.7k
Kubernetes Controllers - are they loops or events?
Tim Hockin
February 20, 2021
Tweet
Share
More Decks by Tim Hockin
See All by Tim Hockin
Kubernetes Pod Probes
thockin
6
2.3k
Go Workspaces for Kubernetes
thockin
1
670
Code Review in Kubernetes
thockin
1
1.2k
Multi-cluster: past, present, future
thockin
0
240
Kubernetes Network Models (why is this so dang hard?)
thockin
9
1.4k
KubeCon EU 2020: SIG-Network Intro and Deep-Dive
thockin
8
1k
A Non-Technical Kubernetes Talk (KubeCon EU 2020)
thockin
3
490
Bringing Traffic Into Your Kubernetes Cluster
thockin
44
11k
Kubernetes and Networks - why is this so dang hard?
thockin
62
54k
Other Decks in Technology
See All in Technology
OCIコンテナサービス関連の技術詳細 /oke-ocir-details
oracle4engineer
PRO
0
780
Hatena Engineer Seminar #23 「チームとプロダクトを育てる Mackerel 開発合宿」
arthur1
0
580
Oracle Transaction Manager for Microservices Free 22.3 製品概要
oracle4engineer
PRO
5
110
NGINXENG JP#2 - 4-NGINX-エンジニアリング勉強会
hiropo20
0
120
インフラ技術基礎勉強会 開催概要
toru_kubota
0
180
データ分析基盤の要件分析の話(202201_JEDAI)
yabooun
0
280
目指せCoverage100%! AutoScale環境におけるSavings Plans購入戦略 / JAWS-UG_SRE_Coverage
taishin
0
520
03_ユーザビリティテスト
kouzoukaikaku
0
590
Oracle Cloud Infrastructure:2023年1月度サービス・アップデート
oracle4engineer
PRO
0
160
地方自治体業務あるある ーアナログ最適化編-
y150saya
1
270
MoT/コネヒト/Kanmu が語るプロダクト開発xデータ分析 - 分析から機械学習システムの開発まで一人で複数ロールを担う大変さ
masatakashiwagi
3
770
OCI DevOps 概要 / OCI DevOps overview
oracle4engineer
PRO
0
510
Featured
See All Featured
Building Your Own Lightsaber
phodgson
96
4.9k
We Have a Design System, Now What?
morganepeng
37
5.9k
BBQ
matthewcrist
75
8.1k
Rebuilding a faster, lazier Slack
samanthasiow
69
7.5k
Designing Experiences People Love
moore
130
22k
Support Driven Design
roundedbygravity
88
8.9k
In The Pink: A Labor of Love
frogandcode
132
21k
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
7
580
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
109
16k
Art Directing for the Web. Five minutes with CSS Template Areas
malarkey
196
10k
Designing for Performance
lara
600
65k
How New CSS Is Changing Everything About Graphic Design on the Web
jensimmons
214
12k
Transcript
Kubernetes Controllers Are they loops or events? Tim Hockin @thockin
v1
Background on “reconciliation”: https://speakerdeck.com/thockin/kubernetes-what-is-reconciliation
Background on “edge vs. level”: https://speakerdeck.com/thockin/edge-vs-level-triggered-logic
Usually when we talk about controllers we refer to them
as a “loop”
Imagine a controller for Pods (aka kubelet). It has 2
jobs: 1) Actuate the pod API 2) Report status on pods
What you’d expect looks something like:
Node Kubernetes API a kubelet b c Get all pods
Node Kubernetes API a kubelet b c { name: a,
... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet b c for each pod
p { if p is running { verify p config } else { start p } gather status }
Node Kubernetes API a kubelet b c Set status c
a b
...then repeat (aka “a poll loop”)
Here’s where it matters
Node Kubernetes API a kubelet b c c a b
kubectl delete pod b
Node Kubernetes API a kubelet c c a b kubectl
delete pod b
Node Kubernetes API a kubelet c Get all pods c
a b
Node Kubernetes API a kubelet c { name: a, ...
} { name: c, ... } c a b
Node Kubernetes API a kubelet c I have “b” but
API doesn’t - delete it! c a b
Node Kubernetes API a kubelet c Set status c a
This is correct level-triggered reconciliation Read desired state, make it
so
Some controllers are implemented this way, but it’s inefficient at
scale
Imagine thousands of controllers (kubelet, kube-proxy, dns, ingress, storage...) polling
continuously
We need to achieve the same behavior more efficiently
We could poll less often, but then it takes a
long (and variable) time to react - not a great UX
Enter the “list-watch” model
Node Kubernetes API a kubelet b c Get all pods
Node Kubernetes API a kubelet b c { name: a,
... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet b c Cache: { name:
a, ... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet b c Watch all pods
Cache: { name: a, ... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet b c Cache: { name:
a, ... } { name: b, ... } { name: c, ... } for each pod p { if p is running { verify p config } else { start p } gather status }
Node Kubernetes API a kubelet b c Set status c
a b Cache: { name: a, ... } { name: b, ... } { name: c, ... }
We trade memory (the cache) for other resources (API server
CPU in particular)
There’s no point in polling my own cache, so what
happens next?
Remember that watch we did earlier? That’s an open stream
for events.
Node Kubernetes API a kubelet b c c a b
kubectl delete pod b Cache: { name: a, ... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet c c a b kubectl
delete pod b Cache: { name: a, ... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet c Delete: { name: b,
... } c a b Cache: { name: a, ... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet c Delete: { name: b,
... } c a b Cache: { name: a, ... } { name: c, ... }
Node Kubernetes API a kubelet c Cache: { name: a,
... } { name: c, ... } c a b API said to delete pod “b”.
Node Kubernetes API a kubelet c Cache: { name: a,
... } { name: c, ... } c a API said to delete pod “b”.
“But you said edge-triggered is bad!”
It is! But this isn’t edge-triggered.
The cache is updated by events (edges) but we are
still reconciling state
“???”
The controller can be restarted at any time and the
cache will be reconstructed - we can’t “miss an edge*” * modulo bugs, read on
Even if you miss an event, you can still recover
the state
Ultimately it’s all just software, and software has bugs. Controllers
should re-list periodically to get full state...
...but we’ve put a lot of energy into making sure
that our list-watch is reliable.