We need to achieve the same
behavior more efficiently
Slide 23
Slide 23 text
We could poll less often, but
then it takes a long (and
variable) time to react - not a
great UX
Slide 24
Slide 24 text
Enter the “list-watch” model
Slide 25
Slide 25 text
Node Kubernetes API
a
kubelet
b
c
Get all pods
Slide 26
Slide 26 text
Node Kubernetes API
a
kubelet
b
c
{ name: a, ... }
{ name: b, ... }
{ name: c, ... }
Slide 27
Slide 27 text
Node Kubernetes API
a
kubelet
b
c
Cache:
{ name: a, ... }
{ name: b, ... }
{ name: c, ... }
Slide 28
Slide 28 text
Node Kubernetes API
a
kubelet
b
c
Watch all
pods
Cache:
{ name: a, ... }
{ name: b, ... }
{ name: c, ... }
Slide 29
Slide 29 text
Node Kubernetes API
a
kubelet
b
c
Cache:
{ name: a, ... }
{ name: b, ... }
{ name: c, ... }
for each pod p {
if p is running {
verify p config
} else {
start p
}
gather status
}
Slide 30
Slide 30 text
Node Kubernetes API
a
kubelet
b
c
Set status
c
a
b
Cache:
{ name: a, ... }
{ name: b, ... }
{ name: c, ... }
Slide 31
Slide 31 text
We trade memory (the cache)
for other resources (API
server CPU in particular)
Slide 32
Slide 32 text
There’s no point in polling my
own cache, so what happens
next?
Slide 33
Slide 33 text
Remember that watch we did
earlier? That’s an open
stream for events.
Slide 34
Slide 34 text
Node Kubernetes API
a
kubelet
b
c
c
a
b
kubectl
delete pod b
Cache:
{ name: a, ... }
{ name: b, ... }
{ name: c, ... }
Slide 35
Slide 35 text
Node Kubernetes API
a
kubelet
c
c
a
b
kubectl
delete pod b
Cache:
{ name: a, ... }
{ name: b, ... }
{ name: c, ... }
Slide 36
Slide 36 text
Node Kubernetes API
a
kubelet
c
Delete:
{ name: b, ... }
c
a
b
Cache:
{ name: a, ... }
{ name: b, ... }
{ name: c, ... }
Slide 37
Slide 37 text
Node Kubernetes API
a
kubelet
c
Delete:
{ name: b, ... }
c
a
b
Cache:
{ name: a, ... }
{ name: c, ... }
Slide 38
Slide 38 text
Node Kubernetes API
a
kubelet
c
Cache:
{ name: a, ... }
{ name: c, ... }
c
a
b
API said to delete
pod “b”.
Slide 39
Slide 39 text
Node Kubernetes API
a
kubelet
c
Cache:
{ name: a, ... }
{ name: c, ... }
c
a
API said to delete
pod “b”.
Slide 40
Slide 40 text
“But you said edge-triggered
is bad!”
Slide 41
Slide 41 text
It is! But this isn’t
edge-triggered.
Slide 42
Slide 42 text
The cache is updated by
events (edges) but we are still
reconciling state
Slide 43
Slide 43 text
“???”
Slide 44
Slide 44 text
The controller can be
restarted at any time and the
cache will be reconstructed -
we can’t “miss an edge*”
* modulo bugs, read on
Slide 45
Slide 45 text
Even if you miss an event, you
can still recover the state
Slide 46
Slide 46 text
Ultimately it’s all just
software, and software has
bugs. Controllers should
re-list periodically to get full
state...
Slide 47
Slide 47 text
...but we’ve put a lot of energy
into making sure that our
list-watch is reliable.