Kubernetes Chaos Engineering: Lessons Learned in Networking

kubernetes chaos engineering: lessons learned in networking @danielepolencic

once upon a time…

MASTER

MASTER NODE NODE

MASTER NODE NODE NODE

CPU Memory

deployments in k8s

APP Pod 1

APP Pod 1 Pod 2 APP

APP Pod 1 Pod 2 Pod 3 APP APP

APP Pod 1 Pod 2 Pod 3 Pod 4 APP
APP APP

Node 2 Node 1 Node 3

Node 2 Pod 1 APP APP Pod 2 APP Pod
3 APP Pod 4 Node 1 Node 3

designed to be fault tolerant #1

Node 2 APP Pod 1 APP Pod 2 APP Pod

Node 2 APP Pod 1 APP Pod 5 APP Pod

designed to be fault tolerant #2

Node 2 APP Pod 1 APP Pod 2 Load balancer
Incoming traffic APP Pod 3 APP Pod 4 Node 1 Node 3

Incoming traffic APP Pod 3 APP Pod 4 APP Pod 5 Node 1 Node 3

designed to scale

Ingress Pod 1 Pod 2 Service Incoming traffic

Incoming traffic Node 1 Node 3

Incoming traffic Node 1 Node 3 ?

is the traffic lost?

is the traffic lost? times out?

is the traffic lost? times out? 404?

it works™

1. the load balancer is app aware? !

RED RED

1. cloud vendor specific

1. cloud vendor specific 2. single point of failure

1. cloud vendor specific 2. single point of failure 3.
hard to scale

2. the master node routes the traffic? !

RED RED

1. cloud vendor agnostic 2. single point of failure 3.
hard to scale

3. node is app aware? !

RED RED

RED RED Load balancer Load balancer Load balancer

1. cloud vendor agnostic 2. redundancy built-in 3. scales with
nodes

kube-proxy

how does it know that the app is not on
the node?

Create a 2 Pods deployment

API SERVER

API SERVER Deployment #1

CONTROLLER MANAGER API SERVER Deployment #1

CONTROLLER MANAGER API SERVER Deployment #1 Pod #1 Pod #2

CONTROLLER MANAGER API SERVER Deployment #1 Pod #1 PENDING Pod
#2 PENDING

CONTROLLER MANAGER SCHEDULER API SERVER Deployment #1 Pod #1 PENDING
Pod #2 PENDING

CONTROLLER MANAGER SCHEDULER API SERVER Deployment #1 Pod #1 SCHEDULED
Pod #2 SCHEDULED

the node creates the container

Deployment with 2 replicas please kubelet kubelet

Deployment with 2 replicas please Is there anything for me?
kubelet kubelet

Docker daemon Deployment with 2 replicas please

Deployment with 3 replicas please RED RED

the kubelet keeps the control plane informed

CONTROLLER MANAGER SCHEDULER API SERVER Deployment #1 Pod #1 RUNNING
Pod #2 RUNNING

Pod name Status Node Endpoint Pod 1 RUNNING worker1 10.0.1.1
Pod 2 RUNNING worker1 10.0.1.2 Pod 3 RUNNING worker2 10.0.2.1 Pod 4 RUNNING worker2 10.0.2.2

the list is up to date

Pod name Status Node Pod 1 RUNNING worker1

Pod name Status Node Pod 1 RUNNING worker1 Pod 2
RUNNING worker2

RUNNING worker2 Pod 3 RUNNING worker3

RUNNING worker2

more lists

Ingress Pod 1 Pod 2 Service Incoming traffic 10.0.1.1:3000 10.0.1.2:3000

Pod name Status Node Endpoint Pod 1 RUNNING worker1 10.0.1.1
Pod 2 RUNNING worker1 10.0.1.2 Service name IP Endpoints Service 1 172.17.0.1 10.0.1.1:3000, 10.0.1.2:3000, 10.0.2.1:3000 Service 2 172.17.0.2 10.0.2.2:8080 Pod 3 RUNNING worker2 10.0.2.1 Pod 4 RUNNING worker2 10.0.2.2

kube-proxy reroutes the traffic

? RED RED

? RED RED Pod name IP Node Pod 1 10.0.1.1
worker2 Pod 2 10.0.2.1 worker3 Service name Endpoints Service 1 Pod1, Pod2

RED RED Pod name IP Node Pod 1 10.0.1.1 worker2
Pod 2 10.0.2.1 worker3 Service name Endpoints Service 1 Pod1, Pod2

that was quick! !

! what if… split brain

RED RED

1. almost works

1. almost works 2. stale routing table

1. almost works 2. stale routing table 3. can recover

! what if… kube-proxy crashes

RED RED Pod name Status Node Pod 1 RUNNING worker1
Old routing table

RED Old routing table Pod name Status Node Pod 1
RUNNING worker1

kube-proxy as daemonset

1. ☸ respawns it

1. ☸ respawns it 2. almost works

1. ☸ respawns it 2. almost works 3. stale routing
table

! what if… routing table is lost

"lost"

1. ssh into node

1. ssh into node 2. drop routing table

1. ssh into node 2. drop routing table 3. observe

monitor the app ~$ while sleep 1 do date +%X
curl -sS http://<balancer_ip>/ done

14:39:41 Hello world!

14:39:41 Hello world! 14:39:42 Hello world!

14:39:41 Hello world! 14:39:42 Hello world! 14:39:43 Hello world!

14:39:41 Hello world! 14:39:42 Hello world! 14:39:43 Hello world! 14:39:44
Hello world!

Hello world! 14:39:45 Hello world!

Hello world! 14:39:45 Hello world! 14:39:46 Hello world!

drop routing table ~$ ssh <node ip> ~$ iptables -F

what would you expect?

Hello world! 14:39:47 Hello world! # nothing...

Hello world! # nothing... # nothing...

14:39:45 Hello world! 14:39:46 Hello world! 14:39:47 Hello world! #
nothing... # nothing... # nothing...

nothing... # nothing... # nothing... 14:40:14 Hello world!

14:39:46 Hello world! 14:39:47 Hello world! # nothing... # nothing...
# nothing... 14:40:14 Hello world! 14:40:15 Hello world!

~30 seconds of void

what if you skip the lb!

monitor the app ~$ while sleep 1 do date +%X
curl -sS http://<node_ip>/ done

Hello world! 14:39:45 Hello world! 14:39:46 Hello world!

drop the routing table!

Hello world! 14:39:46 Hello world! # nothing...

Hello world! # nothing... curl: (28) Connection timed out after 10003 milliseconds

nothing... curl: (28) Connection timed out after 10003 milliseconds curl: (28) Connection timed out after 10004 milliseconds

14:39:45 Hello world! 14:39:46 Hello world! # nothing... curl: (28)
Connection timed out after 10003 milliseconds curl: (28) Connection timed out after 10004 milliseconds 14:40:15 Hello world!

14:39:46 Hello world! # nothing... curl: (28) Connection timed out
after 10003 milliseconds curl: (28) Connection timed out after 10004 milliseconds 14:40:15 Hello world! 14:40:16 Hello world!

1. curl times out at 10s

1. curl times out at 10s 2. lb must be
timing out > 10s

1. curl times out at 10s 2. lb must be
timing out > 10s 3. something fixed the routing table

1. curl times out 10s 2. lb must be timing
out > 10s 3. something fixed the routing table 4. why 30 seconds?

is it you, kube-proxy?

--iptables-sync-period --iptables-min-sync-period

--iptables-sync-period how often routing rules are refreshed (default 30s)

--iptables-min-sync-period minimum interval for refresh (default 10s)

RED Pod name Status Node Endpoint Pod 1 RUNNING worker1
10.0.1.1 Pod 2 RUNNING worker1 10.0.1.2 Pod 3 RUNNING worker2 10.0.2.1 Pod 4 RUNNING worker2 10.0.2.2 Routing table

10.0.1.1 Pod 2 RUNNING worker1 10.0.1.2 Pod 3 RUNNING worker2 10.0.2.1 Pod 4 RUNNING worker2 10.0.2.2 Routing table !

10.0.1.1 Pod 2 RUNNING worker1 10.0.1.2 Pod 3 RUNNING worker2 10.0.2.1 Pod 4 RUNNING worker2 10.0.2.2 Routing table

you can tweak the flags

! lessons learned

1. understand how works

? TO: Service (172.17.0.1) FROM: Anywhere

? TO: FROM: Anywhere Service (172.17.0.1)

? Pod name IP Node Pod 1 10.0.1.1 worker2 Pod
2 10.0.2.1 worker3 Service name Endpoints Service 1 Pod1, Pod2 TO: FROM: Anywhere Service (172.17.0.1)

? Pod name IP Node Pod 1 10.0.1.1 worker2 Pod
2 10.0.2.1 worker3 Service name Endpoints Service 1 Pod1, Pod2 TO: Pod1 (10.0.1.1) FROM: Anywhere Service (172.17.0.1)

192.168.0.2 192.168.0.3 ? 10.0.1.1 TO: Pod1 (10.0.1.1) FROM: Anywhere Service
(172.17.0.1)

192.168.0.1 Destination Next hop 10.0.0.0/24 10.0.1.0/24 10.0.2.0/24 192.168.0.2 192.168.0.3 192.168.0.4
10.0.3.0/24 192.168.0.5 192.168.0.2 192.168.0.3 ? 10.0.1.1 TO: Pod1 (10.0.1.1) FROM: Anywhere Service (172.17.0.1)

192.168.0.2 192.168.0.3 10.0.1.1 TO: FROM: Anywhere Service (172.17.0.1) Pod1 (10.0.1.1)

2. learn by doing · https://github.com/DennyZhang/challenges- kubernetes · https://github.com/arush-sal/cka-practice- environment

3. monitoring and alerting

thanks

QUESTIONS? @danielepolencic

Kubernetes Chaos Engineering: Lessons Learned i...

Kubernetes Chaos Engineering: Lessons Learned in Networking

More Decks by Daniele Polencic

Other Decks in Technology

Featured

Transcript