How We build Kubernetes service by Rancher in LINE

Slide 1

Slide 1 text

No content

Slide 2

Slide 2 text

How We Build k8s Service By Rancher In LINE LINE corporation Verda team Feixiang Li

Slide 3

Slide 3 text

About me • Name: Luke • 2011 ~ 2015 Devops engineer at Rakuten • 2016/01 ~ Cloud engineer at LINE • Baremetal • k8s

Slide 4

Slide 4 text

LINE Private Cloud • IaaS • VM • Baremetal • Managed Service • Database as service • Redis as service • Etc. • Multiple Regions • 20000 VM, 8000 Baremetal Tokyo Osaka Singapore

Slide 5

Slide 5 text

Why Do We Need KaaS • Many teams are using their own k8s on LINE cloud • K8s isn’t easy to use • Deploy: 1.create servers 2. docker 3. kubeadm(or xxx) 4. etc. • Management: logging, monitoring, upgrade etc. • Recovery: etcd backup, restore etc. • Etc. • LINE cloud/infrastructure policy • DB ACL • Global IP • Use other cloud services • Etc.

Slide 6

Slide 6 text

Goal • Provide stable k8s service • Release users from deployment, upgrade, management etc. • Make k8s easier to use • Reduce learning cost • Provide service template/guide (like how to build database service on k8s) • Consult/support etc. • Integration with LINE cloud services • Take care of dependency, configuration etc. • Make it easy to use other cloud service like Redis, object storage etc. LINE GKE

Slide 7

Slide 7 text

How To Do From Scratch K8s Upgrade Auto Provision Etcd backup High Availability Recovery …… Use OSS OSS controller OSS

Slide 8

Slide 8 text

Ideal vs Reality Web Application Engineer: 3? Kubernetes Engineer: 10? Etcd Engineer: 3? Engineers: 2

Slide 9

Slide 9 text

Rancher ● OpenStack support ● No dependency on specific software ● Already used by some teams on production ● Active community ● Architecture(k8s operator pattern)

Slide 10

Slide 10 text

Rancher 2.X ● OSS to create and manage multiple k8s clusters ● Use k8s operator pattern for implementation API Controller ClusterA Watch Reconcile Get latest information from kube- apiserver Check if any difference Between desired and actual states Do something to make actual state desired Reconcile Loop Kubernetes Cluster Cluster Agent Node Agent Create

Slide 11

Slide 11 text

How We Do API Server ● Works as proxy ● Integrate with private cloud ● Limit Rancher function ● Support multiple ranchers K8s provider Support multiple providers User k8s cluster ● VM is created by OpenStack ● K8s cluster is deployed by rancher

Slide 12

Slide 12 text

LINE KaaS ● 2018/06 ~ 2018/10 Development by 2 developers ● 2018/11 Released on development environment ● 47 clusters 472 nodes

Slide 13

Slide 13 text

No content

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

Monitoring System Got Alert

Slide 17

Slide 17 text

What’s happening? API Server Kubernetes Cluster Cluster Agent Kubernetes Cluster Cluster Agent WebSocket WebSocket Failed to establish WebSocket session

Slide 18

Slide 18 text

Check log of cluster-agent $ kubectl logs -f cattle-cluster-agent-df7f69b68-s7mqg -n cattle-system INFO: Environment: CATTLE_ADDRESS=172.18.6.6 CATTLE_CA_CHECKSUM=8b791af7a1dd5f28ca19f8dd689bb816d399ed02753f2472cf25d1eea5c20be1 CATTLE_CLUSTER=true CATTLE_INTERNAL_ADDRESS= CATTLE_K8S_MANAGED=true CATTLE_NODE_NAME=cattle-cluster-agent-df7f69b68-s7mqg CATTLE_SERVER=https://rancher.com INFO: Using resolv.conf: nameserver 172.19.0.10 search cattle-system.svc.cluster.local svc.cluster.local cluster.local ERROR: https://rancher.com/ping is not accessible (Could not resolve host: rancher.com) Somehow this container seems failed to resolve domain name. $ kubectl exec -it cattle-cluster-agent-df7f69b68-s7mqg -n cattle-system bash root@cattle-cluster-agent-df7f69b68-s7mqg:/# cat /etc/resolv.conf nameserver 172.19.0.10 #=> IP from Kubernetes Network search cattle-system.svc.cluster.local svc.cluster.local cluster.local $ kubectl get svc -n kube-system NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) kube-dns ClusterIP 172.19.0.10 53/UDP,53/TCP

Slide 19

Slide 19 text

Kube DNS problem? $ kubectl logs -f cattle-node-agent-c5ddm -n cattle-system INFO: https://rancher.com/ping is accessible INFO: Value from https://rancher.com/v3/settings/cacerts is an x509 certificate time="2019-02-26T21:30:21Z" level=info msg="Rancher agent version a9164915-dirty is starting" time="2019-02-26T21:30:21Z" level=info msg="Listening on /tmp/log.sock" time="2019-02-26T21:30:21Z" level=info msg="Option customConfig=map[address:XXXX internalAddress: roles:[] label:map[]]" time="2019-02-26T21:30:21Z" level=info msg="Option etcd=false" time="2019-02-26T21:30:21Z" level=info msg="Option controlPlane=false" time="2019-02-26T21:30:21Z" level=info msg="Option worker=false" time="2019-02-26T21:30:21Z" level=info msg="Option requestedHostname=yuki-testw1" time="2019-02-26T21:30:21Z" level=info msg="Connecting to wss://rancher.com/v3/connect with token" time="2019-02-26T21:30:21Z" level=info msg="Connecting to proxy" url="wss://rancher.com/v3/connect" time="2019-02-26T21:30:21Z" level=info msg="Starting plan monitor" It’s able to access from a container which isn’t using kube dns. kubectl exec -it cattle-node-agent-c5ddm -n cattle-system bash root@yuki-testc3:/# cat /etc/resolv.conf nameserver 8.8.8.8 ● Something wrong with kube dns ● Something wrong with that container

Slide 20

Slide 20 text

Check kube-dns kube dns seems no problem ○ No error in kube dns log ○ Another container which is using kube-dns can resolve dns $ kubectl logs -l k8s-app=kube-dns -c kubedns -n kube- system|grep '^E' $ # no error detected $ kubectl run -it busybox --image busybox -- sh / # nslookup google.com Server: 172.19.0.10 Address: 172.19.0.10:53 Non-authoritative answer: Name: google.com Address: 216.58.196.238

Slide 21

Slide 21 text

Container with problem ● Container itself ○ container image ○ container network policy etc. ● Network ○ nodes ○ container network Node1 Container network eth0 Node2 Container network

Slide 22

Slide 22 text

Deploy the container on another node $ kubectl logs -f cattle-cluster-agent-5d859cbb48-xb77r -n cattle-system INFO: Environment: CATTLE_ADDRESS=172.18.6.6 CATTLE_CA_CHECKSUM=8b791af7a1dd5f28ca19f8dd689bb816d399ed02753f24 72cf25d1eea5c20be1 CATTLE_CLUSTER=true CATTLE_INTERNAL_ADDRESS= CATTLE_K8S_MANAGED=true CATTLE_NODE_NAME=c cattle-cluster-agent- 5d859cbb48-xb77r CATTLE_SERVER=https://rancher.com INFO: Using resolv.conf: nameserver 172.19.0.10 search cattle-system.svc.cluster.local svc.cluster.local cluster.local INFO: https://rancher.com/ping is accessible Kubernetes Cluster Agent Cluster Agent node1 node2 Container itself seems no problem.

Slide 23

Slide 23 text

Check the node Node1 Node2 busybox busybox $ kubectl exec -it busybox2 sh / # ping 172.18.9.3 -w1 2 packets transmitted, 0 packets received, 100% packet loss $ tcpdump -i eth0 port 8472 and src host … # nothing output Container network has problem $ ping PING <> (): 56 data bytes 64 bytes from : icmp_seq=0 ttl=58 time=3.571 ms

Slide 24

Slide 24 text

Who is responsible for container network ● Building Container Network is just pre-condition of Kubernetes ● It is supposed to be done outside of Kubernetes

Slide 25

Slide 25 text

Look into container network What we use ● Flannel which is used to connect Linux Containers ● Flannel support multiple backends like vxlan, ipip…. Responsibility of Flannel in our case • Configure Linux kernel to create termination device of overlay network • Configure Linux kernel to route, bridge, related sub-system making container connected to other container • This software is not used to forward actual packet, just configuration

Slide 26

Slide 26 text

How flannel works Pod A eth0: 172.17.1.2/24 cni0: 172.17.1.1/24 flannel.1: 172.17.1.0/32 eth0: 10.0.0.1/24 $ ip n 172.17.2.0 flannel.1 lladdr cc.cc.cc.cc.cc.cc $bridge fdb show dev flannel.1 cc.cc.cc.cc.cc.cc dev flannel.1 dst 10.0.0.2 Pod B eth0: 172.17.2.2/24 cni0: 172.17.2.1/24 flannel.1: 172.17.2.0/32 mac: cc.cc.cc.cc.cc.cc eth0: 10.0.0.2/24 • Ping Pod B from Pod A • Pod subnet: 172.xx • Host subnet: 10.xx 1 route, 1 arp entry, 1 fdb entry per host ip r 172.17.2.0/24 via 172.17.2.0 dev flannel.1onlink

Slide 27

Slide 27 text

Routing table 172.17.1.0/24 via 172.17.1.0 dev flannel.1 172.17.2.0/24 dev cni 172.17.3.0/24 via 172.17.3.0 dev flannel.1 ARP cache 172.17.1.0 flannel.1 lladdr aa:aa:aa:aa:aa:aa 172.17.3.0 flannel.1 lladdr cc:cc:cc:cc:cc:cc FDB aa.aa.aa.aa.aa.aa dev flannel.1 dst 10.0.0.1 cc.cc.cc.cc.cc.cc dev flannel.1 dst 10.0.0.3 Routing table 172.17.2.0/24 via 172.17.2.0 dev flannel.1 172.17.3.0/24 dev cni ARP cache 172.17.2.0 flannel.1 lladdr bb.bb.bb.bb.bb.bb FDB bb.bb.bb.bb.bb.bb dev flannel.1 dst 10.0.0.2 Check network configuration Routing table 172.17.1.0/24 dev cni 172.17.2.0/24 via 172.17.2.0 dev flannel.1 172.17.3.0/24 via 172.17.3.0 dev flannel.1 ARP cache 172.17.2.0 flannel.1 lladdr bb:bb:bb:bb:bb:bb 172.17.3.0 flannel.1 lladdr cc:cc:cc:cc:cc:cc FDB bb.bb.bb.bb.bb.bb dev flannel.1 dst 10.0.0.2 cc.cc.cc.cc.cc.cc dev flannel.1 dst 10.0.0.3 Node1 172.17.1.0/24 Node2 172.17.2.0/24 Node3 172.17.2.0/24 Node1 related information is missing

Slide 28

Slide 28 text

Flannel problem? Check flannel agent log of target node, found nothing $ kubectl logs kube-flannel-knwd7 -n kube-system kube-flannel | grep -v ‘^E'

Slide 29

Slide 29 text

Look into flannel $ kubectl get node yuki-testc1 -o yaml apiVersion: v1 kind: Node Metadata: Annotations: flannel.alpha.coreos.com/backend-data: '{"VtepMAC":"9a:4f:ef:9c:2e:2f"}' flannel.alpha.coreos.com/backend-type: vxlan flannel.alpha.coreos.com/kube-subnet-manager: "true" flannel.alpha.coreos.com/public-ip: 10.0.0.1 flannel agent will ● save node specific metadata into k8s node annotation when flannel start ● setup route, fdb, arp cache if there’s a node with flannel annotations All Nodes should have these annotation

Slide 30

Slide 30 text

Check k8s node annotation apiVersion: v1 kind: Node Metadata: Annotations: flannel.alpha.coreos.com/backend-data: '{"VtepMAC":"bb:bb:bb:bb:bb:bb"}' flannel.alpha.coreos.com/backend-type: vxlan flannel.alpha.coreos.com/kube-subnet-manager: "true" flannel.alpha.coreos.com/public-ip: 10.0.0.2 rke.cattle.io/external-ip: 10.0.0.2 Node2 apiVersion: v1 kind: Node Metadata: Annotations: flannel.alpha.coreos.com/backend-data: '{"VtepMAC":"cc:cc:cc:cc:cc:cc"}' flannel.alpha.coreos.com/backend-type: vxlan flannel.alpha.coreos.com/kube-subnet-manager: "true" flannel.alpha.coreos.com/public-ip: 10.0.0.3 rke.cattle.io/external-ip: 10.0.0.3 Node3 apiVersion: v1 kind: Node Metadata: Annotations: rke.cattle.io/external-ip: 10.0.0.1 Node1 Missing Flannel related annotation flannel running on node3 could not configure for node1 because there’s no annotation ⇒why node1 doesn’t have flannel related annotation? ⇒why node2 has node1 network information?

Slide 31

Slide 31 text

Annotation is changed by someone else apiVersion: v1 kind: Node Metadata: Annotations: flannel.alpha.coreos.com/backend-data: '{"VtepMAC":"bb:bb:bb:bb:bb:bb"}' flannel.alpha.coreos.com/backend-type: vxlan flannel.alpha.coreos.com/kube-subnet-manager: "true" flannel.alpha.coreos.com/public-ip: 10.0.0.2 rke.cattle.io/external-ip: 10.0.0.2 apiVersion: v1 kind: Node Metadata: Annotations: flannel.alpha.coreos.com/backend-data: '{"VtepMAC":"cc:cc:cc:cc:cc:cc"}' flannel.alpha.coreos.com/backend-type: vxlan flannel.alpha.coreos.com/kube-subnet-manager: "true" flannel.alpha.coreos.com/public-ip: 10.0.0.3 rke.cattle.io/external-ip: 10.0.0.3 apiVersion: v1 kind: Node Metadata: Annotations: rke.cattle.io/external-ip: 10.0.0.1 Rancher also updates annotation

Slide 32

Slide 32 text

How rancher works When Rancher build Kubernetes Nodes 1. Gets current node annotation 2. Build desired annotation 3. Get node resource 4. Replace annotation with desired one 5. Update node with desired annotation Function A Function B This operation is NOT Atomic. It ignores Optimistic Locking.

Slide 33

Slide 33 text

Look back on what happened Node1 is up Flannel on node1 update node annotation Flannel on node2 add configuration of node1 Get node1 annotations Node3 is up override node1 annotations

Slide 34

Slide 34 text

Write patch and Confirmed it works diff --git a/pkg/controllers/user/nodesyncer/nodessyncer.go b/pkg/controllers/user/nodesyncer/nodessyncer.go index 11cc9c4e..64526ccf 100644 --- a/pkg/controllers/user/nodesyncer/nodessyncer.go +++ b/pkg/controllers/user/nodesyncer/nodessyncer.go @@ -143,7 +143,19 @@ func (m *NodesSyncer) syncLabels(key string, obj *v3.Node) error { toUpdate.Labels = obj.Spec.DesiredNodeLabels } if updateAnnotations { - toUpdate.Annotations = obj.Spec.DesiredNodeAnnotations + // NOTE: This is just workaround. + // There are multiple solutions to solve the problem of https://github.com/rancher/rancher/issues/13644 + // and this problem is kind of desigin bugs. So solving the root cause of problem need design decistions. + // That's why for now we solved the problem by the soltion which don't have to change many places. Because + // We don't wanna create/maintain large change which have high possibility not to be merged in upstream + // Rancher Community tend to hesitate to merge big change created by engineer not from Rancher Lab + + // The solution is to change NodeSyncer so as not to replace annotation with desiredAnnotations but just update annotations which + // is specified in desiredAnnotation. This change have side-effects that disable for user to delete exisiting annotation + // via desiredAnnotation. but we belived this case is not so famous so we chose this solution + for k, v := range obj.Spec.DesiredNodeAnnotations { + toUpdate.Annotations[k] = v

Slide 35

Slide 35 text

Reporting/Proposing to OSS Community

Slide 36

Slide 36 text

Summary of troubleshooting One of agents failed to connect to rancher server Rancher Agent itself? Kube-dns? Kubernetes? Rancher Server? => Need to read code Flannel? => Need to read code Don’t stop to investigate/dive into problem until you understood root cause => There was chance to stop to dive into and Do just workaround for now ● Kube DNS problem?: As some internet information described, If we deployed kube dns on all nodes, this problem seems be hidden ● Flannel problem?: If we just thought flannel annotation get disappeared for trivial reason and manually fixed annotation, this problem seems be hidden.

Slide 37

Slide 37 text

Essence of Operation for complicated system ● Be aware of the responsibility of each software ● Where problem happened is not always where the root cause is ● Don’t stop investigating until you find root cause

Slide 38

Slide 38 text

Community Contribution Pull Requests ● Rancher (8件) ○ https://github.com/rancher/norman/pull/201 ○ https://github.com/rancher/norman/pull/202 ○ https://github.com/rancher/norman/pull/203 ○ https://github.com/rancher/machine/pull/12 ○ https://github.com/rancher/types/pull/525 ○ https://github.com/rancher/rancher/pull/16044 ○ https://github.com/rancher/rancher/pull/15991 ○ https://github.com/rancher/rancher/pull/15909 ● Ingress-Nginx (1件) ○ https://github.com/kubernetes/ingress-nginx/pull/3270 Rancher Source Code Deep Dive https://www.slideshare.net/linecorp/lets-unbox-rancher-20-v200

Slide 39

Slide 39 text

We Are Hiring ● 2 offices , 4 members ○ Tokyo: 3 ○ Kyoto: 1 ○ Taiwan: waiting for you ● English

Slide 40

Slide 40 text

No content