Managed Kubernetes Cluster Service In Private Cloud

Managed Kubernetes Cluster Service In Private Cloud LINE Corporation Yuki
Nishiwaki

High Level Architecture of LINE Private Cloud IaaS Region 1
Region 2 Region 3 Identity Image DNS L4LB L7LB Block Storage Baremetal Object Storage Kubernetes VM Redis Mysql PaaS ElasticSearch FaaS Function as a Service Multiple Regions Support Scale is growing up - 1500 HV - 8000 Baremetals Provide different Level of abstraction PaaS

Today’s Topic IaaS Region 1 Region 2 Region 3 Identity
Image DNS L4LB L7LB Block Storage Baremetal Object Storage VM Redis Mysql PaaS ElasticSearch FaaS Function as a Service Multiple Regions Support Scale is growing up - 1500 HV - 8000 Baremetals Provide different Level of abstraction Kubernetes

Kubernetes Cluster Performance Deployment / Update Private Cloud Collaboration Managed
Kubernetes Mission of Our Managed Kubernetes Service For more than 2200 developers (100+ clusters) Kubernetes Operator Kubernetes Solution Architect High Availability Make an effort to keep Kubernetes Cluster stable Keep thinking How we can migrate existing application to Kubernetes 1 2 3

Kubernetes Cluster Performance Deployment / Update Private Cloud Collaboration Managed
Kubernetes Mission of Our Managed Kubernetes Service For more than 2200 developers (100+ clusters) Kubernetes Operator Kubernetes Solution Architect High Availability Make an effort to keep Kubernetes Cluster stable Keep thinking How we can migrate existing application to Kubernetes Where we focus on Now Where we focus on Now

How / What we provide?

Cluster Details Node Details

$ kubectl --kubeconfig k get node NAME STATUS ROLES AGE
VERSION yuki-vfs-testc1 Ready controlplane 227d v1.11.2 yuki-vfs-testc2 Ready controlplane 227d v1.11.2 yuki-vfs-testc3 Ready controlplane 227d v1.11.2 yuki-vfs-teste1 Ready etcd 227d v1.11.2 yuki-vfs-teste2 Ready etcd 227d v1.11.2 ...

Behind Managed Kubernetes Service?

1. Deployment: HA Deployments (Default Template) etcd etcd etcd 5
2 etcd etcd controller controller controller 3 • kube-apiserver • kube-controller- manager • kube-scheduler • kubelet • kube-proxy × × × 2 worker worker worker N • kubelet • kube-proxy - Toleration Limit Our Deployment × From From From OR Import

1. Deployment: HA Deployments (Default Template) etcd etcd etcd 5
2 etcd etcd controller controller controller 3 • kube-apiserver • kube-controller- manager • kube-scheduler • kubelet • kube-proxy × × × 2 worker worker worker N • kubelet • kube-proxy - Toleration Limit Our Deployment × From From From OR Import All Servers tolerate at least 2 node failures

1. Deployment: Fully Managed Etcd, Controller etcd etcd etcd 5
etcd (Basic) etcd controller controller controller 3 • kube-apiserver • kube-controller- manager • kube-scheduler • kubelet • kube-proxy × × × worker worker worker N • kubelet • kube-proxy - Our Deployment × From From From OR Import Depending on the load of API, We increase the number of API Server or scale up servers 5 etcd (More disk, cpu) Scale Out Scale Up

1. Deployment: Support Import Worker etcd etcd etcd 5 2
etcd etcd controller controller controller 3 • kube-apiserver • kube-controller- manager • kube-scheduler • kubelet • kube-proxy × × × 2 worker worker worker N • kubelet • kube-proxy - Toleration Limit Our Deployment × From From From OR Import From GUI, CLI, User can increase the number of worker. For some specific users, We also allow user to import their own Servers.

1. Deployment: Why We support “Import” ? (1/3) Web Application
Dev Team A Machine Learning Team A Web Application Dev Team B Kubernetes Cluster OpenStack VMs Kubernetes Cluster OpenStack VMs Kubernetes Cluster OpenStack VMs Etcd Controlplane Etcd Controlplane Worker GPU Server GPU Server GPU Server Worker Web Application Dev Team A Create VMs Install Kubernetes

Dev Team A Machine Learning Team A Web Application Dev Team B Kubernetes Cluster OpenStack VMs Kubernetes Cluster OpenStack VMs Kubernetes Cluster OpenStack VMs Etcd Controlplane Etcd Controlplane Worker GPU Server GPU Server GPU Server Worker Web Application Dev Team A We wanted to use our own GPU Server which is not maintained by Private Cloud Create VMs Install Kubernetes

Dev Team A Machine Learning Team A Web Application Dev Team B Kubernetes Cluster OpenStack VMs Kubernetes Cluster OpenStack VMs Kubernetes Cluster OpenStack VMs GPU Server GPU Server GPU Server Worker Worker Worker Etcd Controlplane Etcd Worker Allow to import only as a worker Controlplane Register Server As a worker Install Kubernetes Let them join k8s

1. Deployment: Summary Network Flannel + VXLAN Flannel+VXLAN: Although there
is encapsulation overhead, We’ve chosen simple implementation for easy operation. NetworkPolicy: Each Project will have their own Kubernetes Cluster, that’s why we don’t provide Network Security Etcd OpenStack VMs (At default: 5 Nodes) To make it easy to scale up, replace the node, We use OpenStack VMs. This is fully managed by Private Cloud Operator Controller OpenStack VMs (At default: 5 Nodes) To make it easy to scale up, scale out, replace the node, We use OpenStack VMs. This is fully managed by Private Cloud Operator Workers OpenStack VMs or Import (At default: 3 Nodes) OpenStack VMs: Scale Out is required to perform by End- User but Healing, Replacement is done by Operator Import Servers: required to be managed by End-User

2. Easy Auto-Healing etcd etcd etcd controller controller controller worker
worker worker × × × etcd controller controller Delete Delete Delete New New New

controller 2. Easy Auto-Healing: Key Concept(NodePool) etcd etcd etcd controller
controller worker worker worker NodePool NodePool NodePool • Roles ◦ Etcd or Controller or Worker • The number of Node ◦ Integer (e.g. 10) • Node Spec ◦ Spec ID ▪ OpenStack Image ID ▪ Flavor ID ▪ Network ID • Node Labels ◦ Label Information will be populated into K8s as a k8s node label • Dockerd Option ◦ Dockerd Option will be populated into dockerd option in all nodes

controller worker worker worker NodePool NodePool NodePool Periodically Check - If the number of node is enough - If any node have error state Create/Add VM with Spec to K8s if need

controller worker worker worker NodePool NodePool NodePool User is aware of only NodePool

3. Addon Management Kubernetes Cluster User Cluster Monitoring Log Aggregation
Etcd Backup Persistent Volume Provider Verda Officially Supported Addon Kubernetes Addon? To make Kubernetes Cluster production ready, We have many extra softwares to run. They are called as Kubernetes Addon - Cluster Monitoring - Log Aggregation - Persistent Volume Provider - Etcd Backup Software … Verda Kubernetes Certified Addon? For Common Use-case, Middleware, Private Cloud Team actively maintain instead of Private Cloud User - Addon Software Implementation itself - Configuration Best Practice

3. Addon Management: Addon Manager Kubernetes Cluster User Cluster Monitoring
Log Aggregation Etcd Backup Persistent Volume Provider Log Aggregation Enable Log Aggregation Deploy Addon Make sure they are updated and running HTTP API Addon Manager User - Deploy Addon - Update Addon - Monitor Addon

3. Addon Management: User Cluster Monitoring(1/3) Kubernetes Cluster Etcd Etcd
Etcd Etcd Etcd Controller Etcd Etcd Worker User User can see Monitoring Dashboard Monitoring Control Plane for Kubernetes (Etcd, Controller) Trigger Alert Notify to Developper/Operator for Managed Kubernetes

3. Addon Management: Log Aggregation(2/3) Kubernetes Cluster Etcd Etcd Etcd
Etcd Etcd Controller Etcd Etcd Worker Collect followings - /var/lib/rancher/rke/log/* (k8s, etcd logs) - /var/log/containers/* (all container’s logs) LogRotate with Docker json-file logging driver Operator User

3. Addon Management: Persistent Volume(3/4) Kubernetes Cluster Persistent Volume Provider
Volume Persistent volume claim (pvc) Pod Worker Worker Worker Create Volume Bind Volume to Worker Expose Volume to Pod

3. Addon Management: Etcd Periodic Backup(3/3) Kubernetes Cluster Etcd Etcd
Etcd Etcd Backup Object Storage snapshot snapshot snapshot snapshot snapshot Periodically Take a snapshot Store snapshot into Object Storage

Architecture

Kubernetes Cluster Kubernetes Cluster API Automate Operating Multiple Cluster Cluster
Operation - Cluster Create - Cluster Update - Add Worker Manage Cluster - Deploy - Update - Monitor Addon Controller Composed of 3 different components Addon1 Addon2 Manage Addon - Deploy - Monitor - Update Addon Controller Addon Controller Starter 1 2 3 Start Addon Controller

Why we need simple API server in front of Rancher
What • Simple stateless API Server Responsibility in our Managed Kubernetes • Provide interface of Managed Kubernetes Service for User ◦ This is only place for User Interface Why • Don’t expose Rancher API/GUI directly to User ◦ Avoid strongly depending on Rancher ◦ Restrict few Rancher Functionality • Easy to add extra business logic API

Operation - Cluster Create - Cluster Update - Add Worker Manage Cluster - Deploy - Update - Monitor Addon Controller Composed of 3 different components Addon1 Addon2 Manage Addon - Deploy - Monitor - Update Addon Controller Addon Controller Starter Start Addon Controller 1 2 3

is our core management functionality What is Rancher? • OSS
tools which is developed by Rancher Lab • Implemented based on Kubernetes (Use CRD, client-go heavily) • Provide multiple clusters management functionality Responsibility in our Managed Kubernetes • Provision • Update • Keep Kubernetes Cluster Available/Healthy ◦ Monitoring ◦ Healing

Operation - Cluster Create - Cluster Update - Add Worker Manage Cluster - Deploy - Update - Monitor Addon Controller Composed of 3 different components Addon1 Addon2 Manage Addon - Deploy - Monitor - Update Addon Controller Addon Controller Starter Start Addon Controller 1 2 3

Addon Manager(Addon Starter and Controller) What • Provide Addon Management
Functionality • Addon Controller Starter run with Rancher • Addon Controller run on each k8s Responsibility in our Managed Kubernetes • Addon Controller Starter ◦ Run Addon Controller for each cluster • Addon Controller ◦ Deploy Addon based on Addon definition ◦ Update Addon based on Addon definition ◦ Monitor Addon Addon Manager Addon Controller Addon Controller Starter

Scale and Operation

Scale vs Operators 110 Clusters (Prod: 5, Dev: 105) 1650
Nodes (Prod: 450, Dev: 1200) 4 or 5 Operators VS As of Sep 2019

Rancher is making a lot of operation automated When Node
got broken? When entire cluster got down? When cluster need to update? When certificate to update? 1 API Call to replace Node with broken Node 1 API Call to restore etcd From snapshot 1 API Call to update k8s 1 API Call to rotate certificate Events

2 types of Monitoring User Cluster Monitoring Simple Health Check
1. Kubernetes API is working 2. Have TCP Connectivity with Agent in User K8s Cluster a. Cluster Agent for each Cluster b. Node Agent for each Node 3. Sync kubelet status for each node 4. Sync componentstatus API result Advanced Health Check 1. Node Resource Usage (with Node Exporter) 2. Etcd /metrics API result 3. kube-XXXXX /metrics API result

DataBase Simple Health Check with Rancher Kind: Cluster Cluster1 Kind:
Node Node1 … omit ... Status: componentStatuses: …. omit … condition: …. omit … … omit ... Status: internalNodeStatus: …. omit … condition: …. omit … Automate Operating Multiple Cluster Update Cluster, Node Status Periodically NOTE: Actually Rancher use k8s CRD as a database, so strictly speaking, Database is the k8s

DataBase Simple Health Check with Rancher Kind: Node Node1 …
omit ... Status: componentStatuses: …. omit … condition: …. omit … … omit ... Status: internalNodeStatus: …. omit … condition: …. omit … Rancher State Metrics rancher_cluster_not_true_condition {cluster="c-2mf2d",condition="<Condition Name>"} {cluster="c-2mf2d",condition="NoMemoryPressure"} rancher_cluster_component_not_true_status {cluster="c-25f4n",exported_component="<Component Name>"} {cluster="c-25f4n",exported_component="controller-manager"} rancher_node_not_true_internal_condition {cluster="c-k87fq",condition="PIDPressure",node="m-2tch7"} rancher_node_not_true_codition {cluster="c-2f6gk",condition="Provisioned",node="m-kkrz6"} metrics Automate Operating Multiple Cluster Kind: Cluster Cluster1 NOTE: Actually Rancher use k8s CRD as a database, so strictly speaking, Database is the k8s

DataBase Simple Health Check with Rancher Kind: Node Node1 …
omit ... Status: componentStatuses: …. omit … condition: …. omit … … omit ... Status: internalNodeStatus: …. omit … condition: …. omit … Rancher State Metrics rancher_cluster_not_true_condition {cluster="c-2mf2d",condition="<Condition Name>"} {cluster="c-2mf2d",condition="NoMemoryPressure"} rancher_cluster_component_not_true_status {cluster="c-25f4n",exported_component="<Component Name>"} {cluster="c-25f4n",exported_component="controller-manager"} rancher_node_not_true_internal_condition {cluster="c-k87fq",condition="PIDPressure",node="m-2tch7"} rancher_node_not_true_codition {cluster="c-2f6gk",condition="Provisioned",node="m-kkrz6"} metrics Automate Operating Multiple Cluster Kind: Cluster Cluster1 Long Node Provisioning Long Cluster Provisioning Unhealthy Component Status Unhealthy Node Condition Unhealthy Cluster Condition Alerts Configure Alerts

We can easily notice something wrong quickly Even if the
number of clusters are more than 90 Long Node Provisioning Long Cluster Provisioning Unhealthy Component Status Unhealthy Node Condition Unhealthy Cluster Condition We can notice each node, cluster’s status change

Advanced Health Check Overview => Repeat of P24 Kubernetes Cluster
Etcd Etcd Etcd Etcd Etcd Controller Etcd Etcd Worker User User can see Monitoring Dashboard Monitoring Control Plane for Kubernetes (Etcd, Controller) Trigger Alert Notify to Developper/Operator for Managed Kubernetes

Etcd Node Exporter kube-apiserver kube-controller-manager kube-scheduler kube-proxy kubelet Node Exporter
kube-proxy kubelet Node Exporter kube-proxy kubelet kube-dns metrics-server Etcd Controller Worker Metrics Coverage?

Is everything working fine ?

Yes As long as C-plane is working fine

Don’t forget C-plane Monitoring

Kubernetes Cluster Kubernetes Cluster API Automate Operating Multiple Cluster Addon
Controller Addon1 Addon2 Addon Controller Addon Controller Starter C-plane Monitoring

Kubernetes Cluster Kubernetes Cluster API Automate Operating Multiple Cluster Addon
Controller Addon1 Addon2 Addon Controller Addon Controller Starter C-plane Monitoring Most Important It have many internal states to manage multiple k8s clusters

Introduce /metrics API to expose Internal State Upstream Rancher Applied
our patch Propose to upstream https://github.com/rancher/rancher/pull/21351

Found & Fixed Issues 1. https://github.com/rancher/types/pull/525 2. https://github.com/rancher/machine/pull/12 3. https://github.com/rancher/norman/pull/201
4. https://github.com/rancher/norman/pull/202 5. https://github.com/rancher/norman/pull/203 6. https://github.com/rancher/rancher/pull/15909 7. https://github.com/rancher/types/pull/588 8. https://github.com/rancher/rke/pull/939 9. https://github.com/rancher/rancher/pull/15985 10. https://github.com/rancher/rancher/pull/15991 11. https://github.com/rancher/rancher/pull/16044 12. https://github.com/rancher/rancher/pull/18113 13. https://github.com/rancher/rancher/pull/19750 14. https://github.com/rancher/rancher/pull/20005 15. https://github.com/rancher/norman/pull/285 16. https://github.com/rancher/rke/pull/1374 17. https://github.com/rancher/rancher/pull/21118 18. https://github.com/rancher/norman/pull/291 19. https://github.com/rancher/rke/pull/1438 20. https://github.com/rancher/rancher/pull/21351 Proposed 20 PR

What’s next?

Gradually moving to next phase Next Topics • Performance •
Private Cloud Collaboration • Solution Architect Work => More feature for specific case

Managed Kubernetes Cluster Service In Private C...

Managed Kubernetes Cluster Service In Private Cloud

More Decks by LINE Developers

Other Decks in Technology

Featured

Transcript