Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Managed Kubernetes Cluster Service In Private Cloud

Managed Kubernetes Cluster Service In Private Cloud

A3966f193f4bef226a0d3e3c1f728d7f?s=128

LINE Developers
PRO

January 15, 2020
Tweet

More Decks by LINE Developers

Other Decks in Technology

Transcript

  1. Managed Kubernetes Cluster Service In Private Cloud LINE Corporation Yuki

    Nishiwaki
  2. High Level Architecture of LINE Private Cloud IaaS Region 1

    Region 2 Region 3 Identity Image DNS L4LB L7LB Block Storage Baremetal Object Storage Kubernetes VM Redis Mysql PaaS ElasticSearch FaaS Function as a Service Multiple Regions Support Scale is growing up - 1500 HV - 8000 Baremetals Provide different Level of abstraction PaaS
  3. Today’s Topic IaaS Region 1 Region 2 Region 3 Identity

    Image DNS L4LB L7LB Block Storage Baremetal Object Storage VM Redis Mysql PaaS ElasticSearch FaaS Function as a Service Multiple Regions Support Scale is growing up - 1500 HV - 8000 Baremetals Provide different Level of abstraction Kubernetes
  4. Kubernetes Cluster Performance Deployment / Update Private Cloud Collaboration Managed

    Kubernetes Mission of Our Managed Kubernetes Service For more than 2200 developers (100+ clusters) Kubernetes Operator Kubernetes Solution Architect High Availability Make an effort to keep Kubernetes Cluster stable Keep thinking How we can migrate existing application to Kubernetes 1 2 3
  5. Kubernetes Cluster Performance Deployment / Update Private Cloud Collaboration Managed

    Kubernetes Mission of Our Managed Kubernetes Service For more than 2200 developers (100+ clusters) Kubernetes Operator Kubernetes Solution Architect High Availability Make an effort to keep Kubernetes Cluster stable Keep thinking How we can migrate existing application to Kubernetes Where we focus on Now Where we focus on Now
  6. How / What we provide?

  7. None
  8. None
  9. Cluster Details Node Details

  10. $ kubectl --kubeconfig k get node NAME STATUS ROLES AGE

    VERSION yuki-vfs-testc1 Ready controlplane 227d v1.11.2 yuki-vfs-testc2 Ready controlplane 227d v1.11.2 yuki-vfs-testc3 Ready controlplane 227d v1.11.2 yuki-vfs-teste1 Ready etcd 227d v1.11.2 yuki-vfs-teste2 Ready etcd 227d v1.11.2 ...
  11. Behind Managed Kubernetes Service?

  12. 1. Deployment: HA Deployments (Default Template) etcd etcd etcd 5

    2 etcd etcd controller controller controller 3 • kube-apiserver • kube-controller- manager • kube-scheduler • kubelet • kube-proxy × × × 2 worker worker worker N • kubelet • kube-proxy - Toleration Limit Our Deployment × From From From OR Import
  13. 1. Deployment: HA Deployments (Default Template) etcd etcd etcd 5

    2 etcd etcd controller controller controller 3 • kube-apiserver • kube-controller- manager • kube-scheduler • kubelet • kube-proxy × × × 2 worker worker worker N • kubelet • kube-proxy - Toleration Limit Our Deployment × From From From OR Import All Servers tolerate at least 2 node failures
  14. 1. Deployment: Fully Managed Etcd, Controller etcd etcd etcd 5

    etcd (Basic) etcd controller controller controller 3 • kube-apiserver • kube-controller- manager • kube-scheduler • kubelet • kube-proxy × × × worker worker worker N • kubelet • kube-proxy - Our Deployment × From From From OR Import Depending on the load of API, We increase the number of API Server or scale up servers 5 etcd (More disk, cpu) Scale Out Scale Up
  15. 1. Deployment: Support Import Worker etcd etcd etcd 5 2

    etcd etcd controller controller controller 3 • kube-apiserver • kube-controller- manager • kube-scheduler • kubelet • kube-proxy × × × 2 worker worker worker N • kubelet • kube-proxy - Toleration Limit Our Deployment × From From From OR Import From GUI, CLI, User can increase the number of worker. For some specific users, We also allow user to import their own Servers.
  16. 1. Deployment: Why We support “Import” ? (1/3) Web Application

    Dev Team A Machine Learning Team A Web Application Dev Team B Kubernetes Cluster OpenStack VMs Kubernetes Cluster OpenStack VMs Kubernetes Cluster OpenStack VMs Etcd Controlplane Etcd Controlplane Worker GPU Server GPU Server GPU Server Worker Web Application Dev Team A Create VMs Install Kubernetes
  17. 1. Deployment: Why We support “Import” ? (2/3) Web Application

    Dev Team A Machine Learning Team A Web Application Dev Team B Kubernetes Cluster OpenStack VMs Kubernetes Cluster OpenStack VMs Kubernetes Cluster OpenStack VMs Etcd Controlplane Etcd Controlplane Worker GPU Server GPU Server GPU Server Worker Web Application Dev Team A We wanted to use our own GPU Server which is not maintained by Private Cloud Create VMs Install Kubernetes
  18. 1. Deployment: Why We support “Import” ? (3/3) Web Application

    Dev Team A Machine Learning Team A Web Application Dev Team B Kubernetes Cluster OpenStack VMs Kubernetes Cluster OpenStack VMs Kubernetes Cluster OpenStack VMs GPU Server GPU Server GPU Server Worker Worker Worker Etcd Controlplane Etcd Worker Allow to import only as a worker Controlplane Register Server As a worker Install Kubernetes Let them join k8s
  19. 1. Deployment: Summary Network Flannel + VXLAN Flannel+VXLAN: Although there

    is encapsulation overhead, We’ve chosen simple implementation for easy operation. NetworkPolicy: Each Project will have their own Kubernetes Cluster, that’s why we don’t provide Network Security Etcd OpenStack VMs (At default: 5 Nodes) To make it easy to scale up, replace the node, We use OpenStack VMs. This is fully managed by Private Cloud Operator Controller OpenStack VMs (At default: 5 Nodes) To make it easy to scale up, scale out, replace the node, We use OpenStack VMs. This is fully managed by Private Cloud Operator Workers OpenStack VMs or Import (At default: 3 Nodes) OpenStack VMs: Scale Out is required to perform by End- User but Healing, Replacement is done by Operator Import Servers: required to be managed by End-User
  20. 2. Easy Auto-Healing etcd etcd etcd controller controller controller worker

    worker worker × × × etcd controller controller Delete Delete Delete New New New
  21. controller 2. Easy Auto-Healing: Key Concept(NodePool) etcd etcd etcd controller

    controller worker worker worker NodePool NodePool NodePool • Roles ◦ Etcd or Controller or Worker • The number of Node ◦ Integer (e.g. 10) • Node Spec ◦ Spec ID ▪ OpenStack Image ID ▪ Flavor ID ▪ Network ID • Node Labels ◦ Label Information will be populated into K8s as a k8s node label • Dockerd Option ◦ Dockerd Option will be populated into dockerd option in all nodes
  22. controller 2. Easy Auto-Healing: Key Concept(NodePool) etcd etcd etcd controller

    controller worker worker worker NodePool NodePool NodePool Periodically Check - If the number of node is enough - If any node have error state Create/Add VM with Spec to K8s if need
  23. controller 2. Easy Auto-Healing: Key Concept(NodePool) etcd etcd etcd controller

    controller worker worker worker NodePool NodePool NodePool User is aware of only NodePool
  24. 3. Addon Management Kubernetes Cluster User Cluster Monitoring Log Aggregation

    Etcd Backup Persistent Volume Provider Verda Officially Supported Addon Kubernetes Addon? To make Kubernetes Cluster production ready, We have many extra softwares to run. They are called as Kubernetes Addon - Cluster Monitoring - Log Aggregation - Persistent Volume Provider - Etcd Backup Software … Verda Kubernetes Certified Addon? For Common Use-case, Middleware, Private Cloud Team actively maintain instead of Private Cloud User - Addon Software Implementation itself - Configuration Best Practice
  25. 3. Addon Management: Addon Manager Kubernetes Cluster User Cluster Monitoring

    Log Aggregation Etcd Backup Persistent Volume Provider Log Aggregation Enable Log Aggregation Deploy Addon Make sure they are updated and running HTTP API Addon Manager User - Deploy Addon - Update Addon - Monitor Addon
  26. 3. Addon Management: User Cluster Monitoring(1/3) Kubernetes Cluster Etcd Etcd

    Etcd Etcd Etcd Controller Etcd Etcd Worker User User can see Monitoring Dashboard Monitoring Control Plane for Kubernetes (Etcd, Controller) Trigger Alert Notify to Developper/Operator for Managed Kubernetes
  27. 3. Addon Management: Log Aggregation(2/3) Kubernetes Cluster Etcd Etcd Etcd

    Etcd Etcd Controller Etcd Etcd Worker Collect followings - /var/lib/rancher/rke/log/* (k8s, etcd logs) - /var/log/containers/* (all container’s logs) LogRotate with Docker json-file logging driver Operator User
  28. 3. Addon Management: Persistent Volume(3/4) Kubernetes Cluster Persistent Volume Provider

    Volume Persistent volume claim (pvc) Pod Worker Worker Worker Create Volume Bind Volume to Worker Expose Volume to Pod
  29. 3. Addon Management: Etcd Periodic Backup(3/3) Kubernetes Cluster Etcd Etcd

    Etcd Etcd Backup Object Storage snapshot snapshot snapshot snapshot snapshot Periodically Take a snapshot Store snapshot into Object Storage
  30. Architecture

  31. Kubernetes Cluster Kubernetes Cluster API Automate Operating Multiple Cluster Cluster

    Operation - Cluster Create - Cluster Update - Add Worker Manage Cluster - Deploy - Update - Monitor Addon Controller Composed of 3 different components Addon1 Addon2 Manage Addon - Deploy - Monitor - Update Addon Controller Addon Controller Starter 1 2 3 Start Addon Controller
  32. Why we need simple API server in front of Rancher

    What • Simple stateless API Server Responsibility in our Managed Kubernetes • Provide interface of Managed Kubernetes Service for User ◦ This is only place for User Interface Why • Don’t expose Rancher API/GUI directly to User ◦ Avoid strongly depending on Rancher ◦ Restrict few Rancher Functionality • Easy to add extra business logic API
  33. Kubernetes Cluster Kubernetes Cluster API Automate Operating Multiple Cluster Cluster

    Operation - Cluster Create - Cluster Update - Add Worker Manage Cluster - Deploy - Update - Monitor Addon Controller Composed of 3 different components Addon1 Addon2 Manage Addon - Deploy - Monitor - Update Addon Controller Addon Controller Starter Start Addon Controller 1 2 3
  34. is our core management functionality What is Rancher? • OSS

    tools which is developed by Rancher Lab • Implemented based on Kubernetes (Use CRD, client-go heavily) • Provide multiple clusters management functionality Responsibility in our Managed Kubernetes • Provision • Update • Keep Kubernetes Cluster Available/Healthy ◦ Monitoring ◦ Healing
  35. Kubernetes Cluster Kubernetes Cluster API Automate Operating Multiple Cluster Cluster

    Operation - Cluster Create - Cluster Update - Add Worker Manage Cluster - Deploy - Update - Monitor Addon Controller Composed of 3 different components Addon1 Addon2 Manage Addon - Deploy - Monitor - Update Addon Controller Addon Controller Starter Start Addon Controller 1 2 3
  36. Addon Manager(Addon Starter and Controller) What • Provide Addon Management

    Functionality • Addon Controller Starter run with Rancher • Addon Controller run on each k8s Responsibility in our Managed Kubernetes • Addon Controller Starter ◦ Run Addon Controller for each cluster • Addon Controller ◦ Deploy Addon based on Addon definition ◦ Update Addon based on Addon definition ◦ Monitor Addon Addon Manager Addon Controller Addon Controller Starter
  37. Scale and Operation

  38. Scale vs Operators 110 Clusters (Prod: 5, Dev: 105) 1650

    Nodes (Prod: 450, Dev: 1200) 4 or 5 Operators VS As of Sep 2019
  39. Scale vs Operators 110 Clusters (Prod: 5, Dev: 105) 1650

    Nodes (Prod: 450, Dev: 1200) 4 or 5 Operators VS As of Sep 2019
  40. Rancher is making a lot of operation automated When Node

    got broken? When entire cluster got down? When cluster need to update? When certificate to update? 1 API Call to replace Node with broken Node 1 API Call to restore etcd From snapshot 1 API Call to update k8s 1 API Call to rotate certificate Events
  41. 2 types of Monitoring User Cluster Monitoring Simple Health Check

    1. Kubernetes API is working 2. Have TCP Connectivity with Agent in User K8s Cluster a. Cluster Agent for each Cluster b. Node Agent for each Node 3. Sync kubelet status for each node 4. Sync componentstatus API result Advanced Health Check 1. Node Resource Usage (with Node Exporter) 2. Etcd /metrics API result 3. kube-XXXXX /metrics API result
  42. DataBase Simple Health Check with Rancher Kind: Cluster Cluster1 Kind:

    Node Node1 … omit ... Status: componentStatuses: …. omit … condition: …. omit … … omit ... Status: internalNodeStatus: …. omit … condition: …. omit … Automate Operating Multiple Cluster Update Cluster, Node Status Periodically NOTE: Actually Rancher use k8s CRD as a database, so strictly speaking, Database is the k8s
  43. DataBase Simple Health Check with Rancher Kind: Node Node1 …

    omit ... Status: componentStatuses: …. omit … condition: …. omit … … omit ... Status: internalNodeStatus: …. omit … condition: …. omit … Rancher State Metrics rancher_cluster_not_true_condition {cluster="c-2mf2d",condition="<Condition Name>"} {cluster="c-2mf2d",condition="NoMemoryPressure"} rancher_cluster_component_not_true_status {cluster="c-25f4n",exported_component="<Component Name>"} {cluster="c-25f4n",exported_component="controller-manager"} rancher_node_not_true_internal_condition {cluster="c-k87fq",condition="PIDPressure",node="m-2tch7"} rancher_node_not_true_codition {cluster="c-2f6gk",condition="Provisioned",node="m-kkrz6"} metrics Automate Operating Multiple Cluster Kind: Cluster Cluster1 NOTE: Actually Rancher use k8s CRD as a database, so strictly speaking, Database is the k8s
  44. DataBase Simple Health Check with Rancher Kind: Node Node1 …

    omit ... Status: componentStatuses: …. omit … condition: …. omit … … omit ... Status: internalNodeStatus: …. omit … condition: …. omit … Rancher State Metrics rancher_cluster_not_true_condition {cluster="c-2mf2d",condition="<Condition Name>"} {cluster="c-2mf2d",condition="NoMemoryPressure"} rancher_cluster_component_not_true_status {cluster="c-25f4n",exported_component="<Component Name>"} {cluster="c-25f4n",exported_component="controller-manager"} rancher_node_not_true_internal_condition {cluster="c-k87fq",condition="PIDPressure",node="m-2tch7"} rancher_node_not_true_codition {cluster="c-2f6gk",condition="Provisioned",node="m-kkrz6"} metrics Automate Operating Multiple Cluster Kind: Cluster Cluster1 Long Node Provisioning Long Cluster Provisioning Unhealthy Component Status Unhealthy Node Condition Unhealthy Cluster Condition Alerts Configure Alerts
  45. We can easily notice something wrong quickly Even if the

    number of clusters are more than 90 Long Node Provisioning Long Cluster Provisioning Unhealthy Component Status Unhealthy Node Condition Unhealthy Cluster Condition We can notice each node, cluster’s status change
  46. Advanced Health Check Overview => Repeat of P24 Kubernetes Cluster

    Etcd Etcd Etcd Etcd Etcd Controller Etcd Etcd Worker User User can see Monitoring Dashboard Monitoring Control Plane for Kubernetes (Etcd, Controller) Trigger Alert Notify to Developper/Operator for Managed Kubernetes
  47. Etcd Node Exporter kube-apiserver kube-controller-manager kube-scheduler kube-proxy kubelet Node Exporter

    kube-proxy kubelet Node Exporter kube-proxy kubelet kube-dns metrics-server Etcd Controller Worker Metrics Coverage?
  48. Is everything working fine ?

  49. Yes As long as C-plane is working fine

  50. Don’t forget C-plane Monitoring

  51. Kubernetes Cluster Kubernetes Cluster API Automate Operating Multiple Cluster Addon

    Controller Addon1 Addon2 Addon Controller Addon Controller Starter C-plane Monitoring
  52. Kubernetes Cluster Kubernetes Cluster API Automate Operating Multiple Cluster Addon

    Controller Addon1 Addon2 Addon Controller Addon Controller Starter C-plane Monitoring Most Important It have many internal states to manage multiple k8s clusters
  53. Introduce /metrics API to expose Internal State Upstream Rancher Applied

    our patch Propose to upstream https://github.com/rancher/rancher/pull/21351
  54. Found & Fixed Issues 1. https://github.com/rancher/types/pull/525 2. https://github.com/rancher/machine/pull/12 3. https://github.com/rancher/norman/pull/201

    4. https://github.com/rancher/norman/pull/202 5. https://github.com/rancher/norman/pull/203 6. https://github.com/rancher/rancher/pull/15909 7. https://github.com/rancher/types/pull/588 8. https://github.com/rancher/rke/pull/939 9. https://github.com/rancher/rancher/pull/15985 10. https://github.com/rancher/rancher/pull/15991 11. https://github.com/rancher/rancher/pull/16044 12. https://github.com/rancher/rancher/pull/18113 13. https://github.com/rancher/rancher/pull/19750 14. https://github.com/rancher/rancher/pull/20005 15. https://github.com/rancher/norman/pull/285 16. https://github.com/rancher/rke/pull/1374 17. https://github.com/rancher/rancher/pull/21118 18. https://github.com/rancher/norman/pull/291 19. https://github.com/rancher/rke/pull/1438 20. https://github.com/rancher/rancher/pull/21351 Proposed 20 PR
  55. What’s next?

  56. Gradually moving to next phase Next Topics • Performance •

    Private Cloud Collaboration • Solution Architect Work => More feature for specific case