Slide 1

Slide 1 text

Kyle Bai ⽩白凱仁 Practical Kubernetes

Slide 2

Slide 2 text

@k2r2bai About Me ⽩白凱仁(Kyle Bai) • Software Engineer @ inwinSTACK. • OSS Contributor. • Certified Kubernetes Administrator/Application Developer. • Co-organizer of Cloud Native Taiwan User Group. • Interested in emerging technologies. @kairen https://k2r2bai.com

Slide 3

Slide 3 text

@k2r2bai • Overview of Kubernetes • Overview of Kubeflow • Kubeflow Deployment with MiniKube • Set up Multi-nodes Cluster • Kubeadm Agenda Today I would like to talk about

Slide 4

Slide 4 text

Overview of Kubernetes

Slide 5

Slide 5 text

@k2r2bai kernel libs app app app app • 沒有隔離 • 沒有命名空間 • 共⽤用常⾒見見的函式庫 • ⾼高耦合的應⽤用程式與作業系統 Bare Metal

Slide 6

Slide 6 text

@k2r2bai • 隔離性⾼高 • 效能會損失 • 同樣有⾼高耦合的應⽤用程式與作業系統 • 多虛擬機管理理效率差 • 啟動時間慢 • 系統映像檔容量量較肥 • 粒度粗 Virtual Machines app libs kernel libs app app kernel app libs libs kernel kernel OS Virtualization

Slide 7

Slide 7 text

@k2r2bai • 效能佳 • 透過 namespace 隔離網路路、UID 等 • 與 OS Kernel ⾼高耦合 • 啟動時間快 • 應⽤用映像檔容量量較⼩小(⼩小⾄至 10 MB),攜帶性佳 • 粒度細,利利⽤用密度提升 Containers(OS-Level Virtualization) Application Virtualization libs app kernel libs app libs app libs app

Slide 8

Slide 8 text

@k2r2bai Kubernetes • Container orchestration • Self-healing • Horizontal scaling • Service discovery and Load balancing • Automated rollouts and rollbacks • Secrets and configuration management • Storage orchestration “Kubernetes is becoming the Linux of the cloud” Jim Zemlin, Linux Foundation

Slide 9

Slide 9 text

@k2r2bai Kubernetes Architecture UI CLI API Users Master Nodes etcd scheduler controllers apiserver kubelet kube-proxy add-ons container runtime

Slide 10

Slide 10 text

@k2r2bai Kubernetes System Layers Nucleus: API and Execution Application Layer: Deployment and Routing Governance Layer: Automation and Policy Enforcement Interface Layer: Client Libraries and Tools Ecosystem Container Runtime Network Plugin Volume Plugin Image Registry Cloud Provider Identity Provider Device Plugin

Slide 11

Slide 11 text

@k2r2bai Governance Layer: Automation and Policy Enforcement (APIs optional and pluggable) Application Layer: Deployment and Routing (APIs required and pluggable) Nucleus: API and Execution (APIs required and not pluggable) CronJob batch/ v2alpha1 Job batch/v1 Deployment apps/v1 DaemonSet apps/v1 Pod core/v1 ReplicaSet apps/v1 StatefulSet apps/v1 ReplicationController core/v1 Endpoints core/v1 Ingress extensions/v1beta1 Service core/v1 ConfigMap core/v1 Secret core/v1 PersistentVolumeClaim core/v1 StorageClass storage/v1 ControllerRevision apps/v1 Event core/v1 LimitRange core/v1 ValidatingWebHookConfiguration admissionregistration/v1alpha1 HorizontalPodAutoscaler autoscaling/v1 APIService apiregistration/v1beta1 PodDisruptionBudget policy/v1beta1 PodPreset settings/v1alpha1 PodSecurityPolicy extensions/v1beta1 CertificateSigningRequest certificates/v1beta1 ClusterRole rbac/v1beta1 ClusterRoleBinding rbac/v1beta1 LocalSubjectAccessReview authorization/v1 Namespace core/v1 Node core/v1 PersistentVolume core/v1 ResourceQuota core/v1 Role rbac/v1beta1 RoleBinding rbac/v1beta1 SelfSubjectAccessReview authorization/v1 ServiceAccount core/v1 SubjectAccessReview authorization/v1 NetworkPolicy networking/v1 ComponentStatus core/v1 PriorityClass scheduling/v1alpha1 ClusterServiceBroker servicecatalog/v1beta1 ClusterServiceClass servicecatalog/v1beta1 ClusterServicePlan servicecatalog/v1beta1 ServiceInstance servicecatalog/v1beta1 ServiceBinding servicecatalog/v1beta1 MutatingWebHookConfiguration admissionregistration/v1alpha1 SelfSubjectRulesReview authorization/v1 TokenReview authentication/v1 CustomResourceDefinition apiextensions/v1beta1

Slide 12

Slide 12 text

@k2r2bai

Slide 13

Slide 13 text

Overview of Kubeflow

Slide 14

Slide 14 text

@k2r2bai ML/DL Ecosystem Ref: Sculley et al.: Hidden Technical Debt in Machine Learning Systems

Slide 15

Slide 15 text

@k2r2bai ML/DL Ecosystem Ref: Sculley et al.: Hidden Technical Debt in Machine Learning Systems

Slide 16

Slide 16 text

@k2r2bai Run TensorFlow on K8s w/o Kubeflow

Slide 17

Slide 17 text

@k2r2bai Kubeflow Kubeflow ⽬目標是簡化在 Kubernetes 上運⾏行行 Machine learning (ML) 的過程,使之通過更更 簡單、可攜帶與可擴展的創建。 • ⽬目標不是在於重建其他服務,⽽而是提供⼀一個最佳開發系統,來來部署到任何集群中,有效確保ML在集 群之間移動性,並輕鬆將任務擴展任何集群。 • 由於使⽤用 Kubernetes 來來做為基礎,因此只要有 Kubernetes 的地⽅方,都能夠運⾏行行部署 Kubeflow。

Slide 18

Slide 18 text

@k2r2bai The Kubeflow mission • 在不同基礎設施上實現簡單、可重複的攜帶性部署(Laptop <-> ML rig <-> Training cluster <-> Production cluster)。 • 部署與管理理松耦合的微服務。 • 根據需求進⾏行行叢集縮放。 + https://www.kubeflow.org/

Slide 19

Slide 19 text

@k2r2bai What in the Toolkit? • Jupyter Hub: ⽤用於建立與管理理互動式的 「Jupyter Notebook」。 • TF Job Operator and Controller: ⽤用來來擴展管理理訓練任務,可設定使⽤用 CPU 或 GPU, 並簡化分散式部署配置。 • TensoUlow Serving: 部署經過訓練過後的的TensorFlow模型,提供使⽤用者使⽤用與進⾏行行 預測。

Slide 20

Slide 20 text

@k2r2bai Simple Potable Scalable 更更簡單 可攜帶 可擴展 Data Scientist / ML Engineer 能夠專注於模型創建
 部署學習成本較低 保持 Kubernetes Cluster 位於不同平台上可移植性 Kubernetes 上啟⽤用
 ⼯工作負載平衡

Slide 21

Slide 21 text

@k2r2bai Inference ML Environment

Slide 22

Slide 22 text

@k2r2bai Kubernetes managing resources 透過 Kubernetes 和 containers來來管理理操作系統和硬體資源

Slide 23

Slide 23 text

@k2r2bai Kubernetes managing resources 透過kubeflow在Kubernetes上,簡化移植和擴展機器學習(ML)⼯工作流程的部署 Kubeflow

Slide 24

Slide 24 text

@k2r2bai https://medium.com/@amina.alsherif/how-to-get-started-with-kubeflow Manage by Kubeflow 只需要擔⼼心如何透過Kubernetes 調整參參數和設定⽬目標,來來進⾏行行ML模型訓練

Slide 25

Slide 25 text

@k2r2bai ML/DL Workflow

Slide 26

Slide 26 text

@k2r2bai ML/DL Workflow

Slide 27

Slide 27 text

@k2r2bai Serving Distributed Training Developer create model Katlib https://speakerdeck.com/masayaaoyama/introduction-to-kubeflow-0-dot-1-and-future-at-cloud-native-meetup-tokyo-number-2 Use Kubeflow

Slide 28

Slide 28 text

@k2r2bai Use Operator to manage training Jobs Kubeflow 使⽤用 Operator 結合⾃自定義資源(CRD)來來進⾏行行擴展 Kubernetes 功能擴展,並 透過如⾃自定義類別 「 TFJob 」來來管理理各種 ML/DL Framework 的任務。 查看 kubeflow tfjos 狀狀態:

Slide 29

Slide 29 text

@k2r2bai Operator Operator 是 CoreOS 提出的概念念,⽬目標是簡化複雜有狀狀態應⽤用的管理理,它能夠達到應⽤用程 式的狀狀態事件變化,並利利⽤用控制器與客製化資源(CustomResourceDefinition)來來透過擴展 的 Kubernetes API 進⾏行行⾃自動建立、管理理與配置應⽤用程式容器實例例。

Slide 30

Slide 30 text

@k2r2bai Parts of the Operator • 是⼀一個 Domain Specific 控制器 • ⼀一套具特定應⽤用程式知識的軟體 • 透過⾃自定義的 Controller/Resource(CRD) 來來擴展 Kubernetes 功能 • 簡化有狀狀態應⽤用程式的創建、組態與管理理過程 • 透過程式⾃自動化維護應⽤用程式

Slide 31

Slide 31 text

@k2r2bai The Controller Loop

Slide 32

Slide 32 text

Kubeflow Deployment with MiniKube

Slide 33

Slide 33 text

@k2r2bai Create a cluster using Minikube ⾸首先透過 Minikube 建立⼀一個 Kubernetes 叢集: $ minikube start --memory=16384 --cpus=4 --kubernetes-version=v1.14.5 下載 Kubeflow CLI ⼯工具來來管理理元件(部署、更更新與刪除) : $ wget https://github.com/kubeflow/kubeflow/releases/download/v0.5.1/ kfctl_v0.5.1_linux.tar.gz $ sudo tar -C /usr/local/bin -xzf kfctl_v0.5.1_linux.tar.gz

Slide 34

Slide 34 text

@k2r2bai Deploy Kubeflow via kfctl 透過 kfctl 產⽣生要部署的元件: $ kfctl init kubeflow -V $ cd kubeflow && kfctl generate all -V 透過 kfctl部署的 Kubeflow 元件到 Kubernetes 叢集: $ kfctl apply all -V

Slide 35

Slide 35 text

@k2r2bai Check Kubeflow components 檢查 Kubeflow 元件部署狀狀態: $ kubectl -n kubeflow get po 透過 port-forward 來來存取 Kubeflow Portal: $ kubectl -n kubeflow port-forward svc/ambassador 8080:80 --address 0.0.0.0 存取 HOST:8080

Slide 36

Slide 36 text

@k2r2bai Set up your notebooks

Slide 37

Slide 37 text

@k2r2bai Create a notebook server

Slide 38

Slide 38 text

@k2r2bai Create a notebook server

Slide 39

Slide 39 text

@k2r2bai Create a notebook server

Slide 40

Slide 40 text

@k2r2bai Create a notebook server with GPU

Slide 41

Slide 41 text

@k2r2bai Kubernetes Device Plugins Device Plugins 是 Kubernetes v1.8 加入的特性,⽬目標是以通⽤用介⾯面提供第三⽅方設備廠商開 發插件化⽅方式將裝置(如 GPU)資源串串接⾄至 Kubernetes 上,並且提供容器 Extended Resources。 ⽬目前關注度⾼高的 Device plugins: • NVIDIA device plugin for Kubernetes • AMD device plugin for Kubernetes • Solarflare Device Plugin

Slide 42

Slide 42 text

Set up Multi-nodes Cluster

Slide 43

Slide 43 text

@k2r2bai Kubernetes Deploy Tools https://github.com/ramitsurana/awesome-kubernetes#installers https://caylent.com/50-useful-kubernetes-tools Other Kubespray RKE Kops Kube-aws Typhoon Kubicorn Docker for K8s LinuxKit Matchbox KubeNow Bootkube kubeadm-dind-cluster Minikube PKS https://docs.google.com/spreadsheets/d/1LxSqBzjOxfGx3cmtZ4EbB_BGCxT_wlxW_xgHVVa23es/edit#gid=0

Slide 44

Slide 44 text

@k2r2bai Kubeadm Kubeadm 同樣是由 Kubernetes 社區維護的⼯工具,與 Minkube 不同的是其原始碼被包含 在 Kubernetes 核⼼心專案中。Kubeadm 主要是幫助建立最佳實踐的叢集。 • 適合部署多節點叢集,也能⽀支援 HA 與 Self-hosting 部署⽅方式 • ⽀支援以 Config ⽅方式來來描述部署的叢集 • 預設會⽀支援 Kubernetes 新版本⼀一些重點特性 • 許多部署⼯工具背後也採⽤用 kubeadm • GA and Production-Ready https://github.com/kubernetes/kubernetes/tree/master/cmd/kubeadm

Slide 45

Slide 45 text

@k2r2bai Built to be part of a higher-level solution https://kccnceu19.sched.com/event/MPhI/intro-cluster-lifecycle-sig-lucas-kaldstrom-independent-tim-st-clair-vmware

Slide 46

Slide 46 text

@k2r2bai SIG cluster lifecycle of roadmap https://kccnceu19.sched.com/event/MPhI/intro-cluster-lifecycle-sig-lucas-kaldstrom-independent-tim-st-clair-vmware

Slide 47

Slide 47 text

@k2r2bai Create machines using Vagrant with Virtualbox ⾸首先透過 Git 取得以下 Repo: $ git clone https://github.com/inwinstack/k8s-course 進入對應⽬目錄執⾏行行以下指令: $ cd k8s-course/multi-cluster/kubeadm $ vagrant up $ vagrant status $ vagrant ssh

Slide 48

Slide 48 text

@k2r2bai Create master node 透過 vagrant 指令進入 k8s-m1 節點,並執⾏行行以下指令: $ vagrant ssh k8s-m1 $ sudo su - $ sudo kubeadm init --apiserver-advertise-address=192.16.35.12 \ --pod-network-cidr=10.244.0.0/16 \ --token rlag12.6sd1dhhery5r6fk2 \ --ignore-preflight-errors=NumCPU

Slide 49

Slide 49 text

@k2r2bai Create CNI plugin(Calico) 在 k8s-m1 節點執⾏行行以下指令來來複製 admin config: $ mkdir -p $HOME/.kube && \ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config && \ sudo chown $(id -u):$(id -g) $HOME/.kube/config 在 k8s-m1 節點執⾏行行以下指令 $ kubectl apply -f /vagrant/calico.yaml

Slide 50

Slide 50 text

@k2r2bai Join worker nodes 透過 vagrant 指令進入 k8s-n1, k8s-n2 節點,並執⾏行行以下指令: $ vagrant ssh k8s-n1 $ sudo su - # 這⾏行行請從⾃自⼰己的 Master 複製 $ kubeadm join 192.16.35.12:6443 --token rlag12.6sd1dhhery5r6fk2 \ --discovery-token-ca-cert-hash sha256:ea1f9e8a715c5fcaf1379073e4f9ed5ea34339398b1fab3bcc2bfe74cc07c6be

Slide 51

Slide 51 text

@k2r2bai Deploy a NGINX service 在 Master 節點執⾏行行以下指令來來建立 NGINX: $ kubectl apply -f /vagrant/pdb/ $ kubectl get po,svc -o wide $ curl 192.16.35.12

Slide 52

Slide 52 text

Kubeadm Workflow

Slide 53

Slide 53 text

@k2r2bai Kubeadm init Workflow 1. Preflight Checks 2. Generate CA 3. Write kubeconfig files 4. Generate static Pod manifests 5. Generate bootstrap token 6. Install Addons https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-init-phase/ https://kccnceu19.sched.com/event/MPhI/intro-cluster-lifecycle-sig-lucas-kaldstrom-independent-tim-st-clair-vmware

Slide 54

Slide 54 text

@k2r2bai Preflight Checks • ⼀一系列列的檢查⼯工作,確定這台機器可以⽤用來來部署 Kubernetes。 • Error • if not Kernel 3.10+ or 4+ with specific KernelSpec. • if required cgroups subsystem aren’t in set up. • if API server bindPort or ports 10250/10251/10252 are used. • if the machine hostname is not a valid DNS subdomain. • 安裝的 kubeadm 和 kubelet 版本是否匹配。

Slide 55

Slide 55 text

@k2r2bai Generate CA • 產⽣生 Kubernetes 對外提供服務所需要的各種憑證與⽬目錄。 • Kubernetes 提供服務時都要透過 HTTPS 才能存取 kube-apiserver。 • 存放在 /etc/kubernetes/pki • 可以選擇不讓 kubeadm 產⽣生憑證,⽽而是使⽤用現有憑證。

Slide 56

Slide 56 text

@k2r2bai Write kubeconfig files • 為其他元件產⽣生存取 kube-apiserver 所需的組態檔案 • /etc/kubernetes/xxx.conf • ex: admin.conf • 當前這個 Master 節點的服務地址、監聽埠⼝口與憑證⽬目錄等資訊。

Slide 57

Slide 57 text

@k2r2bai Generates static Pod manifests • 為 Master 的核⼼心元件產⽣生 Pod 組態檔案 • kube-apiserver, kube-controller, kube-scheduler • etcd • Static Pod • 把要部署的 Pod 的 YAML 檔案放在指定的⽬目錄 • kubelet 啟動時,會⾃自動檢查這個⽬目錄,加載所有 Pod YAML ⽂文件,啟動它們 • Master 元件的 YAML 檔案: /etc/kubernetes/manifests

Slide 58

Slide 58 text

@k2r2bai Generate bootstrap token • 只要持有這個 token,任何安裝了了 kubelet 和 kubeadm 的節點,都可以透過 kubeadm join 加入到這個叢集中。 • Token 的值與使⽤用⽅方法,會在最後顯⽰示出來來 • 在 token ⽣生成後,kubeadm 會將 ca.crt 等 master 節點的重要訊息,透過 ConfigMap (cluster-info) 的⽅方式保存在 etcd,供後續部署 node 節點使⽤用

Slide 59

Slide 59 text

@k2r2bai Deploy addons • kubeadm 預設會安裝 kube-proxy 與 CoreDNS 這兩兩個 Addons • 分別⽤用來來提供整個叢集服務的對外與 DNS 功能。

Slide 60

Slide 60 text

@k2r2bai Kubeadm join Workflow https://kccnceu19.sched.com/event/MPhI/intro-cluster-lifecycle-sig-lucas-kaldstrom-independent-tim-st-clair-vmware

Slide 61

Slide 61 text

@k2r2bai Kubeadm join Workflow • 將⼀一個 Node 節點加入到當前叢集中。 $ kubeadm join -- token XYZ --discovery-token-ca-cert-hash XYZ • 要想成為 kubernetes 叢集中的⼀一個節點,就必須在指定叢集的 kube-apiserver 上註冊 該節點。 • 需要相關的憑證⽂文件。 • Kubeadm 使⽤用 Bootstrap token 發起"不安全模式"來來存取到 kube-apiserver,並拿到 cluster-info,然後進⾏行行註冊。

Slide 62

Slide 62 text

@k2r2bai Summary • Kubeadm 設計簡潔,且是 Kuberentes 官⽅方⽀支援的⼯工具 • 可⽤用於⽣生產環境嗎? • 1.14: Setting up HA clusters with kubeadm is still experimental and will be further simplified in future versions.