Slide 1

Slide 1 text

Kubernetes 地端 自 建 vs GKE 哪個更適合你? Taipei Johnny Sung

Slide 2

Slide 2 text

Full stack developer Johnny Sung (宋岡諺) https://fb.com/j796160836 https://blog.jks.co ff ee/ https://www.slideshare.net/j796160836 https://github.com/j796160836

Slide 3

Slide 3 text

大 綱 •Kubernetes (K8s) 基本概念 •Kubernetes (K8s) 元件的概念 •地端架設實務 •開 一 台 GKE 吧 •關於 GPU

Slide 4

Slide 4 text

High Availability https://soco-st.com/18158 高 可 用 性

Slide 5

Slide 5 text

https://blog.whmcs.com/133514/demystifying-high-availability-for-whmcs ( Active / Standby )

Slide 6

Slide 6 text

CAP 定理 • 一 致性(Consistency) •可 用 性(Availability) •分區容錯性(Partition tolerance) https://zh.wikipedia.org/zh-tw/CAP%E5%AE%9A%E7%90%86 https://medium.com/nerd-for-tech/understand-cap-theorem-751f0672890e

Slide 7

Slide 7 text

https://medium.com/how-gipi-learn/%E5%B0%8A%E9%87%8D-%E9%9C%80%E8%A6%81%E9%9D%A0%E5%B0%88%E6%A5%AD%E5%8E%BB%E8%B4%8F%E5%9B%9E%E4%BE%86-8fdecf676fe5

Slide 8

Slide 8 text

https://medium.com/%E5%BE%8C%E7%AB%AF%E6%96%B0%E6%89%8B%E6%9D%91/cap%E5%AE%9A%E7%90%86101-3fdd10e0b9a

Slide 9

Slide 9 text

https://medium.com/%E5%BE%8C%E7%AB%AF%E6%96%B0%E6%89%8B%E6%9D%91/cap%E5%AE%9A%E7%90%86101-3fdd10e0b9a

Slide 10

Slide 10 text

https://medium.com/%E5%BE%8C%E7%AB%AF%E6%96%B0%E6%89%8B%E6%9D%91/cap%E5%AE%9A%E7%90%86101-3fdd10e0b9a

Slide 11

Slide 11 text

大 部分都要改程式 https://soco-st.com/18158 要做到 高 可 用 性 就 Infrastructure 做探討

Slide 12

Slide 12 text

https://javascript.plainenglish.io/what-is-a-server-explanation-for-young-developers-2511d8b313b7

Slide 13

Slide 13 text

https://ithelp.ithome.com.tw/articles/10250841 Virtual Machine (VM) vs Docker

Slide 14

Slide 14 text

https://upload.wikimedia.org/wikipedia/commons/6/67/Kubernetes_logo.svg

Slide 15

Slide 15 text

https://www.cncf.io/blog/2024/06/06/unveiling-the-10-year-kubernetes-anniversary-logo/

Slide 16

Slide 16 text

https://www.linuxfoundation.org/kubernetes-10-year-logo-contest

Slide 17

Slide 17 text

開發 工 程師視 角 的 Kubernetes 可能是你? https://soco-st.com/20498

Slide 18

Slide 18 text

https://soco-st.com/20498 我知道!就是 檔!

Slide 19

Slide 19 text

是什麼? 可以吃嗎? https://soco-st.com/20498

Slide 20

Slide 20 text

想想以前 Docker 的時代

Slide 21

Slide 21 text

Created by hanis tusiyani from Noun Project https://thenounproject.com/icon/server-7086299/
 https://thenounproject.com/icon/data-center-7086329/
 https://www.pngwing.com/en/free-png-ztqam docker run -v ./www:/usr/share/nginx/html:ro -p 80:80 -d nginx docker run 指令 一 次起 單 一 服務

Slide 22

Slide 22 text

Created by hanis tusiyani from Noun Project https://thenounproject.com/icon/server-7086299/
 https://thenounproject.com/icon/data-center-7086329/
 https://www.pngwing.com/en/free-png-ztqam docker run -v ./www:/usr/share/nginx/html:ro -p 80:80 -d nginx version: "3" services: nginx: image: nginx volumes: - ./www:/usr/share/nginx/html:ro ports: - 80:80 docker run 指令 docker-compose.yml 一 次起 多組服務 一 次起 單 一 服務

Slide 23

Slide 23 text

Created by hanis tusiyani from Noun Project https://thenounproject.com/icon/server-7086299/
 https://thenounproject.com/icon/data-center-7086329/
 https://www.pngwing.com/en/free-png-ztqam Created by hanis tusiyani from Noun Project docker run -v ./www:/usr/share/nginx/html:ro -p 80:80 -d nginx version: "3" services: nginx: image: nginx volumes: - ./www:/usr/share/nginx/html:ro ports: - 80:80 docker run 指令 docker-compose.yml • deployment.yml • services.yml • rbac.yml • config-map.yml • …. 一 次起 多組服務 Kubernetes 多組服務 部署在 多台主機 上 一 次起 單 一 服務

Slide 24

Slide 24 text

docker-compose version: "3" services: nginx: image: nginx volumes: - ./www:/usr/share/nginx/html:ro ports: - 80:80 • 服務部署 • 磁碟 • 網路

Slide 25

Slide 25 text

對應 Kubernetes 的元件 • 服務部署 → Deployment / Pod • 磁碟 → PersistentVolumeClaim (PVC) / Con fi gMap / Secret • 網路 → Service / Ingress 永久磁碟儲存需求 會 自 動 1:1 對應 PersistentVolume (PV) 地端 K8s 預設沒有 LoadBalancer 可 用

Slide 26

Slide 26 text

Kustomize Kustomize 是 一 個 Kubernetes 的配置管理 工 具,可以透過定制資源的配置來 簡化 Kubernetes 的部署。它專注於以聲明式 方 式修改和管理 Kubernetes manifest 檔案,不需要動態 生 成配置。使 用 者可以建 立 基礎配置的 "基底", 然後在不同環境(如開發、測試和 生 產)中進 行 客製化覆蓋。Kustomize 允許 合併或替換 YAML 檔案的部分,使得配置更加模組化和可重 用 。它現在是 Kubernetes 的 一 部分,可以直接透過 kubectl 命令 行工 具使 用 。 https://zlaval.medium.com/kustomize-template-free-kubernetes-application-management-3d70ca9d2e05

Slide 27

Slide 27 text

Kustomize 檔案架構 https://thenounproject.com/icon/ fi le-6897025/ https://thenounproject.com/icon/puzzle-6850847/ deployment.yml services.yml config-map.yml … kustomization.yaml

Slide 28

Slide 28 text

一 個網站服務的基本元件

Slide 29

Slide 29 text

Pod Container https://thenounproject.com/icon/ram-7094983/ https://thenounproject.com/icon/hard-disk-7094988/ https://thenounproject.com/icon/network-5355161/ https://thenounproject.com/icon/history-5019532/ https://thenounproject.com/icon/central-processing-unit-7095000/ https://thenounproject.com/icon/form-6622708/
 https://thenounproject.com/icon/approval-6293848/ 網站服務的基本元件

Slide 30

Slide 30 text

Pod Container https://thenounproject.com/icon/ram-7094983/ https://thenounproject.com/icon/hard-disk-7094988/ https://thenounproject.com/icon/network-5355161/ https://thenounproject.com/icon/history-5019532/ https://thenounproject.com/icon/central-processing-unit-7095000/ https://thenounproject.com/icon/form-6622708/
 https://thenounproject.com/icon/approval-6293848/ Service Created by Mada Creative 網站服務的基本元件

Slide 31

Slide 31 text

Pod Container Deployment ReplicaSet https://thenounproject.com/icon/ram-7094983/ https://thenounproject.com/icon/hard-disk-7094988/ https://thenounproject.com/icon/network-5355161/ https://thenounproject.com/icon/history-5019532/ https://thenounproject.com/icon/central-processing-unit-7095000/ https://thenounproject.com/icon/form-6622708/
 https://thenounproject.com/icon/approval-6293848/ by Muhammad Naufal Subhiansyah from Noun Project by Muhammad Naufal Subhiansyah from Noun Project Service Created by Mada Creative 網站服務的基本元件

Slide 32

Slide 32 text

Pod Container Deployment ReplicaSet https://thenounproject.com/icon/ram-7094983/ https://thenounproject.com/icon/hard-disk-7094988/ https://thenounproject.com/icon/network-5355161/ https://thenounproject.com/icon/history-5019532/ https://thenounproject.com/icon/central-processing-unit-7095000/ https://thenounproject.com/icon/form-6622708/
 https://thenounproject.com/icon/approval-6293848/ by Muhammad Naufal Subhiansyah from Noun Project by Muhammad Naufal Subhiansyah from Noun Project Service Created by Mada Creative PVC PersistentVolumeClaim PersistentVolume PV 1:1 網站服務的基本元件

Slide 33

Slide 33 text

Deployment (部署) • 定義 一 個 Pod 的部署 方 式 • Replicas 要幾份 • 設定參數 • Con fi gMap, Secret • Resources Limit (CPU, memory) • VolumeMounts
 (使 用 的 PersistentVolumeClaim PVC) apiVersion: apps/v1 kind: Deployment metadata: labels: app: my-deployment name: my-deployment namespace: my-namespace spec: replicas: 1 selector: matchLabels: app: my-deployment template: metadata: labels: app: my-deployment spec: containers: - image: my_image:1.0 name: my_image resources: requests: memory: 64Mi cpu: 250m limits: memory: 128Mi cpu: 500m ports: - containerPort: 5000 name: my_image volumeMounts: - name: my-pvc mountPath: /mydata - name: my-pvc mountPath: /data/output volumes: - name: my-pvc persistentVolumeClaim: claimName: my-pvc

Slide 34

Slide 34 text

Pod Container Deployment ReplicaSet https://thenounproject.com/icon/ram-7094983/ https://thenounproject.com/icon/hard-disk-7094988/ https://thenounproject.com/icon/network-5355161/ https://thenounproject.com/icon/history-5019532/ https://thenounproject.com/icon/central-processing-unit-7095000/ https://thenounproject.com/icon/form-6622708/
 https://thenounproject.com/icon/approval-6293848/ by Muhammad Naufal Subhiansyah from Noun Project by Muhammad Naufal Subhiansyah from Noun Project Service Created by Mada Creative PVC PersistentVolumeClaim PersistentVolume PV Created by Andika Cahya Fitriani from the Noun Project Provisioner StorageClass 1:1 網站服務的基本元件 還有更多...

Slide 35

Slide 35 text

docker-compose version: "3" services: nginx: image: nginx volumes: - ./www:/usr/share/nginx/html:ro ports: - 80:80 • 服務部署 • 磁碟 • 網路

Slide 36

Slide 36 text

對應 Kubernetes 的元件 • 服務部署 → Deployment / StatefulSet / Pod • 磁碟 → PersistentVolumeClaim (PVC) / Con fi gMap / Secret • 網路 → Service / Ingress 永久磁碟儲存需求 會 自 動 1:1 對應 PersistentVolume (PV) 地端 K8s 預設沒有 LoadBalancer 可 用

Slide 37

Slide 37 text

維運 工 程師視 角 的 Kubernetes 可能還是你? https://soco-st.com/18158

Slide 38

Slide 38 text

K8s 的各種選擇 •作業系統 OS •K8s distro •Container Runtime •CNI (Container Network Interface) •CRI (Container Runtime Interface)

Slide 39

Slide 39 text

K8s 的各種選擇 •作業系統 OS •Ubuntu? Redhat?

Slide 40

Slide 40 text

K8s 的各種選擇 •Container Runtime •docker? containerd? cri-o?

Slide 41

Slide 41 text

K8s 的各種選擇 •K8s distro •社群版 •kubeadm? Rancher? •商 用 版 •OpenShift? VMWare Tanzu?

Slide 42

Slide 42 text

K8s 的各種選擇 •CNI (Container Network Interface) •Flannel? Calico? Cilium?

Slide 43

Slide 43 text

通通綜合起來... https://soco-st.com/18158

Slide 44

Slide 44 text

No content

Slide 45

Slide 45 text

不 用 啦! 用 Google Cloud 就好 ☺

Slide 46

Slide 46 text

No content

Slide 47

Slide 47 text

我給你 一 個預設選項吧! •作業系統 OS:ubuntu •K8s distro:kubeadm •Container Runtime: docker •CNI (Container Network Interface): fl annel •CRI (Container Runtime Interface): cri-dockerd https://soco-st.com/21673 https://en.m.wikipedia.org/wiki/File:UbuntuCoF.svg https://www.docker.com/company/newsroom/media-resources/

Slide 48

Slide 48 text

不只這些 https://soco-st.com/18158

Slide 49

Slide 49 text

•StorageClass •Metric server •ArgoCD •Prometheus + Grafana K8s 常安裝的元件 部署 監控 儲存 K8s擴展

Slide 50

Slide 50 text

Kubernetes 的元件介紹

Slide 51

Slide 51 text

https://mrdevops.hashnode.dev/kubernetes-architecture

Slide 52

Slide 52 text

K8s 重點元件 •kubelet 主服務,確保各元件有正常運作 •kube-apiserver 主要核 心 ,提供 Kubernetes HTTP API •kube-scheduler 排程分配器,把 Pod 分到合適的 node https://github.com/coredns https://github.com/etcd-io/etcd

Slide 53

Slide 53 text

K8s 重點元件 •etcd Key-Value 資料庫,有 一 致性與 高 可 用 的特 色 •CoreDNS •Network CNI •Container Runtime (CRI) https://github.com/coredns https://github.com/etcd-io/etcd

Slide 54

Slide 54 text

https://www.cncf.io/

Slide 55

Slide 55 text

光提到 K8s 主元件 就有這些了

Slide 56

Slide 56 text

若是 週邊系統 就更多了 And More…

Slide 57

Slide 57 text

https://blog.jks.co ff ee/on-premise-self-host-kubernetes-k8s-setup-redhat https://blog.jks.co ff ee/on-premise-self-host-kubernetes-k8s-setup-ubuntu

Slide 58

Slide 58 text

大 致步驟 •<每台都做> 關掉 Swap •<每台都做> 安裝 Docker •<每台都做> 安裝 kubelet、kubeadm、kubectl •<每台都做> 安裝 cri-dockerd •<每台都做> 設定 /etc/hosts •設定 Control plane node •設定 Worker node • 安裝 Helm 套件管理程式 • 安裝 Flannel CNI • 測試檢查叢集

Slide 59

Slide 59 text

GKE vs 地端 • 一 個在雲端 一 個在地端(廢話) •按時收費(彈性收費) vs 一 台實體機 N 百萬 •機房的 電 力 、冷氣、 門 禁、消防設施、合規性...

Slide 60

Slide 60 text

https://www.ithome.com.tw/tech/87704

Slide 61

Slide 61 text

https://www.kubecost.com/kubernetes-autoscaling/kubernetes-hpa/

Slide 62

Slide 62 text

? Worker node Worker node 當服務滿載需要拓展的時候...

Slide 63

Slide 63 text

💰 💰 💰 💰 💰 💰💰💰💰💰💰 💰 💰 💰💰💰💰💰 💰 💰 ? 實體機 實體機 如果是實體機?

Slide 64

Slide 64 text

不 用 啦! 用 Google Cloud 就好 ☺

Slide 65

Slide 65 text

Google Kubernetes Engine (GKE) 架構

Slide 66

Slide 66 text

https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-architecture

Slide 67

Slide 67 text

開 一 台 GKE 吧 •選 Autopilot or Standard 模式 •選地區 •選 kubernetes 版本 https://soco-st.com/21673

Slide 68

Slide 68 text

GKE 二 種模式 •Autopilot 模式 •Standard 模式 GPU 請選這種 https://console.cloud.google.com/kubernetes/add

Slide 69

Slide 69 text

https://k8s.ithome.com.tw/2024/workshop-page/3348

Slide 70

Slide 70 text

區域 •距離(回應速度) •跨 AZ (Availability zone) 有容錯性 (Partition tolerance) 但有跨區流量費 •直接反應 💰 https://mapsvg.com/maps/world 台灣節點 (asia-east1) https://cloud.google.com/kubernetes-engine/pricing?hl=zh-tw

Slide 71

Slide 71 text

K8s 版本 •可 用 性 v.s. 版本穩定性 •可 用 最新版前 一 版 較穩定版

Slide 72

Slide 72 text

熟 手 的朋友 可以 用 Terraform 或 CLi 指令 https://soco-st.com/18158

Slide 73

Slide 73 text

連接 GKE •打開 Cloud Shell •選專案 •載 入金 鑰 gcloud config set project [PROJECT_ID] gcloud container clusters get-credentials [CLUSTER_NAME] \ --region=[COMPUTE_REGION] https://cloud.google.com/kubernetes-engine/docs/how-to/cluster-access-for-kubectl

Slide 74

Slide 74 text

雲端 GKE

Slide 75

Slide 75 text

地端 kubeadm + Flannel

Slide 76

Slide 76 text

https://www.onlogic.com/blog/what-is-a-gpu-a-beginners-guide/ 關於 GPU

Slide 77

Slide 77 text

想玩地端 LLM ?

Slide 78

Slide 78 text

首 先,你要有張 NVIDIA 的卡 (誤)

Slide 79

Slide 79 text

https://mises.org/mises-daily/understanding-price-money

Slide 80

Slide 80 text

GPU 相關 •NVIDIA driver •NVIDIA CUDA •GPU Operator

Slide 81

Slide 81 text

GPU Operator 重點元件 •Device Plugin •GPU Feature Discovery (GFD) •DCGM •DCGM Exporter • …

Slide 82

Slide 82 text

https://info.nvidia.com/how-to-use-gpus-on-kubernetes-webinar.html

Slide 83

Slide 83 text

https://info.nvidia.com/how-to-use-gpus-on-kubernetes-webinar.html

Slide 84

Slide 84 text

GPU K8s 大 致步驟 •裝 NVIDIA driver(.run的版本) •裝 NVIDIA Cuda •裝 NVIDIA Container Toolkit •下指令 patch con fi g 綁定 Containerd •裝 Kubernetes •裝 GPU Operator

Slide 85

Slide 85 text

https://realfood.tesco.com/recipes/rainbow-cake.html GPU 怎麼切?

Slide 86

Slide 86 text

https://aws.amazon.com/tw/blogs/containers/gpu-sharing-on-amazon-eks-with-nvidia-time-slicing-and-accelerated-ec2-instances/ https://developer.nvidia.com/blog/improving-gpu-utilization-in-kubernetes/

Slide 87

Slide 87 text

GPU 切割 方 式 看起來有五種,但其實只有 二 種 https://soco-st.com/18158

Slide 88

Slide 88 text

Time slicing •分時多 工 的原理 •vRAM 不限制 MPS •Multi-Thread 方 式分配 •vRAM 每份固定 大 (Multi-Process Service)

Slide 89

Slide 89 text

MIG •硬體層 面 切割 GPU •指定型號才有 (例如:A100, H100) •Blackwell 或 Hopper™ 系列 vGPU •NVIDIA 支 援 GPU 虛擬化 •要軟體授權 (Multi-Instance GPU) (virtual GPU) https://www.nvidia.com/en-us/technologies/multi-instance-gpu/

Slide 90

Slide 90 text

https://www.nvidia.com/zh-tw/data-center/resources/vgpu-evaluation/

Slide 91

Slide 91 text

https://www.nvidia.com/zh-tw/data-center/resources/vgpu-evaluation/

Slide 92

Slide 92 text

GPU Mode 比 較 •Time slicing: Memory 不限制,Process 間會排擠 •MPS: 軟體性均分 •MIG: 硬體層級分割 https://cloud.google.com/kubernetes-engine/docs/concepts/timesharing-gpus

Slide 93

Slide 93 text

https://www.youtube.com/watch?v=Q2GuTUO170w

Slide 94

Slide 94 text

加 入 GPU 進集群 •選卡 片 •共 用 GPU •注意預算 💰

Slide 95

Slide 95 text

加 入 GPU 進集群 •選卡 片 •共 用 GPU •注意預算 💰

Slide 96

Slide 96 text

加 入 GPU 進集群 https://cloud.google.com/kubernetes-engine/docs/how-to/gpus#gcloud

Slide 97

Slide 97 text

台灣節點有嗎?有! NAME: nvidia-l4 ZONE: asia-east1-a DESCRIPTION: NVIDIA L4 NAME: nvidia-l4-vws ZONE: asia-east1-a DESCRIPTION: NVIDIA L4 Virtual Workstation NAME: nvidia-tesla-p100 ZONE: asia-east1-a DESCRIPTION: NVIDIA Tesla P100 NAME: nvidia-tesla-p100-vws ZONE: asia-east1-a DESCRIPTION: NVIDIA Tesla P100 Virtual Workstation NAME: nvidia-tesla-t4 ZONE: asia-east1-a DESCRIPTION: NVIDIA T4 NAME: nvidia-tesla-t4-vws ZONE: asia-east1-a DESCRIPTION: NVIDIA Tesla T4 Virtual Workstation NAME: nvidia-l4 ZONE: asia-east1-b DESCRIPTION: NVIDIA L4 NAME: nvidia-l4-vws ZONE: asia-east1-b DESCRIPTION: NVIDIA L4 Virtual Workstation NAME: nvidia-l4 ZONE: asia-east1-c DESCRIPTION: NVIDIA L4 NAME: nvidia-l4-vws ZONE: asia-east1-c DESCRIPTION: NVIDIA L4 Virtual Workstation NAME: nvidia-tesla-p100 ZONE: asia-east1-c DESCRIPTION: NVIDIA Tesla P100 NAME: nvidia-tesla-p100-vws ZONE: asia-east1-c DESCRIPTION: NVIDIA Tesla P100 Virtual Workstation NAME: nvidia-tesla-t4 ZONE: asia-east1-c DESCRIPTION: NVIDIA T4 NAME: nvidia-tesla-t4-vws ZONE: asia-east1-c DESCRIPTION: NVIDIA Tesla T4 Virtual Workstation NAME: nvidia-tesla-v100 ZONE: asia-east1-c DESCRIPTION: NVIDIA V100 台灣節點 (asia-east1) 列出 Google cloud 上有的 GPU/TPU gcloud compute accelerator-types list

Slide 98

Slide 98 text

把玩開源 LLM •Gemma 採 用 與建 立 Gemini 模型時相同的研究成果和技術, 開源 LLM 模型 •Ollama https://ollama.com/ •Open webui https://openwebui.com/

Slide 99

Slide 99 text

https://medium.com/@dilipkashyap15/googles-new-ai-model-gemini-now-available-in-bard-here-is-how-to-use-259386d6bd68

Slide 100

Slide 100 text

https://ai.google.dev/gemma?hl=zh-tw#gemma-2

Slide 101

Slide 101 text

No content

Slide 102

Slide 102 text

No content

Slide 103

Slide 103 text

No content

Slide 104

Slide 104 text

Takeways • 一 個網站服務的基本元件 •Kubernetes 內部的元件 •GPU 的服 用 方 式

Slide 105

Slide 105 text

Q & A https://www.sherpany.com/en/resources/digital-transformation/cloud-computing/cloud-computing-de fi nition/

Slide 106

Slide 106 text

如果你想要 用 AI •Vertex AI 串接 API https://cloud.google.com/vertex-ai •Cloud Run 跑單次服務 https://cloud.google.com/run •GKE 適 用長 期部署架構 https://cloud.google.com/kubernetes-engine https://theaiagency.co.nz/