GCP+GKE Deep Dive Part 2: Advanced Cluster Management

GCP+GKE Deep Dive Part 2: Advanced Cluster Management

Part 1 바로가기: https://speakerdeck.com/premist/gcp-plus-gke-deep-dive-part-1-initial-app-development/

처음 GKE를 사용하기 시작하면 난해할 수 있는 서비스 배포, GKE 클러스터를 생성하고 여러 GCP의 서비스를 이용하여 첫 애플리케이션을 배포하는 과정까지 자세하게 살펴봅니다. 또한 애플리케이션을 배포한 이후 안정적인 서비스 운영을 위해 활용할 수 있는 클러스터 관리 테크닉을 소개합니다.

Part 2: Advanced Cluster Management
Kubernetes에는 분산 시스템을 구축하고 관리하는 것을 도와주는 다양한 기능이 있지만, 워낙 많은 사용 사례에 대비하다보니 어떤 기능이 있는지를 쉽게 간과하고 넘어가는 경우가 많습니다.

두 번째 파트에서는 클러스터와 애플리케이션을 보다 효율적으로 관리할 수 있도록 도와주는 여러 가지 리소스와 기능을 소개합니다. Google Cloud 서비스와 연동되어 Service Account 프로비저닝을 GKE 내에서 관리할 수 있게 도와주는 Service Broker, 한 번 실행되거나 시간대에 맞춰 자동으로 실행되는 CronJob, 애플리케이션의 SLO(Service Level Objective)를 유지하기 위한 Affinity 및 Pod Disruption Budget에 대해 알아봅니다.

필요 이해도: Kubernetes에 애플리케이션을 배포해 보고 관리해 보신 경험이 있는 분에게 적합합니다. Part 1을 듣고 이어서 들으셔도 무방합니다.

91b2bf7b681403e98e6846677d9ca029?s=128

Minku Lee

June 29, 2018
Tweet

Transcript

  1. 3.
  2. 4.
  3. 6.
  4. 7.

    apiVersion: batch/v1beta1 kind: CronJob metadata: name: recurring-job spec: schedule: "*/1

    * * * *" jobTemplate: spec: template: spec: containers: - name: recurringwork image: recurringwork:latest args: - ./do-recurring.sh restartPolicy: OnFailure cronjob.yml
  5. 8.

    apiVersion: batch/v1beta1 kind: CronJob metadata: name: recurring-job spec: schedule: "*/1

    * * * *" concurrencyPolicy: Replace jobTemplate: spec: template: spec: containers: - name: recurringwork image: recurringwork:latest args: - ./do-recurring.sh restartPolicy: OnFailure cronjob.yml 핟펓킪핟픒킪맒핂쇦펖쁢섾핂헒핟펓핂퐒헒븫빦힎팘팦삲졂 Allow 솧킪킲픒푷 Forbid 솧킪킲픒믖힎 Replace 믾홂핟펓픒홓욚몮킲
  6. 11.
  7. 12.
  8. 13.
  9. 14.
  10. 16.

    apiVersion: apps/v1 kind: Deployment metadata: name: gitlab labels: app: gitlab

    spec: replicas: 1 selector: matchLabels: app: gitlab template: metadata: labels: apps: gitlab deployment.yml spec: nodeSelector: cloud.google.com/gke-preemptible: "true" containers: - name: gitlab image: gitlab/gitlab-ce:latest resources: requests: cpu: "0.5" memory: 1Gi env: - name: GITLAB_OMNIBUS_CONFIG value: ...
  11. 19.
  12. 25.

    spec: affinity: nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: cloud.google.com/gke-accelerator

    operator: In values: - nvidia-tesla-p100 - nvidia-tesla-v100 preferredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: cloud.google.com/gke-nodepool operator: In values: - pool-a pod-gpu-and-nodepool.yml
  13. 27.

    spec: affinity: podAffinity: preferredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app

    operator: In values: - gitlab topologyKey: failure-domain.beta.kubernetes.io/zone different-zone-preferred.yml
  14. 30.
  15. 35.

    Pod Disruption Budget • 혾멂펞재쁢Pod핂캏헣힒맽쿦옪퓮힎쇮멑픒
 맣헪쿦핖쁢믾쁳 • minAvailable짝maxUnavailable 퐃켦픊옪컲헣 •

    ⚠ 핞짪헏핆disruption픦몋푾펞잚PDB 훎쿦많쇦즎옪 
 폖믾팘픎/PEF픦핳팮짪캫킪펞쁢PDB많힎힎힎팘픒쿦핖픚 BETA
  16. 37.

    Pod Disruption Budget • 혾멂펞재쁢Pod핂캏헣힒맽쿦옪퓮힎쇮멑픒
 맣헪쿦핖쁢믾쁳 • minAvailable짝maxUnavailable 퐃켦픊옪컲헣 •

    ⚠ 핞짪헏핆disruption픦몋푾펞잚PDB 훎쿦많쇦즎옪 
 폖믾팘픎/PEF픦핳팮짪캫킪펞쁢PDB많힎힎힎팘픒쿦핖픚 BETA
  17. 38.

    Node Draining $ kubectl cordon NODE $ kubectl drain NODE

    훊펂힒Node읊큲흂핂쇦힎팘솒옫힎헣삲 훊펂힒Node읊큲흂핂쇦힎팘솒옫힎헣몮 
 맏Pod읊칻헪삲
  18. 39.

    $ kubectl cordon gke-my-cluster-my-pool-592cda94-2w25 node "gke-my-cluster-my-pool-592cda94-2w25" cordoned $ kubectl describe

    gke-my-cluster-my-pool-592cda94-2w25 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal NodeNotSchedulable 45s kubelet, ... Node status is now: NodeNotSchedulable kubectl cordon
  19. 50.
  20. 51.

    %

  21. 55.

    $ kubectl create clusterrolebinding cluster-admin-binding \ --clusterrole=cluster-admin \ --user=$(gcloud config

    get-value account) clusterrolebinding "cluster-admin-binding" created Service Catalog컲 https://cloud.google.com/kubernetes-engine/docs/how-to/add-on/service-catalog/install-service-catalog
  22. 56.
  23. 57.

    $ sc install account: account@example.com project: shakr-openinfra-demo zone: generated service

    catalog deployment config in dir: /tmp/service- catalog544428136 Service Catalog installed successfully. Service Catalog컲 https://cloud.google.com/kubernetes-engine/docs/how-to/add-on/service-catalog/install-service-catalog
  24. 58.

    $ sc add-gcp-broker using project: shakr-openinfra-demo enabling a GCP API:

    servicebroker.googleapis.com enabling a GCP API: bigtableadmin.googleapis.com enabling a GCP API: ml.googleapis.com ... The Service Broker has been added successfully. Service Catalog컲 https://cloud.google.com/kubernetes-engine/docs/how-to/add-on/service-catalog/install-service-catalog
  25. 59.

    $ gcloud projects add-iam-policy-binding $PROJECT_ID \ --member serviceAccount:$EMAIL \ --role=roles/owner

    Service Catalog컲 https://cloud.google.com/kubernetes-engine/docs/how-to/add-on/service-catalog/install-service-catalog
  26. 60.

    $ kubectl -o "custom- columns=NAME:.spec.externalName,DESCRIPTION:.spec.description" \ get clusterserviceclasses NAME DESCRIPTION

    cloud-spanner The first horizontally scalable... cloud-iam-service-account Specialized service which provisions... cloud-pubsub Ingest event streams from anywhere... cloud-sql-mysql A fully-managed MySQL database service bigquery A fast, highly scalable, cost-effective cloud-bigtable A high performance NoSQL database Service Catalog핆
  27. 61.

    apiVersion: servicecatalog.k8s.io/v1beta1 kind: ServiceInstance metadata: name: test-storage namespace: default spec:

    clusterServiceClassExternalName: cloud-storage clusterServicePlanExternalName: beta parameters: bucketId: shakr-openinfra-demo-test-storage location: US storageClass: STANDARD serviceinstance.yml
  28. 63.
  29. 64.
  30. 65.

    apiVersion: servicecatalog.k8s.io/v1beta1 kind: ServiceInstance metadata: name: test-storage namespace: default spec:

    clusterServiceClassExternalName: cloud-storage clusterServicePlanExternalName: beta parameters: bucketId: shakr-openinfra-demo-test-storage location: US storageClass: STANDARD serviceinstance.yml
  31. 69.

    $ svcat provision test-storage \ --class cloud-storage \ --plan beta

    \ --namespace default \ --param bucketId=shakr-openinfra-demo-test-storage \ --param location=US \ --param storageClass=STANDARD svcat픊옪Service Instance캫컿
  32. 71.
  33. 73.

    $ svcat bind test-storage \ --name test-storage-binding \ --params-json \

    '{ "serviceAccount": "test-storage-serviceaccount", "createServiceAccount": true, "roles": [ "roles/storage.objectCreator", "roles/storage.objectViewer" ] }' Service Binding캫컿
  34. 75.

    spec: volumes: - name: test-storage-binding secret: secretName: test-storage-binding containers: -

    name: my-app image: shakr/my-app:latest volumeMounts: - name: binding mountPath: /mnt/binding env: - name: GOOGLE_APPLICATION_CREDENTIALS value: /mnt/binding/privateKeyData - name: STORAGE_PROJECT valueFrom: secretKeyRef: name: user-storage-binding key: projectId - name: STORAGE_BUCKET valueFrom: secretKeyRef: name: user-storage-binding key: bucketId deployment.yml (pod spec)
  35. 76.

    Service Catalog TL;DR • GKE(Kubernetes) 얺큲펞컪 GCP픦컪찒큲읊Service Instance옪 캫컿펺짢옪칺푷쿦핖삲 •

    Service Account JSON Key픦뽆픒
 먿헣힎팘팒솒쇪삲 • ⚠ Service Instance 칻헪킪킲헪읺콚큲 GCS쩒 SQL핆큲큲 솒
 칻헪쇦삖훊픦
  36. 78.
  37. 84.

    Pod Disruption Budget • 혾멂펞재쁢Pod핂캏헣힒맽쿦옪퓮힎쇮멑픒
 맣헪쿦핖쁢믾쁳 • minAvailable짝maxUnavailable 퐃켦픊옪컲헣 •

    ⚠ 핞짪헏핆disruption픦몋푾펞잚PDB 훎쿦많쇦즎옪 
 폖믾팘픎/PEF픦핳팮짪캫킪펞쁢PDB많힎힎힎팘픒쿦핖픚 BETA
  38. 86.

    Service Catalog • GKE(Kubernetes) 얺큲펞컪 GCP픦컪찒큲읊Service Instance옪 캫컿펺짢옪칺푷쿦핖삲 • Service

    Account JSON Key픦뽆픒
 먿헣힎팘팒솒쇪삲 • ⚠ Service Instance 칻헪킪킲헪읺콚큲 GCS쩒 SQL핆큲큲 솒
 칻헪쇦삖훊픦