プロダクションレディ Pods+ / Production-Ready Pods Plus

Container X mas Party with ﬂexy 2018/12/18 Kazuki Suda <[email protected]>
@superbrothers ϓϩμΫγϣϯϨσΟ Pods

@superbrothers

@superbrothers アジェンダ 1. Pods とは何か 2. プロダクションレディ Pods

Pods とは何か

@superbrothers ▶ 複数のコンテナと複数のボリューム ▶ デプロイの最⼩単位 ▶ IP-per-Pod Pods Pod Volume
Web server File Puller

@superbrothers Pod A ボリュームコンテナコンテナノード1 ノード2 IP: 10.60.0.5
Pod B IP: 10.60.1.7 Pod C IP: 10.60.1.8 Podに含まれるコンテナは  必ず同じノード上で実⾏される各PodはフラットなネットワークのIPアドレスを持ち、ノードをまたいで通信できる

@superbrothers apiVersion: v1 kind: Pod metadata: name: nginx spec: containers:
- name: nginx image: nginx:1.13.7 ports: - containerPort: 80 ▶ 複数のコンテナと複数のボリューム ▶ デプロイの最⼩単位 ▶ IP-per-Pod Pods

@superbrothers Pod A Deployment レプリカ数: 3  セレクタ: app=nginx ノード1 Podテンプレート
ノード2 Pod B Pod C app: nginx app: nginx app: nginx app: nginx Podオブジェクトのテンプレートでこの設定を元に  Podレプリカが作成される作成されるPodは、適切なノードに  ⾃動的にスケジュールされ実⾏される

@superbrothers Pod ライフサイクル Running Terminating Scheduling Building Containers

プロダクションレディ Pods

@superbrothers Building Containers ▶ The Twelve-Factor App + 設定を環境変数に格納する +
ログは stdout, stderr に出⼒する, etc ▶ 1プロセス / 1コンテナ ▶ ⼩さいベースイメージを使う ▶ 不要なパッケージをインストールしない + multi-stage builds

@superbrothers multi-stage builds Use multi-stage builds | Docker Documentation FROM
golang:1.7.3 WORKDIR /go/src/github.com/alexellis/href-counter/ RUN go get -d -v golang.org/x/net/html COPY app.go . RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o app . FROM alpine:latest RUN apk --no-cache add ca-certificates WORKDIR /root/ COPY --from=0 /go/src/github.com/alexellis/href-counter/app . CMD ["./app"]

@superbrothers 実⾏に必要なリソースが残っていないノードにスケジュールされてしまうコンテナの必要最低限必要なリソースを指定する + resource requests Scheduling

@superbrothers resource requests ▶ コンテナの実⾏に必要最低限必要なリソース ▶ スケジュールされたコンテナのリソース要求の  合計がノードの容量よりも少ないことを保証する apiVersion: v1
kind: Pod metadata: name: kuard spec: containers: - image: gcr.io/kuar-demo/kuard- name: kuard resources: requests: cpu: "500m" memory: “128Mi" ephemeral-storage: “2Gi" ports: - containerPort: 8080 name: http protocol: TCP

@superbrothers ephemeral-storage ▶ コンテナの⼀時ストレージ + emptyDir volumes, container logs +
container writable layers  (runtime partition 利⽤時は⾮カウント) apiVersion: v1 kind: Pod metadata: name: kuard spec: containers: - image: gcr.io/kuar-demo/kuard- name: kuard resources: requests: cpu: "500m" memory: “128Mi" ephemeral-storage: “2Gi" ports: - containerPort: 8080 name: http protocol: TCP v1.13 beta

@superbrothers Running ノードのリソースを使い切ってしまうコンテナのリソース使⽤量の上限を指定する + resource limits アプリケーションがハングして固まってしまうリクエストを受ける準備が終わる前にリクエストがきてしまうコンテナのヘルスチェックを設定する
+ Liveness probe (⽣きているかどうか) + Readiness probe (応答できるかどうか)

@superbrothers apiVersion: v1 kind: Pod metadata: name: kuard spec: containers:
- image: gcr.io/kuar-demo/kuard-am name: kuard resources: requests: cpu: "500m" memory: “128Mi" ephemeral-storage: “2Gi" limits: cpu: "1000m" memory: “256Mi" ephemeral-storage: “4Gi" ports: - containerPort: 8080 name: http protocol: TCP ▶ コンテナのリソース使⽤量の上限を指定する + cpu: 上限を超えて使⽤しない + memory: 上限を超えると強制終了 (OOM) + ephemeral-storage: 上限を超えると強制終了 resource limits

@superbrothers ▶ Pod レベル + 全コンテナの⼀時ストレージの使⽤量が Limit の合計値を超えたら ▶ コンテナレベル
+ container logs と writable layer が Limit を超えたら ephemeral-storage: 上限を超えると強制終了 v1.13 beta

@superbrothers Liveness probe ▶ コンテナ内プロセスの死活監視 ▶ 失敗するとコンテナを強制的に再起動する spec: containers: -
image: gcr.io/kuar-demo/kuard-amd64:1 name: kuard ports: - containerPort: 8080 name: http protocol: TCP livenessProbe: httpGet: path: /healthy port: 8080 initialDelaySeconds: 5 timeoutSeconds: 1 periodSeconds: 10 failureThreshold: 3 readinessProbe: httpGet: path: /ready port: 8080 initialDelaySeconds: 30 timeoutSeconds: 1 periodSeconds: 10 failureThreshold: 3

@superbrothers Liveness probe ▶ exec: コマンドの実⾏ + Exit コードが0で healthy
判定 ▶ httpGet: HTTP GET リクエスト + ステータスコードが200以上400未満で healthy 判定 ▶ tcpSocket: TCP Socket + コネクションが確⽴すれば healthy 判定

@superbrothers http.HandleFunc("/healthy", func(w http.ResponseWriter, r *http.Request) { w.Write([]byte("OK")) } http.ListenAndServe(":8080",
nil)

@superbrothers Readiness Probe ▶ コンテナ内プロセスがリクエストに応答できるか ▶ 失敗するとServices を通じて  トラフィックを受信しない
(Unready) spec: containers: - image: gcr.io/kuar-demo/kuard-amd64:1 name: kuard ports: - containerPort: 8080 name: http protocol: TCP livenessProbe: httpGet: path: /healthy port: 8080 initialDelaySeconds: 5 timeoutSeconds: 1 periodSeconds: 10 failureThreshold: 3 readinessProbe: httpGet: path: /ready port: 8080 initialDelaySeconds: 30 timeoutSeconds: 1 periodSeconds: 10 failureThreshold: 3

@superbrothers Services ▶ 仮想IPとポート ▶ ラベルセレクタによる  Pod のグルーピング ▶ サービスタイプ:
+ ClusterIP + NodePort + LoadBalancer + Pod app web Pod app web ReplicaSet Service VIP: 10.0.0.249 Selector: app=web

@superbrothers apiVersion: v1 kind: Service metadata: name: kuard spec: type:
ClusterIP selector: app: kuard ports: - protocol: TCP port: 8080 targetPort: 8080 Services ▶ 仮想IPとポート ▶ ラベルセレクタによる  Pod のグルーピング ▶ サービスタイプ: + ClusterIP + NodePort + LoadBalancer +

@superbrothers http.HandleFunc("/ready", func(w http.ResponseWriter, r *http.Request) { message := ""
if !isDataLoaded() { message += “Initial data is not loaded\n” } if len(message) > 0 { // Send 503 http.Error(w, message, http.StatusServiceUnavailable) } else { w.Write([]byte("OK")) } }) http.ListenAndServe(":8080", nil)

@superbrothers Terminating 終了中にリクエストがきてエラーを返してしまうコンテナを Graceful shutdown させる + terminationGracePeriodSeconds +
shareProcessNamespace + preStop フック / SIGTERM のハンドリング

@superbrothers apiVersion: apps/v1 kind: Deployment metadata: name: kuard spec: selector:
matchLabels: app: kuard template: metadata: labels: app: kuard spec: terminationGracePeriodSeconds: 60 shareProcessNamespace: true containers: - image: gcr.io/kuar-demo/kuard-amd64:1 name: kuard ports: - containerPort: 8080 name: http protocol: TCP terminationGracePeriodSeconds ▶ Pod を Graceful に終了させるため必要な秒数 ▶ 設定した秒数が経過するとコンテナに SIGKILL が送信される

@superbrothers apiVersion: apps/v1 kind: Deployment metadata: name: kuard spec: selector:
matchLabels: app: kuard template: metadata: labels: app: kuard spec: terminationGracePeriodSeconds: 60 shareProcessNamespace: true containers: - image: gcr.io/kuar-demo/kuard-amd64:1 name: kuard ports: - containerPort: 8080 name: http protocol: TCP livenessProbe: shareProcessNamespace ▶ Pod 内のコンテナで PID Namespace を共有する ▶ コンテナの PID 1 問題の解決 + PID1 は、カーネルから特別扱いされてるって本当ですか？ v1.13 beta

@superbrothers $ kubectl attach -it nginx -c shell If you
don't see a command prompt, try pressing enter. / # ps aux PID USER TIME COMMAND 1 root 0:00 /pause 5 root 0:00 nginx: master process nginx -g daemon off; 9 101 0:00 nginx: worker process 10 root 0:00 sh 15 root 0:00 ps aux

@superbrothers apiVersion: apps/v1 kind: Deployment metadata: labels: app: nginx name:
nginx spec: selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - image: nginx:1.13.7 name: nginx lifecycle: preStop: exec: command: [“/bin/sh“, ”-c“, “sleep 2; nginx -s quit; sleep 3”] ▶ Pod の終了フェイズの最初に実⾏される (Optional) + ブロックしなければならない(同期)ことに注意 ▶ preStop フック実⾏後に SIGTERM が送信される preStop フック

@superbrothers 新規コネクション接続済コネクション preStop 処理 SIGTERM 処理 Pod の終了開始 (Optional) 
preStop フック Services のターゲットから外れる kube-proxy が iptables ルールを更新し、新規接続がなくなるこれ以降新規接続はなくなる preStop フックか SIGTERM で Graceful にクローズする必要がある GracePeriodSeconds 後  終了していない場合(デフォルト30秒)  SIGKILL   SIGTERM Kubernetes: 詳解 Pods の終了 - Qiita

まとめ

@superbrothers まとめ ▶ Building Containers + ベストプラクティスに従う ▶ Scheduling +
必要最低限必要なリソースを設定する (resource requests) ▶ Running + ヘルスチェックを設定する (Liveness / Readiness probe) + リソース使⽤量に上限を設定する (resource limits) ▶ Terminating + Graceful shutdown させる  (preStop フック / shareProcessNamespace/ SIGTERM ハンドリング)

Questions?

プロダクションレディ Pods+ / Production-Ready Pods Plus

プロダクションレディ Pods+ / Production-Ready Pods Plus

More Decks by Kazuki Suda

Other Decks in Technology

Featured

Transcript