Slide 1

Slide 1 text

Hirofumi Ichihara LINE Corporation コンテナ ネットワーキング

Slide 2

Slide 2 text

• 市原 裕史 • LINE Corporation • インフラプラットフォーム室 Verda2 Team • Network Software Developer • SDN/NFV • OpenStack Neutron • Docker • Kubernetes ABOUT ME

Slide 3

Slide 3 text

仮想計算機からコンテナへ Server VM OS Hyper Visor VM VM Server OS Container Container Container Container Server OS Process Process 1991年 Linux リリース 2010年 OpenStack リリース 2014年 Kubernetes リリース OS OS OS

Slide 4

Slide 4 text

モノリシックからマイクロサービスへ Process 機能A 機能B 機能C 機能D Process 機能A Process 機能C Process 機能B Process 機能D

Slide 5

Slide 5 text

Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container コンテナの世界 Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container Container

Slide 6

Slide 6 text

数⼗億のコンテナを制御 https://cloud.google.com/containers/?hl=ja

Slide 7

Slide 7 text

Container Orchestration Tool

Slide 8

Slide 8 text

ベアメタルから仮想ネットワーク ロードバランサ FWルール管理 ルーティング 管理 サーバ スイッチ ルータ ファイアウォール Compute node vSwitch vRouter vFW Network node vLB vSwitch VM VM VM VM FWルール管理 ルーティング 管理 VM: 仮想計算機 vSwitch: 仮想スイッチ vRouter: 仮想ルータ vFW:仮想ファイアウォール vLB: 仮想ロードバランサ

Slide 9

Slide 9 text

仮想ネットワークから コンテナネットワーク Worker Container Container Container Container Container Container Router + FW + LB + Switch FWルール管理、NAT管理、ルーティング管理、バラン シング管理、MACアドレス管理、セッション管理

Slide 10

Slide 10 text

コンテナネットワーク技術 Router + FW + LB + Switch • Linux bridge, veth, routing • iptables, ipvs, ipset, conntrack • OpenVSwitch, VPP, cilium, Tungsten Fabric

Slide 11

Slide 11 text

仮想計算機のIPアドレス ライフサイクル 同じIPアドレスで復帰 VM 1 192.168.100.10 Compute node VM 2 192.168.100.11 Compute node VM 1 192.168.100.10 VM 2 192.168.100.11 Compute node Compute node

Slide 12

Slide 12 text

コンテナのIPアドレス ライフサイクル Worker Container 1 192.168.100.10 Worker Container 2 192.168.101.11 Worker Worker Container 2 192.168.101.11 Container 1 192.168.101.12 異なるIPアドレス で復帰

Slide 13

Slide 13 text

ライフサイクルの変⾰

Slide 14

Slide 14 text

Container Container Kubernetes Services 機能 仮想IPとサービスプロキシ l user space proxy l iptables proxy l ipvs proxy サービスディスカバリ l DNS サービスタイプ l ClusterIP l NodePort l LoadBalancer l ExternalName https://kubernetes.io/docs/concepts/services-networking/service/ Pod Pod Pod Service

Slide 15

Slide 15 text

リソースを Service 化 nginx nginx nginx LB Web Service

Slide 16

Slide 16 text

ClusterIP deployment 作成 Worker1 nginx Worker2 nginx apiVersion: apps/v1 kind: Deployment metadata: name: my-nginx spec: selector: matchLabels: run: my-nginx replicas: 2 template: metadata: labels: run: my-nginx spec: containers: - name: my-nginx image: nginx ports: - containerPort: 80 $ kubectl create -f my-nginx.yaml https://kubernetes.io/docs/concepts/services-networking/connect-applications-service/#exposing-pods-to-the-cluster

Slide 17

Slide 17 text

ClusterIP service 作成 apiVersion: v1 kind: Service metadata: name: my-nginx labels: run: my-nginx spec: ports: - port: 80 protocol: TCP selector: run: my-nginx Worker1 nginx Worker2 nginx $ kubectl create -f svc.yaml Load Balancer 80 80 https://kubernetes.io/docs/concepts/services-networking/connect-applications-service/#exposing-pods-to-the-cluster プロトコルは TCP/UDP に対応。また ver1.12 から SCTP を alpha サポート

Slide 18

Slide 18 text

ClusterIP 作成されたリソース Worker1 nginx Worker2 nginx $ kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE my-nginx-6458ff55f-d2qtj 1/1 Running 0 5h34m 10.244.2.2 worker2 my-nginx-6458ff55f-pdmgn 1/1 Running 0 5h34m 10.244.1.4 worker1 $ kubectl get services NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE my-nginx ClusterIP 10.99.125.234 80/TCP 5h17m Load Balancer 10.244.1.4:80 10.244.2.2:80 10.99.125.234:80

Slide 19

Slide 19 text

ClusterIP 作成されたリソース $ curl http://10.99.125.234 Welcome to nginx! …

Thank you for using nginx.

10.99.125.234:80 my-nginx Service (http)

Slide 20

Slide 20 text

PodのIPアドレスの変更 Worker1 nginx Worker2 nginx Load Balancer 10.244.1.4:80 10.244.2.2:80 10.99.125.234:80 Worker3 nginx 10.244.3.2:80

Slide 21

Slide 21 text

PodのIPアドレスの変更 10.99.125.234:80 my-nginx Service (http)

Slide 22

Slide 22 text

ClusterIP どのように通信するのか? $ kubectl get services NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE my-nginx ClusterIP 10.99.125.234 80/TCP 5h17m $ ip a | grep "inet " inet 127.0.0.1/8 scope host lo inet 10.0.2.15/24 brd 10.0.2.255 scope global enp0s3 inet 192.168.33.10/24 brd 192.168.33.255 scope global enp0s8 inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0 inet 10.244.0.1/24 scope global cni0 inet 10.244.0.0/32 scope global flannel.1 $ ip r default via 10.0.2.2 dev enp0s3 10.0.2.0/24 dev enp0s3 proto kernel scope link src 10.0.2.15 10.244.0.0/24 dev cni0 proto kernel scope link src 10.244.0.1 linkdown 10.244.1.0/24 via 10.244.1.0 dev flannel.1 onlink 10.244.2.0/24 via 10.244.2.0 dev flannel.1 onlink 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown 192.168.33.0/24 dev enp0s8 proto kernel scope link src 192.168.33.10 10.99.125.234 に疎通可能そうなネットワークインターフェースは⾒当たらない

Slide 23

Slide 23 text

• Ubuntu 16.04.5 • kubeadm v1.12.2 • Kubernetes v1.12.2 • kubectl v1.12.2 • flannel v0.10.0 試験環境 Master 192.168.33.10 Worker1 192.168.33.11 Worker2 192.168.33.12 nginx Pod 10.244.1.4:80 nginx Pod 10.244.2.2:80 Kubernetes Controller

Slide 24

Slide 24 text

ClusterIP with iptables $ sudo iptables-save … -A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.99.125.234/32 -p tcp -m comment --comment "default/my-nginx: cluster IP" -m tcp -- dport 80 -j KUBE-MARK-MASQ -A KUBE-SERVICES -d 10.99.125.234/32 -p tcp -m comment --comment "default/my-nginx: cluster IP" -m tcp --dport 80 -j KUBE-SVC- BEPXDJBUHFCSYIC3 … -A KUBE-SVC-BEPXDJBUHFCSYIC3 -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-ISOVEE5LNFCUUY5T -A KUBE-SVC-BEPXDJBUHFCSYIC3 -j KUBE-SEP-C6VVNBMP4A5PJSYL … -A KUBE-SEP-C6VVNBMP4A5PJSYL -s 10.244.2.2/32 -j KUBE-MARK-MASQ -A KUBE-SEP-C6VVNBMP4A5PJSYL -p tcp -m tcp -j DNAT --to-destination 10.244.2.2:80 … -A KUBE-SEP-ISOVEE5LNFCUUY5T -s 10.244.1.4/32 -j KUBE-MARK-MASQ -A KUBE-SEP-ISOVEE5LNFCUUY5T -p tcp -m tcp -j DNAT --to-destination 10.244.1.4:80 worker1, worker2 ともに全く同じルールが追加されている

Slide 25

Slide 25 text

ClusterIP with iptables $ sudo iptables-save … -A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.99.125.234/32 -p tcp -m comment --comment "default/my-nginx: cluster IP" -m tcp -- dport 80 -j KUBE-MARK-MASQ -A KUBE-SERVICES -d 10.99.125.234/32 -p tcp -m comment --comment "default/my-nginx: cluster IP" -m tcp --dport 80 -j KUBE-SVC- BEPXDJBUHFCSYIC3 … -A KUBE-SVC-BEPXDJBUHFCSYIC3 -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-ISOVEE5LNFCUUY5T -A KUBE-SVC-BEPXDJBUHFCSYIC3 -j KUBE-SEP-C6VVNBMP4A5PJSYL … -A KUBE-SEP-C6VVNBMP4A5PJSYL -s 10.244.2.2/32 -j KUBE-MARK-MASQ -A KUBE-SEP-C6VVNBMP4A5PJSYL -p tcp -m tcp -j DNAT --to-destination 10.244.2.2:80 … -A KUBE-SEP-ISOVEE5LNFCUUY5T -s 10.244.1.4/32 -j KUBE-MARK-MASQ -A KUBE-SEP-ISOVEE5LNFCUUY5T -p tcp -m tcp -j DNAT --to-destination 10.244.1.4:80 10.99.125.234:80 (my-nginx service)への通信は KUBE-SVC-BEPXDJBUHFCSYIC3へジャンプ

Slide 26

Slide 26 text

ClusterIP with iptables $ sudo iptables-save … -A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.99.125.234/32 -p tcp -m comment --comment "default/my-nginx: cluster IP" -m tcp -- dport 80 -j KUBE-MARK-MASQ -A KUBE-SERVICES -d 10.99.125.234/32 -p tcp -m comment --comment "default/my-nginx: cluster IP" -m tcp --dport 80 -j KUBE-SVC- BEPXDJBUHFCSYIC3 … -A KUBE-SVC-BEPXDJBUHFCSYIC3 -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-ISOVEE5LNFCUUY5T -A KUBE-SVC-BEPXDJBUHFCSYIC3 -j KUBE-SEP-C6VVNBMP4A5PJSYL … -A KUBE-SEP-C6VVNBMP4A5PJSYL -s 10.244.2.2/32 -j KUBE-MARK-MASQ -A KUBE-SEP-C6VVNBMP4A5PJSYL -p tcp -m tcp -j DNAT --to-destination 10.244.2.2:80 … -A KUBE-SEP-ISOVEE5LNFCUUY5T -s 10.244.1.4/32 -j KUBE-MARK-MASQ -A KUBE-SEP-ISOVEE5LNFCUUY5T -p tcp -m tcp -j DNAT --to-destination 10.244.1.4:80 確率 0.5 で KUBE-SEP-ISOVEE5LNFCUUY5T もしくは KUBE-SEP-C6VVNBMP4A5PJSYLへジャンプする

Slide 27

Slide 27 text

ClusterIP with iptables $ sudo iptables-save … -A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.99.125.234/32 -p tcp -m comment --comment "default/my-nginx: cluster IP" -m tcp -- dport 80 -j KUBE-MARK-MASQ -A KUBE-SERVICES -d 10.99.125.234/32 -p tcp -m comment --comment "default/my-nginx: cluster IP" -m tcp --dport 80 -j KUBE-SVC- BEPXDJBUHFCSYIC3 … -A KUBE-SVC-BEPXDJBUHFCSYIC3 -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-ISOVEE5LNFCUUY5T -A KUBE-SVC-BEPXDJBUHFCSYIC3 -j KUBE-SEP-C6VVNBMP4A5PJSYL … -A KUBE-SEP-C6VVNBMP4A5PJSYL -s 10.244.2.2/32 -j KUBE-MARK-MASQ -A KUBE-SEP-C6VVNBMP4A5PJSYL -p tcp -m tcp -j DNAT --to-destination 10.244.2.2:80 … -A KUBE-SEP-ISOVEE5LNFCUUY5T -s 10.244.1.4/32 -j KUBE-MARK-MASQ -A KUBE-SEP-ISOVEE5LNFCUUY5T -p tcp -m tcp -j DNAT --to-destination 10.244.1.4:80 10.99.125.234:80 への通信が各 nginx Podの アドレス 10.244.2.2, 10.244.1.4 へ DNAT される

Slide 28

Slide 28 text

ClusterIP with ipvs $ sudo ipvsadm -Ln IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 10.99.125.234:80 rr -> 10.244.1.4:80 Masq 1 0 0 -> 10.244.2.2:80 Masq 1 0 0 worker1, worker2 ともに全く同じルールが追加されている

Slide 29

Slide 29 text

Worker iptablesによってパケットの書き換え Worker Pod Worker nginx Pod 10.244.1.4:80 nginx Pod 10.244.2.2:80 10.99.125.234 iptables 10.244.1.4 10.244.1.4

Slide 30

Slide 30 text

更に複雑な例 Network Policy

Slide 31

Slide 31 text

Pod へのクラスタ内外の通信のアクセス許可をコントロールする Network Policy Worker Pod Worker Pod Pod Pod Pod Pod https://kubernetes.io/docs/tasks/administer-cluster/declare-network-policy/ https://kubernetes.io/docs/concepts/services-networking/network-policies/

Slide 32

Slide 32 text

label: app=foo 試験環境 Worker1 10.150.0.2 Worker2 10.150.0.3 • kubernetes 1.11.2-gke.18 • kubectl v1.10.7 Client A 10.48.1.16 label: app=bar Client B 10.48.1.17 label: app=hello hello-app

Slide 33

Slide 33 text

Network Policy 着信制限 https://cloud.google.com/kubernetes-engine/docs/tutorials/network-policy $ kubectl run hello-web --labels app=hello --image=gcr.io/google-samples/hello-app:1.0 --port 8080 --expose kind: NetworkPolicy apiVersion: networking.k8s.io/v1 metadata: name: hello-allow-from-foo spec: policyTypes: - Ingress podSelector: matchLabels: app: hello ingress: - from: - podSelector: matchLabels: app: foo Worker1 label: app=hello hello-app 8080 Firewall label: app=foo

Slide 34

Slide 34 text

Network Policy 着信制限 結果 https://cloud.google.com/kubernetes-engine/docs/tutorials/network-policy $ kubectl run -l app=foo --image=alpine --restart=Never --rm -i test-1 -- wget -qO- --timeout=2 http://hello- web:8080 Hello, world! Version: 1.0.0 Hostname: hello-web-76d4fc9f5b-c98kz $ kubectl run -l app=bar --image=alpine --restart=Never --rm -i test-1 -- wget -qO- --timeout=2 http://hello- web:8080 If you don't see a command prompt, try pressing enter. wget: download timed out pod default/test-1 terminated (Error) label: app=hello Worker1 hello-app 8080 Firewall label: app=foo label: app=bar Client A Client B

Slide 35

Slide 35 text

Network Policy 着信制限 iptables -A cali-tw-cali7e6f9a42eba -m comment --comment "cali:K7cgKlec-rOcRZmV" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT -A cali-tw-cali7e6f9a42eba -m comment --comment "cali:r0vz6Rgdk8ddL4bG" -m conntrack --ctstate INVALID -j DROP -A cali-tw-cali7e6f9a42eba -m comment --comment "cali:ziYb2WzOoIMVstsP" -j MARK --set-xmark 0x0/0x1000000 -A cali-tw-cali7e6f9a42eba -m comment --comment "cali:8YswtCBJpM8NBzfl" -m comment --comment "Start of policies" -j MARK --set- xmark 0x0/0x2000000 -A cali-tw-cali7e6f9a42eba -m comment --comment "cali:zmd0DgaOCBkFFlYd" -m mark --mark 0x0/0x2000000 -j cali-pi- _6OJfcXg5T4SeuT6eE80 -A cali-tw-cali7e6f9a42eba -m comment --comment "cali:8dNeQcZoZeI9oNzW" -m comment --comment "Return if policy accepted" -m mark --mark 0x1000000/0x1000000 -j RETURN -A cali-tw-cali7e6f9a42eba -m comment --comment "cali:WAdLtw9Uyo9c3Vfi" -m comment --comment "Drop if no policies passed packet" -m mark --mark 0x0/0x2000000 -j DROP -A cali-tw-cali7e6f9a42eba -m comment --comment "cali:nvKHosQ-U2PISrW5" -j cali-pri-k8s_ns.default -A cali-tw-cali7e6f9a42eba -m comment --comment "cali:ETIe11gXYxo1RL5F" -m comment --comment "Return if profile accepted" -m mark --mark 0x1000000/0x1000000 -j RETURN -A cali-tw-cali7e6f9a42eba -m comment --comment "cali:tp5L1HaKwMICs94e" -m comment --comment "Drop if no profiles matched" -j DROP Network Policy 適⽤後に追加されたルール

Slide 36

Slide 36 text

Network Policy 着信制限 iptables -A cali-tw-cali7e6f9a42eba -m comment --comment "cali:K7cgKlec-rOcRZmV" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT -A cali-tw-cali7e6f9a42eba -m comment --comment "cali:r0vz6Rgdk8ddL4bG" -m conntrack --ctstate INVALID -j DROP -A cali-tw-cali7e6f9a42eba -m comment --comment "cali:ziYb2WzOoIMVstsP" -j MARK --set-xmark 0x0/0x1000000 -A cali-tw-cali7e6f9a42eba -m comment --comment "cali:8YswtCBJpM8NBzfl" -m comment --comment "Start of policies" -j MARK --set- xmark 0x0/0x2000000 -A cali-tw-cali7e6f9a42eba -m comment --comment "cali:zmd0DgaOCBkFFlYd" -m mark --mark 0x0/0x2000000 -j cali-pi- _6OJfcXg5T4SeuT6eE80 -A cali-tw-cali7e6f9a42eba -m comment --comment "cali:8dNeQcZoZeI9oNzW" -m comment --comment "Return if policy accepted" -m mark --mark 0x1000000/0x1000000 -j RETURN -A cali-tw-cali7e6f9a42eba -m comment --comment "cali:WAdLtw9Uyo9c3Vfi" -m comment --comment "Drop if no policies passed packet" -m mark --mark 0x0/0x2000000 -j DROP -A cali-tw-cali7e6f9a42eba -m comment --comment "cali:nvKHosQ-U2PISrW5" -j cali-pri-k8s_ns.default -A cali-tw-cali7e6f9a42eba -m comment --comment "cali:ETIe11gXYxo1RL5F" -m comment --comment "Return if profile accepted" -m mark --mark 0x1000000/0x1000000 -j RETURN -A cali-tw-cali7e6f9a42eba -m comment --comment "cali:tp5L1HaKwMICs94e" -m comment --comment "Drop if no profiles matched" -j DROP Network Policy 適⽤後に追加されたルール ジャンプ先の cali-pi-_6OJfcXg5T4SeuT6eE80 と cali-pri-k8s_ns.default のどちらかで条件を満たせば通信許可

Slide 37

Slide 37 text

Network Policy 着信制限 iptables -A cali-pi-_6OJfcXg5T4SeuT6eE80 -m comment --comment "cali:9Zdqkrk8NUVBY_ck" -m set --match-set cali4- s:0Mv1nWHW09z0NgcXya-DCdb src -j MARK --set-xmark 0x1000000/0x1000000 -A cali-pi-_6OJfcXg5T4SeuT6eE80 -m comment --comment "cali:dR3oD81dXG8jV32f" -m mark --mark 0x1000000/0x1000000 -j RETURN $ sudo toolbox ipset list cali4-s:0Mv1nWHW09z0NgcXya-DCdb Spawning container root-gcr.io_google-containers_toolbox-20180309-00 on /var/lib/toolbox/root-gcr.io_google-containers_toolbox- 20180309-00. Press ^] three times within 1s to kill container. Name: cali4-s:0Mv1nWHW09z0NgcXya-DCdb Type: hash:ip Revision: 4 Header: family inet hashsize 1024 maxelem 1048576 Size in memory: 136 References: 3 Number of entries: 1 Members: 10.48.1.16 ipset エントリ cali4-s:0Mv1nWHW09z0NgcXya-DCdb に含まれる IP アドレスからの通信は許可のためのマーク app=foo ラベルを持つ Pod の IP アドレスが登録されている

Slide 38

Slide 38 text

iptablesによってパケットのドロップ Worker Worker iptables ClientAから label: app=foo Client A 10.48.1.16 label: app=bar Client B 10.48.1.17 label: app=hello hello-app ClientAから ClientBから ClientAから ClientBから

Slide 39

Slide 39 text

Kubernetesの基本的な機能でさえ 複雑なLinuxネットワーキングの設定を 読み解く知識と経験が必要 Worker Pod Pod Pod Worker Pod Pod Pod すごくごちゃごちゃ した設定 すごくごちゃごちゃ した設定

Slide 40

Slide 40 text

THANK YOU