Slide 1

Slide 1 text

ML؀ڥͰͷRook/Ceph ͯΜͥΜ 2020/07/03 Japan Rook Meetup #3 1

Slide 2

Slide 2 text

SELF INTRODUCTION ▸ ॴଐ: ۙـେֶେֶӃ(ؔ੢ʹ͋Δࢲཱେֶ) ▸ ૯߹ཧ޻ֶݚڀՊΤϨΫτϩχΫεܥ޻ֶઐ߈ ▸ ίϯϐϡʔλϏδϣϯݚڀࣨɹम࢜1೥ ▸ Twitter: ˏAokiTenzen ▸ Blog: https://tenzen.hatenablog.com/ 2

Slide 3

Slide 3 text

AGENDA I. Rook/Cephͷ࢖༻ঢ়گ II. OSDs on PVCs with Pod Topology Spread Constraints III. Rook/CephFSͷো֐υϝΠϯ ݚڀࣨͰ࣮ݧతʹՔಇ͍ͤͯ͞ΔKubernetes ClusterͰͷ Rook/Cephʹ͍͓ͭͯ࿩͠͠·͢ɽ 3

Slide 4

Slide 4 text

ิ଍ ͝঺հ͢Δ؀ڥ͸ɼMLΛ࢖͏ΞϓϦέʔγϣϯ Λಈ͔͢؀ڥͰ͸ͳ͘ɼMLΛ࢖ͬͯݚڀɾ։ൃΛ ߦ͏؀ڥʹͳΓ·͢ɽ 4

Slide 5

Slide 5 text

Rook/Cephͷ࢖༻ঢ়گ 5

Slide 6

Slide 6 text

ӡ༻͍ͯ͠ΔKubernetes Cluster(Ұ෦෼) ▸ ֓ཁ ▸ JupyterLabΛϢʔβʹఏڙ ▸ IngressʹΑΔΞΫηείϯτ ϩʔϧ ▸ Rook/CephͰϝΠϯετϨʔ δΛಈతʹ෷͍ग़͢ɽ ▸ ֎෦NFSαʔό͔Βαϒετ ϨʔδΛಈతʹ෷͍ग़͢ɽ ▸ ArgoCDͰ GitOps 6

Slide 7

Slide 7 text

ετϨʔδγεςϜબఆج४ ▸ ཁ݅ ▸ ಉ͡σʔλΛෳ਺ਓͰղੳ͍ͨ͠ɽ ▸ ཉ͍࣌͠ʹཉ͍͠෼͚ͩετϨʔδΛ࢖͍ͨ ͍ɽ ▸ JupyterLab্ ͰϑΝΠϧ୯ ҐͰσʔλΛ Π ϯ λ ϥ Ϋ ςΟϒʹૢ࡞ ͠ͳ͕Βղੳ ͍ͨ͠ɽ 7

Slide 8

Slide 8 text

ετϨʔδγεςϜબఆج४ ▸ ཁ݅ ▸ ಉ͡σʔλΛෳ਺ਓͰղੳ͍ͨ͠ɽ ▸ ཉ͍࣌͠ʹཉ͍͠෼͚ͩετϨʔδΛ࢖͍ͨ ͍ɽ ▸ JupyterLab্ ͰϑΝΠϧ୯ ҐͰσʔλΛ Π ϯ λ ϥ Ϋ ςΟϒʹૢ࡞ ͠ͳ͕Βղੳ ͍ͨ͠ɽ Rook/CephFSΛબ୒ 8

Slide 9

Slide 9 text

Ceph ͷछྨ ▸ Ceph ▸ Object Storage(RGW) ▸ Amazon S3 Restful APIͱ OpenStack Swift APIޓ׵ͷ ΠϯλϑΣʔεΛఏڙ͢Δɽ ▸ Block Devices(RBD) ▸ ϒϩοΫσόΠεΛఏڙ͢Δɽ ▸ FileSystem(CephFS) ▸ POSIX४ڌͷϑΝΠϧγεςϜΛఏڙ͢Δɽ 9

Slide 10

Slide 10 text

Rookʹ͓͚ΔCeph Clusterͷछྨ ▸ Rookʹ͓͚ΔCeph Cluster͸2λΠϓ͋Δ ▸ Host-based Cluster ▸ PVC-based Cluster 10

Slide 11

Slide 11 text

Rookʹ͓͚ΔCeph Clusterͷछྨ Host-based Cluster Cluster؅ཧऀ͕Ϛχ ϑΣετͰ௚઀σόΠ εͷࢦఆΛߦ͍ɼ OSD Pod͕࡞੒͞ΕΔ λΠϓ nodes: - name: "node0" devices: - name: "/dev/disk/by-id/ata-hoge0" - name: "/dev/disk/by-id/ata-hoge1" - name: "/dev/disk/by-id/ata-hoge2" - name: "node1" devices: - name: "/dev/disk/by-id/ata-fuga0" - name: "/dev/disk/by-id/ata-fuga1" - name: “/dev/disk/by-id/ata-fuga2" - name: "node2" devices: - name: "/dev/disk/by-id/ata-foo0" - name: "/dev/disk/by-id/ata-foo1" - name: “/dev/disk/by-id/ata-foo2" Rook v1.2͔Β͸ ʮPersistent Block Device Namingʯ ʹରԠ 11

Slide 12

Slide 12 text

Rookʹ͓͚ΔCeph Clusterͷछྨ PVC-based Cluster Cluster͕PVCΛ࡞੒͠ɼ PVCʹج͍ͮͯCSI͕࡞੒͠ ͨPVΛ࢖ͬͯOSD PodΛ࡞ ੒͢Δɽ storage: storageClassDeviceSets: - name: japan_rook_sets count: 6 …(লུ) volumeClaimTemplates: - metadata: name: data spec: resources: requests: storage: 10Gi storageClassName: gp2 volumeMode: Block accessModes: - ReadWriteOnce OSD͋ͨΓͷ ༰ྔΛࢦఆ OSD਺Λࢦఆ storage class Λࢦఆ 12

Slide 13

Slide 13 text

Rookʹ͓͚ΔCeph Clusterͷछྨ ▸ ݱࡏ͸Host-based ClusterΛ࢖༻த ▸ ࠓޙPVC-based ClusterʹҠߦ͍ͨ͠ɽ ֤NodeͷσόΠε഑ஔͷৄ ࡉΛҙࣝͤͣʹ࢖͑Δ͜ͱ ͕ັྗ 13

Slide 14

Slide 14 text

OSDs on PVCs with Pod Topology Spread Constraints 14

Slide 15

Slide 15 text

OSDs on PVCsʹ͓͚ΔPod Topology Spread Constraintsͷ༗༻ੑ ▸ Rook v1.2Ҏલ͸PVC-basedͰOSDͷۉ౳഑ஔ͕ࠔ೉ͩͬͨɽ rackA rackB rackC Nodeα Nodeβ Nodeγ Nodeδ Nodeε Nodeζ OSD0 OSD1 OSD2 OSD3 OSD4 OSD5 15

Slide 16

Slide 16 text

▸ Rook v1.2Ҏલ͸PVC-basedͰOSDͷۉ౳഑ஔ͕ࠔ೉ͩͬͨɽ rackA rackB rackC Nodeα Nodeβ Nodeγ Nodeδ Nodeε Nodeζ OSD0 OSD1 OSD2 OSD3 OSD4 OSD6 Rook͞ΜͳΜ΍͔Μ΍ݴ͏ͯ ্ख͍ࣄͯ͘͠ΕΔΜͪΌ͏ʁ ຊ౰ʹͰ͖ͳ͍ͷʁ OSDs on PVCsʹ͓͚ΔPod Topology Spread Constraintsͷ༗༻ੑ 16

Slide 17

Slide 17 text

‣ ࣮ࡍ΍ͬͯΈΔͱ… rack໊ Node໊ rackA pvcbased-w0 pvcbased-w1 rackB pvcbased-w2 pvcbased-w3 rackC pvcbased-w4 pvcbased-w5 Node໊ σόΠε਺ pvcbased-w0 3 pvcbased-w1 3 pvcbased-w2 3 pvcbased-w3 3 pvcbased-w4 3 pvcbased-w5 3 rackผNode໊ rackผNode໊ osd਺ 12 osd਺ OSDs on PVCsʹ͓͚ΔPod Topology Spread Constraintsͷ༗༻ੑ 17

Slide 18

Slide 18 text

rackA: 4×OSD pvcbased-w0: 3×OSD pvcbased-w1: 1×OSD rackB: 5×OSD pvcbased-w2: 3×OSD pvcbased-w3: 2×OSD rackC: 3×OSD pvcbased-w4: 1×OSD pvcbased-w5: 2×OSD OSDs on PVCsʹ͓͚ΔPod Topology Spread Constraintsͷ༗༻ੑ 18

Slide 19

Slide 19 text

rackA: 4×OSD pvcbased-w0: 3×OSD pvcbased-w1: 1×OSD rackB: 5×OSD pvcbased-w2: 3×OSD pvcbased-w3: 2×OSD rackC: 3×OSD pvcbased-w4: 1×OSD pvcbased-w5: 2×OSD rack໊ OSD਺ rackA 4 rackB 5 rackC 3 ༏लͳRook͞ΜͰ΋ ೉͍͠΋ͷ͸͋Δ OSDͷ഑ஔ͕ ภͬͯ͠·ͬ ͍ͯΔ… OSDs on PVCsʹ͓͚ΔPod Topology Spread Constraintsͷ༗༻ੑ 19

Slide 20

Slide 20 text

▸ Pod Topology Spread ConstraintsͷޮՌΛ͔֬ΊΔɽ ▸ ҎԼͷΑ͏ͳ͓͏ͪKubernetes ClusterΛߏஙɽ OSDs on PVCs with Pod Topology Spread Constraints CPU σΟεΫ DRAM Intel Xeon Platinum 8167M(26C/52T) • Intel OPTANE SSD 900P 280GB ×1 • Intel DC P3700 800GB×2 ECC Registered 192GB ෺ཧߏ੒ role ୆਺ CPU DRAM Mster 1 4vCPU 12GB Worker 6 4vCPU 12GB Clusterߏ੒ 20

Slide 21

Slide 21 text

OSDs on PVCs with Pod Topology Spread Constraints rack໊ Node໊ σόΠε ୆਺ Rook όʔδϣϯ osd ਺ CSI rackA pvcbased-w0 pvcbased-w1 3 3 v1.3.5 12 TopoLVM rackB pvcbased-w2 pvcbased-w3 3 3 rackC pvcbased-w4 pvcbased-w5 3 3 Clusterߏ੒ 21

Slide 22

Slide 22 text

…লུ topologySpreadConstraints: - maxSkew: 1 topologyKey: topology.rook.io/rack whenUnsatisfiable: DoNotSchedule labelSelector: matchExpressions: - key: app operator: In values: - rook-ceph-osd - rook-ceph-osd-prepare - maxSkew: 1 topologyKey: kubernetes.io/hostname whenUnsatisfiable: DoNotSchedule labelSelector: matchExpressions: - key: app operator: In values: - rook-ceph-osd - rook-ceph-osd-prepare ద༻͢ΔϚχϑΣετ 22 topology.rook.io/rackͰ෼ࢄ഑ஔ rack಺ͷnodeͰͷ෼ࢄ഑ஔ OSDs on PVCs with Pod Topology Spread Constraints

Slide 23

Slide 23 text

Hands ON!! OSDs on PVCs with Pod Topology Spread Constraints 23

Slide 24

Slide 24 text

OSDs on PVCs with Pod Topology Spread Constraints PVC-based Cluster OSD༻ͷPV OSD Pod 24

Slide 25

Slide 25 text

OSDs on PVCs with Pod Topology Spread Constraints rackA: 4×OSD pvcbased-w0: 2×OSD pvcbased-w1: 2×OSD rackB: 4×OSD pvcbased-w2: 2×OSD pvcbased-w3: 2×OSD rackC: 4×OSD pvcbased-w4: 2×OSD pvcbased-w5: 2×OSD 25

Slide 26

Slide 26 text

Rook/CephFS ͷো֐υϝΠϯ 26

Slide 27

Slide 27 text

Rook/CephFSো֐υϝΠϯͷઃఆ •topology.kubernetes.io/region •topology.kubernetes.io/zone •topology.rook.io/datacenter •topology.rook.io/room •topology.rook.io/pod •topology.rook.io/pdu •topology.rook.io/row •topology.rook.io/rack •topology.rook.io/chassis 27 ▸ Rook/CephͰ͸ҎԼͷΑ͏ͳো֐υϝΠϯ͕༻ҙ͞Ε͓ͯΓɼ Node΁ϥϕϧ෇͚͢Δ͜ͱͰ࢖༻Ͱ͖Δɽ

Slide 28

Slide 28 text

•topology.kubernetes.io/region •topology.kubernetes.io/zone •topology.rook.io/datacenter •topology.rook.io/room •topology.rook.io/pod •topology.rook.io/pdu •topology.rook.io/row •topology.rook.io/rack •topology.rook.io/chassis 28 ▸ Rook/CephͰ͸ҎԼͷΑ͏ͳো֐υϝΠϯ͕༻ҙ͞Ε͓ͯΓɼ Node΁ϥϕϧ෇͚͢Δ͜ͱͰ࢖༻Ͱ͖Δɽ ࠓճ͸͜ΕΛ࢖༻ Rook/CephFSো֐υϝΠϯͷઃఆ

Slide 29

Slide 29 text

▸ ஫ҙ఺ ▸ Kubernetes v1.17ΑΓલͰ͸ʮfailure-domain.beta.kubernetes.ioʯϥϕϧΛ࢖༻ 29 •topology.kubernetes.io/region •topology.kubernetes.io/zone •topology.rook.io/datacenter •topology.rook.io/room •topology.rook.io/pod •topology.rook.io/pdu •topology.rook.io/row •topology.rook.io/rack •topology.rook.io/chassis •failure-domain.beta.kubernetes.io/zone •failure-domain.beta.kubernetes.io/region ઌ΄Ͳ࡞੒ͨ͠ClusterͰ෼ࢄ഑ஔग़དྷ͍ͯΔ͔֬ೝ͢Δɽ Rook/CephFSো֐υϝΠϯͷઃఆ

Slide 30

Slide 30 text

30 Rook/CephFSো֐υϝΠϯͷઃఆ ʦ3, 11, 5ʧ ͪΌΜͱ෼ࢄ͞Ε͍ͯΔʂʂ

Slide 31

Slide 31 text

ࢀߟจݙ ▸ Cephͷछྨ ▸ https://docs.ceph.com/docs/mimic/architecture/ ▸ Japan Rook Meetup #2 ʮRook/Ceph upstream࠷৽ঢ়گʯ ▸ https://speakerdeck.com/sat/ceph-upstreamzui-xin-zhuang-kuang ▸ Persistent Block Device Naming ▸ https://wiki.archlinux.org/index.php/Persistent_block_device_naming ▸ Rook/CephͰͷো֐υϝΠϯ ▸ https://github.com/rook/rook/blob/master/Documentation/ceph-cluster-crd.md#osd-topology 31

Slide 32

Slide 32 text

THANK YOU 32