Japan Rook Meetup #3(https://rook.connpass.com/event/174294/)の第3セッションで使用した資料になります. 本資料では,前半にML環境で使用しているRook/Cephについてご紹介した後,Rook/Cephの使っているor今後使用予定の機能について実践を交えながら深く掘り下げていきます.
MLڥͰͷRook/CephͯΜͥΜ2020/07/03Japan Rook Meetup #31
View Slide
SELF INTRODUCTION▸ ॴଐ: ۙـେֶେֶӃ(ؔʹ͋Δࢲཱେֶ)▸ ૯߹ཧֶݚڀՊΤϨΫτϩχΫεܥֶઐ߈▸ ίϯϐϡʔλϏδϣϯݚڀࣨɹम࢜1▸ Twitter: ˏAokiTenzen▸ Blog: https://tenzen.hatenablog.com/2
AGENDAI. Rook/Cephͷ༻ঢ়گII. OSDs on PVCswith Pod Topology Spread ConstraintsIII. Rook/CephFSͷোυϝΠϯݚڀࣨͰ࣮ݧతʹՔಇ͍ͤͯ͞ΔKubernetes ClusterͰͷRook/Cephʹ͍͓ͭͯ͠͠·͢ɽ3
ิ͝հ͢ΔڥɼMLΛ͏ΞϓϦέʔγϣϯΛಈ͔͢ڥͰͳ͘ɼMLΛͬͯݚڀɾ։ൃΛߦ͏ڥʹͳΓ·͢ɽ4
Rook/Cephͷ༻ঢ়گ5
ӡ༻͍ͯ͠ΔKubernetes Cluster(Ұ෦)▸ ֓ཁ▸ JupyterLabΛϢʔβʹఏڙ▸ IngressʹΑΔΞΫηείϯτϩʔϧ▸ Rook/CephͰϝΠϯετϨʔδΛಈతʹ͍ग़͢ɽ▸ ֎෦NFSαʔό͔ΒαϒετϨʔδΛಈతʹ͍ग़͢ɽ▸ ArgoCDͰ GitOps6
ετϨʔδγεςϜબఆج४▸ ཁ݅▸ ಉ͡σʔλΛෳਓͰղੳ͍ͨ͠ɽ▸ ཉ͍࣌͠ʹཉ͍͚ͩ͠ετϨʔδΛ͍͍ͨɽ▸ JupyterLab্ͰϑΝΠϧ୯ҐͰσʔλΛΠ ϯ λ ϥ ΫςΟϒʹૢ࡞͠ͳ͕Βղੳ͍ͨ͠ɽ7
ετϨʔδγεςϜબఆج४▸ ཁ݅▸ ಉ͡σʔλΛෳਓͰղੳ͍ͨ͠ɽ▸ ཉ͍࣌͠ʹཉ͍͚ͩ͠ετϨʔδΛ͍͍ͨɽ▸ JupyterLab্ͰϑΝΠϧ୯ҐͰσʔλΛΠ ϯ λ ϥ ΫςΟϒʹૢ࡞͠ͳ͕Βղੳ͍ͨ͠ɽRook/CephFSΛબ8
Ceph ͷछྨ▸ Ceph▸ Object Storage(RGW)▸ Amazon S3 Restful APIͱOpenStack Swift APIޓͷΠϯλϑΣʔεΛఏڙ͢Δɽ▸ Block Devices(RBD)▸ ϒϩοΫσόΠεΛఏڙ͢Δɽ▸ FileSystem(CephFS)▸ POSIX४ڌͷϑΝΠϧγεςϜΛఏڙ͢Δɽ9
Rookʹ͓͚ΔCeph Clusterͷछྨ▸ Rookʹ͓͚ΔCeph Cluster2λΠϓ͋Δ▸ Host-based Cluster▸ PVC-based Cluster10
Rookʹ͓͚ΔCeph ClusterͷछྨHost-based ClusterClusterཧऀ͕ϚχϑΣετͰσόΠεͷࢦఆΛߦ͍ɼOSD Pod͕࡞͞ΕΔλΠϓnodes:- name: "node0"devices:- name: "/dev/disk/by-id/ata-hoge0"- name: "/dev/disk/by-id/ata-hoge1"- name: "/dev/disk/by-id/ata-hoge2"- name: "node1"devices:- name: "/dev/disk/by-id/ata-fuga0"- name: "/dev/disk/by-id/ata-fuga1"- name: “/dev/disk/by-id/ata-fuga2"- name: "node2"devices:- name: "/dev/disk/by-id/ata-foo0"- name: "/dev/disk/by-id/ata-foo1"- name: “/dev/disk/by-id/ata-foo2"Rook v1.2͔ΒʮPersistent Block Device NamingʯʹରԠ11
Rookʹ͓͚ΔCeph ClusterͷछྨPVC-based ClusterCluster͕PVCΛ࡞͠ɼPVCʹج͍ͮͯCSI͕࡞ͨ͠PVΛͬͯOSD PodΛ࡞͢Δɽstorage:storageClassDeviceSets:- name: japan_rook_setscount: 6…(লུ)volumeClaimTemplates:- metadata:name: dataspec:resources:requests:storage: 10GistorageClassName: gp2volumeMode: BlockaccessModes:- ReadWriteOnceOSD͋ͨΓͷ༰ྔΛࢦఆOSDΛࢦఆstorage classΛࢦఆ12
Rookʹ͓͚ΔCeph Clusterͷछྨ▸ ݱࡏHost-based ClusterΛ༻த▸ ࠓޙPVC-based ClusterʹҠߦ͍ͨ͠ɽ֤NodeͷσόΠεஔͷৄࡉΛҙࣝͤͣʹ͑Δ͜ͱ͕ັྗ13
OSDs on PVCswith Pod Topology Spread Constraints14
OSDs on PVCsʹ͓͚ΔPod Topology Spread Constraintsͷ༗༻ੑ▸ Rook v1.2ҎલPVC-basedͰOSDͷۉஔ͕ࠔͩͬͨɽrackArackB rackCNodeαNodeβNodeγNodeδNodeεNodeζOSD0 OSD1OSD2OSD3 OSD4 OSD515
▸ Rook v1.2ҎલPVC-basedͰOSDͷۉஔ͕ࠔͩͬͨɽrackArackB rackCNodeαNodeβNodeγNodeδNodeεNodeζOSD0 OSD1OSD2OSD3 OSD4 OSD6Rook͞ΜͳΜ͔Μݴ͏্ͯख͍ࣄͯ͘͠ΕΔΜͪΌ͏ʁຊʹͰ͖ͳ͍ͷʁOSDs on PVCsʹ͓͚ΔPod Topology Spread Constraintsͷ༗༻ੑ 16
‣ ࣮ࡍͬͯΈΔͱ…rack໊ Node໊rackApvcbased-w0pvcbased-w1rackBpvcbased-w2pvcbased-w3rackCpvcbased-w4pvcbased-w5Node໊ σόΠεpvcbased-w0 3pvcbased-w1 3pvcbased-w2 3pvcbased-w3 3pvcbased-w4 3pvcbased-w5 3rackผNode໊rackผNode໊osd12osdOSDs on PVCsʹ͓͚ΔPod Topology Spread Constraintsͷ༗༻ੑ 17
rackA: 4×OSDpvcbased-w0: 3×OSDpvcbased-w1: 1×OSDrackB: 5×OSDpvcbased-w2: 3×OSDpvcbased-w3: 2×OSDrackC: 3×OSDpvcbased-w4: 1×OSDpvcbased-w5: 2×OSDOSDs on PVCsʹ͓͚ΔPod Topology Spread Constraintsͷ༗༻ੑ 18
rackA: 4×OSDpvcbased-w0: 3×OSDpvcbased-w1: 1×OSDrackB: 5×OSDpvcbased-w2: 3×OSDpvcbased-w3: 2×OSDrackC: 3×OSDpvcbased-w4: 1×OSDpvcbased-w5: 2×OSDrack໊ OSDrackA 4rackB 5rackC 3༏लͳRook͞ΜͰ͍͠ͷ͋ΔOSDͷஔ͕ภͬͯ͠·͍ͬͯΔ…OSDs on PVCsʹ͓͚ΔPod Topology Spread Constraintsͷ༗༻ੑ 19
▸ Pod Topology Spread ConstraintsͷޮՌΛ͔֬ΊΔɽ▸ ҎԼͷΑ͏ͳ͓͏ͪKubernetes ClusterΛߏஙɽOSDs on PVCs with Pod Topology Spread ConstraintsCPU σΟεΫ DRAMIntel Xeon Platinum8167M(26C/52T)• Intel OPTANE SSD 900P 280GB ×1• Intel DC P3700 800GB×2ECC Registered192GBཧߏrole CPU DRAMMster 1 4vCPU 12GBWorker 6 4vCPU 12GBClusterߏ20
OSDs on PVCs with Pod Topology Spread Constraintsrack໊ Node໊σόΠεRookόʔδϣϯosdCSIrackApvcbased-w0pvcbased-w133v1.3.5 12 TopoLVMrackBpvcbased-w2pvcbased-w333rackCpvcbased-w4pvcbased-w533Clusterߏ21
…লུtopologySpreadConstraints:- maxSkew: 1topologyKey: topology.rook.io/rackwhenUnsatisfiable: DoNotSchedulelabelSelector:matchExpressions:- key: appoperator: Invalues:- rook-ceph-osd- rook-ceph-osd-prepare- maxSkew: 1topologyKey: kubernetes.io/hostnamewhenUnsatisfiable: DoNotSchedulelabelSelector:matchExpressions:- key: appoperator: Invalues:- rook-ceph-osd- rook-ceph-osd-prepareద༻͢ΔϚχϑΣετ22topology.rook.io/rackͰࢄஔrackͷnodeͰͷࢄஔOSDs on PVCs with Pod Topology Spread Constraints
Hands ON!!OSDs on PVCs with Pod Topology Spread Constraints 23
OSDs on PVCs with Pod Topology Spread ConstraintsPVC-based ClusterOSD༻ͷPVOSD Pod24
OSDs on PVCs with Pod Topology Spread ConstraintsrackA: 4×OSDpvcbased-w0: 2×OSDpvcbased-w1: 2×OSDrackB: 4×OSDpvcbased-w2: 2×OSDpvcbased-w3: 2×OSDrackC: 4×OSDpvcbased-w4: 2×OSDpvcbased-w5: 2×OSD25
Rook/CephFSͷোυϝΠϯ26
Rook/CephFSোυϝΠϯͷઃఆ•topology.kubernetes.io/region•topology.kubernetes.io/zone•topology.rook.io/datacenter•topology.rook.io/room•topology.rook.io/pod•topology.rook.io/pdu•topology.rook.io/row•topology.rook.io/rack•topology.rook.io/chassis27▸ Rook/CephͰҎԼͷΑ͏ͳোυϝΠϯ͕༻ҙ͞Ε͓ͯΓɼNodeϥϕϧ͚͢Δ͜ͱͰ༻Ͱ͖Δɽ
•topology.kubernetes.io/region•topology.kubernetes.io/zone•topology.rook.io/datacenter•topology.rook.io/room•topology.rook.io/pod•topology.rook.io/pdu•topology.rook.io/row•topology.rook.io/rack•topology.rook.io/chassis28▸ Rook/CephͰҎԼͷΑ͏ͳোυϝΠϯ͕༻ҙ͞Ε͓ͯΓɼNodeϥϕϧ͚͢Δ͜ͱͰ༻Ͱ͖Δɽࠓճ͜ΕΛ༻Rook/CephFSোυϝΠϯͷઃఆ
▸ ҙ▸ Kubernetes v1.17ΑΓલͰʮfailure-domain.beta.kubernetes.ioʯϥϕϧΛ༻29•topology.kubernetes.io/region•topology.kubernetes.io/zone•topology.rook.io/datacenter•topology.rook.io/room•topology.rook.io/pod•topology.rook.io/pdu•topology.rook.io/row•topology.rook.io/rack•topology.rook.io/chassis•failure-domain.beta.kubernetes.io/zone•failure-domain.beta.kubernetes.io/regionઌ΄Ͳ࡞ͨ͠ClusterͰࢄஔग़དྷ͍ͯΔ͔֬ೝ͢ΔɽRook/CephFSোυϝΠϯͷઃఆ
30Rook/CephFSোυϝΠϯͷઃఆʦ3, 11, 5ʧͪΌΜͱࢄ͞Ε͍ͯΔʂʂ
ࢀߟจݙ▸ Cephͷछྨ▸ https://docs.ceph.com/docs/mimic/architecture/▸ Japan Rook Meetup #2 ʮRook/Ceph upstream࠷৽ঢ়گʯ▸ https://speakerdeck.com/sat/ceph-upstreamzui-xin-zhuang-kuang▸ Persistent Block Device Naming▸ https://wiki.archlinux.org/index.php/Persistent_block_device_naming▸ Rook/CephͰͷোυϝΠϯ▸ https://github.com/rook/rook/blob/master/Documentation/ceph-cluster-crd.md#osd-topology31
THANK YOU32