ML環境でのRook/Ceph

0dc4f1b0c11ae5bc211f86e6f0a0f8a7?s=47 tenzen
July 03, 2020

 ML環境でのRook/Ceph

Japan Rook Meetup #3(https://rook.connpass.com/event/174294/)の第3セッションで使用した資料になります
本資料では,前半にML環境で使用しているRook/Cephについてご紹介した後,Rook/Cephの使っているor今後使用予定の機能について実践を交えながら深く掘り下げていきます.

0dc4f1b0c11ae5bc211f86e6f0a0f8a7?s=128

tenzen

July 03, 2020
Tweet

Transcript

  1. ML؀ڥͰͷRook/Ceph ͯΜͥΜ 2020/07/03 Japan Rook Meetup #3 1

  2. SELF INTRODUCTION ▸ ॴଐ: ۙـେֶେֶӃ(ؔ੢ʹ͋Δࢲཱେֶ) ▸ ૯߹ཧ޻ֶݚڀՊΤϨΫτϩχΫεܥ޻ֶઐ߈ ▸ ίϯϐϡʔλϏδϣϯݚڀࣨɹम࢜1೥ ▸

    Twitter: ˏAokiTenzen ▸ Blog: https://tenzen.hatenablog.com/ 2
  3. AGENDA I. Rook/Cephͷ࢖༻ঢ়گ II. OSDs on PVCs with Pod Topology

    Spread Constraints III. Rook/CephFSͷো֐υϝΠϯ ݚڀࣨͰ࣮ݧతʹՔಇ͍ͤͯ͞ΔKubernetes ClusterͰͷ Rook/Cephʹ͍͓ͭͯ࿩͠͠·͢ɽ 3
  4. ิ଍ ͝঺հ͢Δ؀ڥ͸ɼMLΛ࢖͏ΞϓϦέʔγϣϯ Λಈ͔͢؀ڥͰ͸ͳ͘ɼMLΛ࢖ͬͯݚڀɾ։ൃΛ ߦ͏؀ڥʹͳΓ·͢ɽ 4

  5. Rook/Cephͷ࢖༻ঢ়گ 5

  6. ӡ༻͍ͯ͠ΔKubernetes Cluster(Ұ෦෼) ▸ ֓ཁ ▸ JupyterLabΛϢʔβʹఏڙ ▸ IngressʹΑΔΞΫηείϯτ ϩʔϧ ▸

    Rook/CephͰϝΠϯετϨʔ δΛಈతʹ෷͍ग़͢ɽ ▸ ֎෦NFSαʔό͔Βαϒετ ϨʔδΛಈతʹ෷͍ग़͢ɽ ▸ ArgoCDͰ GitOps 6
  7. ετϨʔδγεςϜબఆج४ ▸ ཁ݅ ▸ ಉ͡σʔλΛෳ਺ਓͰղੳ͍ͨ͠ɽ ▸ ཉ͍࣌͠ʹཉ͍͠෼͚ͩετϨʔδΛ࢖͍ͨ ͍ɽ ▸ JupyterLab্

    ͰϑΝΠϧ୯ ҐͰσʔλΛ Π ϯ λ ϥ Ϋ ςΟϒʹૢ࡞ ͠ͳ͕Βղੳ ͍ͨ͠ɽ 7
  8. ετϨʔδγεςϜબఆج४ ▸ ཁ݅ ▸ ಉ͡σʔλΛෳ਺ਓͰղੳ͍ͨ͠ɽ ▸ ཉ͍࣌͠ʹཉ͍͠෼͚ͩετϨʔδΛ࢖͍ͨ ͍ɽ ▸ JupyterLab্

    ͰϑΝΠϧ୯ ҐͰσʔλΛ Π ϯ λ ϥ Ϋ ςΟϒʹૢ࡞ ͠ͳ͕Βղੳ ͍ͨ͠ɽ Rook/CephFSΛબ୒ 8
  9. Ceph ͷछྨ ▸ Ceph ▸ Object Storage(RGW) ▸ Amazon S3

    Restful APIͱ OpenStack Swift APIޓ׵ͷ ΠϯλϑΣʔεΛఏڙ͢Δɽ ▸ Block Devices(RBD) ▸ ϒϩοΫσόΠεΛఏڙ͢Δɽ ▸ FileSystem(CephFS) ▸ POSIX४ڌͷϑΝΠϧγεςϜΛఏڙ͢Δɽ 9
  10. Rookʹ͓͚ΔCeph Clusterͷछྨ ▸ Rookʹ͓͚ΔCeph Cluster͸2λΠϓ͋Δ ▸ Host-based Cluster ▸ PVC-based

    Cluster 10
  11. Rookʹ͓͚ΔCeph Clusterͷछྨ Host-based Cluster Cluster؅ཧऀ͕Ϛχ ϑΣετͰ௚઀σόΠ εͷࢦఆΛߦ͍ɼ OSD Pod͕࡞੒͞ΕΔ λΠϓ

    nodes: - name: "node0" devices: - name: "/dev/disk/by-id/ata-hoge0" - name: "/dev/disk/by-id/ata-hoge1" - name: "/dev/disk/by-id/ata-hoge2" - name: "node1" devices: - name: "/dev/disk/by-id/ata-fuga0" - name: "/dev/disk/by-id/ata-fuga1" - name: “/dev/disk/by-id/ata-fuga2" - name: "node2" devices: - name: "/dev/disk/by-id/ata-foo0" - name: "/dev/disk/by-id/ata-foo1" - name: “/dev/disk/by-id/ata-foo2" Rook v1.2͔Β͸ ʮPersistent Block Device Namingʯ ʹରԠ 11
  12. Rookʹ͓͚ΔCeph Clusterͷछྨ PVC-based Cluster Cluster͕PVCΛ࡞੒͠ɼ PVCʹج͍ͮͯCSI͕࡞੒͠ ͨPVΛ࢖ͬͯOSD PodΛ࡞ ੒͢Δɽ storage:

    storageClassDeviceSets: - name: japan_rook_sets count: 6 …(লུ) volumeClaimTemplates: - metadata: name: data spec: resources: requests: storage: 10Gi storageClassName: gp2 volumeMode: Block accessModes: - ReadWriteOnce OSD͋ͨΓͷ ༰ྔΛࢦఆ OSD਺Λࢦఆ storage class Λࢦఆ 12
  13. Rookʹ͓͚ΔCeph Clusterͷछྨ ▸ ݱࡏ͸Host-based ClusterΛ࢖༻த ▸ ࠓޙPVC-based ClusterʹҠߦ͍ͨ͠ɽ ֤NodeͷσόΠε഑ஔͷৄ ࡉΛҙࣝͤͣʹ࢖͑Δ͜ͱ

    ͕ັྗ 13
  14. OSDs on PVCs with Pod Topology Spread Constraints 14

  15. OSDs on PVCsʹ͓͚ΔPod Topology Spread Constraintsͷ༗༻ੑ ▸ Rook v1.2Ҏલ͸PVC-basedͰOSDͷۉ౳഑ஔ͕ࠔ೉ͩͬͨɽ rackA

    rackB rackC Nodeα Nodeβ Nodeγ Nodeδ Nodeε Nodeζ OSD0 OSD1 OSD2 OSD3 OSD4 OSD5 15
  16. ▸ Rook v1.2Ҏલ͸PVC-basedͰOSDͷۉ౳഑ஔ͕ࠔ೉ͩͬͨɽ rackA rackB rackC Nodeα Nodeβ Nodeγ Nodeδ

    Nodeε Nodeζ OSD0 OSD1 OSD2 OSD3 OSD4 OSD6 Rook͞ΜͳΜ΍͔Μ΍ݴ͏ͯ ্ख͍ࣄͯ͘͠ΕΔΜͪΌ͏ʁ ຊ౰ʹͰ͖ͳ͍ͷʁ OSDs on PVCsʹ͓͚ΔPod Topology Spread Constraintsͷ༗༻ੑ 16
  17. ‣ ࣮ࡍ΍ͬͯΈΔͱ… rack໊ Node໊ rackA pvcbased-w0 pvcbased-w1 rackB pvcbased-w2 pvcbased-w3

    rackC pvcbased-w4 pvcbased-w5 Node໊ σόΠε਺ pvcbased-w0 3 pvcbased-w1 3 pvcbased-w2 3 pvcbased-w3 3 pvcbased-w4 3 pvcbased-w5 3 rackผNode໊ rackผNode໊ osd਺ 12 osd਺ OSDs on PVCsʹ͓͚ΔPod Topology Spread Constraintsͷ༗༻ੑ 17
  18. rackA: 4×OSD pvcbased-w0: 3×OSD pvcbased-w1: 1×OSD rackB: 5×OSD pvcbased-w2: 3×OSD

    pvcbased-w3: 2×OSD rackC: 3×OSD pvcbased-w4: 1×OSD pvcbased-w5: 2×OSD OSDs on PVCsʹ͓͚ΔPod Topology Spread Constraintsͷ༗༻ੑ 18
  19. rackA: 4×OSD pvcbased-w0: 3×OSD pvcbased-w1: 1×OSD rackB: 5×OSD pvcbased-w2: 3×OSD

    pvcbased-w3: 2×OSD rackC: 3×OSD pvcbased-w4: 1×OSD pvcbased-w5: 2×OSD rack໊ OSD਺ rackA 4 rackB 5 rackC 3 ༏लͳRook͞ΜͰ΋ ೉͍͠΋ͷ͸͋Δ OSDͷ഑ஔ͕ ภͬͯ͠·ͬ ͍ͯΔ… OSDs on PVCsʹ͓͚ΔPod Topology Spread Constraintsͷ༗༻ੑ 19
  20. ▸ Pod Topology Spread ConstraintsͷޮՌΛ͔֬ΊΔɽ ▸ ҎԼͷΑ͏ͳ͓͏ͪKubernetes ClusterΛߏஙɽ OSDs on

    PVCs with Pod Topology Spread Constraints CPU σΟεΫ DRAM Intel Xeon Platinum 8167M(26C/52T) • Intel OPTANE SSD 900P 280GB ×1 • Intel DC P3700 800GB×2 ECC Registered 192GB ෺ཧߏ੒ role ୆਺ CPU DRAM Mster 1 4vCPU 12GB Worker 6 4vCPU 12GB Clusterߏ੒ 20
  21. OSDs on PVCs with Pod Topology Spread Constraints rack໊ Node໊

    σόΠε ୆਺ Rook όʔδϣϯ osd ਺ CSI rackA pvcbased-w0 pvcbased-w1 3 3 v1.3.5 12 TopoLVM rackB pvcbased-w2 pvcbased-w3 3 3 rackC pvcbased-w4 pvcbased-w5 3 3 Clusterߏ੒ 21
  22. …লུ topologySpreadConstraints: - maxSkew: 1 topologyKey: topology.rook.io/rack whenUnsatisfiable: DoNotSchedule labelSelector:

    matchExpressions: - key: app operator: In values: - rook-ceph-osd - rook-ceph-osd-prepare - maxSkew: 1 topologyKey: kubernetes.io/hostname whenUnsatisfiable: DoNotSchedule labelSelector: matchExpressions: - key: app operator: In values: - rook-ceph-osd - rook-ceph-osd-prepare ద༻͢ΔϚχϑΣετ 22 topology.rook.io/rackͰ෼ࢄ഑ஔ rack಺ͷnodeͰͷ෼ࢄ഑ஔ OSDs on PVCs with Pod Topology Spread Constraints
  23. Hands ON!! OSDs on PVCs with Pod Topology Spread Constraints

    23
  24. OSDs on PVCs with Pod Topology Spread Constraints PVC-based Cluster

    OSD༻ͷPV OSD Pod 24
  25. OSDs on PVCs with Pod Topology Spread Constraints rackA: 4×OSD

    pvcbased-w0: 2×OSD pvcbased-w1: 2×OSD rackB: 4×OSD pvcbased-w2: 2×OSD pvcbased-w3: 2×OSD rackC: 4×OSD pvcbased-w4: 2×OSD pvcbased-w5: 2×OSD 25
  26. Rook/CephFS ͷো֐υϝΠϯ 26

  27. Rook/CephFSো֐υϝΠϯͷઃఆ •topology.kubernetes.io/region •topology.kubernetes.io/zone •topology.rook.io/datacenter •topology.rook.io/room •topology.rook.io/pod •topology.rook.io/pdu •topology.rook.io/row •topology.rook.io/rack •topology.rook.io/chassis

    27 ▸ Rook/CephͰ͸ҎԼͷΑ͏ͳো֐υϝΠϯ͕༻ҙ͞Ε͓ͯΓɼ Node΁ϥϕϧ෇͚͢Δ͜ͱͰ࢖༻Ͱ͖Δɽ
  28. •topology.kubernetes.io/region •topology.kubernetes.io/zone •topology.rook.io/datacenter •topology.rook.io/room •topology.rook.io/pod •topology.rook.io/pdu •topology.rook.io/row •topology.rook.io/rack •topology.rook.io/chassis 28

    ▸ Rook/CephͰ͸ҎԼͷΑ͏ͳো֐υϝΠϯ͕༻ҙ͞Ε͓ͯΓɼ Node΁ϥϕϧ෇͚͢Δ͜ͱͰ࢖༻Ͱ͖Δɽ ࠓճ͸͜ΕΛ࢖༻ Rook/CephFSো֐υϝΠϯͷઃఆ
  29. ▸ ஫ҙ఺ ▸ Kubernetes v1.17ΑΓલͰ͸ʮfailure-domain.beta.kubernetes.ioʯϥϕϧΛ࢖༻ 29 •topology.kubernetes.io/region •topology.kubernetes.io/zone •topology.rook.io/datacenter •topology.rook.io/room

    •topology.rook.io/pod •topology.rook.io/pdu •topology.rook.io/row •topology.rook.io/rack •topology.rook.io/chassis •failure-domain.beta.kubernetes.io/zone •failure-domain.beta.kubernetes.io/region ઌ΄Ͳ࡞੒ͨ͠ClusterͰ෼ࢄ഑ஔग़དྷ͍ͯΔ͔֬ೝ͢Δɽ Rook/CephFSো֐υϝΠϯͷઃఆ
  30. 30 Rook/CephFSো֐υϝΠϯͷઃఆ ʦ3, 11, 5ʧ ͪΌΜͱ෼ࢄ͞Ε͍ͯΔʂʂ

  31. ࢀߟจݙ ▸ Cephͷछྨ ▸ https://docs.ceph.com/docs/mimic/architecture/ ▸ Japan Rook Meetup #2

    ʮRook/Ceph upstream࠷৽ঢ়گʯ ▸ https://speakerdeck.com/sat/ceph-upstreamzui-xin-zhuang-kuang ▸ Persistent Block Device Naming ▸ https://wiki.archlinux.org/index.php/Persistent_block_device_naming ▸ Rook/CephͰͷো֐υϝΠϯ ▸ https://github.com/rook/rook/blob/master/Documentation/ceph-cluster-crd.md#osd-topology 31
  32. THANK YOU 32