Rancherでつくる KubernetesのGPUクラスタ

5862ce660c5a25de9584b0c2e5497055?s=47 sasaki
September 20, 2019

Rancherでつくる KubernetesのGPUクラスタ

5862ce660c5a25de9584b0c2e5497055?s=128

sasaki

September 20, 2019
Tweet

Transcript

  1. 3BODIFS.FFUVQJO,ZPUP 3BODIFSͰͭ͘Δ ,VCFSOFUFTͷ(16Ϋϥελ  ࠤʑ໦ਅ໵

  2. 8IP Shinya Sasaki Head of Infrastructure Engineering at AlpacaJapan Co.,

    Ltd. Osaka, Japan
  3. Ҏલͷൃද

  4. None
  5. None
  6. None
  7. Ҏલͷൃද

  8. None
  9. ࠓ೔͸͜ͷลͷ࿩

  10. ઃஔ؀ڥ w ๭σʔληϯλʔ ૔ݿʁ  w ϥοΫͱిݯɺΠϯλʔωοτ઀ଓΛ࢖Θͤͯ΋Β͍ͬͯ Δ w ػثͷઃஔɺ-"/݁ઢɺύʔπަ׵౳͸ݱ஍ͷํʹ΍ͬͯ

    ΋Β͑Δ
  11. None
  12. None
  13. None
  14. None
  15. None
  16. None
  17. ϋʔυ΢ΣΞ w طଘͰ࢖͍ͬͯΔ(16αʔόΛ࠶ηοτΞοϓ͠ɺॱ࣍Ϋ ϥελʹऩ༰ w ࠓޙ૿ઃ༧ఆ

  18. ϚβʔϘʔυ w (16͕ຕૠ͞Δ΋ͷ͕΄͔ʹͳ͔ͬͨΒ͍͠

  19. (16 w (F'PSDF(59 w (F'PSDF359

  20. %JTL w 64#઀ଓͷ44% w ͠ΐͬͪΎ͏յΕΔͷͰ

  21. ిݯ؅ཧ w *1.*ͱ͔J-0ͷΑ͏ͳϦϞʔτ؅ཧػೳ͸ͳ͍ w .FSPTTͰεϚϗͰిݯ0''0/

  22. ඞཁͳλεΫ w 3BODIFS؀ڥͷߏங w (16ϊʔυͷηοτΞοϓखॱͷཱ֬ w ηοτΞοϓͷࣗಈԽ

  23. 3BODIFS؀ڥͷߏங

  24. 3BODIFS؀ڥͷߏங ׂ Ѫ

  25. (16ϊʔυͷηοτΞοϓखॱͷཱ֬

  26. (16ϊʔυͷηοτΞοϓ w 04Πϯετʔϧ w $6%"Πϯετʔϧ w EPDLFSΠϯετʔϧ w OWJEJBEPDLFSΠϯετʔϧ w

    3BODIFSΫϥελʹ+PJO
  27. 04Πϯετʔϧ w 6CVOUV

  28. 04Πϯετʔϧ w 6CVOUV-54 44%Λೝࣝ͠ͳ͍

  29. 04Πϯετʔϧ w 6CVOUV-54 w 6CVOUV-54

  30. 04Πϯετʔϧ w 6CVOUV-54 w 6CVOUV-54 Πϯετʔϧ్தͰΤϥʔ

  31. 04Πϯετʔϧ w 6CVOUV-54 w 6CVOUV-54 w 6CVOUV

  32. 04Πϯετʔϧ w 6CVOUV-54 w 6CVOUV-54 w 6CVOUV Πϯετʔϧ੒ޭ

  33. 04Πϯετʔϧ w 6CVOUV-54 w 6CVOUV-54 w 6CVOUV Πϯετʔϧ੒ޭ ͕ɺOWJEJBEPDLFS͕ʹର Ԡ͍ͯ͠ͳ͍ͷͰɺͷ΋ͷ

    ΛΠϯετʔϧ͢Δͱ͔ɾɾɾ
  34. 04Πϯετʔϧ w 6CVOUV-54 w 6CVOUV-54 w 6CVOUV ͱΓ͋͑ͣɾɾɾ

  35. ͜͏͍͏ͷ͕͍΍ͰΫϥ΢υΤϯδχΞʹ ͳͬͨͷͰ͸ͳ͔ͬͨͷ͔ɾɾɾ

  36. ͍࢝͟ΊΔͱɾɾɾ w  w 6CVOUV-54ϦϦʔε

  37. 04Πϯετʔϧ w  w 6CVOUV-54ϦϦʔε

  38. ͜͏͍͏ͷ͕͍΍Ͱ SZ

  39. OWJEJBTNJ OWJEJBTNJ 5IV4FQ   c/7*%*"4.*%SJWFS7FSTJPOc c   

    c(16/BNF1FSTJTUFODF.c#VT*E%JTQ"c7PMBUJMF6ODPSS&$$c c'BO5FNQ1FSG1XS6TBHF$BQc.FNPSZ6TBHFc(166UJM$PNQVUF.c c  c c(F'PSDF(590⒎c0⒎c/"c c$188c.J#.J#c%FGBVMUc     c(F'PSDF(590⒎c0⒎c/"c c$188c.J#.J#c%FGBVMUc     c(F'PSDF(590⒎c0⒎c/"c c$188c.J#.J#c%FGBVMUc     c(F'PSDF(590⒎c"0⒎c/"c c$188c.J#.J#c%FGBVMUc    
  40. ηοτΞοϓͷࣗಈԽ

  41. ͜Εʂ IUUQLZTNPIBUFOBCMPHKQFOUSZ

  42. ؅ཧ༻αʔό

  43. None
  44. .""4͸Α͔ͬͨ w ͦΜͳʹ೉͘͠ͳ͔ͬͨ w %)$1ͱ͔શ෦؅ཧͰ͖Δ w ϋʔυ΢ΣΞ৘ใ΋ݟΕΔ

  45. ͕ɺ໰୊͕ɾɾɾ w ηοτΞοϓதʹQPXFSP⒎͕૸Δ w ϚβʔϘʔυ͕෮ిʹରԠ͍ͯ͠ͳ͍ w ṖͷεΠονͷΈͰ͔͠෮چͰ͖ͳ͍ w ϦϞʔτͰ͸ରԠෆՄ

  46. ṖͷεΠον

  47. None
  48. ݁ہ͖͋ΒΊͯQSFTFFE w %)$1ɺ5'51ɺ1SFTFFEͦΕͧΕΠϯετʔϧ w ςΩετϕʔεͷઃఆϑΝΠϧ w ਓʹҾ͖ܧ͙͜ͱΛߟ͑ΔͱͰ͖Ε͹΍Γͨ͘ͳ͔ͬ ͨɾɾɾ

  49. EIDQEDPOG IPTUOWYM\ IBSEXBSFFUIFSOFUFEFYYYYYY pYFEBEESFTT PQUJPOIPTUOBNFlTFSW ^ IPTUOWYM\ IBSEXBSFFUIFSOFUFEFYYYYZZ pYFEBEESFTT PQUJPOIPTUOBNFTFSW

    ^   *1ΞυϨεͷ؅ཧͱ͔ͨ͘͠ͳ͔͕ͬͨɺނোͨ͠ ͱ͖ͷަ׵ґཔͱ͔ߟ͑Δͱ."$ΞυϨεʹΑΔݻ ఆ*1؅ཧ͕ඞཁ
  50. None
  51. QSFTFFEDGH ɿ (OSΠϯετʔϧؔ࿈͸ུ) in-target /bin/mkdir /home/ubuntu/.ssh ;\ in-target /bin/chmod 700

    /home/ubuntu/.ssh ;\ in-target /bin/sh -c 'echo "ssh-rsa xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" > /home/ ubuntu/.ssh/authorized_keys' ;\ in-target /bin/chown -R ubuntu:ubuntu /home/ubuntu/.ssh ;\ in-target /bin/sh -c 'echo "ubuntu ALL = NOPASSWD: ALL" > /etc/sudoers.d/ubuntu' ;\ in-target /bin/sh -c 'curl -fsSL https://xxxxxxxxxxxxxxxxxxxxxx/install-node.sh | bash' d-i finish-install/reboot_in_progress note
  52. QSFTFFEDGH ɿ (OSΠϯετʔϧؔ࿈͸ུ) in-target /bin/mkdir /home/ubuntu/.ssh ;\ in-target /bin/chmod 700

    /home/ubuntu/.ssh ;\ in-target /bin/sh -c 'echo "ssh-rsa xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" > /home/ ubuntu/.ssh/authorized_keys' ;\ in-target /bin/chown -R ubuntu:ubuntu /home/ubuntu/.ssh ;\ in-target /bin/sh -c 'echo "ubuntu ALL = NOPASSWD: ALL" > /etc/sudoers.d/ubuntu' ;\ in-target /bin/sh -c 'curl -fsSL https://xxxxxxxxxxxxxxxxxxxxxx/install-node.sh | bash' d-i finish-install/reboot_in_progress note 44)ϩάΠϯͰ͖ΔΑ͏ʹ
  53. QSFTFFEDGH ɿ (OSΠϯετʔϧؔ࿈͸ུ) in-target /bin/mkdir /home/ubuntu/.ssh ;\ in-target /bin/chmod 700

    /home/ubuntu/.ssh ;\ in-target /bin/sh -c 'echo "ssh-rsa xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" > /home/ ubuntu/.ssh/authorized_keys' ;\ in-target /bin/chown -R ubuntu:ubuntu /home/ubuntu/.ssh ;\ in-target /bin/sh -c 'echo "ubuntu ALL = NOPASSWD: ALL" > /etc/sudoers.d/ubuntu' ;\ in-target /bin/sh -c 'curl -fsSL https://xxxxxxxxxxxxxxxxxxxxxx/install-node.sh | bash' d-i finish-install/reboot_in_progress note ؅ཧ͠΍͍͢Α͏ʹΠϯε τʔϧεΫϦϓτ͸֎෦ʹ
  54. QSFTFFEDGH ɿ (OSΠϯετʔϧؔ࿈͸ུ) in-target /bin/mkdir /home/ubuntu/.ssh ;\ in-target /bin/chmod 700

    /home/ubuntu/.ssh ;\ in-target /bin/sh -c 'echo "ssh-rsa xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" > /home/ ubuntu/.ssh/authorized_keys' ;\ in-target /bin/chown -R ubuntu:ubuntu /home/ubuntu/.ssh ;\ in-target /bin/sh -c 'echo "ubuntu ALL = NOPASSWD: ALL" > /etc/sudoers.d/ubuntu' ;\ in-target /bin/sh -c 'curl -fsSL https://xxxxxxxxxxxxxxxxxxxxxx/install-node.sh | bash' d-i finish-install/reboot_in_progress note ࠶ىಈ
  55. JOTUBMMOPEFTI apt-get update hostnamectl set-hostname localhost apt-get -y install curl

    apt-transport-https ca-certificates gnupg-agent software-properties-common open-iscsi curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add - add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu bionic stable" add-apt-repository -y ppa:graphics-drivers curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/nvidia-docker/ubuntu18.04/nvidia-docker.list | sudo tee /etc/apt/ sources.list.d/nvidia-docker.list apt-get update apt-get install -y vim nvidia-docker2 ubuntu-drivers-common nvidia-cuda-toolkit docker-ce docker-ca-cli containerd.io ubuntu-drivers autoinstall if ! grep -e "default-runtime" /etc/docker/daemon.json -e "default-runtime" /etc/docker/daemon.json >/dev/null; then sed -i -e "2i \ \ \ \ \"default-runtime\": \"nvidia\"," /etc/docker/daemon.json ; fi
  56. ηοτΞοϓͷྲྀΕ w ݱ஍ͷํʹ-"/ʹ઀ଓ͠ɺىಈͯ͠΋Β͏ w ࣗಈ04Πϯετʔϧ։࢝ˠ࠶ىಈ w ϦϞʔτ઀ଓ֬ೝ͠ɺखಈͰΫϥελʹ+PJO͢ΔίϚϯυ ࣮ߦ

  57. ΫϥελࢀՃ

  58. ӡ༻։࢝ޙͷ(16ϊʔυͷ໾ׂ w ౰ॳLTͷ؅ཧܥ΋݉Ͷ͍͕ͯͨɺΑ͘ίέΔͷͰ؅ཧܥ ͸&$ʹͯ͠ɺ(16ϊʔυ͸8PSLFSઐ༻ʹͨ͠ w ؅ཧܥίέΔͱ݁ߏ໘౗ɾɾɾ

  59. ͜ͷลͷ࿩

  60. ؅ཧܥ&$

  61. 8PSLFSͱͯ͠ͷ&$

  62. (16ϊʔυ

  63. None
  64. ͜ͷลͷ࿩΋ ·ͨػձ͕͋Ε͹ʂ

  65. ͋Γ͕ͱ͏͍͟͝·ͨ͠