ScaleShift-jp-2019-summer

 ScaleShift-jp-2019-summer

ScaleShift の機能概要、Kubernetes クラスタとの連携について

1e5a15f4dc65c207a04a1e82a3f92e92?s=128

ryo nakamaru

July 17, 2019
Tweet

Transcript

  1. ScaleShift ΦϯϓϨϛε / Ϋϥ΢υ Ͱ࣮ݱ͢Δػցֶश؀ڥ June, 2019

  2. !2 ScaleShift Docker ϕʔεɺΦʔϓϯιʔεͷ Web ΫϥΠΞϯτ ΞϓϦέʔγϣϯͰ͢ • ϞσϧߏஙϑΣʔζ -

    NGC / ࣗࣾϦϙδτϦ͔Βػցֶश Docker ΠϝʔδΛϫϯΫϦοΫͰऔಘ - ͦͷ೚ҙͷ Docker ΠϝʔδΛ Jupyter notebook ίϯςφͱͯ͠ىಈ • ϞσϧֶशϑΣʔζ - ߏஙʹར༻ͨ͠ϥΠϒϥϦ͝ͱ Docker ΠϝʔδʹݻΊϦϙδτϦ΁อଘ - ΫϦοΫ͚ͩͰ Kubernetes Ϋϥελ / Rescale ΁େن໛ܭࢉλεΫΛૹ৴
  3. جຊతͳಈ͖ !3 How does it work?

  4. !4 ScaleShift ͷىಈ ϩʔΧϧʹ Web αʔόʔ্ཱ͕͕ͪΓ·͢

  5. !5 ػցֶशιϑτ΢ΣΞͷΠϯετʔϧ NGC / ϓϥΠϕʔτϨδετϦ ͔ΒϫϯΫϦοΫͰμ΢ϯϩʔυ

  6. !6 Jupyter notebook ͰͷϞσϧߏங Jupyter Ͱϥοϓͨ͠ίϯςφ͕͔ΜͨΜʹىಈ ϙʔτ΋࡞ۀྖҬ΋ ίϯςφ͝ͱʹ ෼཭͞Εͨ ΫϦʔϯͳ؀ڥ

    ɹ.
  7. !7 େن໛ܭࢉͷͨΊͷϥοϐϯά ґଘϥΠϒϥϦ΍ιʔείʔυ܈Λ·ͱΊɺͻͱͭͷΠϝʔδʹݻΊ·͢

  8. !8 ࣾ಺Ϋϥελ / Ϋϥ΢υ΁ܭࢉλεΫ౤ೖ ౤ೖઌʹԠͯ͡ඞཁͳ API ͕࣮ߦ͞Ε·͢ ར༻ϦιʔεྔΛܾΊ Ϋϥελ΁λεΫ౤ೖ

  9. Kubernetes ࿈ܞ !9 Integration with a kubernetes cluster

  10. !10 ػցֶश ͱ Kubernetes Web ք۾Λத৺ʹίϯςφΦʔέετϨʔγϣϯͷσϑΝΫτʹͳͬͨ k8sɻ ػցֶशͷจ຺Ͱ΋ίϯςφར༻͕੝ΜʹͳΓɺԠ༻ࣄྫ͕૿͍͑ͯ·͢ɻ - NVIDIA

    ͕ެࣜʹαϙʔτΛද໌ [ GTC 2018 Keynote, March 27 ] - Mercari ML Ops Night Vol.1 [ גࣜձࣾ ϝϧΧϦ / May 23, 2018 ] ɹhttps://mercari.connpass.com/event/85931/presentation/ - Jupyter ͚ͩͰػցֶशΛ࣮αʔϏεల։Ͱ͖Δج൫ [ גࣜձࣾϦΫϧʔτϥΠϑελΠϧ ] ɹhttps://engineer.recruit-lifestyle.co.jp/techblog/2018-10-04-ml-platform/ - KubernetesʹΑΔػցֶशج൫΁ͷ௅ઓ [ גࣜձࣾ Preferred Networks / Dec 4, 2018 ] ɹhttps://www.slideshare.net/pfi/kubernetes-125013757
  11. !11 ScaleShift + Kubernetes ߏ੒ྫ ετϨʔδ ؅ཧϊʔυ ܭࢉϊʔυ ࣾ಺ωοτϫʔΫ NGC

    DockerHub ϓϥΠϕʔτ ϨδετϦ Kubernetes ݚڀ / ։ൃνʔϜ ScaleShift ೖΓ ϩʔΧϧ୺຤
  12. !12 1. ػցֶशιϑτ΢ΣΞͷબ୒ ετϨʔδ ؅ཧϊʔυ ܭࢉϊʔυ ࣾ಺ωοτϫʔΫ NGC DockerHub ϓϥΠϕʔτ

    ϨδετϦ Kubernetes ݚڀ / ։ൃνʔϜ GUI ͔ΒબͿ͚ͩͰ μ΢ϯϩʔυ͕࢝·Γ·͢
  13. !13 2. Ϟσϧߏங ετϨʔδ ؅ཧϊʔυ ܭࢉϊʔυ ࣾ಺ωοτϫʔΫ NGC DockerHub ϓϥΠϕʔτ

    ϨδετϦ Kubernetes ݚڀ / ։ൃνʔϜ ScaleShift ͕ ϊʔτϒοΫΛىಈ͠·͢
  14. !14 3. ࣮ߦ؀ڥɾೖྗσʔλͷసૹ ετϨʔδ ؅ཧϊʔυ ܭࢉϊʔυ ࣾ಺ωοτϫʔΫ NGC DockerHub ϓϥΠϕʔτ

    ϨδετϦ Kubernetes ScaleShift ͕಺෦తʹ ඞཁͳసૹΛߦ͍·͢ ݚڀ / ։ൃνʔϜ
  15. !15 4. େن໛ܭࢉͷ࣮ߦΛࢦࣔ ετϨʔδ ؅ཧϊʔυ ܭࢉϊʔυ ࣾ಺ωοτϫʔΫ NGC DockerHub ϓϥΠϕʔτ

    ϨδετϦ Kubernetes ݚڀ / ։ൃνʔϜ Kubernetes ͷ Job ͱͯ͠ ܭࢉ৚݅Λૹ৴͠·͢
  16. !16 5. େن໛ܭࢉͷ࣮ߦ ετϨʔδ ؅ཧϊʔυ ܭࢉϊʔυ ࣾ಺ωοτϫʔΫ NGC DockerHub Kubernetes

    ϓϥΠϕʔτ ϨδετϦ ݚڀ / ։ൃνʔϜ
  17. !17 6. ܭࢉ݁Ռͷ֬ೝ ετϨʔδ ؅ཧϊʔυ ܭࢉϊʔυ ࣾ಺ωοτϫʔΫ NGC DockerHub ϓϥΠϕʔτ

    ϨδετϦ Kubernetes ݚڀ / ։ൃνʔϜ
  18. !18 Kubernetes ઃఆ / λεΫ࣮ߦը໘

  19. ScaleShift ͷઃఆ !19 Configurations

  20. !20 ֎෦࿈ܞ ࿈ܞػೳ ઃఆ஋ NVIDIA GPU CLOUD • NVIDIA ࣾͷ؅ཧ͢Δػցֶश

    Docker Πϝʔδͷ Ұཡ / ৄࡉ৘ใऔಘɺΠϝʔδͷμ΢ϯϩʔυ ɹAPI Ωʔ & Ϣʔβઃఆ ϓϥΠϕʔτϨδετϦ • ࣗࣾͰ؅ཧ͢Δػցֶश Docker Πϝʔδͷ Ұཡ৘ใऔಘɺΠϝʔδͷμ΢ϯϩʔυ ɹ઀ଓઌ & Ϣʔβઃఆ AWS • ػցֶश Docker Πϝʔδͷμ΢ϯϩʔυ • ϩʔΧϧϑΝΠϧγεςϜͱ S3 ؒͷσʔλ࿈ܞ ʢ࣮૷༧ఆʣ Kubernetes • ࣾ಺Ϋϥελ / Ϋϥ΢υͰͷେن໛ܭࢉ࣮ߦ ɹkubecfg Rescale • Rescale ϓϥοτϑΥʔϜͰͷେن໛ܭࢉ࣮ߦ ɹ஍Ҭࢦఆ & API Ωʔ
  21. !21 ىಈΦϓγϣϯʢൈਮʣ ઃఆ֓ཁ ॳظ஋ SS_JUPYTER_MINIMUM_PORT ίϯςφ΁ͷ઀ଓϙʔτಈతׂ౰։࢝൪߸ ɾ30000 SS_LOG_LEVEL ΞϓϦέʔγϣϯͷϩάग़ྗϨϕϧ ɹwarn

    SS_WORKSPACE_HOST_DIR ϗετଆͷ࡞ۀσʔλอଘྖҬ ɹͳ͠ʢࢦఆඞਢʣ SS_NGC_REGISTRY_ENDPOINT NGC ઀ଓઌ ɹhttps://registry.nvidia.com SS_NGC_REGISTRY_USER_NAME NGC Ϣʔβʔ໊ ɹ$oauthtoken SS_RESCALE_SINGULARITY_VERSION Rescale Ͱͷ Singularity ϥϯλΠϜόʔδϣϯ ɹ3.2.0 SS_RESCALE_JOB_WALLTIME Rescale ͰͷλεΫ࣮ߦ࠷େ࣌ؒ ɾ3600 docker-compose.yml ʹઃఆΛهࡌɺىಈͰ͖·͢