tesorflow-v1.0-on-ec2

1e5a15f4dc65c207a04a1e82a3f92e92?s=47 ryo nakamaru
February 20, 2017

 tesorflow-v1.0-on-ec2

MaruLabo × JAWS-UG AI 支部 #2 用の資料です

1e5a15f4dc65c207a04a1e82a3f92e92?s=128

ryo nakamaru

February 20, 2017
Tweet

Transcript

  1. TensorFlow v1.0 with GPU on AWS MaruLabo × JAWS-UG AI

    #2 @ 2017.02.20
  2. @pottava SUPINF Inc.

  3. ࠓ೔͸ TensorFlow Ͱ͕͢ɺMXNet ΋ɻ http://qiita.com/pottava/items/0d40747287ff31b8db77

  4. DeepLeaning ͷֶशʹ΋ظ଴͕ߴ·Δ AWS Batchʂ https://jawsug-cli.doorkeeper.jp/events/52026

  5. 3 / 11ɺͥͻ͝ࢀՃ͍ͩ͘͞ʂʂ http://jawsdays2017.jaws-ug.jp/

  6. ͦ΋ͦ΋ͳͥ GPU ࢖͏ͷʁ

  7. ʢࠓ೔͸ʣϋϯζΦϯ͔ͩΒͰ͢ ɾσΟʔϓϥʔχϯάͷֶशΛҰఆ࣌ؒ಺ʹऴ͍͑ͨ ɾGPU ͸ߦྻܭࢉ͕ CPU ΑΓ΋ͣͬͱ଎͍ ɾֶशͷଟ͘͸ߦྻܭࢉ ɾGPU ࢖͑͹ϋϯζΦϯΛ׬૸Ͱ͖ΔՄೳੑ͕ߴ·Δ

  8. Ϋϥ΢υͳΒ GPU ͷํ͕͍҆ʁʁ ɾΫϥ΢υ͸Ұൠతʹ࣌ؒ՝ۚ ɾCPU ͩͱ 1 ࣌ؒ൒͔͔ΔॲཧɺGPU ͳΒ 45

    ෼ ɹͲ͕͍ͬͪ҆ʁ ɾϓϩάϥϜ΍ن໛ʹԠͯ͡બ୒͠·͠ΐ͏
  9. ݟ͑ͳ͍ίετ ɾͰ΋ GPU ޲͚ʹॻ͘ͷ͸େมͳΜͰ͠ΐɾɾʁ ɾTensorFlow ͳͲ͸ GPU Λҙࣝͤͣͱ΋ॻ͚Δ ɾࢼߦࡨޡ͢Δաఔ͕஗͍ͷ͸஍ຯʹετϨε ɾ଎͍͸ਖ਼ٛ

  10. ͯ͞

  11. Topics 1. AWS GPU Πϯελϯε & NVIDIA ੡඼͓͞Β͍ 2. g2

    ܥͰ TensorFlow v1.0 Λ࢖͏ 3 ͭͷํ๏ 3. ҆͘࢖͏ʹ͸
  12. 1. AWS GPU Πϯελϯε & NVIDIA ੡඼͓͞Β͍

  13. GPU Πϯελϯε AWS ʹ͸ 2 छྨ͋Γ·͢ʢݱߦੈ୅ʣ g2 ܥ: NVIDIA GRID

    K520 ɹɹɹɹ1,536 CUDA cores / GPU ͕ 2 ͭͰ 1 ͭͷ K520 ɹɹɹɹg2 Ͱ࢖͑Δ GPU ͸ຊདྷάϥϑΟοΫɾήʔϛϯά༻్ p2 ܥ: NVIDIA Tesla K80 ɹɹɹɹഒਫ਼౓ԋࢉ࠷େ 2.91 TFLOPSɺ୯ਫ਼౓ԋࢉ࠷େ 8.74 TFLOPS ɹɹɹɹ2,496 CUDA cores / GPU ͕ 2 ͭͰ 1 ͭͷ K80 ɹɹɹɹp2 ͷ GPU ͸൚༻ίϯϐϡʔςΟϯά༻్
  14. EC2 Ͱ GPU Λಈ͔͢ʹ͸ GPU υϥΠόΛΠϯετʔϧ͢Ε͹ OKʂ ͱ͸͍͑ͦΕΛ௚઀ૢ࡞͢Δͷ͸ɾɾ ݱ࣮తʹ͸ CUDA

    Toolkit ΋ඞཁͰ͢ɻ TensorFlow ͸ cuDNN ΋಺෦Ͱ࢖͏ͷͰͦΕ΋ɻ
  15. υϥΠόʁ ɾGPU ͝ͱʹ NVIDIA Driver ͕഑෍͞Ε͍ͯ·͢ ɹg2 ͳΒ GRID K520ɺp2

    ͳΒ Tesla K80 ͷυϥΠό ɾυϥΠόͷόʔδϣϯ൪߸͸௨͠ɻ ɹྫ: ࠷৽όʔδϣϯͩͱ g2 ܥ GPU ͸ೝࣝͰ͖ͳ͍
  16. CUDAʁ ɾ͘ʔͩ ɾNVIDIA ࣾ੡ GPU ޲͚ C ݴޠ౷߹։ൃ؀ڥ ɾίϯύΠϥͱ͔ϥΠϒϥϦͱ͔ศརπʔϧ܈ ɾTensorFlow

    ͳͲ΋ CUDA ܦ༝Ͱ GPU Λૢ࡞
  17. CUDA ͱ NVIDIA υϥΠόͷରԠ ৽͍͠ CUDA Λ࢖͏ʹ͸৽͠ΊͷυϥΠό͕ඞཁɻ https://github.com/NVIDIA/nvidia-docker/wiki/CUDA#requirements

  18. Πϯετʔϧํ๏ 3 ͭ ɾυϥΠόΛೖΕͯɺCUDA Toolkit ΛೖΕΔ ɾCUDA ͷ Runfile ΠϯετʔϧͰυϥΠό͝ͱೖΕΔ

    ɾυϥΠό͚ͩೖΕͯɺͦͷ্͸ Docker Λ࢖͏
  19. ஫ҙ఺ ɾυϥΠό͸ GPU ͝ͱɺCUDA ͸ OS ͝ͱͷ༻ҙ ɾRunfile ΠϯετʔϧͰ͸ GPU

    ͱͷ੔߹ੑʹ஫ҙ ɾg2 ܥͱ p2 ܥ݉༻ͷ AMI Λ࡞ΔͳΒ ɹGRID K520 ͱ Tesla K80 ͲͪΒͰ΋࢖͑ͯ ɹ͔ͭͳΔ΂͘৽͍͠υϥΠόΛ࢖͏
  20. ೉ͦ͠͏

  21. ೉͍͠Ͱ͢

  22. ΋ͬͱ؆୯ʹ࢖͑ͳ͍ͷʁ

  23. ָ͍ͨ͠ํ΁ ɾAWS ʹ͸ AMI ͱ͍͏ϚγϯΠϝʔδ͕͋Γ·ͯ͠ ɹNVIDIA υϥΠό΋ CUDA ΋͢Ͱʹೖͬͨ΋ͷ͕ʂ ɾNVIDIA

    ެࣜ AMI → AWS Marketplace ΁ Go ɾAWS ެࣜ → “Deep Learning AMI” Ͱ୳ͤ·͢
  24. AMI ར༻ͷώϯτ ɾଞΫϥ΢υͰ͸ CUDA ೖΓެࣜΠϝʔδ͸·ͩͳ͍ ɾNVIDIA / AWS ͍ͣΕͷ AMI

    ΋ݱঢ় CUDA ͸ 7.5 ɾࣗ࡞ͨ͠ AMI Λ Public ʹ͢Δͷ͸ɾɾ ɹɹ- NVIDIA ͷϥΠηϯε ɹɹ- υϥΠόͷΈ NVIDIA AMI + Docker ͱ͍͏ख΋
  25. 2. g2 ܥͰ TensorFlow v1.0 Λ ࢖͏ 3 ͭͷํ๏

  26. ͦ΋ͦ΋ TensorFlow Λ࢖͏ʹ͸ ɾpip install ɾ./configure ͔Βͷ pip installʢࣗ෼ͰϏϧυʣ ɾnvidia-docker

    run
  27. TensorFlow v1.0 ͷґଘ GPU ൛ TensorFlow ͸ CUDA ͱ cuDNN

    ʹґଘɻ v0.12 Ҏ߱ CUDA 8.0 ΛλʔήοτʹϏϧυ͞Ε͍ͯ ΔͨΊɺ8.0 ܥϥΠϒϥϦʢToolkit શମ͕ 8.0 Ͱ͋Δඞ ཁ͸ͳ͍ʣͱ 367.48 Ҏ߱ ͷ NVIDIA υϥΠό͕ඞཁɻ
  28. ͱ͍͏͜ͱ͸

  29. ҎԼͷ͍ͣΕ͔͕ඞཁ ɾґଘΛຬͨ͢Α͏ʹαʔόΛηοτΞοϓ ɾCUDA 7.5 Λλʔήοτʹࣗ෼Ͱ TF ΛϏϧυ ɾ৚݅Λຬͨ͢υϥΠό͚ͩೖΕͯɺDocker Ͱىಈ

  30. g2 ܥ + CUDA 8.0

  31. ણࡉͳυϥΠόόʔδϣϯ g2 ܥ GRID K520 ͷ࠷৽ରԠυϥΠό͸ 367.57ɻ TensorFlow ͷϏϧυࡁΈόΠφϦ͕ཁٻ͢Δ CUDA

    8.0 + NVIDIA Driver (>= 367.48) Λຬͨ͢ͷ͸ ஍ຯʹ೉͍͠ɻubuntu 16.04 + ҎԼ Runfile Ͱ OK https://developer.nvidia.com/compute/cuda/8.0/prod/local_installers/ cuda_8.0.44_linux-run
  32. g2 ܥ + CUDA 7.5

  33. ࣗ෼ͰϏϧυ͢ΔͳΒ Compute capability: g2 ͷ K520 ͸ 3.0ɺp2 ͷ K80

    ͸ 3.7 TensorFlow ͷϏϧυ࣌ʹࢦఆ͠·͢ɻ https://en.wikipedia.org/wiki/CUDA#GPUs_supported
  34. Docker Λ࢖͏ͳΒ K520 ରԠͷ NVIDIA υϥΠό 367.57 ΛೖΕɺ nvidia-docker ΛηοτΞοϓ͢Ε͹

    OKʂ ΋͘͠͸ AWS ެࣜͷ DeepLearning AMI ubuntu ൛Λ࢖͑͹υϥΠόΠϯετʔϧ͑͞ෆཁɻ
  35. p2 ܥ͸ʁ

  36. جຊ͸ಉ͡ ͦͷ্ Tesla K80 ͱͳΕ͹αʔόηοτΞοϓ΋ ਵ෼ָʹͳΔͨΊɺׂѪ͠·͢ɻ

  37. 3. ҆͘࢖͏ʹ͸

  38. εϙοτΠϯελϯε AWS Ͱ GPU Λ࢖͏ͳΒͥͻ࢖͍͍ͨͱ͜Ζɻ ೔ຊޠͷࢿྉ΋ॆ࣮͍ͯ͠ΔͷͰௐ΂ͯΈ͍ͯͩ͘͞

  39. גࣜձࣾεϐϯϑ ΞΠσΟΞΛ͔ͨͪʹʂ +

  40. http://prtimes.jp/main/html/rd/p/000000007.000007768.html Comfy for Docker ϓϩδΣΫτ΁ͷ Docker ಋೖɾ։ൃࢧԉɾӡ༻؂ࢹ୅ߦΛ͍ͨ͠·͢ɻ ʢGCP / Azure

    ΋΋ͪΖΜରԠ͍ͯ͠·͢ɾɾʣ https://www.supinf.co.jp/service/dockersupport/
  41. ͝૬ஊ͸͓ؾܰʹͪ͜Β·Ͱ.. 41 <Thank you !! https://www.supinf.co.jp/service/dockersupport/