$30 off During Our Annual Pro Sale. View Details »

GPU on OpenStack GPUインターナルクラウドのベストプラクティス

GPU on OpenStack GPUインターナルクラウドのベストプラクティス

OpenStack Days Tokyo 2017の資料です。

masafumi_ohta

July 27, 2017
Tweet

More Decks by masafumi_ohta

Other Decks in Technology

Transcript

  1. G P U O N O P E N S TA C K -
    G P U Π ϯ λ ʔ φϧ Ϋ ϥ ΢ υ ͷ ϕ ε τϓ ϥ Ϋ ςΟε
    P R E S E N T E D B Y M A S A F U M I O H TA @ m a s a f u m i o h t a

    View Slide

  2. A ‘ S TA C K E R ’ L O O K I N G
    I N T O G P G P U U S E .
    V O L U N T E E R F O R T H E
    R A S P B E R RY P I
    F O U N D AT I O N : )
    M A S A F U M I O H TA

    View Slide

  3. T H I S P R E S E N TAT I O N I S
    R E N E W E D F O R
    O P E N S TA C K D AY S T O K Y O
    2 0 1 7 .
    I T I S I N C L U D E D S O M E
    F E E D B A C K S F R O M T H E
    S E S S I O N AT # L C 3 C H I N A
    I N B E I J I N G , 2 0 1 7
    P R E S E N TAT I O N R E N E W E D

    View Slide

  4. O R G I Z N E D B Y V T J A N D
    G AT H E R
    M E , N V I D I A ( A S K ) , D E L L E M C ,
    N E S I C A N D S O M E
    C O M PA N I E S W H O
    I N T E R E S T E D G P U U S E O N
    O P E N S TA C K E N V I R O N M E N T
    F O R O U R C U S T O M E R S
    T H I S P R O J E C T I S …

    View Slide

  5. T E S L A M 6 0 + D E L L E M C ɹ
    P O W E R E D G E
    C 4 1 3 0 + O P E N S TA C K
    T O E VA L U AT E G P U O N
    O P E N S TA C K E N V I R O N M E N T
    T H A N K S T W O H E L P I N G
    O U T A N D P R O V I D I N G
    T H O S E S E R V E R S + C A R D S
    N O W E VA L U AT I N G . .

    View Slide

  6. View Slide

  7. ͳ ͥ G P U O N O P E N S TA C K ͳ ͷ ͔ ?
    ͸ ͡ Ί ʹ

    View Slide

  8. O P E N S TA C K
    ಛ घ ར ༻ ͷ ध ཁ
    OpenStackͷಛघͳར༻ͷधཁ͕
    ૿͖͍͑ͯͯΔ
    Hadoop(Sahara),HPC,ͳͲͳͲ
    ΄ͱΜͲͷ΋ͷ͸·ͱΊͨͷ͕
    ͳ͘ɺάάͬͯௐ΂Δ͔͠ͳ͍…
    ᐌ͘ʮυΩϡϝϯτϩετঢ়ଶʯ
    ͜ΕΒͷ΋ͷ͸OpenStackͷ
    Docsʹ·ͱΊΒΕΔ΂͖

    View Slide

  9. Ͳ ͏ ΍ ͬͯ G P U ͸ O P E N S TA C K ؀ ڥ Ͱ ಈ ࡞ ͢ Δ ͷ ͔ʁ
    ‘GPU ON OPENSTACK’ͱ͸Կ͔?

    View Slide

  10. ‘ G P U ’ τ Ϩϯ υ
    ͷ ͓ ͞ Β ͍
    • ଟ͘ͷGPUίΞΛར༻͢Δ
    • Ұ෦ͷܭࢉ͸୯ҰMPUϢχοτ͕
    খ͘͞εϐʔυ͕஗͔Ζ͏ͱ΋ଟ͘
    ͷMPUίΞΛ࢖͏͜ͱ͸͍͔ͭ͘
    ͷܭࢉॲཧʹ͸༗ޮͰ͋Δ
    • αʔό֤ʑ͸ߴύϑΥʔϚϯεͰί
    ϯύΫτʹͰ͖Δ
    • ௿ফඅిྗԽ͸HPCΤϯυϢʔ
    βʹͱͬͯେࣄͳ͜ͱ
    • ଟ͘ͷলిྗɾϋΠύϑΥʔϚϯε
    αʔόΛॴ༗͠΍͘͢ͳΔ

    View Slide

  11. Ͳ ͏ ΍ ͬͯ G P U
    ͸ ಈ ͘ͷ ͔ʁ
    • ݱࡏͷͱ͜ΖPCIύεεϧʔ͋Δ͍͸
    NVIDIA(GPU) dockerͰͷར༻ͱͳΔ
    • ‘PCIύεεϧʔ’ ͸౔୆ͱͳΔϕΞϝλϧԾ
    ૝؀ڥʹґଘ͢Δ
    • VSphereͱXen͸֤ʑVMʹGPUίΞΛ೚ҙͷ୯Ґ
    Ͱ෼ׂׂͯ͠Γ౰ͯ͢Δ͜ͱ͕Մೳ
    • OpenStackͷඪ४తͳϕΞϝλϧԾ૝؀ڥͱͯ͠
    ࢖ΘΕΔKVMͰ͸ίΞ෼ׂ͕Ͱ͖ͣɺGPUϢχο
    τ͝ͱΛVMʹׂΓ౰ͯΔ͜ͱͱͳΔ
    • ࡢࠓ࢖ΘΕΔίϯςφԾ૝؀ڥͰͷσϓϩ
    ΠͱͳΔNVIDIA(GPU) Docker͸Dockerί
    ϯςφಉ࢜ͰҰͭͷGPUϢχοτΛڞ༗͠
    ͍ͯΔ͕ɺ໌ࣔతͳ෼ׂ͸͠ͳ͍
    • ΢Πϯυ΢ζ͸’docker vm'ͱͯ͠͸ಈ࡞͠ͳ͍

    View Slide

  12. G P U
    O P E N S TA C K ͱ ͸
    • ΠϯελϯτͳHPCར༻
    • ͍͔ͭ͘ͷܭࢉΛ͓͑ͨ͠ΒVMͦͷ
    ΋ͷΛյ͢ɻ
    • ͪΐͬͱ͓ͨ͠ࢼ͠Ͱ͍͔ͭ͘ͷVM
    Λ࢖ͬͯHPCάϦουࢼͯ͠ΈΔɻऴ
    ΘͬͨΒ͙͢յ͢ɻ
    • GPUΠϯλʔφϧΫϥ΢υͱͯ͠
    ͷGPUར༻
    • ओʹ੡଄ۀʹ͓͍ͯɺ͍͔ͭ͘ͷγε
    ςϜ͸৘ใ؅ཧ্ɺύϒϦοΫΫϥ΢
    υʹ֎͕ͩ͠Ͱ͖ͳ͍ɻ

    View Slide

  13. Ͳ Μ ͳ G P U ͷ ϝ Χ χζ Ϝ ͕ O P E N S TA C K ؀ ڥ
    Ͱ ͏ ͝ ͍ͯ ͍ Δ ͷ ͔ʁ
    S E T U P : G P U O N O P E N S TA C K

    View Slide

  14. O p e n S t a c k Ͱ G P U Λ ಈ ͔͢ ํ ๏ ʢ ݱ ࡏ )
    • PCIύεεϧʔ
    • PCIσόΠεΛμΠϨΫτʹ઀ଓ͢Δ
    • ComputeNodeͷϋΠύʔόΠβʔґଘɺOpenStackґଘͰ͸ͳ͍
    • Xenར༻ͰͷGPUίΞ෼ׂɺKVM͸NVIDIA/AMDͱ΋ίΞ෼ׂෆՄ
    • Intel GVT-g(Xen)/GVT-d(KVM)ʹΑΔIntel GPUͷίΞ෼ׂ
    • ίϯςφ
    • NVIDIA Dockerͷར༻
    • ෳ਺ͷίϯςφʹΑͬͯGPUΛར༻Ͱ͖Δ͕ɺ໌ࣔతͳGPUίΞ෼ׂ͸Ͱ͖ͳ͍

    λεΫ࣍ୈͰͷGPUར༻
    • Kubernetes/Mesos/Docker SwarmͳͲͱͷ૊Έ߹Θͤ؅ཧ

    View Slide

  15. P C I ύ ε ε ϧ ʔ ͸ V M ্ Ͱ Ͳ ͏ ಈ ࡞ ͢ Δ ͷ ͔ʁ
    • PCIσόΠεΛμΠϨΫτʹLinuxϗετΛ௨ͯ͡઀ଓ͢Δ
    • ෺ཧϗετΑΓσόΠεΛ੾Γ཭͢ඞཁ͋Γ
    • (NVIDIA) Dockerͱҧ͍ϗετ੾Γ཭͠ͱͳΔͨΊɺ؂ࢹ؅ཧ౳͸֤VMʹඞཁ
    • OpenStackґଘͷ͜ͱͰ͸ͳ͘ɺLinuxͷϕΞϝλϧ؀ڥʹґଘ
    • KVM্Ͱ͸ҰͭͷGPUσόΠεʹͻͱͭͷVM
    • Docker/Xen/VSphereͱҧ͍GPU͸෼ׂ΋Ͱ͖ͳ͚Ε͹ڞ༗΋Ͱ͖ͳ͍
    • ͜Ε͸KVMͷ੍ݶͰ͋ͬͯOpenStackʹΑΔ΋ͷͰ͸ͳ͍

    View Slide

  16. P C I P a s s t h ro u g h o n O p e n S t a c k
    • Redhat͸ެࣜαϙʔτ
    • ͨͩɺ͋·Γਪ͠Ͱͳ͍ؾ͕..
    • Ubuntu͸υΩϡϝϯτ͢Β·ͱ΋ʹͳ͍…
    • άάͬͯ୳͔͢͠ͳ͍ʢ͕͢͞Ubuntu..orz )
    • ໨Լখ৬/NVIDIA JAPAN/DELLEMC/VTJͰ࠶౓ࡉ͘ݕূͱ
    OpenStackίϛϡχςΟͷͨΊʹυΩϡϝϯτΛ·ͱΊ্͛Δ༧
    ఆʢझࢫʹಉҙ͍͖ͨͩࢀը͍͚ͨͩΔاۀ༷׻ܴ͍ͨ͠·͢)

    View Slide

  17. Linux OS for KVM hypervisor
    GPU Driver
    App
    VM
    VMM/KVM
    IOMMU/Vt-d
    PCI Express x16
    Linux/Win OS
    ComputeNode
    GPU Card
    Nova Compute
    Nova Scheduler
    Nova API
    Linux OS
    ControllerNode
    ਤ1:OpenStackͲ͏΍ͬͯGPUύεεϧʔΛ࣮ݱ͍ͯ͠Δͷ͔ʁ(KVMͷ৔߹)
    Nova Conductor
    pci-stub/vfio-pci
    GPU Driver

    View Slide

  18. S t e p 1 : ϗ ε τ ͷ G P U ͷ ঢ় ଶ Λ · ͣ ͸ ௐ ΂ Δ
    • lspci -nn | grep -i nvidia ͰGPUͷঢ়ଶΛ·ͣ͸νΣοΫ
    lspci -nn | grep -i nvidia
    88:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:11b4] (rev a1)
    88:00.1 Audio device [0403]: NVIDIA Corporation GK104 HDMI Audio Controller [10de:0e0a] (rev a1)
    • શͯͷGPUϢχοτ͕ύεεϧʔ͞Ε͍ͯΔඞཁ͕͋Δ
    • GPU͚ͩͰͳ͘ɺϢχοτͱͯ͠ೝࣝ͞ΕΔGPUσόΠεࣗମ΋
    ύεεϧʔ͢Δඞཁ͋Γ
    • Ͱͳ͍ͱVM͸ಈ͍ͯ͘Εͳ͍..ʢ׬શʹύεεϧʔʹͳΒͳ͍)

    View Slide

  19. G P U ϙʔ τ ͕ ύ ε ε ϧ ʔ ͞ Εͯ ͍ Δ ͔ ν Σ ο Ϋ
    • QuadroͳͲHDMI͸ϏσΦϙʔτ͚ͩͰͳ͘ΦʔσΟ
    Φϙʔτ΋΋͍ͬͯΔͨΊ஫ҙ
    • lspciͰνΣοΫ͞Εͨ΋ͷʹ͍ͭͯ͸શͯύεεϧʔͤ͞Δඞ
    ཁ͕͋Δɻ

    View Slide

  20. S T E P 2 : I O M M U η ο τΞ οϓ
    • IOMMU(Input/Output Memory Management Unit) ͸෺ཧσ
    όΠεΛԾ૝ԽγεςϜͰ࢖͏্Ͱඞཁͳ΋ͷ
    • ΋ͪΖΜvt-d͸Φϯʹ͢Δඞཁ͋Γ (EFI/BIOSͷͷσϑΥϧ
    τ͸ON)
    • intel_iommuͱvfio_iommu_type1.allow_unsafe_interruptsͷ
    ઃఆΛ/etc/default/grubʹߦ͏ඞཁ͋Γ
    GRUB_CMDLINE_LINUX_DEFAULT="quiet splash intel_iommu=on
    vfio_iommu_type1.allow_unsafe_interrupts=1”

    View Slide

  21. S T E P 3 : p c i - s t u b ͱ V F I O
    • pci-stub͸෺ཧσόΠεΛLinuxϗετଆ͕ར༻Ͱ͖ͳ͍Α͏ʹ͢Δ
    • VFIO(Virtual Function IO) ͸ pci-stubͱಉ༷ͷಇ͖Λ͢Δɻkernel 4.1Ҏ
    ߱ͷαϙʔτ
    • ະ࢖༻࣌͸σόΠεΛD3εςʔτ(௿ফඅిྗϞʔυ)ʹมߋ͢Δ
    • It is not used by default σϑΥϧτͰ͸࢖ΘΕͳ͍ͨΊ /etc/module Λ
    ฤूͯ͠ɺ͜ΕΒͱؔ࿈͢ΔίϯϙʔωϯπΛ௥ه͢Δ (kvm,kvm_intel)
    pci_stub
    vfio
    vfio_iommu_type1
    vfio_pci
    kvm
    kvm_intel

    View Slide

  22. S T E P 4 - 1 : ϒ ϥ ο Ϋ Ϧε τ ( 1 )
    • ramfsͰGPUσόΠεΛೝࣝͰ͖ͳ͍Α͏ʹ͢Δ
    • /etc/initramfs-tools/modules to initramfs (ubuntuͷ৔߹)
    echo ‘pci_stub ids=10de:11b4,10de:0e0a’ >> /etc/initramfs-tools/modules
    sudo update-initramfs -u && sudo reboot

    View Slide

  23. S T E P 4 - 2 : B l a c k l i s t ( 2 )
    • ϒʔτ࣌ʹGPUσόΠεΛೝࣝͰ͖ͳ͍Α͏ʹ͢Δɻ
    • /etc/modprobe.d/blacklist.conf ʹ࣍Λ௥Ճ:
    blacklist nvidia
    blacklist nvidia-uvm
    • υϥΠόʹ͍ͭͯ΋ϒϥοΫϦετʹೖΕΔඞཁ͋Γ
    blacklist nouveau

    View Slide

  24. S T E P 5 : ෺ ཧ ͔ Β ͷ Ξ ϯόΠ ϯ υ
    • ʰ෺ཧϗετ͔Β੾Γ཭͠VM΁ͷόΠϯυ͢ΔͨΊʱʹpci-stubΛ
    νΣοΫ͢Δ
    1. pci-stub/new_idʹύεεϧʔ͢ΔυϥΠόͷPCI IDΛΤϯτϦ͢Δɻ
    2. ؔ࿈͢ΔPCIࣝผࢠΛ෺ཧϗετ͔Β੾Γ཭͢
    3. pci-stubʹ͜ͷPCIࣝผࢠΛόΠϯυ͢Δɻ
    echo 11de 11b4 > /sys/bus/pci/drivers/pci-stub/new_id
    echo 11de 0e0a > /sys/bus/pci/drivers/pci-stub/new_id
    echo 0000:88:00.0 > /sys/bus/pci/devices/0000:88:00.0/driver/unbind
    echo 0000:88:00.1 > /sys/bus/pci/devices/0000:88:00.1/driver/unbind
    echo 0000:88:00.0 > /sys/bus/pci/drivers/pci-stub/bind
    echo 0000:88:00.1 > /sys/bus/pci/drivers/pci-stub/bind
    • ෺ཧϗετͷdmesgʹҎԼͷ’claimed’͕ϒʔτϓϩηεʹ͋Δ͔Ͳ͏
    ͔֬ೝ͢Δɻ
    pci-stub 0000:88:00.1: claimed by stub

    View Slide

  25. ෺ཧσόΠε
    pci-stub
    (Ծ૝Ͱ࢖͏)
    GPUϢχοτ
    (શͯͷσόΠε)
    ‘GPUΛ෺ཧσόΠεΑΓΞϯόΠϯυͯ͠
    Ծ૝σόΠεʹόΠϯυ͢Δ’
    echo 11de 11b4 > /sys/bus/pci/drivers/pci-stub/new_id
    echo 11de 0e0a > /sys/bus/pci/drivers/pci-stub/new_id
    echo 0000:88:00.0 > /sys/bus/pci/drivers/pci-stub/bind
    echo 0000:88:00.1 > /sys/bus/pci/drivers/pci-stub/bind
    echo 0000:88:00.0 > /sys/bus/pci/devices/0000:88:00.0/driver/unbind
    echo 0000:88:00.1 > /sys/bus/pci/devices/0000:88:00.1/driver/unbind
    ਤ2:෺ཧʹ͋ΔGPUΛ੾Γ཭͠Ծ૝ʹͭͳ͙

    View Slide

  26. modprobe
    /etc/modprobe.d/blacklist.conf
    pci-stub
    /sys/bus/pci/drivers/pci-stub/
    /sys/bus/pci/devices/$(Identifier)/driver/unbind
    ramfs
    /etc/initramfs-tools/modules
    GRUB
    /etc/default/grub
    modules
    /etc/modules
    UEFI/BIOS
    Vt-d
    ਤ3:ϒʔτ࣌ͷGPUϒϥοΫϦετͷϓϩηε(Ubuntuͷ৔߹)
    IOMMU
    IOMMU
    BLACK
    LIST
    BLACK
    LIST
    IOMMU
    BLACK
    LIST
    BLACK
    LIST

    View Slide

  27. G P U Λ ௥ Ճ ͢ Δ ( 1 )
    • lspci ͷ݁ՌΛ֬ೝ- 2ͭͷؔ࿈͢ΔPCIࣝผࢠ͕֬ೝͰ
    ͖Δ͸ͣ(ࣝผࢠ൪߸͸࢖༻͢ΔγεςϜʹґଘ)
    lspci -nn | grep -i nvidia
    88:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:11b4] (rev a1)
    88:00.1 Audio device [0403]: NVIDIA Corporation GK104 HDMI Audio Controller [10de:0e0a] (rev a1)
    84:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:11b4] (rev a1)
    84:00.1 Audio device [0403]: NVIDIA Corporation GK104 HDMI Audio Controller [10de:0e0a] (rev a1)
    • pci-stubʹ͞Βʹύεεϧʔ͢ΔGPUͷ௥هΛ͢Δɻ
    echo 0000:84:00.0 > /sys/bus/pci/devices/0000:84:00.0/driver/unbind
    echo 0000:84:00.1 > /sys/bus/pci/devices/0000:84:00.1/driver/unbind
    echo 0000:84:00.0 > /sys/bus/pci/drivers/pci-stub/bind
    echo 0000:84:00.1 > /sys/bus/pci/drivers/pci-stub/bind

    View Slide

  28. G P U Λ ௥ Ճ ͢ Δ ( 2 )
    • ௥ՃͷGPU͕͏·͘ಈ͍͔ͨVMͰ֬ೝ
    ubuntu@guestos$ lspci -nn | grep -i nvidia
    00:07.0 VGA compatible controller [0300]: NVIDIA Corporation GK104GL [Quadro K4200] [10de:11b4]
    (rev a1)
    00:08.0 VGA compatible controller [0300]: NVIDIA Corporation GK104GL [Quadro K4200] [10de:11b4]
    (rev a1)
    • ΞϓϦʹΑͬͯ͸྆ํಉ࣌ʹར༻͢ΔࡍͳͲɺಉҰͷ
    GPUͰ͋Δඞཁ͕͋ΔՄೳੑ͕͋Δɻ
    /nbody -benchmark -numdevices=2 -num bodies=65536

    View Slide

  29. O p e n s t a c k : n o v a - a p i ͷ ઃ ఆ
    • ControllerNodeͷ/etc/nova/nova.confΛฤू͠nova-
    apiΛ࠶ىಈ͢Δ
    • pci_aliasʹPCIσόΠεͷ৘ใɺΤΠϦΞε໊Λهड़
    pci_alias={“name”:”K4200”,"vendor_id":"10de","product_id":"11b4"}

    View Slide

  30. O p e n S t a c k : n o v a - c o m p u t e ͷ ઃ ఆ
    • ComputeNodeʹ͋Δ/etc/nova/nova.confΛฤू͠ɺ
    nova-computeΛ࠶ىಈ͢Δɻ
    • pci_passthrough_whitelistʹPCIσόΠεͷ৘ใɺΤΠϦΞε໊Λ
    هड़
    pci_passthrough_whitelist={“name”:”K4200","vendor_id":"10de","product_id":"11b4"}
    *͜ͷέʔεͷ৔߹ɺϕϯμʔIDͱϓϩμΫτID͕ద߹ͨ͠σόΠε͸શͯVMʹύεεϧʔ͢Δɻ
    • pci_aliasʹʹPCIσόΠεͷ৘ใɺΤΠϦΞε໊Λಉ༷ʹ௥ه
    *NeutonҎ߱
    pci_alias={“name”:”K4200”,"vendor_id":"10de","product_id":"11b4"}

    View Slide

  31. O p e n S t a c k : n o v a - s c h e d u l e r ͷ ઃ ఆ
    • ControllerNodeͷ/etc/nova/nova.confΛઃఆ͠nova-
    schedulerΛ࠶ىಈ͢Δɻ
    • PciPassthroughFilterΛར༻Ͱ͖ΔΑ͏ʹ͢ΔͨΊʹ
    PciPassthroughFilterΛscheduler_default_filtersʹ௥ه͢Δ
    • ಉ༷ʹPciPassthroughFilterΛscheduler_available_filtersʹهड़
    ͢Δ
    scheduler_available_filters=nova.scheduler.filters.all_filters
    scheduler_available_filters=nova.scheduler.filters.pci_passthrough_filter.PciPassthroughFilter
    scheduler_default_filters=DifferentHostFilter,RetryFilter,AvailabilityZoneFilter,RamFilter,CoreFilter,Dis
    kFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,
    ServerGroupAffinityFilter,AggregateInstanceExtraSpecsFilter,PciPassthroughFilter

    View Slide

  32. Linux OS for KVM hypervisor
    GPU D
    App
    VM
    VMM/KVM
    Linux/W
    ComputeNode
    Nova Compute
    Nova Scheduler
    Nova API
    Linux OS
    ControllerNode
    Nova Conductor
    pci_alias
    pci_passthrough_whitelist
    pci_alias
    scheduler_default_filters
    scheduler_available_filters
    ਤ4:Nova͕GPU(PCI)ύεεϧʔ͞Ε͍ͯΔComputeNodeͰಈ࡞͢Δϓϩηε
    P C I σόΠ ε Λ ར ༻ Մ ʹ ͯ͠ P C I ར ༻ ͷ Ϧ Ϋ Τ ε τ
    Λ ૹ Ε Δ Α ͏ ʹ ͢ Δ
    p c i p a s s t h ro u g h f i l t e r Λ ར ༻ ͯ͠ G P U ύ ε ε ϧ ʔ
    ͞ Ε ͨ C o m p u t e N o d e Λ બ Ϳ
    G P U ύ ε ε ϧ ʔ ͠ ͨ Π ϯελ ϯε Λ p c i _ a l i a s ͱ
    p c i _ p a s s t h ro u g h _ w h i t e l i s t ʹ Α ͬͯ ൃ ੜ ͞ ͤ Δ

    View Slide

  33. O p e n S t a c k : f l a v o r- k e y ͷ ઃ ఆ
    • flavor-keyΛઃఆ͠GPUΠϯελϯεͰར༻Ͱ͖ΔΑ͏
    ʹPCIύεεϧʔͷઃఆΛflavorʹ௥ه͢Δ
    • pci_passthrough:alias=$(pci_alias_name):$(the number of
    GPUs we would like to use)
    nova flavor-key $flavor_name set “pci_passthrough:alias”=“K4200:$(the number_of_gpus)”

    View Slide

  34. G P U O N O P E N S TA C K Λ ࢖ ͏ ʹ ͋ ͨ ͬͯ ஫ ҙ ͢΂
    ͖ ໰ ୊ ఺
    ط ஌ ͷ ໰ ୊

    View Slide

  35. C l o u d Πϝʔ δ ͷ ໰ ୊
    • CloudΠϝʔδ͸GPUΛ࢖͏্Ͱ͸ͱͯ΋খ͘͞qemu-imgͰϦαΠζ
    ͢Δඞཁ͕͋Δ
    • CUDAυϥΠό͸perl-packages(dev packages)͕Πϯετʔϧ࣌ʹඞཁ
    • ͦΕ͕.deb͋Δ͍͸.rpmύοέʔδͰ͋Ζ͏ͱΠϯετʔϧ͕ඞཁʹͳΔɻͳͥͳΒ
    CUDAύοέʔδࣗମ͕όΠφϦύοέʔδͰͳ͘ɺιʔείʔυΑΓόΠφϦΛϏ
    ϧυ͍ͯͯ͠ɺmakeΛύοέʔδΠϯετʔϧͷࡍʹ࣮ߦ͍ͯ͠Δ
    • NVIDIAᐌ͘CUDAυϥΠόͷ࣍ظϦϦʔεͰFIX༧ఆͱͷ͜ͱ
    • CUDA 7.6Ҏ߱Ͱfixͷ༧ఆͩͬͨͷʹ..·ͩfix͞Εͯͳ͍..orz

    View Slide

  36. V D I ͱ ͯ͠ͷ W i n d o w s ར ༻
    • CUDA on Windows ͸ࢥͬͨΑΓૣ͍͚ͲΧΫΧΫͯ͠͠·͏
    • ଟ෼DISKͷεϐʔυɺωοτϫʔΫͳͲͳͲ͍Ζ͍Ζͳ΋ͷ͕༧૝͞ΕΔɻଟ
    ෼ΤϑΣϝϥϧϞʔυ΍ͦͷଞര଎ܥͷSSD/NVMeͷར༻΍Β10gҎ্ͷωοτ
    ϫʔΫ؀ڥ͕ඞཁͱͳΔͩΖ͏
    • ·ͩվળରԠΛ΍ͬͯΈͨ͜ͱ͸ͳ͍͕ɺͳͥൃੜ͢Δͷ͔Λௐ΂Δ΂ۙ͘ʑݕূ༧ఆɻ
    • VM͸جຊϝϞϦ/NWసૹͳͲίϯςΩετεΠονͰͷಈ࡞Λ͢ΔͨΊɺCUDAͷॏ͍
    ϫʔΫϩʔυ͸ΧΫΧΫ͢ΔՄೳੑ͕͋Δ͔΋͠Εͳ͍ɻ
    • ৄ͘͠͸σϞϏσΦʹͯ…
    • GPU on OpenStack্ͰͷWindowsͷಈ࡞͸΋ͬͱௐ͕ࠪඞཁ..͕࣌ؒ΄͍͠…

    View Slide

  37. V D I ͱ ͯ͠ͷ W i n d o w s ར ༻ ( f e e d b a c k )
    • Thanks giving some feedbacks to my session at LC3 China!
    • Should checked and investigate the Windows-related issues below, will
    update later.
    • Windows 10 on KVM issue
    • http://bart.vanhauwaert.org/hints/installing-win10-on-KVM.html
    • Windows 10 deployment is succeeded from my KVM but I still have failed the same
    deployment from my OpenStack.I should investigate more what happened in detail.
    • Nvidia Driver issue (version 337.88 or later)
    • https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVMF#.
    22Error_43:_Driver_failed_to_load.22_on_Nvidia_GPUs_passed_to_Windows_VMs
    • ‘Nvidia drivers on Windows check if an hypervisor is running and fail if it detects one, which
    results in an Error 43 in the Windows device manager. ‘ I haven’t found this issue on my
    Windows 7 VMs so I should check more in detail
    • Related links libvirt for adding the driver, should be checked.
    • https://github.com/openstack/nova/blob/master/nova/virt/libvirt/config.py#L2025-L2036
    • https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L4262-L4264

    View Slide

  38. ϥ Π ϒ ϚΠά Ϩ ʔ γ ϣ ϯ ͷ ໰ ୊
    • PCIύεεϧʔΛ࢖ͬͨVM͸ϥΠϒϚΠάϨʔγϣϯ͕Ͱ͖ͳ͍ɻϚΠ
    άϨʔγϣϯલͷϗετͷPCIίωΫγϣϯ͕੾Εͳ͍··ʹͳͬͯ͠
    ·͏ɻ
    • ϫʔΫΞϥ΢ϯυ:ݹ͍ίωΫγϣϯΛԼهʹ͋ΔΑ͏ͳmysql DBͷ
    nova.pci_devices͔Β࡟আ͠ݹ͍ϗετΛ࠶ىಈ͢Δɻ
    • ࠶ىಈ͸ଞͷϗετʹӨڹ͢ΔͨΊɺ͋Γ͑ͳ͍ɻଞʹmysql DBͳͲؔ࿈ϓϩη
    εͷ࠶ىಈͰ৐Γ੾Δܗ΋͋Δ͕ɺ͜Ε΋ଞͷϗετʹӨڹ͢ΔͨΊ͋Γ͑ͳ͍
    ͜ͱʹͳΔɻ
    | 2016-08-11 00:54:45 | 2016-08-19 04:58:01 | NULL | 0 | 45 | 21 | 0000:84:00.0 | 11b4 | 10de | type-PCI | pci_0000_84_00_0 |
    label_10de_11b4 | available | {} | NULL | NULL | 1 | <<-- old-host
    | 2016-08-11 00:54:45 | 2016-08-19 04:58:01 | NULL | 0 | 48 | 21 | 0000:88:00.0 | 11b4 | 10de | type-PCI | pci_0000_88_00_0 |
    label_10de_11b4 | available | {} | NULL | NULL | 1 | <<-- old-host

    View Slide

  39. G P U ͕ O P E N S TA C K ؀ ڥ Ͱ Ͳ ͏ ಈ ͘ ͔ 

    ν Σ ο Ϋ ʂ
    σ Ϟ ( Ϗ σ Φ )

    View Slide

  40. Π ϯε λ ϯε Λ ཱ ͪ ্ ͛ G P U ͷ ಈ ࡞ Λ ֬ ೝ
    • RHEL OpenStackΛར༻
    • ύεεϧʔͯ͋͠ΔΠϯελϯεΛ্ཱͪ͛Δ
    • UbuntuΠϯελϯεΛ্ཱͪ͛lspciͱdevicequeryΛ࣮ߦɺGPUͷ
    ಈ࡞Λ֬ೝ
    • Windows্ͰͷGPUͷಈ͖ΛGraphicsϕϯνͰ֬ೝɺRDSͰͷϦ
    ϞʔταΠτʹ͋Δ΋ͷͷ઀ଓͷͨΊɺΧΫΧΫͯ͠͠·͏͕ϕϯ
    νͷ݁Ռ͚ͩʁྑ͍͜ͱΛ֬ೝ͢Δɻ
    • σϞϏσΦͰ͸SlashtopΛར༻

    View Slide

  41. ؔ ࿈ Ϧ ϯ Ϋ
    • Attaching physical PCI devices to guests:
    https://docs.openstack.org/admin-guide/compute-pci-
    passthrough.html
    • Container as a Service on GPU Cloud- Our Decision
    Among K8s, Mesos, Docker Swarm, and OpenStack
    Zun:
    https://www.slideshare.net/secret/AiUdO4dLxNTkfI
    • OVMF_ʹΑΔ_PCI_ύεεϧʔ

    https://goo.gl/icq9mV

    View Slide

  42. S p e c i a l T h a n k s t o :
    • GPU on OpenStack project members
    VirtualTech Japan
    NVIDIA
    DellEMC
    NEC Networks & System Integration
    • @zgock999 at Tokaido-LUG, Nagoya, Japan
    Teach me some hints how to use GPGPU on VM!
    • Matthew Treinish of IBM attended my session at LC3 china and
    figure out and feedback some point!
    • Our customers! give the chance to evaluate!

    View Slide

  43. P R E S E N T B Y M A S A F U M I O H TA
    T W E E T @ m a s a f u m i o h t a m a i l t o : m a s a f u m i @ p i d 0 . o rg
    T H A N K S V E RY M U C H F O R C O M I N G M Y S E S S I O N !

    View Slide