Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
GPU on OpenStack日本語版
Search
masafumi_ohta
May 15, 2017
Technology
0
740
GPU on OpenStack日本語版
日本語バージョン作りました。若干訂正も加えました。
masafumi_ohta
May 15, 2017
Tweet
Share
More Decks by masafumi_ohta
See All by masafumi_ohta
COSCUP25 Intro at OSC Tokyo Spring 25, at Komazawa Univ.
masafumi_ohta
0
21
Deepdiving to Raspberry Pi 5
masafumi_ohta
0
34
树莓派的历史、相关信息及使用案例
masafumi_ohta
0
410
海外カンファレンスのCFPの正しい書き方
masafumi_ohta
4
570
GPD.pdf
masafumi_ohta
0
68
GPD MicroPCのご紹介
masafumi_ohta
0
170
3大あくじょ考察
masafumi_ohta
0
480
GPU on OpenStack GPUインターナルクラウドのベストプラクティス
masafumi_ohta
0
300
これからのSIerに必要なこと
masafumi_ohta
1
520
Other Decks in Technology
See All in Technology
東京大学「Agile-X」のFPGA AIデザインハッカソンを制したソニーのAI最適化
sony
0
180
設計に疎いエンジニアでも始めやすいアーキテクチャドキュメント
phaya72
18
12k
プロダクト開発と社内データ活用での、BI×AIの現在地 / Data_Findy
sansan_randd
1
690
251029 JAWS-UG AI/ML 退屈なことはQDevにやらせよう
otakensh
0
120
RemoteFunctionを使ったコロケーション
mkazutaka
1
170
入院医療費算定業務をAIで支援する:包括医療費支払い制度とDPCコーディング (公開版)
hagino3000
0
130
ストレージエンジニアの仕事と、近年の計算機について / 第58回 情報科学若手の会
pfn
PRO
4
920
어떤 개발자가 되고 싶은가?
arawn
1
320
文字列操作の達人になる ~ Kotlinの文字列の便利な世界 ~ - Kotlin fest 2025
tomorrowkey
2
260
.NET 10のBlazorの期待の新機能
htkym
0
160
Oracle Database@Google Cloud:サービス概要のご紹介
oracle4engineer
PRO
0
400
CLIPでマルチモーダル画像検索 →とても良い
wm3
2
670
Featured
See All Featured
Understanding Cognitive Biases in Performance Measurement
bluesmoon
31
2.7k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
285
14k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
190
55k
Unsuck your backbone
ammeep
671
58k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
10
900
Building an army of robots
kneath
306
46k
Art, The Web, and Tiny UX
lynnandtonic
303
21k
Fantastic passwords and where to find them - at NoRuKo
philnash
52
3.5k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3.1k
Side Projects
sachag
455
43k
The Straight Up "How To Draw Better" Workshop
denniskardys
239
140k
Six Lessons from altMBA
skipperchong
29
4k
Transcript
GPU on OpenStack ͓͓ͨɹ·͞;Έ @masafumiohta
͓લͩΕʁ> SIerۈ ͍͔ͭ͘ͷOpenStack PoCʹࢀՃɺٴͼΞυόΠε OpenStackͷఏҊɾޚ૬ஊঝͬͯ·͢ɻ OpenStackίʔυಡΜͰमਖ਼ͨ͠Γͱ͔… ଓ͖WebͰ https://jp.linkedin.com/in/ohtamasafumi
͡Ίʹ OpenStackಛघར༻ Sahara(HadoopͷOpenStack)ͳͲͳͲ ΄ͱΜͲ͕·ͱΊΒΕͯͳ͙ͬͯ͘ௐΔ͔͠ͳ͍ ͍ΘΏΔʰυΩϡϝϯτϩετʱ docs.openstack.orgɹʹ͋͛ͯ΄͍͠ɻ
GPU on OpenStackͱ
GPUτϨϯυ GPUϝχʔίΞత ֤MPUίΞҰݸ͋ͨΓখ͍͕͘͞ز͔ͭͷܭࢉ (MATLABܥͳͲ)Ͱ༗ޮ GPUͷμΠूੵʹର͢ΔলిྗHPCϢʔβʹ ͱͬͯେ͖͍ɻ ίϯύΫτͳγεςϜলిྗɾলεϖʔε͕ඞਢͳຊ ͷHPCγεςϜʹେ͖͍
GPUͲ͏ͬͯ͏ͷ͔ʁ LinuxͷPCIύεεϧʔٕज़Ͱ͏ AWSଟ͜ͷϕʔε KVMʹґଘ VSphereGPUίΞͷεϓϦοτ͕Մೳ VSphereXenͷΑ͏ͳGPUεϓϦοτՄೳ͔ʁ ݱߦͰ͖ͳ͍͕Nvidia͕αϙʔτ༧ఆ
GPU Docker DockerΛ͔ͭͬͨGPUͷར༻ ίΞׂͰ͖ͳ͍͕λεΫʹԠͯ͡GPUίΞར༻Λ ࣗಈར༻ׂɻ ࠷ۙͷGPU on OpenStackͱ͍͑͜ͷGPU Dockerར ༻
ͨͩɺύεεϧʔ+KVMΑΓύϑΥʔϚϯεͰͳ͍
GPU OpenStack୭ͷͨΊ HPCͷςϯϙϥϦར༻ ͪΐͬͱܭࢉͬͯऴΘͬͨΒVM͝ͱյ͢ ͍͔ͭ͘ͷVMΛ͔ͭͬͯ؆୯ʹHPCάϦουͱ͔.. EC2ͷΑ͏ʹHPCΛͬͨΓ͢Δ ෦ར༻Ͱ͏ɺಛʹۀͰEC2ͳͲΫϥυʹग़ͪ͠Ό ͍͚ͳ͍ਓ͚
GPU on OpenStackΛ͏
PCIύεεϧʔ(1) PCIσόΠεΛμΠϨΫτʹVMʹͭͳ͙ ཧϗετΑΓ͔ΒPCIσόΠεΛΓ͢ඞཁ͕͋Δɻ KVMͷػೳʹࠨӈ͞ΕΔ͜ͱʹͳ͍ͬͯΔɻ OpenStackͱؔͳ͍ ҰͭͷσόΠεʹҰͭͷVM GPUࣗମ֤VMʹର͠ڞ༗͓ΑͼׂෆՄೳ ͋͘·ͰKVMͷ੍ݶɺOpenStackͰͳ͍ɻ
PCIύεεϧʔ(2) RedhatͰެࣜαϙʔτ ͔͠͠ར༻ੵۃਪ͍ͯ͠ͳ͍ʢ͍͋͠ʁ) ࣮ࡍRedhatͰϋϚΔ.. UbuntuͰυΩϡϝϯτԽ͞Ε͍ͯͳ͍ άάͬͯใΛ͕͞͞ͳ͖Ό͍͚ͳ͍ɻ
K Linux OS for KVM hypervisor GPU Driver App Instance
VMM/KVM IOMMU/Vt-d PCI Express x16 Linux/Win OS ComputeNode GPU Card GPU Card Nova Compute Nova Scheduler AMQP Nova API Linux OS ControllerNode ਤ:GPUύεεϧʔͲ͏ͬͯOpenStackͰಈ͔͘ʢҹ͕Πϯελϯεൃߦϓϩηεʣ
KVMϗετͷGPU ϗετͷGPUΛνΣοΫ͢Δ: lspci -nn | grep -i nvidia lspci -nn
| grep -i nvidia 88:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:11b4] (rev a1) 88:00.1 Audio device [0403]: NVIDIA Corporation GK104 HDMI Audio Controller [10de:0e0a] (rev a1) GPUͷϢχοτ͝ͱύεεϧʔ͢Δඞཁ͕͋Δ GPUϗετ͚ͩͰͳ͘HDMIͳͲϢχοτ͝ͱΓ͢ඞཁ͋Γ Ͱͳ͍ͱVM্Ͱಈ͍ͯ͘Εͳ͍=ύεεϧʔͰ͖ͳ͍
GPUͷϙʔτΛΈΔ GPUʹHDMIϏσΦͱΦʔσΟΦ͕͋ΔͷͰ ͜Ε͝ͱύεεϧʔ͢Δඞཁ͕༗Δ
IOMMUͷηοτΞοϓ ཧσόΠεΛ͏ͨΊͷԾγεςϜʹ͓͍ͯ IOMMU(Input/Output Memory Management Unit) ηοτΞο ϓඞਢ ͪΖΜ intel
vt-d (I/OԾԽ)Φϯʹ (EFI/BIOSνΣοΫ) grubͰͷηοτΞοϓඞཁ:/etc/default/grubΛมߋهड़ GRUB_CMDLINE_LINUX_DEFAULT="quiet splash intel_iommu=on vfio_iommu_type1.allow_unsafe_interrupts=1”
pci-stub pci-stub ཧPCIσόΠεΛϗετͰར༻Ͱ͖ͳ͍Α͏ʹ͢Δ ͨΊʹઃఆ͢Δ /etc/moduleʹσϑΥϧτͰઃఆ͞Ε͍ͯͳ͍ͷͰઃఆ͢Δɻؔ ࿈͢Δίϯϙʔωϯτઃఆ͢Δ (vfio,kvm)ɻ pci_stub vfio vfio_iommu_type1
vfio_pci kvm kvm_intel
VFIO(1) ύεεϧʔ͞ΕΔͷΛཧσόΠε͔ΒΓͨ͢ ΊʹVFIO(Virtual Function IO)্ʹՃ͢Δඞཁ͕͋Δɻ ͜ΕΒͷσόΠεΛramfsͰೝࣝ͞ΕΔͷΛ͙ /etc/initramfs-tools/modules to initramfs (ubuntu)
echo ‘pci_stub ids=10de:11b4,10de:0e0a’ >> /etc/initramfs-tools/modules sudo update-initramfs -u && sudo reboot
VFIO(2) ͜ΕΒͷσόΠε·ͨbootͷࡍʹೝࣝ͞Εͳ͍ Α͏ʹ͢Δɻͯ͢ͷboot upγʔέϯεͰཧσό ΠεͰͷೝࣝΛࢭ͢Δɻ /etc/modprobe.d/blacklist.conf ʹՃ: blacklist nvidia blacklist
nvidia-uvm NvidiaޓυϥΠόೝࣝͤ͞ͳ͍Α͏ʹՃ blacklist nouveau
ཧ͔ΒΓ͢ pci-stubΛ͔ͭͬͯཧσόΠεΑΓΓ͠VMͱ͚ͬͭ͘Δɻσ όΠεIDΛnew_idʹՃ͢Δɻ·ͨؔ࿈ࣝผࢠΛΓ͠ɺVMͱ ͚ͬͭ͘Δɻ echo 11de 11b4 > /sys/bus/pci/drivers/pci-stub/new_id echo
11de 0e0a > /sys/bus/pci/drivers/pci-stub/new_id echo 0000:88:00.0 > /sys/bus/pci/devices/0000:88:00.0/driver/unbind echo 0000:88:00.1 > /sys/bus/pci/devices/0000:88:00.1/driver/unbind echo 0000:88:00.0 > /sys/bus/pci/drivers/pci-stub/bind echo 0000:88:00.1 > /sys/bus/pci/drivers/pci-stub/bind ཧϚγϯ͔ΒΓ͞Ε͔ͨɺ’claimed’ʹͳ͍ͬͯΔ͔ௐΔɻ pci-stub 0000:88:00.1: claimed by stub
modprobe /etc/modprobe.d/blacklist.conf pci-stab /sys/bus/pci/drivers/pci-stub/ /sys/bus/pci/devices/$(Identifier)/driver/unbind ramfs /etc/initramfs-tools/modules GRUB /etc/default/grub modules
/etc/modules UEFI/BIOS Vt-d ਤ:GPUύεεϧʔͰComputeNodeͷBOOTϓϩηε্Ͱઃఆ͢Δͷ B O O T ϓ ϩ η ε IOMMU IOMMU BLACK LIST BLACK LIST IOMMU BLACK LIST
ཧσόΠε pci-stub (ԾͰ͏) GPUϢχοτ (σόΠεશମ) GPUΛཧσόΠε͔ΒΓ͠ɺ ԾσόΠεʹ͚ସ͑Δɻ echo 11de 11b4
> /sys/bus/pci/drivers/pci-stub/new_id echo 11de 0e0a > /sys/bus/pci/drivers/pci-stub/new_id echo 0000:88:00.0 > /sys/bus/pci/drivers/pci-stub/bind echo 0000:88:00.1 > /sys/bus/pci/drivers/pci-stub/bind echo 0000:88:00.0 > /sys/bus/pci/devices/0000:88:00.0/driver/unbind echo 0000:88:00.1 > /sys/bus/pci/devices/0000:88:00.1/driver/unbind
͞ΒʹGPUΛՃ͑Δ(1) lspciͷ݁ՌΛ֬ೝɺ2ͭͷσόΠεID͕݁Ռͱͯ͠ݟ͑Δɺ ͜ͷݟ͑ํγεςϜʹґଘɻ lspci -nn | grep -i nvidia 88:00.0
VGA compatible controller [0300]: NVIDIA Corporation Device [10de:11b4] (rev a1) 88:00.1 Audio device [0403]: NVIDIA Corporation GK104 HDMI Audio Controller [10de:0e0a] (rev a1) 84:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:11b4] (rev a1) 84:00.1 Audio device [0403]: NVIDIA Corporation GK104 HDMI Audio Controller [10de:0e0a] (rev a1)
͞ΒʹGPUΛՃ͑Δ(2) pci-stubΛར༻͠͞ΒʹσόΠεΛύεεϧʔ͢Δɻ echo 0000:84:00.0 > /sys/bus/pci/devices/0000:84:00.0/driver/unbind echo 0000:84:00.1 > /sys/bus/pci/devices/0000:84:00.1/driver/unbind
echo 0000:84:00.0 > /sys/bus/pci/drivers/pci-stub/bind echo 0000:84:00.1 > /sys/bus/pci/drivers/pci-stub/bind CUDAΞϓϦΛ͏্ͰಉػछͷGPUͷՃ͕ඞਢ(͕ͪ ͏ͷෆՄ)ɺಉ͡ͷͰ͋Δ͔ฉ͔ΕΔ͜ͱ͕͋Δɻ /nbody -benchmark -numdevices=2 -num bodies=65536
͞ΒʹGPUΛՃ͑Δ(3) ύεεϧʔʹޭ͢ΔͱVM্Ͱಈ͍͍ͯΔͷ͕lspciʹͯ νΣοΫͰ͖Δɻ ubuntu@guestos$ lspci -nn | grep -i nvidia
00:07.0 VGA compatible controller [0300]: NVIDIA Corporation GK104GL [Quadro K4200] [10de:11b4] (rev a1) 00:08.0 VGA compatible controller [0300]: NVIDIA Corporation GK104GL [Quadro K4200] [10de:11b4] (rev a1)
Novaͷه(1) ComputeNodeͷwhitelist͕PCIύεεϧʔͱGPUͷ VMͷσϓϩΠϝϯτͰඞཁɻ /etc/nova/nova.conf ʹ pci_passthrough_whitelist Λه pci_passthrough_whitelist={"name":"K4200","vendor_id":"10de","product_id": "11b4"}
Novaͷه(2) ίϯτϩʔϥʔϊʔυͷnova aliasͷઃఆ͕ඞཁɺޙ ड़͢Δflavor-key Ͱར༻͢Δɻ to /etc/nova/nova.confʹ pci_aliasesΛه pci_alias={“name”:”K4200”,"vendor_id":"10de","product_id":"11b4"}
Novaͷه(3) ίϯτϩʔϥϊʔυʹͯPCIύεεϧʔϑΟϧλʔΛ novaʹՃɻ /etc/nova/nova.conf ʹΞϯμʔϥΠϯ෦Ճ scheduler_available_filters=nova.scheduler.filters.all_filters scheduler_available_filters=nova.scheduler.filters.pci_passthrough_filter.PciP assthroughFilter scheduler_default_filters=DifferentHostFilter,RetryFilter,AvailabilityZoneFilter, RamFilter,CoreFilter,DiskFilter,ComputeFilter,ComputeCapabilitiesFilter,Imag
ePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter,Aggre gateInstanceExtraSpecsFilter,PciPassthroughFilter
nova alias GPUΛ͏ͨΊʹflavor-keyΛه͢ΔɻϑϨʔόʹset໊ Ͱطଘflavorʹର͠pciύεεϧʔͷͰܾΊͨGPUͷalias ໊ɺ͍͍ͨGPUϢχοτΛՃ͢Δɻ ্هલड़·ͰͷաఔͰɺQuadro K4200Λ2ϢχοτՃ͍ͯ͠ Δ͜ͱ͕લఏͳͷͰɺҎԼͷΑ͏ͳهड़ʹͳΔɻ pci_aliasΛK4200ͱ͍ͯ͠ΔͨΊҎԼͷهड़ͱͳΔɻ nova
flavor-key $flavor_name set “pci_passthrough:alias”=“K4200:$amount_of_gpu”
طͷ
ΫϥυΠϝʔδͷ Πϝʔδ͕GPUΛ͏ʹͱͯখ͘͞qemu-imgͰΠϝʔδͷϦα ΠζΛͯ͠Δඞཁ͕༗Δɻ CUDA driverΛΠϯετʔϧ͢Δʹperl-packagesΛطଘͷΫϥυ Πϝʔδʹ͍ΕͯΔඞཁ͕͋Δɻ CUDAυϥΠό.deb or .rpMύοέʔδͰόΠφϦϑΝΠϧͰͳ͘ɺΠϯε τʔϧ࣌ʹιʔεΛmakeͰΠϯετʔϧ͢ΔܗΛͱ͍ͬͯΔ=.runϑΝΠϧͱม
ΘΒͣɻ NvidiaকདྷͷϦϦʔεͰspecϑΝΠϧʹperlؔ࿈ϑΝΠϧͷهΛ͢ΔܗͰfix ͢Δ༧ఆɻ 7.6Ҏ߱Ͱగਖ਼ͯ͘͠ΕΔɺͱ͍͍͕ͬͯͨ…(·ͩςετͯ͠ͳ͍ɻʣ
Windows as VDI Windows্ͷCUDAͪΌΜͱΠϯετʔϧͰ͖Εૣ ͘ͳΔ͚Ͳ࣌ͨ·ΧΫΧΫͯ͠͠·͏ɻ vmͷdisk speedͷɺΤϑΣϝϥϧʹ͢ΔͳΓૣ͍σΟεΫΛ ͏ͳΓͰͦͦ͜͜ղܾ͢Δ(SSD,NVMe or..etc) vmcontext
switchͰr/wΛߦ͍ͬͯΔͷͰϔϏʔϫʔΫϩʔυ ͷCUDAͳΓ͜ͷΑ͏ͳΧΫΧΫঢ়ଶΛى͜͢Մೳੑ͕͋Δɻ ͜ͷ͋ͱޙʹ·͔͕ͤͨɺগ͠ΤϑΣϝϥϧͰվળͨ͠Β͍͠ ސ٬·ɺͳΜͱ͔ͳͬͨͱͱΓ͋͑ͣͷຬΒͬͨɻ
ϥΠϒϚΠάϨʔγϣϯ ϥΠϒϚΠΫϨʔγϣϯͰ͖ͳ͍ɺvm͕ݹ͍ϗετͷଓ ใΛΓͤͳ͍ɻݹ͍ϗετͷใΛѲͬͨ··ʹͳΔɻ ϫʔΫΞϥϯυ:nova.pci_devicesͷMySQLͷDBใΛফ͠ ͯɺݹ͍ϗετΛ࠶ىಈ͢Δʼҙຯͳ͠ʂ | 2016-08-11 00:54:45 | 2016-08-19
04:58:01 | NULL | 0 | 45 | 21 | 0000:84:00.0 | 11b4 | 10de | type-PCI | pci_0000_84_00_0 | label_10de_11b4 | available | {} | NULL | NULL | 1 | <<-- old-host | 2016-08-11 00:54:45 | 2016-08-19 04:58:01 | NULL | 0 | 48 | 21 | 0000:88:00.0 | 11b4 | 10de | type-PCI | pci_0000_88_00_0 | label_10de_11b4 | available | {} | NULL | NULL | 1 | <<-- old-host
·ͱΊ OpenStackͷಛघར༻͕Ͱ͖Δ͔Ͳ͏͔ͷOS࣍ୈ Ͱ͢ɻ ͷOS͕μϝͳͷʹOpenStack͕..ͱ͍͏ͷΠέͯͳ͍ਓ͕͍ ͏͓ݴ༿… αʔόܥOSҙ֎ʹ͍Ζ͍ΖΠέͯ·ͤΜ…͋Γ͋ΓͰ͢ɻ GPUͷར༻ࠓ͜ͷύεεϧʔ͕ݱߦͰ͕͢ɺNvidia Intel࣍ୈ͔ͱ(AMD…Ͳ͏͢ΔΜͩΖ͏…)
Thank you Masafumi Ohta @masafumiohta