$30 off During Our Annual Pro Sale. View Details »
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
GPU on OpenStack日本語版
Search
masafumi_ohta
May 15, 2017
Technology
0
770
GPU on OpenStack日本語版
日本語バージョン作りました。若干訂正も加えました。
masafumi_ohta
May 15, 2017
Tweet
Share
More Decks by masafumi_ohta
See All by masafumi_ohta
COSCUP25 Intro at OSC Tokyo Spring 25, at Komazawa Univ.
masafumi_ohta
0
23
Deepdiving to Raspberry Pi 5
masafumi_ohta
0
38
树莓派的历史、相关信息及使用案例
masafumi_ohta
0
420
海外カンファレンスのCFPの正しい書き方
masafumi_ohta
4
580
GPD.pdf
masafumi_ohta
0
72
GPD MicroPCのご紹介
masafumi_ohta
0
180
3大あくじょ考察
masafumi_ohta
0
480
GPU on OpenStack GPUインターナルクラウドのベストプラクティス
masafumi_ohta
0
310
これからのSIerに必要なこと
masafumi_ohta
1
530
Other Decks in Technology
See All in Technology
Snowflake導入から1年、LayerXのデータ活用の現在 / One Year into Snowflake: How LayerX Uses Data Today
civitaspo
0
2.5k
MariaDB Connector/C のcaching_sha2_passwordプラグインの仕様について
boro1234
0
1k
NIKKEI Tech Talk #41: セキュア・バイ・デザインからクラウド管理を考える
sekido
PRO
0
220
AI with TiDD
shiraji
1
300
フルカイテン株式会社 エンジニア向け採用資料
fullkaiten
0
9.9k
ActiveJobUpdates
igaiga
1
320
New Relic 1 年生の振り返りと Cloud Cost Intelligence について #NRUG
play_inc
0
240
『君の名は』と聞く君の名は。 / Your name, you who asks for mine.
nttcom
1
120
株式会社ビザスク_AI__Engineering_Summit_Tokyo_2025_登壇資料.pdf
eikohashiba
1
120
Identity Management for Agentic AI 解説
fujie
0
480
障害対応訓練、その前に
coconala_engineer
0
200
20251222_サンフランシスコサバイバル術
ponponmikankan
2
140
Featured
See All Featured
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
31
9.8k
Pawsitive SEO: Lessons from My Dog (and Many Mistakes) on Thriving as a Consultant in the Age of AI
davidcarrasco
0
38
Designing for Performance
lara
610
69k
How to make the Groovebox
asonas
2
1.8k
Git: the NoSQL Database
bkeepers
PRO
432
66k
Bootstrapping a Software Product
garrettdimon
PRO
307
120k
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
34
2.6k
Have SEOs Ruined the Internet? - User Awareness of SEO in 2025
akashhashmi
0
200
SEOcharity - Dark patterns in SEO and UX: How to avoid them and build a more ethical web
sarafernandez
0
89
The AI Search Optimization Roadmap by Aleyda Solis
aleyda
1
5k
How to build a perfect <img>
jonoalderson
0
4.7k
The Impact of AI in SEO - AI Overviews June 2024 Edition
aleyda
5
680
Transcript
GPU on OpenStack ͓͓ͨɹ·͞;Έ @masafumiohta
͓લͩΕʁ> SIerۈ ͍͔ͭ͘ͷOpenStack PoCʹࢀՃɺٴͼΞυόΠε OpenStackͷఏҊɾޚ૬ஊঝͬͯ·͢ɻ OpenStackίʔυಡΜͰमਖ਼ͨ͠Γͱ͔… ଓ͖WebͰ https://jp.linkedin.com/in/ohtamasafumi
͡Ίʹ OpenStackಛघར༻ Sahara(HadoopͷOpenStack)ͳͲͳͲ ΄ͱΜͲ͕·ͱΊΒΕͯͳ͙ͬͯ͘ௐΔ͔͠ͳ͍ ͍ΘΏΔʰυΩϡϝϯτϩετʱ docs.openstack.orgɹʹ͋͛ͯ΄͍͠ɻ
GPU on OpenStackͱ
GPUτϨϯυ GPUϝχʔίΞత ֤MPUίΞҰݸ͋ͨΓখ͍͕͘͞ز͔ͭͷܭࢉ (MATLABܥͳͲ)Ͱ༗ޮ GPUͷμΠूੵʹର͢ΔলిྗHPCϢʔβʹ ͱͬͯେ͖͍ɻ ίϯύΫτͳγεςϜলిྗɾলεϖʔε͕ඞਢͳຊ ͷHPCγεςϜʹେ͖͍
GPUͲ͏ͬͯ͏ͷ͔ʁ LinuxͷPCIύεεϧʔٕज़Ͱ͏ AWSଟ͜ͷϕʔε KVMʹґଘ VSphereGPUίΞͷεϓϦοτ͕Մೳ VSphereXenͷΑ͏ͳGPUεϓϦοτՄೳ͔ʁ ݱߦͰ͖ͳ͍͕Nvidia͕αϙʔτ༧ఆ
GPU Docker DockerΛ͔ͭͬͨGPUͷར༻ ίΞׂͰ͖ͳ͍͕λεΫʹԠͯ͡GPUίΞར༻Λ ࣗಈར༻ׂɻ ࠷ۙͷGPU on OpenStackͱ͍͑͜ͷGPU Dockerར ༻
ͨͩɺύεεϧʔ+KVMΑΓύϑΥʔϚϯεͰͳ͍
GPU OpenStack୭ͷͨΊ HPCͷςϯϙϥϦར༻ ͪΐͬͱܭࢉͬͯऴΘͬͨΒVM͝ͱյ͢ ͍͔ͭ͘ͷVMΛ͔ͭͬͯ؆୯ʹHPCάϦουͱ͔.. EC2ͷΑ͏ʹHPCΛͬͨΓ͢Δ ෦ར༻Ͱ͏ɺಛʹۀͰEC2ͳͲΫϥυʹग़ͪ͠Ό ͍͚ͳ͍ਓ͚
GPU on OpenStackΛ͏
PCIύεεϧʔ(1) PCIσόΠεΛμΠϨΫτʹVMʹͭͳ͙ ཧϗετΑΓ͔ΒPCIσόΠεΛΓ͢ඞཁ͕͋Δɻ KVMͷػೳʹࠨӈ͞ΕΔ͜ͱʹͳ͍ͬͯΔɻ OpenStackͱؔͳ͍ ҰͭͷσόΠεʹҰͭͷVM GPUࣗମ֤VMʹର͠ڞ༗͓ΑͼׂෆՄೳ ͋͘·ͰKVMͷ੍ݶɺOpenStackͰͳ͍ɻ
PCIύεεϧʔ(2) RedhatͰެࣜαϙʔτ ͔͠͠ར༻ੵۃਪ͍ͯ͠ͳ͍ʢ͍͋͠ʁ) ࣮ࡍRedhatͰϋϚΔ.. UbuntuͰυΩϡϝϯτԽ͞Ε͍ͯͳ͍ άάͬͯใΛ͕͞͞ͳ͖Ό͍͚ͳ͍ɻ
K Linux OS for KVM hypervisor GPU Driver App Instance
VMM/KVM IOMMU/Vt-d PCI Express x16 Linux/Win OS ComputeNode GPU Card GPU Card Nova Compute Nova Scheduler AMQP Nova API Linux OS ControllerNode ਤ:GPUύεεϧʔͲ͏ͬͯOpenStackͰಈ͔͘ʢҹ͕Πϯελϯεൃߦϓϩηεʣ
KVMϗετͷGPU ϗετͷGPUΛνΣοΫ͢Δ: lspci -nn | grep -i nvidia lspci -nn
| grep -i nvidia 88:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:11b4] (rev a1) 88:00.1 Audio device [0403]: NVIDIA Corporation GK104 HDMI Audio Controller [10de:0e0a] (rev a1) GPUͷϢχοτ͝ͱύεεϧʔ͢Δඞཁ͕͋Δ GPUϗετ͚ͩͰͳ͘HDMIͳͲϢχοτ͝ͱΓ͢ඞཁ͋Γ Ͱͳ͍ͱVM্Ͱಈ͍ͯ͘Εͳ͍=ύεεϧʔͰ͖ͳ͍
GPUͷϙʔτΛΈΔ GPUʹHDMIϏσΦͱΦʔσΟΦ͕͋ΔͷͰ ͜Ε͝ͱύεεϧʔ͢Δඞཁ͕༗Δ
IOMMUͷηοτΞοϓ ཧσόΠεΛ͏ͨΊͷԾγεςϜʹ͓͍ͯ IOMMU(Input/Output Memory Management Unit) ηοτΞο ϓඞਢ ͪΖΜ intel
vt-d (I/OԾԽ)Φϯʹ (EFI/BIOSνΣοΫ) grubͰͷηοτΞοϓඞཁ:/etc/default/grubΛมߋهड़ GRUB_CMDLINE_LINUX_DEFAULT="quiet splash intel_iommu=on vfio_iommu_type1.allow_unsafe_interrupts=1”
pci-stub pci-stub ཧPCIσόΠεΛϗετͰར༻Ͱ͖ͳ͍Α͏ʹ͢Δ ͨΊʹઃఆ͢Δ /etc/moduleʹσϑΥϧτͰઃఆ͞Ε͍ͯͳ͍ͷͰઃఆ͢Δɻؔ ࿈͢Δίϯϙʔωϯτઃఆ͢Δ (vfio,kvm)ɻ pci_stub vfio vfio_iommu_type1
vfio_pci kvm kvm_intel
VFIO(1) ύεεϧʔ͞ΕΔͷΛཧσόΠε͔ΒΓͨ͢ ΊʹVFIO(Virtual Function IO)্ʹՃ͢Δඞཁ͕͋Δɻ ͜ΕΒͷσόΠεΛramfsͰೝࣝ͞ΕΔͷΛ͙ /etc/initramfs-tools/modules to initramfs (ubuntu)
echo ‘pci_stub ids=10de:11b4,10de:0e0a’ >> /etc/initramfs-tools/modules sudo update-initramfs -u && sudo reboot
VFIO(2) ͜ΕΒͷσόΠε·ͨbootͷࡍʹೝࣝ͞Εͳ͍ Α͏ʹ͢Δɻͯ͢ͷboot upγʔέϯεͰཧσό ΠεͰͷೝࣝΛࢭ͢Δɻ /etc/modprobe.d/blacklist.conf ʹՃ: blacklist nvidia blacklist
nvidia-uvm NvidiaޓυϥΠόೝࣝͤ͞ͳ͍Α͏ʹՃ blacklist nouveau
ཧ͔ΒΓ͢ pci-stubΛ͔ͭͬͯཧσόΠεΑΓΓ͠VMͱ͚ͬͭ͘Δɻσ όΠεIDΛnew_idʹՃ͢Δɻ·ͨؔ࿈ࣝผࢠΛΓ͠ɺVMͱ ͚ͬͭ͘Δɻ echo 11de 11b4 > /sys/bus/pci/drivers/pci-stub/new_id echo
11de 0e0a > /sys/bus/pci/drivers/pci-stub/new_id echo 0000:88:00.0 > /sys/bus/pci/devices/0000:88:00.0/driver/unbind echo 0000:88:00.1 > /sys/bus/pci/devices/0000:88:00.1/driver/unbind echo 0000:88:00.0 > /sys/bus/pci/drivers/pci-stub/bind echo 0000:88:00.1 > /sys/bus/pci/drivers/pci-stub/bind ཧϚγϯ͔ΒΓ͞Ε͔ͨɺ’claimed’ʹͳ͍ͬͯΔ͔ௐΔɻ pci-stub 0000:88:00.1: claimed by stub
modprobe /etc/modprobe.d/blacklist.conf pci-stab /sys/bus/pci/drivers/pci-stub/ /sys/bus/pci/devices/$(Identifier)/driver/unbind ramfs /etc/initramfs-tools/modules GRUB /etc/default/grub modules
/etc/modules UEFI/BIOS Vt-d ਤ:GPUύεεϧʔͰComputeNodeͷBOOTϓϩηε্Ͱઃఆ͢Δͷ B O O T ϓ ϩ η ε IOMMU IOMMU BLACK LIST BLACK LIST IOMMU BLACK LIST
ཧσόΠε pci-stub (ԾͰ͏) GPUϢχοτ (σόΠεશମ) GPUΛཧσόΠε͔ΒΓ͠ɺ ԾσόΠεʹ͚ସ͑Δɻ echo 11de 11b4
> /sys/bus/pci/drivers/pci-stub/new_id echo 11de 0e0a > /sys/bus/pci/drivers/pci-stub/new_id echo 0000:88:00.0 > /sys/bus/pci/drivers/pci-stub/bind echo 0000:88:00.1 > /sys/bus/pci/drivers/pci-stub/bind echo 0000:88:00.0 > /sys/bus/pci/devices/0000:88:00.0/driver/unbind echo 0000:88:00.1 > /sys/bus/pci/devices/0000:88:00.1/driver/unbind
͞ΒʹGPUΛՃ͑Δ(1) lspciͷ݁ՌΛ֬ೝɺ2ͭͷσόΠεID͕݁Ռͱͯ͠ݟ͑Δɺ ͜ͷݟ͑ํγεςϜʹґଘɻ lspci -nn | grep -i nvidia 88:00.0
VGA compatible controller [0300]: NVIDIA Corporation Device [10de:11b4] (rev a1) 88:00.1 Audio device [0403]: NVIDIA Corporation GK104 HDMI Audio Controller [10de:0e0a] (rev a1) 84:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:11b4] (rev a1) 84:00.1 Audio device [0403]: NVIDIA Corporation GK104 HDMI Audio Controller [10de:0e0a] (rev a1)
͞ΒʹGPUΛՃ͑Δ(2) pci-stubΛར༻͠͞ΒʹσόΠεΛύεεϧʔ͢Δɻ echo 0000:84:00.0 > /sys/bus/pci/devices/0000:84:00.0/driver/unbind echo 0000:84:00.1 > /sys/bus/pci/devices/0000:84:00.1/driver/unbind
echo 0000:84:00.0 > /sys/bus/pci/drivers/pci-stub/bind echo 0000:84:00.1 > /sys/bus/pci/drivers/pci-stub/bind CUDAΞϓϦΛ͏্ͰಉػछͷGPUͷՃ͕ඞਢ(͕ͪ ͏ͷෆՄ)ɺಉ͡ͷͰ͋Δ͔ฉ͔ΕΔ͜ͱ͕͋Δɻ /nbody -benchmark -numdevices=2 -num bodies=65536
͞ΒʹGPUΛՃ͑Δ(3) ύεεϧʔʹޭ͢ΔͱVM্Ͱಈ͍͍ͯΔͷ͕lspciʹͯ νΣοΫͰ͖Δɻ ubuntu@guestos$ lspci -nn | grep -i nvidia
00:07.0 VGA compatible controller [0300]: NVIDIA Corporation GK104GL [Quadro K4200] [10de:11b4] (rev a1) 00:08.0 VGA compatible controller [0300]: NVIDIA Corporation GK104GL [Quadro K4200] [10de:11b4] (rev a1)
Novaͷه(1) ComputeNodeͷwhitelist͕PCIύεεϧʔͱGPUͷ VMͷσϓϩΠϝϯτͰඞཁɻ /etc/nova/nova.conf ʹ pci_passthrough_whitelist Λه pci_passthrough_whitelist={"name":"K4200","vendor_id":"10de","product_id": "11b4"}
Novaͷه(2) ίϯτϩʔϥʔϊʔυͷnova aliasͷઃఆ͕ඞཁɺޙ ड़͢Δflavor-key Ͱར༻͢Δɻ to /etc/nova/nova.confʹ pci_aliasesΛه pci_alias={“name”:”K4200”,"vendor_id":"10de","product_id":"11b4"}
Novaͷه(3) ίϯτϩʔϥϊʔυʹͯPCIύεεϧʔϑΟϧλʔΛ novaʹՃɻ /etc/nova/nova.conf ʹΞϯμʔϥΠϯ෦Ճ scheduler_available_filters=nova.scheduler.filters.all_filters scheduler_available_filters=nova.scheduler.filters.pci_passthrough_filter.PciP assthroughFilter scheduler_default_filters=DifferentHostFilter,RetryFilter,AvailabilityZoneFilter, RamFilter,CoreFilter,DiskFilter,ComputeFilter,ComputeCapabilitiesFilter,Imag
ePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter,Aggre gateInstanceExtraSpecsFilter,PciPassthroughFilter
nova alias GPUΛ͏ͨΊʹflavor-keyΛه͢ΔɻϑϨʔόʹset໊ Ͱطଘflavorʹର͠pciύεεϧʔͷͰܾΊͨGPUͷalias ໊ɺ͍͍ͨGPUϢχοτΛՃ͢Δɻ ্هલड़·ͰͷաఔͰɺQuadro K4200Λ2ϢχοτՃ͍ͯ͠ Δ͜ͱ͕લఏͳͷͰɺҎԼͷΑ͏ͳهड़ʹͳΔɻ pci_aliasΛK4200ͱ͍ͯ͠ΔͨΊҎԼͷهड़ͱͳΔɻ nova
flavor-key $flavor_name set “pci_passthrough:alias”=“K4200:$amount_of_gpu”
طͷ
ΫϥυΠϝʔδͷ Πϝʔδ͕GPUΛ͏ʹͱͯখ͘͞qemu-imgͰΠϝʔδͷϦα ΠζΛͯ͠Δඞཁ͕༗Δɻ CUDA driverΛΠϯετʔϧ͢Δʹperl-packagesΛطଘͷΫϥυ Πϝʔδʹ͍ΕͯΔඞཁ͕͋Δɻ CUDAυϥΠό.deb or .rpMύοέʔδͰόΠφϦϑΝΠϧͰͳ͘ɺΠϯε τʔϧ࣌ʹιʔεΛmakeͰΠϯετʔϧ͢ΔܗΛͱ͍ͬͯΔ=.runϑΝΠϧͱม
ΘΒͣɻ NvidiaকདྷͷϦϦʔεͰspecϑΝΠϧʹperlؔ࿈ϑΝΠϧͷهΛ͢ΔܗͰfix ͢Δ༧ఆɻ 7.6Ҏ߱Ͱగਖ਼ͯ͘͠ΕΔɺͱ͍͍͕ͬͯͨ…(·ͩςετͯ͠ͳ͍ɻʣ
Windows as VDI Windows্ͷCUDAͪΌΜͱΠϯετʔϧͰ͖Εૣ ͘ͳΔ͚Ͳ࣌ͨ·ΧΫΧΫͯ͠͠·͏ɻ vmͷdisk speedͷɺΤϑΣϝϥϧʹ͢ΔͳΓૣ͍σΟεΫΛ ͏ͳΓͰͦͦ͜͜ղܾ͢Δ(SSD,NVMe or..etc) vmcontext
switchͰr/wΛߦ͍ͬͯΔͷͰϔϏʔϫʔΫϩʔυ ͷCUDAͳΓ͜ͷΑ͏ͳΧΫΧΫঢ়ଶΛى͜͢Մೳੑ͕͋Δɻ ͜ͷ͋ͱޙʹ·͔͕ͤͨɺগ͠ΤϑΣϝϥϧͰվળͨ͠Β͍͠ ސ٬·ɺͳΜͱ͔ͳͬͨͱͱΓ͋͑ͣͷຬΒͬͨɻ
ϥΠϒϚΠάϨʔγϣϯ ϥΠϒϚΠΫϨʔγϣϯͰ͖ͳ͍ɺvm͕ݹ͍ϗετͷଓ ใΛΓͤͳ͍ɻݹ͍ϗετͷใΛѲͬͨ··ʹͳΔɻ ϫʔΫΞϥϯυ:nova.pci_devicesͷMySQLͷDBใΛফ͠ ͯɺݹ͍ϗετΛ࠶ىಈ͢Δʼҙຯͳ͠ʂ | 2016-08-11 00:54:45 | 2016-08-19
04:58:01 | NULL | 0 | 45 | 21 | 0000:84:00.0 | 11b4 | 10de | type-PCI | pci_0000_84_00_0 | label_10de_11b4 | available | {} | NULL | NULL | 1 | <<-- old-host | 2016-08-11 00:54:45 | 2016-08-19 04:58:01 | NULL | 0 | 48 | 21 | 0000:88:00.0 | 11b4 | 10de | type-PCI | pci_0000_88_00_0 | label_10de_11b4 | available | {} | NULL | NULL | 1 | <<-- old-host
·ͱΊ OpenStackͷಛघར༻͕Ͱ͖Δ͔Ͳ͏͔ͷOS࣍ୈ Ͱ͢ɻ ͷOS͕μϝͳͷʹOpenStack͕..ͱ͍͏ͷΠέͯͳ͍ਓ͕͍ ͏͓ݴ༿… αʔόܥOSҙ֎ʹ͍Ζ͍ΖΠέͯ·ͤΜ…͋Γ͋ΓͰ͢ɻ GPUͷར༻ࠓ͜ͷύεεϧʔ͕ݱߦͰ͕͢ɺNvidia Intel࣍ୈ͔ͱ(AMD…Ͳ͏͢ΔΜͩΖ͏…)
Thank you Masafumi Ohta @masafumiohta