Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
GPU on OpenStack日本語版
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
masafumi_ohta
May 15, 2017
Technology
0
780
GPU on OpenStack日本語版
日本語バージョン作りました。若干訂正も加えました。
masafumi_ohta
May 15, 2017
Tweet
Share
More Decks by masafumi_ohta
See All by masafumi_ohta
COSCUP25 Intro at OSC Tokyo Spring 25, at Komazawa Univ.
masafumi_ohta
0
26
Deepdiving to Raspberry Pi 5
masafumi_ohta
0
40
树莓派的历史、相关信息及使用案例
masafumi_ohta
0
420
海外カンファレンスのCFPの正しい書き方
masafumi_ohta
4
580
GPD.pdf
masafumi_ohta
0
72
GPD MicroPCのご紹介
masafumi_ohta
0
180
3大あくじょ考察
masafumi_ohta
0
490
GPU on OpenStack GPUインターナルクラウドのベストプラクティス
masafumi_ohta
0
320
これからのSIerに必要なこと
masafumi_ohta
1
530
Other Decks in Technology
See All in Technology
OWASP Top 10:2025 リリースと 少しの日本語化にまつわる裏話
okdt
PRO
3
820
AWS Network Firewall Proxyを触ってみた
nagisa53
1
240
日本の85%が使う公共SaaSは、どう育ったのか
taketakekaho
1
230
セキュリティについて学ぶ会 / 2026 01 25 Takamatsu WordPress Meetup
rocketmartue
1
310
データの整合性を保ちたいだけなんだ
shoheimitani
8
3.2k
配列に見る bash と zsh の違い
kazzpapa3
3
160
30万人の同時アクセスに耐えたい!新サービスの盤石なリリースを支える負荷試験 / SRE Kaigi 2026
genda
4
1.3k
Oracle Base Database Service 技術詳細
oracle4engineer
PRO
15
93k
15 years with Rails and DDD (AI Edition)
andrzejkrzywda
0
200
StrandsとNeptuneを使ってナレッジグラフを構築する
yakumo
1
120
【Oracle Cloud ウェビナー】[Oracle AI Database + AWS] Oracle Database@AWSで広がるクラウドの新たな選択肢とAI時代のデータ戦略
oracle4engineer
PRO
2
170
20260208_第66回 コンピュータビジョン勉強会
keiichiito1978
0
170
Featured
See All Featured
SEOcharity - Dark patterns in SEO and UX: How to avoid them and build a more ethical web
sarafernandez
0
120
Automating Front-end Workflow
addyosmani
1371
200k
Stop Working from a Prison Cell
hatefulcrawdad
273
21k
Designing Powerful Visuals for Engaging Learning
tmiket
0
240
Paper Plane
katiecoart
PRO
0
46k
Jess Joyce - The Pitfalls of Following Frameworks
techseoconnect
PRO
1
66
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
133
19k
The Illustrated Children's Guide to Kubernetes
chrisshort
51
51k
How to Get Subject Matter Experts Bought In and Actively Contributing to SEO & PR Initiatives.
livdayseo
0
67
Accessibility Awareness
sabderemane
0
53
End of SEO as We Know It (SMX Advanced Version)
ipullrank
3
3.9k
Building a Scalable Design System with Sketch
lauravandoore
463
34k
Transcript
GPU on OpenStack ͓͓ͨɹ·͞;Έ @masafumiohta
͓લͩΕʁ> SIerۈ ͍͔ͭ͘ͷOpenStack PoCʹࢀՃɺٴͼΞυόΠε OpenStackͷఏҊɾޚ૬ஊঝͬͯ·͢ɻ OpenStackίʔυಡΜͰमਖ਼ͨ͠Γͱ͔… ଓ͖WebͰ https://jp.linkedin.com/in/ohtamasafumi
͡Ίʹ OpenStackಛघར༻ Sahara(HadoopͷOpenStack)ͳͲͳͲ ΄ͱΜͲ͕·ͱΊΒΕͯͳ͙ͬͯ͘ௐΔ͔͠ͳ͍ ͍ΘΏΔʰυΩϡϝϯτϩετʱ docs.openstack.orgɹʹ͋͛ͯ΄͍͠ɻ
GPU on OpenStackͱ
GPUτϨϯυ GPUϝχʔίΞత ֤MPUίΞҰݸ͋ͨΓখ͍͕͘͞ز͔ͭͷܭࢉ (MATLABܥͳͲ)Ͱ༗ޮ GPUͷμΠूੵʹର͢ΔলిྗHPCϢʔβʹ ͱͬͯେ͖͍ɻ ίϯύΫτͳγεςϜলిྗɾলεϖʔε͕ඞਢͳຊ ͷHPCγεςϜʹେ͖͍
GPUͲ͏ͬͯ͏ͷ͔ʁ LinuxͷPCIύεεϧʔٕज़Ͱ͏ AWSଟ͜ͷϕʔε KVMʹґଘ VSphereGPUίΞͷεϓϦοτ͕Մೳ VSphereXenͷΑ͏ͳGPUεϓϦοτՄೳ͔ʁ ݱߦͰ͖ͳ͍͕Nvidia͕αϙʔτ༧ఆ
GPU Docker DockerΛ͔ͭͬͨGPUͷར༻ ίΞׂͰ͖ͳ͍͕λεΫʹԠͯ͡GPUίΞར༻Λ ࣗಈར༻ׂɻ ࠷ۙͷGPU on OpenStackͱ͍͑͜ͷGPU Dockerར ༻
ͨͩɺύεεϧʔ+KVMΑΓύϑΥʔϚϯεͰͳ͍
GPU OpenStack୭ͷͨΊ HPCͷςϯϙϥϦར༻ ͪΐͬͱܭࢉͬͯऴΘͬͨΒVM͝ͱյ͢ ͍͔ͭ͘ͷVMΛ͔ͭͬͯ؆୯ʹHPCάϦουͱ͔.. EC2ͷΑ͏ʹHPCΛͬͨΓ͢Δ ෦ར༻Ͱ͏ɺಛʹۀͰEC2ͳͲΫϥυʹग़ͪ͠Ό ͍͚ͳ͍ਓ͚
GPU on OpenStackΛ͏
PCIύεεϧʔ(1) PCIσόΠεΛμΠϨΫτʹVMʹͭͳ͙ ཧϗετΑΓ͔ΒPCIσόΠεΛΓ͢ඞཁ͕͋Δɻ KVMͷػೳʹࠨӈ͞ΕΔ͜ͱʹͳ͍ͬͯΔɻ OpenStackͱؔͳ͍ ҰͭͷσόΠεʹҰͭͷVM GPUࣗମ֤VMʹର͠ڞ༗͓ΑͼׂෆՄೳ ͋͘·ͰKVMͷ੍ݶɺOpenStackͰͳ͍ɻ
PCIύεεϧʔ(2) RedhatͰެࣜαϙʔτ ͔͠͠ར༻ੵۃਪ͍ͯ͠ͳ͍ʢ͍͋͠ʁ) ࣮ࡍRedhatͰϋϚΔ.. UbuntuͰυΩϡϝϯτԽ͞Ε͍ͯͳ͍ άάͬͯใΛ͕͞͞ͳ͖Ό͍͚ͳ͍ɻ
K Linux OS for KVM hypervisor GPU Driver App Instance
VMM/KVM IOMMU/Vt-d PCI Express x16 Linux/Win OS ComputeNode GPU Card GPU Card Nova Compute Nova Scheduler AMQP Nova API Linux OS ControllerNode ਤ:GPUύεεϧʔͲ͏ͬͯOpenStackͰಈ͔͘ʢҹ͕Πϯελϯεൃߦϓϩηεʣ
KVMϗετͷGPU ϗετͷGPUΛνΣοΫ͢Δ: lspci -nn | grep -i nvidia lspci -nn
| grep -i nvidia 88:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:11b4] (rev a1) 88:00.1 Audio device [0403]: NVIDIA Corporation GK104 HDMI Audio Controller [10de:0e0a] (rev a1) GPUͷϢχοτ͝ͱύεεϧʔ͢Δඞཁ͕͋Δ GPUϗετ͚ͩͰͳ͘HDMIͳͲϢχοτ͝ͱΓ͢ඞཁ͋Γ Ͱͳ͍ͱVM্Ͱಈ͍ͯ͘Εͳ͍=ύεεϧʔͰ͖ͳ͍
GPUͷϙʔτΛΈΔ GPUʹHDMIϏσΦͱΦʔσΟΦ͕͋ΔͷͰ ͜Ε͝ͱύεεϧʔ͢Δඞཁ͕༗Δ
IOMMUͷηοτΞοϓ ཧσόΠεΛ͏ͨΊͷԾγεςϜʹ͓͍ͯ IOMMU(Input/Output Memory Management Unit) ηοτΞο ϓඞਢ ͪΖΜ intel
vt-d (I/OԾԽ)Φϯʹ (EFI/BIOSνΣοΫ) grubͰͷηοτΞοϓඞཁ:/etc/default/grubΛมߋهड़ GRUB_CMDLINE_LINUX_DEFAULT="quiet splash intel_iommu=on vfio_iommu_type1.allow_unsafe_interrupts=1”
pci-stub pci-stub ཧPCIσόΠεΛϗετͰར༻Ͱ͖ͳ͍Α͏ʹ͢Δ ͨΊʹઃఆ͢Δ /etc/moduleʹσϑΥϧτͰઃఆ͞Ε͍ͯͳ͍ͷͰઃఆ͢Δɻؔ ࿈͢Δίϯϙʔωϯτઃఆ͢Δ (vfio,kvm)ɻ pci_stub vfio vfio_iommu_type1
vfio_pci kvm kvm_intel
VFIO(1) ύεεϧʔ͞ΕΔͷΛཧσόΠε͔ΒΓͨ͢ ΊʹVFIO(Virtual Function IO)্ʹՃ͢Δඞཁ͕͋Δɻ ͜ΕΒͷσόΠεΛramfsͰೝࣝ͞ΕΔͷΛ͙ /etc/initramfs-tools/modules to initramfs (ubuntu)
echo ‘pci_stub ids=10de:11b4,10de:0e0a’ >> /etc/initramfs-tools/modules sudo update-initramfs -u && sudo reboot
VFIO(2) ͜ΕΒͷσόΠε·ͨbootͷࡍʹೝࣝ͞Εͳ͍ Α͏ʹ͢Δɻͯ͢ͷboot upγʔέϯεͰཧσό ΠεͰͷೝࣝΛࢭ͢Δɻ /etc/modprobe.d/blacklist.conf ʹՃ: blacklist nvidia blacklist
nvidia-uvm NvidiaޓυϥΠόೝࣝͤ͞ͳ͍Α͏ʹՃ blacklist nouveau
ཧ͔ΒΓ͢ pci-stubΛ͔ͭͬͯཧσόΠεΑΓΓ͠VMͱ͚ͬͭ͘Δɻσ όΠεIDΛnew_idʹՃ͢Δɻ·ͨؔ࿈ࣝผࢠΛΓ͠ɺVMͱ ͚ͬͭ͘Δɻ echo 11de 11b4 > /sys/bus/pci/drivers/pci-stub/new_id echo
11de 0e0a > /sys/bus/pci/drivers/pci-stub/new_id echo 0000:88:00.0 > /sys/bus/pci/devices/0000:88:00.0/driver/unbind echo 0000:88:00.1 > /sys/bus/pci/devices/0000:88:00.1/driver/unbind echo 0000:88:00.0 > /sys/bus/pci/drivers/pci-stub/bind echo 0000:88:00.1 > /sys/bus/pci/drivers/pci-stub/bind ཧϚγϯ͔ΒΓ͞Ε͔ͨɺ’claimed’ʹͳ͍ͬͯΔ͔ௐΔɻ pci-stub 0000:88:00.1: claimed by stub
modprobe /etc/modprobe.d/blacklist.conf pci-stab /sys/bus/pci/drivers/pci-stub/ /sys/bus/pci/devices/$(Identifier)/driver/unbind ramfs /etc/initramfs-tools/modules GRUB /etc/default/grub modules
/etc/modules UEFI/BIOS Vt-d ਤ:GPUύεεϧʔͰComputeNodeͷBOOTϓϩηε্Ͱઃఆ͢Δͷ B O O T ϓ ϩ η ε IOMMU IOMMU BLACK LIST BLACK LIST IOMMU BLACK LIST
ཧσόΠε pci-stub (ԾͰ͏) GPUϢχοτ (σόΠεશମ) GPUΛཧσόΠε͔ΒΓ͠ɺ ԾσόΠεʹ͚ସ͑Δɻ echo 11de 11b4
> /sys/bus/pci/drivers/pci-stub/new_id echo 11de 0e0a > /sys/bus/pci/drivers/pci-stub/new_id echo 0000:88:00.0 > /sys/bus/pci/drivers/pci-stub/bind echo 0000:88:00.1 > /sys/bus/pci/drivers/pci-stub/bind echo 0000:88:00.0 > /sys/bus/pci/devices/0000:88:00.0/driver/unbind echo 0000:88:00.1 > /sys/bus/pci/devices/0000:88:00.1/driver/unbind
͞ΒʹGPUΛՃ͑Δ(1) lspciͷ݁ՌΛ֬ೝɺ2ͭͷσόΠεID͕݁Ռͱͯ͠ݟ͑Δɺ ͜ͷݟ͑ํγεςϜʹґଘɻ lspci -nn | grep -i nvidia 88:00.0
VGA compatible controller [0300]: NVIDIA Corporation Device [10de:11b4] (rev a1) 88:00.1 Audio device [0403]: NVIDIA Corporation GK104 HDMI Audio Controller [10de:0e0a] (rev a1) 84:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:11b4] (rev a1) 84:00.1 Audio device [0403]: NVIDIA Corporation GK104 HDMI Audio Controller [10de:0e0a] (rev a1)
͞ΒʹGPUΛՃ͑Δ(2) pci-stubΛར༻͠͞ΒʹσόΠεΛύεεϧʔ͢Δɻ echo 0000:84:00.0 > /sys/bus/pci/devices/0000:84:00.0/driver/unbind echo 0000:84:00.1 > /sys/bus/pci/devices/0000:84:00.1/driver/unbind
echo 0000:84:00.0 > /sys/bus/pci/drivers/pci-stub/bind echo 0000:84:00.1 > /sys/bus/pci/drivers/pci-stub/bind CUDAΞϓϦΛ͏্ͰಉػछͷGPUͷՃ͕ඞਢ(͕ͪ ͏ͷෆՄ)ɺಉ͡ͷͰ͋Δ͔ฉ͔ΕΔ͜ͱ͕͋Δɻ /nbody -benchmark -numdevices=2 -num bodies=65536
͞ΒʹGPUΛՃ͑Δ(3) ύεεϧʔʹޭ͢ΔͱVM্Ͱಈ͍͍ͯΔͷ͕lspciʹͯ νΣοΫͰ͖Δɻ ubuntu@guestos$ lspci -nn | grep -i nvidia
00:07.0 VGA compatible controller [0300]: NVIDIA Corporation GK104GL [Quadro K4200] [10de:11b4] (rev a1) 00:08.0 VGA compatible controller [0300]: NVIDIA Corporation GK104GL [Quadro K4200] [10de:11b4] (rev a1)
Novaͷه(1) ComputeNodeͷwhitelist͕PCIύεεϧʔͱGPUͷ VMͷσϓϩΠϝϯτͰඞཁɻ /etc/nova/nova.conf ʹ pci_passthrough_whitelist Λه pci_passthrough_whitelist={"name":"K4200","vendor_id":"10de","product_id": "11b4"}
Novaͷه(2) ίϯτϩʔϥʔϊʔυͷnova aliasͷઃఆ͕ඞཁɺޙ ड़͢Δflavor-key Ͱར༻͢Δɻ to /etc/nova/nova.confʹ pci_aliasesΛه pci_alias={“name”:”K4200”,"vendor_id":"10de","product_id":"11b4"}
Novaͷه(3) ίϯτϩʔϥϊʔυʹͯPCIύεεϧʔϑΟϧλʔΛ novaʹՃɻ /etc/nova/nova.conf ʹΞϯμʔϥΠϯ෦Ճ scheduler_available_filters=nova.scheduler.filters.all_filters scheduler_available_filters=nova.scheduler.filters.pci_passthrough_filter.PciP assthroughFilter scheduler_default_filters=DifferentHostFilter,RetryFilter,AvailabilityZoneFilter, RamFilter,CoreFilter,DiskFilter,ComputeFilter,ComputeCapabilitiesFilter,Imag
ePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter,Aggre gateInstanceExtraSpecsFilter,PciPassthroughFilter
nova alias GPUΛ͏ͨΊʹflavor-keyΛه͢ΔɻϑϨʔόʹset໊ Ͱطଘflavorʹର͠pciύεεϧʔͷͰܾΊͨGPUͷalias ໊ɺ͍͍ͨGPUϢχοτΛՃ͢Δɻ ্هલड़·ͰͷաఔͰɺQuadro K4200Λ2ϢχοτՃ͍ͯ͠ Δ͜ͱ͕લఏͳͷͰɺҎԼͷΑ͏ͳهड़ʹͳΔɻ pci_aliasΛK4200ͱ͍ͯ͠ΔͨΊҎԼͷهड़ͱͳΔɻ nova
flavor-key $flavor_name set “pci_passthrough:alias”=“K4200:$amount_of_gpu”
طͷ
ΫϥυΠϝʔδͷ Πϝʔδ͕GPUΛ͏ʹͱͯখ͘͞qemu-imgͰΠϝʔδͷϦα ΠζΛͯ͠Δඞཁ͕༗Δɻ CUDA driverΛΠϯετʔϧ͢Δʹperl-packagesΛطଘͷΫϥυ Πϝʔδʹ͍ΕͯΔඞཁ͕͋Δɻ CUDAυϥΠό.deb or .rpMύοέʔδͰόΠφϦϑΝΠϧͰͳ͘ɺΠϯε τʔϧ࣌ʹιʔεΛmakeͰΠϯετʔϧ͢ΔܗΛͱ͍ͬͯΔ=.runϑΝΠϧͱม
ΘΒͣɻ NvidiaকདྷͷϦϦʔεͰspecϑΝΠϧʹperlؔ࿈ϑΝΠϧͷهΛ͢ΔܗͰfix ͢Δ༧ఆɻ 7.6Ҏ߱Ͱగਖ਼ͯ͘͠ΕΔɺͱ͍͍͕ͬͯͨ…(·ͩςετͯ͠ͳ͍ɻʣ
Windows as VDI Windows্ͷCUDAͪΌΜͱΠϯετʔϧͰ͖Εૣ ͘ͳΔ͚Ͳ࣌ͨ·ΧΫΧΫͯ͠͠·͏ɻ vmͷdisk speedͷɺΤϑΣϝϥϧʹ͢ΔͳΓૣ͍σΟεΫΛ ͏ͳΓͰͦͦ͜͜ղܾ͢Δ(SSD,NVMe or..etc) vmcontext
switchͰr/wΛߦ͍ͬͯΔͷͰϔϏʔϫʔΫϩʔυ ͷCUDAͳΓ͜ͷΑ͏ͳΧΫΧΫঢ়ଶΛى͜͢Մೳੑ͕͋Δɻ ͜ͷ͋ͱޙʹ·͔͕ͤͨɺগ͠ΤϑΣϝϥϧͰվળͨ͠Β͍͠ ސ٬·ɺͳΜͱ͔ͳͬͨͱͱΓ͋͑ͣͷຬΒͬͨɻ
ϥΠϒϚΠάϨʔγϣϯ ϥΠϒϚΠΫϨʔγϣϯͰ͖ͳ͍ɺvm͕ݹ͍ϗετͷଓ ใΛΓͤͳ͍ɻݹ͍ϗετͷใΛѲͬͨ··ʹͳΔɻ ϫʔΫΞϥϯυ:nova.pci_devicesͷMySQLͷDBใΛফ͠ ͯɺݹ͍ϗετΛ࠶ىಈ͢Δʼҙຯͳ͠ʂ | 2016-08-11 00:54:45 | 2016-08-19
04:58:01 | NULL | 0 | 45 | 21 | 0000:84:00.0 | 11b4 | 10de | type-PCI | pci_0000_84_00_0 | label_10de_11b4 | available | {} | NULL | NULL | 1 | <<-- old-host | 2016-08-11 00:54:45 | 2016-08-19 04:58:01 | NULL | 0 | 48 | 21 | 0000:88:00.0 | 11b4 | 10de | type-PCI | pci_0000_88_00_0 | label_10de_11b4 | available | {} | NULL | NULL | 1 | <<-- old-host
·ͱΊ OpenStackͷಛघར༻͕Ͱ͖Δ͔Ͳ͏͔ͷOS࣍ୈ Ͱ͢ɻ ͷOS͕μϝͳͷʹOpenStack͕..ͱ͍͏ͷΠέͯͳ͍ਓ͕͍ ͏͓ݴ༿… αʔόܥOSҙ֎ʹ͍Ζ͍ΖΠέͯ·ͤΜ…͋Γ͋ΓͰ͢ɻ GPUͷར༻ࠓ͜ͷύεεϧʔ͕ݱߦͰ͕͢ɺNvidia Intel࣍ୈ͔ͱ(AMD…Ͳ͏͢ΔΜͩΖ͏…)
Thank you Masafumi Ohta @masafumiohta