Slide 1

Slide 1 text

Lightning Talks ࡾ୐༔հ / Pepabo R&D Institute, GMO Pepabo, Inc. 2023.12.18 Fukuoka.go#19 Reboot GoݴޠͰMac GPUϓϩάϥϛϯά

Slide 2

Slide 2 text

ϓϦϯγύϧΤϯδχΞ ࡾ୐ ༔հ / @monochromegane 2 https://blog.monochromegane.com Yusuke Miyake ϖύϘݚڀॴ ݚڀһ

Slide 3

Slide 3 text

• ଟมྔਖ਼ن෼෍ ʹै͏ཚ਺ੜ੒ʹ͕͔͔࣌ؒΔ • ͜ͷཚ਺ੜ੒ͷखॱʢͷҰͭʣ͸ҎԼͷ௨Γ 1. ֤ཁૉ͕ඪ४ਖ਼ن෼෍ʹै͏ཚ਺ ΛಘΔ 2. ڞ෼ࢄߦྻ ΛίϨεΩʔ෼ղʢ ʣͯ͠ࡾ֯ߦྻ ΛಘΔ 3. ΛٻΊΔ • ಛʹɺ֬཰෼෍ͷύϥϝʔλʢ ͱ ʣ͕౎౓ҟͳͬͨΓɺ࣍ݩ਺ ͕େ͖͍ ৔߹ʹɺཚ਺ੜ੒ʹ͕͔͔࣌ؒͬͯ͠·͏ y ∼ 𝒩 (μ, Σ), μ ∈ ℝD, Σ ∈ ℝD×D z = {zi }1≤i≤D , zi ∼ 𝒩 (0,1) Σ Σ = LL⊤ L y = μ + Lz μ Σ D 3 ͸͡Ίʹ

Slide 4

Slide 4 text

• ߦྻܭࢉ͸ಠཱ͔ͭฒߦͨ͠λεΫΛଟؚ͘ΉͨΊɺߴ଎Խʹ͸ฒྻԽ͕༗ޮ • ͢ͳΘͪɺSIMDɺCPUͷϚϧνίΞɺGPUͳͲʹΑΔฒྻԽ • CPUό΢ϯυͰλεΫཻ౓΋খ͍͞ͷͰgoroutine͸޲͔ͳ͍ʢͱࢥ͏ʣ • GoݴޠͰͷߦྻܭࢉϥΠϒϥϦGonum͸CPUͷϚϧνίΞΛαϙʔτ͢Δ BLAS΁ͷόΠϯσΟϯάΛఏڙ͍ͯ͠Δ • Apple silicon (M1) ʹGPU͕౥ࡌ͞Ε͍ͯΔͷͰɺͦͪΒ΋׆༻͍ͨ͠ 4 ͸͡Ίʹ

Slide 5

Slide 5 text

• GPU΁ͷΞΫηεΛఏڙ͢ΔOSඪ४౥ࡌͷϑϨʔϜϫʔΫ • άϥϑΟοΫεॲཧҎ֎ʹ΋ɺGPU্Ͱͷฒྻܭࢉॲཧ΋ѻ͑Δ • Objective-C·ͨ͸Swift͔ΒɺGPU্ͷॲཧΛهड़ͨ͠γΣʔμʔؔ਺ΛݺͿ • γΣʔμʔؔ਺͸C++ϕʔεͷMetal Shader Language (MSL) Ͱهड़ • Metal Performance Shaders (MPS) ͱ͍͏γΣʔμʔؔ਺܈΋ఏڙ͞ΕΔ 5 Metal: MacͰGPUϓϩάϥϛϯά

Slide 6

Slide 6 text

6 Metal: MacͰGPUϓϩάϥϛϯά • جຊతͳྲྀΕ͸ɺσόΠεʢGPUʣͷίϚϯυΩϡʔʹର͠ɺίϚϯυόο ϑΝͱ͍͏୯ҐͰγΣʔμʔؔ਺Λొ࿥͠ɺ݁ՌΛड͚औΔͱ͍͏΋ͷ • ͳ͓ɺCPUͱGPUͷ஋ͷ΍ΓऔΓʹ͸ઐ༻ͷόοϑΝ͕༻ҙ͞Ε͍ͯΔ ίϚϯυΩϡʔ ͷ४උ ΍ΓऔΓ༻ͷ όοϑΝͷ४උ όοϑΝͷσʔλ͔Β ߦྻΠϯελϯεੜ੒ .14ͷγΣʔμʔؔ਺ ΛॳظԽɺίϚϯυ όοϑΝͱͯ͠Τϯ ίʔυɺΩϡʔʹొ࿥ όοϑΝ͔Β݁Ռͷड ͚औΓ 0CKFDUJW$Ͱͷ ࣮૷ྫ

Slide 7

Slide 7 text

• Goݴޠ͔ΒcgoΛ࢖͑͹͜ΕΒͷObjective-CͷίʔυΛݺ΂Δ༷ࢠ • https://github.com/a-h/gpu ͸ϥΠϒϥϦͱͯ͠ར༻Ͱ͖Δ͕MPSʹରԠ͍ͯ͠ͳ͍ • https://github.com/mikecvet/go-mm ͸MPSͷݺͼग़͠Λ࣮૷͍ͯ͠Δ͕ϕϯνϚʔΫͷίʔυͷΈ • ্هΛࢀߟʹͭͭ͠ɺGoݴޠ্ͰͷGPUΛ༻͍ͨଟมྔਖ਼ن෼෍ʹै͏ཚ਺ ੜ੒͕Ͱ͖ͦ͏ 7 Cgo: GoݴޠͰMac GPUϓϩάϥϛϯά

Slide 8

Slide 8 text

1. Objective-CͷϔομϑΝΠϧΛinclude͠ɺLDFLAGSʹMetalϑϨʔϜϫʔΫΛࢦఆ͢Δ 2. ʢඞཁʹԠͯ͡ʣࣗલͷγΣʔμʔؔ਺Λgo:embedͰ૊ΈࠐΜͰ͓͘ 3. C.xxͱͯ͠Objective-CͰهड़ͨ͠ॳظԽ΍γΣʔμʔؔ਺Λ࣮ߦ͢Δؔ਺ΛݺͿɻ࣮ߦ࣌ ͷύϥϝʔλ΍݁Ռ͸unsafeύοέʔδΛ࢖ͬͯΞΫηεɻ 8 Cgo: GoݴޠͰMac GPUϓϩάϥϛϯά (PͰͷ࣮૷ྫ

Slide 9

Slide 9 text

• Goݴޠ্ͰͷGPUΛ༻͍ͨଟมྔਖ਼ن෼෍ʹै͏ཚ਺ੜ੒ • Goͷίʔυ͔Β ΛcgoΛܦ༝ͯ͠Objective-Cͷؔ਺ʹ౉͢ • MPSͷMPSMatrixDecompositionCholeskyΛ༻͍ͯίϨεΩʔ෼ղ • ࣗલγΣʔμʔؔ਺Λ༻͍ͯԼࡾ֯ߦྻҎ֎Λ0ʹຒΊΔ • MPSͷMPSMatrixVectorMultiplicationΛ༻͍ͯ Λܭࢉ • MPSͷMPSMatrixSumΛ༻͍ͯ Λܭࢉ • GoͷίʔυͰ݁ՌΛड͚औΔ z, μ, Σ Lz μ + Lz 9 Cgo: GoݴޠͰMac GPUϓϩάϥϛϯά

Slide 10

Slide 10 text

• GonumͱMetal࣮૷ͷ࣮ߦ଎౓Λൺֱʢ1000࣍ݩʣ 10 ඪ४ਖ਼ن෼෍ཚ਺ͷม׵଎౓ͷൺֱ BenchmarkTransformNormMetal-8 9 117668310 ns/op BenchmarkTransformNormGonumBLAS-8 55 21494668 ns/op BenchmarkTransformNormGonum-8 21 54288034 ns/op BenchmarkTransformNormCholMetal-8 1140 1070094 ns/op BenchmarkTransformNormCholGonumBLAS-8 15124 78960 ns/op BenchmarkTransformNormCholGonum-8 6712 177368 ns/op • ίϨεΩʔ෼ղͷ݁ՌΛผ్౉͢Α͏ʹͨ͠৔߹ͷൺֱ • MPSͷίϨεΩʔ෼ղ͸গ͠஗͍͔΋͠Εͳ͍͕ɺͦͷଞͷࠩ͸Կ͔

Slide 11

Slide 11 text

• ߦྻʢ1000x1000ʣͱߦྻʢ1000x1000ʣͷ৐ࢉ଎౓Λൺֱ 11 ߦྻ৐ࢉ଎౓ͷൺֱ BenchmarkMatrixMultipicationMetal-8 494 2222134 ns/op BenchmarkMatrixMultipicationGonumBLAS-8 60 22063894 ns/op BenchmarkMatrixMultipicationGonum-8 19 59507228 ns/op BenchmarkMatrixVectorMultipicationMetal-8 1497 792244 ns/op BenchmarkMatrixVectorMultipicationGonumBLAS-8 10000 114843 ns/op BenchmarkMatrixVectorMultipicationGonum-8 972 1239177 ns/op • ߦྻʢ1000x1000ʣͱϕΫτϧʢ1000x1ʣͷ৐ࢉ଎౓Λൺֱ • ߦྻಉ࢜ͷΑ͏ͳܭࢉྔͰ͸GPUͷํ͕ߴ଎ɻ ͷΑ͏ͳߦྻͱϕΫτϧͷ৐ࢉͰ͸͜ ͷ࣍ݩ਺ʹ͓͍ͯ͸GPUҠৡͷΦʔόʔϔουͷํ͕େ͖͔ͬͨͱߟ͑ΒΕΔ Lz

Slide 12

Slide 12 text

• GoݴޠͰMac GPUϓϩάϥϛϯά͢Δํ๏Λ঺հͨ͠ • ؆қతͳ଎౓ͷൺֱධՁΛ௨ͯ͠ɺ࢖͍ॴͷഽײΛಘΔ͜ͱ͕Ͱ͖ͨ • MPSͷϚχϡΞϧΛಡΉͱχϡʔϥϧωοτϫʔΫͷαϙʔτ΋͋ΓɺΞΠ σΟΞ࣍ୈͰ໘ന͍͜ͱ͕Ͱ͖ͦ͏ • MPSͷݺͼग़͠΋ՄೳͳϥΠϒϥϦΛ࡞ͬͯΈ͍ͨ • Ͳ͏΋ϝϞϦϦʔΫͯͦ͠͏ͳͷͰͦͷลΓ΋վળ͍ͨ͠ • ࡾGo 12 ·ͱΊ

Slide 13

Slide 13 text

No content