Slide 1

Slide 1 text

Vulkan Modern Vulkan NAOMASA MATSUBAYASHI Twitter: @fadis_ ͍·Ͳ͖ͷ ιʔείʔυ: https://github.com/Fadis/gct/tree/kernelvm-online-4

Slide 2

Slide 2 text

Vulkan GPUΛૢ࡞͢Δҝͷ ΫϩεϓϥοτϑΥʔϜͳAPI https://www.vulkan.org/

Slide 3

Slide 3 text

Vulkan GPUΛૢ࡞͢Δҝͷ ΫϩεϓϥοτϑΥʔϜͳAPI https://www.vulkan.org/ Windows Nintendo Switch Stadia Android Linux MoltenVK(macOS iOS iPadOS) ͋ͱFuchsia΍QNX΋ରԠͯ͠Δ

Slide 4

Slide 4 text

GPU 3DάϥϑΟΫεΛඳ͘ҝͷઐ༻ϋʔυ΢ΣΞ ϑϨʔϜόοϑΝͷ಺༰Λը໘ʹૹΔػߏ + 20ੈلͷ

Slide 5

Slide 5 text

3DάϥϑΟΫεΛඳ͘ҝͷઐ༻ϋʔυ΢ΣΞ ϑϨʔϜόοϑΝͷ಺༰Λը໘ʹૹΔػߏ + GPU 3DάϥϑΟΫεʹ ཁٻ͞ΕΔܭࢉ͕ෳࡶʹͳͬͯ ͋ͬͱ͍͏ؒʹഁ୼

Slide 6

Slide 6 text

GPU ೚ҙͷܭࢉΛߦ͏ϓϩηοα + + ࣮ߦՄೳόΠφϦͱσʔλΛஔ͍͓ͯ͘ϝϞϦ 21ੈلͷ ϑϨʔϜόοϑΝͷ಺༰Λը໘ʹૹΔػߏ

Slide 7

Slide 7 text

GPU ೚ҙͷܭࢉΛߦ͏ϓϩηοα + + ࣮ߦՄೳόΠφϦͱσʔλΛஔ͍͓ͯ͘ϝϞϦ 21ੈلͷ ϑϨʔϜόοϑΝͷ಺༰Λը໘ʹૹΔػߏ Ͳ͏ͯ͜͠ͷํ๏Ͱ CPUΑΓߴ଎ʹܭࢉͰ͖Δͷ?

Slide 8

Slide 8 text

GPU ೚ҙͷܭࢉΛߦ͏ϓϩηοα + + ࣮ߦՄೳόΠφϦͱσʔλΛஔ͍͓ͯ͘ϝϞϦ 21ੈلͷ ϑϨʔϜόοϑΝͷ಺༰Λը໘ʹૹΔػߏ େྔͷ

Slide 9

Slide 9 text

float x32 Tensor Core ϩʔυετΞ σΟεύον໋ྩΩϟογϡ ϨδελόϯΫ GeForce RTX3080ͷ৔߹ ALU εʔύʔεΧϥͷҝͷ ෳࡶͳґଘؔ܎ͷ νΣοΫ౳͸࣋ͨͳ͍ ∴͜ͷϓϩηοα1ݸͷ τϥϯδελ਺͸ খ͘͞཈͑ΒΕΔ Warp (Subgroup)

Slide 10

Slide 10 text

float x128 ڞ༗ϝϞϦ L1Ωϟογϡ RT Core GeForce RTX3080ͷ৔߹ Streaming Multiprocessor (Work Group)

Slide 11

Slide 11 text

float x256 GeForce RTX3080ͷ৔߹ Texture Processing Cluster PolyMorph

Slide 12

Slide 12 text

float x1536 ϥελϥΠβ Raster Operators Graphics Processing Clusters

Slide 13

Slide 13 text

float x10752 PCI-ExpressϗετΠϯλʔϑΣʔε NVLinkϗετΠϯλʔϑΣʔε L2Ωϟογϡ Graphics Processing Unit (Physical Device)

Slide 14

Slide 14 text

float x 21504 PCI-Express NVLink Device Group

Slide 15

Slide 15 text

1ΫϩοΫͰେྔͷσʔλʹରͯ͠ԋࢉ ݸʑͷϓϩηοα͕গʑ஗ͯ͘΋CPUΛѹ౗Ͱ͖Δ ԿͰCPU͸ͦ͏͠ͳ͍ͷ? CPUͷxxഒ଎͍ ·͔͡Α

Slide 16

Slide 16 text

1ΫϩοΫͰܭࢉͰ͖Δ਺Ҏ্ͷσʔλ͕ಉ࣌ʹແ͍ͱ Կ΋͠ͳ͍ԋࢉث͕ੜ͡ ͨͩͷ஗͍ܭࢉػʹͳΔ ஋1 ஋2 ஋3 ࢖ΘΕͳ͍ԋࢉث શ෦Ͱ3ݸͷ σʔλ ͜ͷ৚݅ΛຬͨͤΔ͔Ͳ͏͔͸λεΫʹґΔ

Slide 17

Slide 17 text

े෼ͳฒྻ౓ ͕͋Δ ৽छͷλεΫ Yes No

Slide 18

Slide 18 text

෼ۀ OSͱ͔໘౗ͳͷ͸೚ͤͨ Զ͸σΟʔϓϥʔχϯάͱ͔͚ͩ͢Δ ͻͰ͐

Slide 19

Slide 19 text

GPUͷಈ͔͠ํ 1. GPUͷϝϞϦʹσʔλΛૹΔ 2. GPU্Ͱ࣮ߦՄೳόΠφϦΛ࣮ߦ͢Δ 3. GPUͷϝϞϦ͔Β݁ՌΛऔΓग़͢ ͍ΖΜͳϕϯμʔͷGPU͕͋Δ͚Ͳ ϕϯμʔʹґΒͣ͜ͷૢ࡞Λ͢ΔAPI͕Vulkan ۃΊͯࡶͳ ೖྗ ೖྗ ग़ྗ ग़ྗ

Slide 20

Slide 20 text

GPUͷϝϞϦʹσʔλΛૹΔ MMU ී௨ʹmallocͨ͠ϝϞϦ͸ PCI-ExpressͷσόΠε͔Β͸ ࿈ଓͨ͠ྖҬʹݟ͑ͳ͍ ҟͳΔMMUΛհͯ͠ ϝϞϦΛݟ͍ͯΔ 0x4000 0x4000 IOMMU 0x4000ͷσʔλΛ͍࣋ͬͯͬͯΑ

Slide 21

Slide 21 text

GPUͷϝϞϦʹσʔλΛૹΔ MMU͔Β΋IOMMU͔Β΋ ಉ͡Α͏ʹݟ͑ΔྖҬΛ ϝΠϯϝϞϦʹ֬อ͢Δ 0x4000 0x1000 IOMMU 0x1000 GPUʹૹΓ͍ͨσʔλΛ ͜ͷྖҬʹίϐʔ͢Δ MMU

Slide 22

Slide 22 text

GPUͷϝϞϦʹσʔλΛૹΔ CPU͕ॻ͖׵͑Δ͔΋͠Εͳ͍ϝϞϦΛ GPU͸ΩϟογϡͰ͖ͳ͍ 0x1000 IOMMU 0x5000 CPUͷϝϞϦ্ͷྖҬͷσʔλΛ GPUͷϝϞϦ্ʹ֬อͨ͠ྖҬʹ ίϐʔ͢Δ CPUͷϝϞϦ GPUͷϝϞϦ

Slide 23

Slide 23 text

GPUͷϝϞϦʹσʔλΛૹΔ 0x1000 IOMMU 0x5000 MMU 0x4000 0x1000 ͜ͷίϐʔ͸memcpyͰྑ͍ ͜ͷྖҬͷ֬อ͸ mallocͰྑ͍ ͜ͷྖҬͷ֬อʹ͸ ઐ༻ͷAPI͕ཁΔ ͜ͷྖҬͷ֬อʹ΋ ઐ༻ͷAPI͕ཁΔ ͜ͷίϐʔΛߦ͏ʹ͸ ઐ༻ͷAPI͕ཁΔ

Slide 24

Slide 24 text

GPUͷϝϞϦʹσʔλΛૹΔ 0x1000 IOMMU 0x5000 MMU 0x4000 0x1000 ͜ͷίϐʔ͸memcpyͰྑ͍ ͜ͷྖҬͷ֬อ͸ mallocͰྑ͍ vkAllocateMemory vkCmdCopyBuffer vkAllocateMemory

Slide 25

Slide 25 text

GPUͷϝϞϦʹσʔλΛૹΔ 0x1000 IOMMU 0x5000 MMU 0x4000 0x1000 ͜͏͍͏ ྖҬͷ͜ͱΛ Staging Buffer ͱݺͿ

Slide 26

Slide 26 text

GPUͷϝϞϦ͔Β݁ՌΛऔΓग़͢ 0x1000 IOMMU 0x5000 MMU 0x4000 0x1000 vkAllocateMemory vkCmdCopyBuffer vkAllocateMemory memcpy malloc CPUʹσʔλΛฦ࣌͢΋ಉ͡ํ๏Ͱ

Slide 27

Slide 27 text

0x1000 IOMMU 0x5000 MMU 0x4000 0x1000 CPU͔Βίϐʔͨ͠ ූ߸෇͖੔਺΍ුಈখ਺఺਺Λ GPU͸ม׵ͳ͠Ͱ ಉ͡Α͏ʹղऍͰ͖ͳ͚Ε͹ͳΒͳ͍

Slide 28

Slide 28 text

https://www.khronos.org/registry/vulkan/specs/1.0/html/chap3.html#fundamentals-host-environment https://www.khronos.org/registry/vulkan/specs/1.0/html/chap36.html#spirvenv-precision-operation 32͓Αͼ64bitͷුಈখ਺఺਺͸IEEE Std 754-2008 ූ߸෇͖੔਺͸2ͷิ਺දݱ ΤϯσΟΞϯ͸CPUͱGPUͰಉ͡΋ͷΛαϙʔτ NaN NaN Vulkan 1.0ͷن֨ΑΓ VulkanରԠ؀ڥͷCPUͱGPUͨΔ΋ͷ ͜͏ܾ·͍ͬͯΔͷͰ ͦͷ··ίϐʔͨ͠஋͕ಡΊΔ

Slide 29

Slide 29 text

"memory_props": { "basic": { "memoryHeaps": [ { "flags": 1, "size": 8589934592 }, { "flags": 0, "size": 12528737280 }, { "flags": 1, "size": 257949696 } ], "memoryTypes": [ { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 0, "propertyFlags": 1 }, { "heapIndex": 1, "propertyFlags": 6 }, { "heapIndex": 1, "propertyFlags": 14 }, { "heapIndex": 2, "propertyFlags": 7 } ] }} vkGetPhysicalDeviceMemoryPropertiesͰ࢖͑ΔϝϞϦΛௐ΂Δ GPUͷϝϞϦʹ ಠཱͨ͠ώʔϓ͕2ͭ CPUͷϝϞϦʹ ಠཱͨ͠ώʔϓ͕1ͭ

Slide 30

Slide 30 text

"memory_props": { "basic": { "memoryHeaps": [ { "flags": 1, "size": 8589934592 }, { "flags": 0, "size": 12528737280 }, { "flags": 1, "size": 257949696 } ], "memoryTypes": [ { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 0, "propertyFlags": 1 }, { "heapIndex": 1, "propertyFlags": 6 }, { "heapIndex": 1, "propertyFlags": 14 }, { "heapIndex": 2, "propertyFlags": 7 } ] }} vkGetPhysicalDeviceMemoryPropertiesͰ࢖͑ΔϝϞϦΛௐ΂Δ ͜ͷล͸ ಛघ༻్ͳͷͰ ࠓ͸ແࢹ ϝϞϦλΠϓ ͲΜͳৼΔ෣͍Λ͢Δ ϝϞϦΛ֬อͰ͖Δ͔

Slide 31

Slide 31 text

"memory_props": { "basic": { "memoryHeaps": [ { "flags": 1, "size": 8589934592 }, { "flags": 0, "size": 12528737280 }, { "flags": 1, "size": 257949696 } ], "memoryTypes": [ { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 0, "propertyFlags": 1 }, { "heapIndex": 1, "propertyFlags": 6 }, { "heapIndex": 1, "propertyFlags": 14 }, { "heapIndex": 2, "propertyFlags": 7 } ] }} vkGetPhysicalDeviceMemoryPropertiesͰ࢖͑ΔϝϞϦΛௐ΂Δ GPUͷϝϞϦʹ GPUͷΈ͔Βݟ͑ΔྖҬΛ ֬อͰ͖Δ CPUͷϝϞϦʹCPU͔Βݟ͑ͯ CPU͕Ωϟογϡ͠ͳ͍ྖҬΛ ֬อͰ͖Δ CPUͷϝϞϦʹCPU͔Βݟ͑ͯ CPU͕Ωϟογϡ͢ΔྖҬΛ ֬อͰ͖Δ GPUͷϝϞϦʹCPU͔Βݟ͑ͯ CPU͕Ωϟογϡ͠ͳ͍ྖҬΛ ֬อͰ͖Δ

Slide 32

Slide 32 text

ಛघͳϝϞϦ͸vkAllocateMemoryͰ֬อ VkResult vkAllocateMemory( VkDevice device, const VkMemoryAllocateInfo* pAllocateInfo, const VkAllocationCallbacks* pAllocator, VkDeviceMemory* pMemory ); typedef struct VkMemoryAllocateInfo { VkStructureType sType; const void* pNext; VkDeviceSize allocationSize; uint32_t memoryTypeIndex; } VkMemoryAllocateInfo; ͜ͷαΠζ ͜ͷϝϞϦλΠϓͷϝϞϦΛ ͘Ε ͜ͷGPU༻ʹ

Slide 33

Slide 33 text

֬อͨ͠ϝϞϦΛ ܭࢉʹ࢖͏σʔλΛஔ͘ όοϑΝͱͯ͠࢖͏ ͱ͍͏ҙࢥදࣔΛ͢Δ VkResult vkCreateBuffer( VkDevice device, const VkBufferCreateInfo* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkBuffer* pBuffer ); typedef struct VkBufferCreateInfo { VkStructureType sType; const void* pNext; VkBufferCreateFlags flags; VkDeviceSize size; VkBufferUsageFlags usage; VkSharingMode sharingMode; uint32_t queueFamilyIndexCount; const uint32_t* pQueueFamilyIndices; } VkBufferCreateInfo; ͜ͷαΠζͷ ͜ͷGPU༻ʹ ͜Μͳ༻్ͷόοϑΝΛ ࡞ͬͯ VkDeviceMemory VkBuffer ϝϞϦͷத਎͸൚༻తͳσʔλͰ͢

Slide 34

Slide 34 text

֬อͨ͠ϝϞϦΛ ܭࢉʹ࢖͏σʔλΛஔ͘ όοϑΝͱͯ͠࢖͏ ͱ͍͏ҙࢥදࣔΛ͢Δ VkResult vkCreateBuffer( VkDevice device, const VkBufferCreateInfo* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkBuffer* pBuffer ); typedef struct VkBufferCreateInfo { VkStructureType sType; const void* pNext; VkBufferCreateFlags flags; VkDeviceSize size; VkBufferUsageFlags usage; VkSharingMode sharingMode; uint32_t queueFamilyIndexCount; const uint32_t* pQueueFamilyIndices; } VkBufferCreateInfo; ͜ͷαΠζͷ ͜ͷGPU༻ʹ VkResult vkBindBufferMemory( VkDevice device, VkBuffer buffer, VkDeviceMemory memory, VkDeviceSize memoryOffset ); ͜ͷϝϞϦΛ ࢖͏ ͜ͷόοϑΝ͸ ͜Μͳ༻్ͷόοϑΝΛ ࡞ͬͯ

Slide 35

Slide 35 text

"memory_props": { "basic": { "memoryHeaps": [ { "flags": 1, "size": 8589934592 }, { "flags": 0, "size": 12528737280 }, { "flags": 1, "size": 257949696 } ], "memoryTypes": [ { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 0, "propertyFlags": 1 }, { "heapIndex": 1, "propertyFlags": 6 }, { "heapIndex": 1, "propertyFlags": 14 }, { "heapIndex": 2, "propertyFlags": 7 } ] }} CPU͔Βݟ͑Δଐੑͷ͍ͭͨϝϞϦ͸ GPUͷϝϞϦʹ GPUͷΈ͔Βݟ͑ΔྖҬΛ ֬อͰ͖Δ CPUͷϝϞϦʹCPU͔Βݟ͑ͯ CPU͕Ωϟογϡ͠ͳ͍ྖҬΛ ֬อͰ͖Δ CPUͷϝϞϦʹCPU͔Βݟ͑ͯ CPU͕Ωϟογϡ͢ΔྖҬΛ ֬อͰ͖Δ GPUͷϝϞϦʹCPU͔Βݟ͑ͯ CPU͕Ωϟογϡ͠ͳ͍ྖҬΛ ֬อͰ͖Δ

Slide 36

Slide 36 text

{ "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 1, "propertyFlags": 0 }, { "heapIndex": 0, "propertyFlags": 1 }, { "heapIndex": 1, "propertyFlags": 6 }, { "heapIndex": 1, "propertyFlags": 14 }, { "heapIndex": 2, "propertyFlags": 7 } ] }} CPU͕Ωϟογϡ͢ΔྖҬΛ ֬อͰ͖Δ GPUͷϝϞϦʹCPU͔Βݟ͑ͯ CPU͕Ωϟογϡ͠ͳ͍ྖҬΛ ֬อͰ͖Δ VkResult vkMapMemory( VkDevice device, VkDeviceMemory memory, VkDeviceSize offset, VkDeviceSize size, VkMemoryMapFlags flags, void** ppData ); ͜ͷϝϞϦͷ ઌ಄ΞυϨε͕ฦͬͯ͘Δ vkMapMemory͔ͯ͠ΒvkUnmapMemory͢Δ·Ͱͷؒ ϓϩηεͷΞυϨεۭؒʹϝϞϦ͕Ϛοϓ͞ΕΔ ͜ͷҐஔ͔Β ͜ͷ௕͞ͷൣғͷ

Slide 37

Slide 37 text

ίϚϯυ ίϚϯυ ݁Ռ ݁Ռ GPUʹԿ͔Λͤ͞Δʹ͸ ΩϡʔʹίϚϯυΛྲྀ͢ vkCmdCopyBufferͰ CPUͷϝϞϦʹ͋ΔσʔλΛ GPUʹҾͬுΒ͍ͤͨ

Slide 38

Slide 38 text

ίϚϯυόοϑΝ ίϚϯυ ίϚϯυ ίϚϯυ͸ ίϚϯυόοϑΝʹଋͶͯૹΔ ίϚϯυόοϑΝͷ ಺༰͕׬ྃͨ͠ ίϚϯυόοϑΝ1ͭʹରͯ͠ ࣮ߦ׬ྃ௨஌͕1ͭฦͬͯ͘Δ

Slide 39

Slide 39 text

1ͭͷGPU͕ ෳ਺ͷΩϡʔΛ͍࣋ͬͯΔࣄ͕͋Δ ಉҰͷΩϡʔʹର͢Δॻ͖ࠐΈ͸ ഉଞతʹߦ͏ඞཁ͕͋Δ͕ ҟͳΔΩϡʔʹର͢Δॻ͖ࠐΈ͸ ෳ਺ͷCPU͔Βಉ࣌ʹߦΘΕͯ΋ྑ͍

Slide 40

Slide 40 text

"queue_family": [ { "basic": { "minImageTransferGranularity": { ... }, "queueCount": 16, "queueFlags": 15, "timestampValidBits": 64 } }, { "basic": { "minImageTransferGranularity": { ... }, "queueCount": 2, "queueFlags": 12, "timestampValidBits": 64 } vkGetPhysicalDeviceQueueFamilyPropertiesͰ࢖͑ΔΩϡʔΛௐ΂Δ άϥϑΟοΫʹؔΘΔίϚϯυΛྲྀͤΔ GPUͰܭࢉ͢ΔҝͷίϚϯυΛྲྀͤΔ σʔλͷసૹͷҝͷίϚϯυΛྲྀͤΔ ͜͏͍͏Ωϡʔ͕16ຊ GPUͰܭࢉ͢ΔҝͷίϚϯυΛྲྀͤΔ σʔλͷసૹͷҝͷίϚϯυΛྲྀͤΔ ͜͏͍͏Ωϡʔ͕2ຊ

Slide 41

Slide 41 text

} }, { "basic": { "minImageTransferGranularity": { ... }, "queueCount": 2, "queueFlags": 12, "timestampValidBits": 64 } }, { "basic": { "minImageTransferGranularity": { ... }, "queueCount": 8, "queueFlags": 14, "timestampValidBits": 64 } }, GPUͰܭࢉ͢ΔҝͷίϚϯυΛྲྀͤΔ σʔλͷసૹͷҝͷίϚϯυΛྲྀͤΔ ͜͏͍͏Ωϡʔ͕2ຊ σʔλͷసૹͷҝͷίϚϯυΛྲྀͤΔ ͜͏͍͏Ωϡʔ͕8ຊ GPUͷԋࢉثͱ͸ಠཱʹಈ͚ΔDMA͕ 8ج͋Δͱ͍͏͜ͱ

Slide 42

Slide 42 text

} }, { "basic": { "minImageTransferGranularity": { ... }, "queueCount": 2, "queueFlags": 12, "timestampValidBits": 64 } }, { "basic": { "minImageTransferGranularity": { ... }, "queueCount": 8, "queueFlags": 14, "timestampValidBits": 64 } }, GPUͰܭࢉ͢ΔҝͷίϚϯυΛྲྀͤΔ σʔλͷసૹͷҝͷίϚϯυΛྲྀͤΔ ͜͏͍͏Ωϡʔ͕2ຊ σʔλͷసૹͷҝͷίϚϯυΛྲྀͤΔ ͜͏͍͏Ωϡʔ͕8ຊ GPUͷԋࢉثͱ͸ಠཱʹಈ͚ΔDMA͕ 8ج͋Δͱ͍͏͜ͱ

Slide 43

Slide 43 text

ίϚϯυϓʔϧ ίϚϯυόοϑΝ ίϚϯυόοϑΝ ⋯ ίϚϯυόοϑΝ ίϚϯυ vkAllocateCommandBuffers ίϚϯυ͸ઐ༻ͷϝϞϦʹ ੵ·ͳ͚Ε͹ͳΒͳ͍ࣄ͕͋ΔͷͰ ઐ༻ͷϝϞϦϓʔϧ͔ΒׂΓ౰ͯ vkCreateCommandPool σόΠε ϓʔϧΛ࡞੒ ίϚϯυόοϑΝΛऔಘ vkFreeCommandBuffers ίϚϯυόοϑΝΛฦ٫ ࢖͍ऴΘͬͨΒ

Slide 44

Slide 44 text

ίϚϯυϓʔϧ ίϚϯυόοϑΝ ίϚϯυόοϑΝ ⋯ ίϚϯυόοϑΝ vkCmdCopyBuffer vkAllocateCommandBuffers vkCreateCommandPool vkCmdCopyBufferΛ ίϚϯυόοϑΝʹੵΜͰ ΩϡʔʹSubmit࣮ͯ͠ߦ VkResult vkQueueSubmit( VkQueue queue, uint32_t submitCount, const VkSubmitInfo* pSubmits, VkFence fence ); ͜ͷΩϡʔʹ

Slide 45

Slide 45 text

vkCmdCopyBuffer ίϚϯυόοϑΝʹੵΜͰ ΩϡʔʹSubmit࣮ͯ͠ߦ VkResult vkQueueSubmit( VkQueue queue, uint32_t submitCount, const VkSubmitInfo* pSubmits, VkFence fence ); ͜ͷΩϡʔʹ typedef struct VkSubmitInfo { VkStructureType sType; const void* pNext; uint32_t waitSemaphoreCount; const VkSemaphore* pWaitSemaphores; const VkPipelineStageFlags* pWaitDstStageMask; uint32_t commandBufferCount; const VkCommandBuffer* pCommandBuffers; uint32_t signalSemaphoreCount; const VkSemaphore* pSignalSemaphores; } VkSubmitInfo; ͜ͷ ίϚϯυόοϑΝΛ ྲྀͯ͠

Slide 46

Slide 46 text

VkResult vkQueueSubmit( VkQueue queue, uint32_t submitCount, const VkSubmitInfo* pSubmits, VkFence fence ); VkResult vkWaitForFences( VkDevice device, uint32_t fenceCount, const VkFence* pFences, VkBool32 waitAll, uint64_t timeout ); ͜͜ͰSubmitͨ͠ ίϚϯυόοϑΝͷ ಺༰͕ ׬ྃ͢Δ͔ timeoutͷ࣌ؒܦա͢Δ·Ͱ ଴ػͯ͠ VkResult vkCreateFence( VkDevice device, const VkFenceCreateInfo* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkFence* pFence ); FenceΛ࡞ͬͯ׬ྃ௨஌Λड͚औΔ

Slide 47

Slide 47 text

GPUͷಈ͔͠ํ 1. GPUͷϝϞϦʹσʔλΛૹΔ 2. GPU্Ͱ࣮ߦՄೳόΠφϦΛ࣮ߦ͢Δ 3. GPUͷϝϞϦ͔Β݁ՌΛऔΓग़͢ ۃΊͯࡶͳ ೖྗ ೖྗ ग़ྗ ग़ྗ γΣʔμ

Slide 48

Slide 48 text

GeForceͯ͞͠ಈ͘ͳΒ RADEONͯ͞͠΋ಈ͘΍Ζ PCࣗ࡞erͷҰൠతͳࢥߟ GPUͷ໋ྩηοτ͸ϕϯμʔຖʹҟͳΔ ͕ɺͳ͔ͳ͔ཧղͯ͠΋Β͑ͳ͍

Slide 49

Slide 49 text

--- gcn.list 2021-11-09 02:04:47.899271324 +0900 +++ rdna2.list 2021-11-09 02:22:47.976688357 +0900 @@ -1,29 +1,41 @@ -V_ADDC_U32 +V_ADD3_U32 +V_ADD_CO_CI_U32 +V_ADD_CO_U32 +V_ADD_F16 V_ADD_F32 V_ADD_F64 -V_ADD_I32 +V_ADD_LSHL_U32 +V_ADD_NC_I16 +V_ADD_NC_I32 +V_ADD_NC_U16 +V_ADD_NC_U32 V_ALIGNBIT_B32 V_ALIGNBYTE_B32 V_AND_B32 -V_ASHRREV_I32 -V_ASHR_I32 -V_ASHR_I64 +V_AND_OR_B32 +V_ASHRREV_B32 +V_ASHRREV_I16 +V_ASHRREV_I64 V_BCNT_U32_B32 V_BFE_I32 V_BFE_U32 V_BFI_B32 V_BFM_B32 V_BFREV_B32 +V_CEIL_F16 V_CEIL_F32 V_CEIL_F64 V_CLREXCP V_CNDMASK_B32 +V_COS_F16 V_COS_F32 V_CUBEID_F32 V_CUBEMA_F32 V_CUBESC_F32 V_CUBETC_F32 V_CVT_F16_F32 +V_CVT_F16_I16 +V_CVT_F16_U16 V_CVT_F32_F16 V_CVT_F32_F64 V_CVT_F32_I32 @@ -36,135 +48,205 @@ V_CVT_F64_I32 V_CVT_F64_U32 V_CVT_FLR_I32_F32 +V_CVT_I16_F16 V_CVT_I32_F32 V_CVT_I32_F64 +V_CVT_NORM_I16_F16 V_MAC_F32 -V_MAC_LEGACY_F32 -V_MADAK_F32 -V_MADI64_I32 -V_MADMK_F32 -V_MADU64_U32 -V_MAD_F32 +V_MAD_I16 +V_MAD_I32_I16 V_MAD_I32_I24 -V_MAD_LEGACY_F32 +V_MAD_I64_I32 +V_MAD_U16 +V_MAD_U32_U16 V_MAD_U32_U24 +V_MAD_U64_U32 +V_MAX3_F16 V_MAX3_F32 +V_MAX3_I16 V_MAX3_I32 +V_MAX3_U16 V_MAX3_U32 +V_MAX_F16 V_MAX_F32 V_MAX_F64 +V_MAX_I16 V_MAX_I32 -V_MAX_LEGACY_F32 +V_MAX_U16 V_MAX_U32 V_MBCNT_HI_U32_B32 V_MBCNT_LO_U32_B32 +V_MED3_F16 V_MED3_F32 V_MED3_I32 V_MED3_U32 +V_MIN3_F16 V_MIN3_F32 +V_MIN3_I16 V_MIN3_I32 +V_MIN3_U16 V_MIN3_U32 +V_MIN_F16 V_MIN_F32 V_MIN_F64 +V_MIN_I16 V_MIN_I32 -V_MIN_LEGACY_F32 +V_MIN_U16 V_MIN_U32 V_MOVRELD_B32 +V_MOVRELSD_2_B32 V_MOVRELSD_B32 V_MOVRELS_B32 V_MOV_B32 +V_MOV_FED_B32 V_MQSAD_PK_U16_U8 AMD GCNͱAMD RDNA2ͷ ϕΫλԋࢉ໋ྩͷdiff ݁ߏͳ਺ͷ໋ྩ͕ ৽͍͠RDNA2Ͱ͸ ࡟আ͞Ε͍ͯΔ GPU͸ಉ͡ϕϯμͰ͋ͬͯ΋ ໋ྩηοτͷޓ׵ੑ͸ͳ͘ͳΓ͕ͪ

Slide 50

Slide 50 text

GPU Aͷ ࣮ߦՄೳόΠφϦ GPU A GPU B GPU C GPUͷ࣮ߦՄೳόΠφϦΛ ௚઀༻ҙ࣮ͯ͠ߦ͢Δͱ ಛఆͷGPUͰ͔͠ಈ͔ͳ͘ͳΔ ϋʔυ΢ΣΞΛݶఆͰ͖ΔՈఉ༻ήʔϜػ͸͜ΕΛ΍͍ͬͯΔ ࣮ߦ࣌ ίϯύΠϧ࣌

Slide 51

Slide 51 text

void main() { vec3 normal = normalize( inpu t_normal.xyz ); vec3 pos = input_position. xyz; vec3 N = normal; GPU A GPU B GPU C GLSL(ߴڃݴޠ) ࣮ߦ࣌ ίϯύΠϧ࣌ OpenGLͷ৔߹ ࣮ߦ࣌ʹγΣʔμΛ ίϯύΠϧ͢Δ ͕͔͔࣌ؒΔ

Slide 52

Slide 52 text

void main() { vec3 normal = normalize( inpu t_normal.xyz ); vec3 pos = input_position. xyz; vec3 N = normal; ߴڃݴޠ a b × + 3 a b × + 3 ࣮ߦՄೳόΠφϦ AST AST ࣈ۟ղੳ ߏจղੳ λʔήοτ ඇґଘͷ ࠷దԽ λʔήοτ όΠφϦͷ ੜ੒ ίϯύΠϥͷॲཧ͸େ͖͘෼͚ͯ4ஈ֊ a b × + 3 AST λʔήοτ ݻ༗ͷ ࠷దԽ

Slide 53

Slide 53 text

void main() { vec3 normal = normalize( inpu t_normal.xyz ); vec3 pos = input_position. xyz; vec3 N = normal; ߴڃݴޠ a b × + 3 a b × + 3 ࣮ߦՄೳόΠφϦ AST AST ࣈ۟ղੳ ߏจղੳ λʔήοτ ඇґଘͷ ࠷దԽ λʔήοτ όΠφϦͷ ੜ੒ a b × + 3 AST λʔήοτ ݻ༗ͷ ࠷దԽ ͜ͷ෦෼͸GPUຖʹߦ͏ඞཁ͕͋ΔͷͰ ࣮ߦ࣌ʹ΍Β͟ΔΛಘͳ͍ ͜ͷ෦෼͸ ࣄલʹย෇͚ͯ΋໰୊ͳ͍ a b × + 3 ͜ͷஈ֊ͷASTΛ όΠφϦܗࣜͰ γϦΞϥΠζ͓ͯ࣋ͬͯ͜͠͏

Slide 54

Slide 54 text

void main() { vec3 normal = normalize( inpu t_normal.xyz ); vec3 pos = input_position. xyz; vec3 N = normal; ߴڃݴޠ a b × + 3 a b × + 3 ࣮ߦՄೳόΠφϦ AST AST ࣈ۟ղੳ ߏจղੳ λʔήοτ ඇґଘͷ ࠷దԽ λʔήοτ όΠφϦͷ ੜ੒ a b × + 3 AST λʔήοτ ݻ༗ͷ ࠷దԽ ͜ͷ෦෼͸ ࣄલʹย෇͚ͯ΋໰୊ͳ͍ a b × + 3 SPIR-V ͜ͷஈ֊ͷASTΛ όΠφϦܗࣜͰ γϦΞϥΠζ͓ͯ࣋ͬͯ͜͠͏

Slide 55

Slide 55 text

void main() { vec3 normal = normalize( inpu t_normal.xyz ); vec3 pos = input_position. xyz; vec3 N = normal; GPU A GPU B GPU C GLSL(ߴڃݴޠ) ࣮ߦ࣌ ίϯύΠϧ࣌ Vulkanͷ৔߹ a b × + 3 glslc SPIR-V vkCreateShaderModule

Slide 56

Slide 56 text

#version 450 #extension GL_ARB_separate_shader_objects : enable #extension GL_ARB_shading_language_420pack : enable #extension GL_KHR_shader_subgroup_basic : enable #extension GL_KHR_shader_subgroup_arithmetic : enable layout(local_size_x_id = 1, local_size_y_id = 2 ) in; layout(std430, binding = 1) buffer layout1 { float output_data[]; }; layout(constant_id = 3) const float value = 1; void main() { const uint x = gl_GlobalInvocationID.x; const uint y = gl_GlobalInvocationID.y; const uint width = gl_WorkGroupSize.x * gl_NumWorkGroups.x; const uint index = x + y * width; output_data[ index ] += value; } ؆୯ͳGLSLͷྫ

Slide 57

Slide 57 text

#version 450 #extension GL_ARB_separate_shader_objects : enable #extension GL_ARB_shading_language_420pack : enable #extension GL_KHR_shader_subgroup_basic : enable #extension GL_KHR_shader_subgroup_arithmetic : enable layout(local_size_x_id = 1, local_size_y_id = 2 ) in; layout(std430, binding = 1) buffer layout1 { float output_data[]; }; layout(constant_id = 3) const float value = 1; void main() { const uint x = gl_GlobalInvocationID.x; const uint y = gl_GlobalInvocationID.y; const uint width = gl_WorkGroupSize.x * gl_NumWorkGroups.x; const uint index = x + y * width; output_data[ index ] += value; } όοϑΝ

Slide 58

Slide 58 text

#version 450 #extension GL_ARB_separate_shader_objects : enable #extension GL_ARB_shading_language_420pack : enable #extension GL_KHR_shader_subgroup_basic : enable #extension GL_KHR_shader_subgroup_arithmetic : enable layout(local_size_x_id = 1, local_size_y_id = 2 ) in; layout(std430, binding = 1) buffer layout1 { float output_data[]; }; layout(constant_id = 3) const float value = 1; void main() { const uint x = gl_GlobalInvocationID.x; const uint y = gl_GlobalInvocationID.y; const uint width = gl_WorkGroupSize.x * gl_NumWorkGroups.x; const uint index = x + y * width; output_data[ index ] += value; } εϨουID͔Β όοϑΝͷͲ͜ʹॻ͔ܾ͘ΊΔ

Slide 59

Slide 59 text

#version 450 #extension GL_ARB_separate_shader_objects : enable #extension GL_ARB_shading_language_420pack : enable #extension GL_KHR_shader_subgroup_basic : enable #extension GL_KHR_shader_subgroup_arithmetic : enable layout(local_size_x_id = 1, local_size_y_id = 2 ) in; layout(std430, binding = 1) buffer layout1 { float output_data[]; }; layout(constant_id = 3) const float value = 1; void main() { const uint x = gl_GlobalInvocationID.x; const uint y = gl_GlobalInvocationID.y; const uint width = gl_WorkGroupSize.x * gl_NumWorkGroups.x; const uint index = x + y * width; output_data[ index ] += value; } όοϑΝͷ1ཁૉʹ1ΛՃ͑Δ value͸1 ࣮ߦ͢Δ౓ʹόοϑΝͷ஋ΛΠϯΫϦϝϯτ͢Δ

Slide 60

Slide 60 text

#version 450 #extension GL_ARB_separate_shader_objects : enable #extension GL_ARB_shading_language_420pack : enable #extension GL_KHR_shader_subgroup_basic : enable #extension GL_KHR_shader_subgroup_arithmetic : enable layout(local_size_x_id = 1, local_size_y_id = 2 ) in; layout(std430, binding = 1) buffer layout1 { float output_data[]; }; layout(constant_id = 3) const float value = 1; void main() { const uint x = gl_GlobalInvocationID.x; const uint y = gl_GlobalInvocationID.y; const uint width = gl_WorkGroupSize.x * gl_NumWorkGroups.x; const uint index = x + y * width; output_data[ index ] += value; } binding = 1ͷόοϑΝΛ output_dataͱ݁ͼ͚ͭΔ binding = 1ͷόοϑΝͬͯͲͷόοϑΝͷ͜ͱ?

Slide 61

Slide 61 text

σεΫϦϓληοτ όοϑΝ# CJOEJOH όοϑΝ" CJOEJOH όοϑΝ$ CJOEJOH ⋮ όοϑΝA όοϑΝB όοϑΝC #version 450 #extension GL_ARB_separate_shader_objects : enable #extension GL_ARB_shading_language_420pack : enabl #extension GL_KHR_shader_subgroup_basic : enable #extension GL_KHR_shader_subgroup_arithmetic : ena layout(local_size_x_id = 1, local_size_y_id = 2 ) layout(std430, binding = 1) buffer layout1 { float output_data[]; }; layout(constant_id = 3) const float value = 1; void main() { const uint x = gl_GlobalInvocationID.x; const uint y = gl_GlobalInvocationID.y; const uint width = gl_WorkGroupSize.x * gl_NumWo const uint index = x + y * width; output_data[ index ] += value; } ॻ͖ࠐΈ γΣʔμͷbindingͱvkCreateBufferͰ࡞ͬͨόοϑΝΛରԠ෇͚Δ vkUpdateDescriptorSetsͰొ࿥

Slide 62

Slide 62 text

σεΫϦϓλϓʔϧ σεΫϦϓληοτ ⋮ όοϑΝA όοϑΝB όοϑΝC σεΫϦϓληοτ͸ ϋʔυ΢ΣΞͷ ݶΒΕͨϨδελΛ ࢖͏Մೳੑ͕͋Δ σεΫϦϓληοτ ⋮ ⋯ σεΫϦϓληοτ͸σεΫϦϓλϓʔϧ͔ΒׂΓ౰ͯΔ vkAllocateDescriptorSets ཁΒͳ͘ͳͬͨΒ vkFreeDescriptorSets Ͱฦ٫

Slide 63

Slide 63 text

σεΫϦϓλϓʔϧ σεΫϦϓληοτ όοϑΝA όοϑΝB όοϑΝC σεΫϦϓληοτ ⋮ ⋯ σεΫϦϓληοτϨΠΞ΢τ όοϑΝ༻ͷσεΫϦϓλ͕3ݸ͋ΔΑ͏ͳ σεΫϦϓληοτΛ͍ͩ͘͞ ԿΛରԠ͚ͮΔҝͷ σεΫϦϓλ͕ Կݸ༻ҙ͞Ε͍ͯΔ σεΫϦϓληοτ͕ ཉ͍͔͠Λද͢ σεΫϦϓληοτϨΠΞ΢τ

Slide 64

Slide 64 text

σεΫϦϓλϓʔϧ σεΫϦϓληοτ όοϑΝA όοϑΝB όοϑΝC σεΫϦϓληοτ ⋮ ⋯ σεΫϦϓληοτϨΠΞ΢τ όοϑΝ༻ͷσεΫϦϓλ͕3ݸ͋ΔΑ͏ͳ σεΫϦϓληοτΛ͍ͩ͘͞ ԿΛରԠ͚ͮΔҝͷ σεΫϦϓλ͕ Կݸ༻ҙ͞Ε͍ͯΔ σεΫϦϓληοτ͕ ཉ͍͔͠Λද͢ σεΫϦϓληοτϨΠΞ΢τ SPIR-VΛ ಡΜͩΒΘ͔ΔͷͰ͸ a b × + 3

Slide 65

Slide 65 text

SPIR-VΛ ಡΜͩΒΘ͔ΔͷͰ͸ a b × + 3 Q. A. Θ͔Δ ͳͷͰSPIR-V͔ΒbindingΛ ړΔϥΠϒϥϦ͕͋Δ SPIRV-Reflect https://github.com/KhronosGroup/SPIRV-Reflect ϕϯμʔຖͷGPUͷυϥΠόʹ ͜ͷػೳΛ࣮૷͠ͳͯ͘ྑ͍

Slide 66

Slide 66 text

γΣʔμϞδϡʔϧͱσεΫϦϓληοτϨΠΞ΢τΛ͚ͬͭ͘Δ ͬͭ͘͘=์ஔ͞ΕΔbinding͸ଘࡏ͠ͳ͍ ίϯϐϡʔτύΠϓϥΠϯ VkResult vkCreateComputePipelines( VkDevice device, VkPipelineCache pipelineCache, uint32_t createInfoCount, const VkComputePipelineCreateInfo* pCreateInfos, const VkAllocationCallbacks* pAllocator, VkPipeline* pPipelines ); typedef struct VkComputePipelineCreateInfo { VkStructureType sType; const void* pNext; VkPipelineCreateFlags flags; VkPipelineShaderStageCreateInfo stage; VkPipelineLayout layout; VkPipeline basePipelineHandle; int32_t basePipelineIndex; } VkComputePipelineCreateInfo;

Slide 67

Slide 67 text

ίϯϐϡʔτύΠϓϥΠϯ typedef struct VkComputePipelineCreateInfo { VkStructureType sType; const void* pNext; VkPipelineCreateFlags flags; VkPipelineShaderStageCreateInfo stage; VkPipelineLayout layout; VkPipeline basePipelineHandle; int32_t basePipelineIndex; } VkComputePipelineCreateInfo; typedef struct VkPipelineShaderStageCreateInfo { VkStructureType sType; const void* pNext; VkPipelineShaderStageCreateFlags flags; VkShaderStageFlagBits stage; VkShaderModule module; const char* pName; const VkSpecializationInfo* pSpecializationInfo; } VkPipelineShaderStageCreateInfo; γΣʔμ Ϟδϡʔϧ

Slide 68

Slide 68 text

ίϯϐϡʔτύΠϓϥΠϯ VkPipelineShaderStageCreateInfo stage; VkPipelineLayout layout; VkPipeline basePipelineHandle; int32_t basePipelineIndex; } VkComputePipelineCreateInfo; VkResult vkCreatePipelineLayout( VkDevice device, const VkPipelineLayoutCreateInfo* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkPipelineLayout* pPipelineLayout ); typedef struct VkPipelineLayoutCreateInfo { VkStructureType sType; const void* pNext; VkPipelineLayoutCreateFlags flags; uint32_t setLayoutCount; const VkDescriptorSetLayout* pSetLayouts; uint32_t pushConstantRangeCount; const VkPushConstantRange* pPushConstantRanges; } VkPipelineLayoutCreateInfo; σεΫϦϓλ ηοτ ϨΠΞ΢τ

Slide 69

Slide 69 text

ύΠϓϥΠϯΩϟογϡ VkResult vkCreateComputePipelines( VkDevice device, VkPipelineCache pipelineCache, uint32_t createInfoCount, const VkComputePipelineCreateInfo* pCreateInfos, const VkAllocationCallbacks* pAllocator, VkPipeline* pPipelines ); Ұ౓࡞ͬͨ ࣮ߦՄೳόΠφϦ౳Λ͓֮͑ͯ͘ ͜Ε Ҏલͱಉ͡಺༰ͰύΠϓϥΠϯͷ࡞੒Λཁٻ͞ΕͨΒ Ωϟογϡͷ಺༰Λ࢖͏

Slide 70

Slide 70 text

ύΠϓϥΠϯΩϟογϡ VkPipelineCache pipelineCache, uint32_t createInfoCount, const VkComputePipelineCreateInfo* pCreateInfos, const VkAllocationCallbacks* pAllocator, VkPipeline* pPipelines ); VkResult vkCreatePipelineCache( VkDevice device, const VkPipelineCacheCreateInfo* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkPipelineCache* pPipelineCache ); typedef struct VkPipelineCacheCreateInfo { VkStructureType sType; const void* pNext; VkPipelineCacheCreateFlags flags; size_t initialDataSize; const void* pInitialData; } VkPipelineCacheCreateInfo;

Slide 71

Slide 71 text

ύΠϓϥΠϯΩϟογϡ VkResult vkCreatePipelineCache( VkDevice device, const VkPipelineCacheCreateInfo* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkPipelineCache* pPipelineCache ); typedef struct VkPipelineCacheCreateInfo { VkStructureType sType; const void* pNext; VkPipelineCacheCreateFlags flags; size_t initialDataSize; const void* pInitialData; } VkPipelineCacheCreateInfo; VkResult vkGetPipelineCacheData( VkDevice device, VkPipelineCache pipelineCache, size_t* pDataSize, void* pData ); ೋ࣍هԱ ࣍ճىಈ࣌͸ γΣʔμͷ ίϯύΠϧΛճආ

Slide 72

Slide 72 text

ύΠϓϥΠϯΩϟογϡ VkResult vkCreatePipelineCache( VkDevice device, const VkPipelineCacheCreateInfo* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkPipelineCache* pPipelineCache ); typedef struct VkPipelineCacheCreateInfo { VkStructureType sType; const void* pNext; VkPipelineCacheCreateFlags flags; size_t initialDataSize; const void* pInitialData; } VkPipelineCacheCreateInfo; VkResult vkGetPipelineCacheData( VkDevice device, VkPipelineCache pipelineCache, size_t* pDataSize, void* pData ); ೋ࣍هԱ ࣍ճىಈ࣌͸ γΣʔμͷ ίϯύΠϧΛճආ

Slide 73

Slide 73 text

[v1 , v2 , v3 , v4 , v5 , v6 , v7 , v8 , v9 , v10] ͋ͱඞཁͳͷ͸ԿεϨουͰ࣮ߦ͢Δ͔ void vkCmdDispatch( VkCommandBuffer commandBuffer, uint32_t groupCountX, uint32_t groupCountY, uint32_t groupCountZ ); ͜ͷίϚϯυόοϑΝʹ ݸͷεϨουͰ࣮ߦΛ։࢝͢ΔཁٻΛੵΉ groupCountx × groupCounty × groupCountz ͜ͷίϚϯυΛΩϡʔʹྲྀ͢ͱGPUͰγΣʔμ͕࣮ߦ͞ΕΔ

Slide 74

Slide 74 text

ίϚϯυόοϑΝ vkCmdDispatch vkCmdDispatch vkCmdDispatchΛ ෳ਺Ωϡʔʹྲྀͨ͠৔߹ ͦΕΒ͕ ॱ൪ʹ࣮ߦ͞ΕΔอূ͸ͳ͍ GPUͷϓϩηοαʹ༨༟͕͋Δ৔߹ ෳ਺ͷvkCmdDispatch͕ ಉ࣌ʹ࣮ߦ͞ΕΔ͜ͱ΋͋Δ stallͨ͠vkCmdDispatch͕ ޙճ͠ʹͳΔ͜ͱ΋͋Δ 32εϨου 64εϨου

Slide 75

Slide 75 text

ίϚϯυόοϑΝ ෳ਺ͷvkCmdDispatchͷؒʹ σʔλͷґଘؔ܎͕͋Δ৔߹͸ vkCmdPipelineBarrierͰ ґଘؔ܎Λ໌ࣔ͢Δͱ ద੾ͳॱংͰ࣮ߦ͞ΕΔ vkCmdPipelineBarrier vkCmdDispatch vkCmdDispatch

Slide 76

Slide 76 text

void vkCmdPipelineBarrier( VkCommandBuffer commandBuffer, VkPipelineStageFlags srcStageMask, VkPipelineStageFlags dstStageMask, VkDependencyFlags dependencyFlags, uint32_t memoryBarrierCount, const VkMemoryBarrier* pMemoryBarriers, uint32_t bufferMemoryBarrierCount, const VkBufferMemoryBarrier* pBufferMemoryBarriers, uint32_t imageMemoryBarrierCount, const VkImageMemoryBarrier* pImageMemoryBarriers ); typedef struct VkBufferMemoryBarrier { VkStructureType sType; const void* pNext; VkAccessFlags srcAccessMask; VkAccessFlags dstAccessMask; uint32_t srcQueueFamilyIndex; uint32_t dstQueueFamilyIndex; VkBuffer buffer; VkDeviceSize offset; VkDeviceSize size; } VkBufferMemoryBarrier; ͜ͷόοϑΝ

Slide 77

Slide 77 text

VkDependencyFlags dependencyFlags, uint32_t memoryBarrierCount, const VkMemoryBarrier* pMemoryBarriers, uint32_t bufferMemoryBarrierCount, const VkBufferMemoryBarrier* pBufferMemoryBarriers, uint32_t imageMemoryBarrierCount, const VkImageMemoryBarrier* pImageMemoryBarriers ); typedef struct VkBufferMemoryBarrier { VkStructureType sType; const void* pNext; VkAccessFlags srcAccessMask; VkAccessFlags dstAccessMask; uint32_t srcQueueFamilyIndex; uint32_t dstQueueFamilyIndex; VkBuffer buffer; VkDeviceSize offset; VkDeviceSize size; } VkBufferMemoryBarrier; ͜ͷόοϑΝ όϦΞͷલʹ͜ͷόοϑΝΛ৮ͬͨίϚϯυ͕׬ྃ͢Δ·Ͱ όϦΞͷޙͰ͜ͷόοϑΝΛ৮ΔίϚϯυΛ։࢝ͯ͠͸͍͚·ͤΜ

Slide 78

Slide 78 text

{ auto mapped = staging_buffer->map< float >(); std::fill( mapped.begin(), mapped.end(), 0.f ); } { auto rec = command_buffer->begin(); rec.copy( staging_buffer, device_local_buffer ); rec.barrier( vk::AccessFlagBits::eTransferWrite, vk::AccessFlagBits::eShaderRead, vk::PipelineStageFlagBits::eTransfer, vk::PipelineStageFlagBits::eComputeShader, vk::DependencyFlagBits( 0 ), { device_local_buffer }, {} ); rec.bind_descriptor_set( vk::PipelineBindPoint::eCompute, pipeline_layout, descriptor_set ); θϩΫϦΞͨ͠ ϝϞϦΛ GPUʹૹͬͯ ίϐʔ׬ྃΛ ଴͔ͬͯΒ

Slide 79

Slide 79 text

rec.bind_descriptor_set( vk::PipelineBindPoint::eCompute, pipeline_layout, descriptor_set ); rec.bind_pipeline( vk::PipelineBindPoint::eCompute, pipeline ); rec->dispatch( 4, 2, 1 ); rec.barrier( vk::AccessFlagBits::eShaderWrite, vk::AccessFlagBits::eTransferRead, vk::PipelineStageFlagBits::eComputeShader, vk::PipelineStageFlagBits::eTransfer, vk::DependencyFlagBits( 0 ), { device_local_buffer }, {} ); rec.copy( device_local_buffer, staging_buffer ); } σεΫϦϓληοτΛ ࢦఆͯ͠ ύΠϓϥΠϯΛ ࢦఆͯ͠ ࣮ߦͯ͠ ࣮ߦͷ׬ྃΛ ଴͔ͬͯΒ

Slide 80

Slide 80 text

vk::PipelineStageFlagBits::eComputeShader, vk::PipelineStageFlagBits::eTransfer, vk::DependencyFlagBits( 0 ), { device_local_buffer }, {} ); rec.copy( device_local_buffer, staging_buffer ); } command_buffer->execute( gct::submit_info_t() ); command_buffer->wait_for_executed(); std::vector< float > host; host.reserve( 1024 ); { auto mapped = staging_buffer->map< float >(); std::copy( mapped.begin(), mapped.end(), std::back_inserter( host ) ); } unsigned int count; nlohmann::json json = host; std::cout << json.dump( 2 ) << std::endl; CPUଆʹίϐʔ JSONʹͯ͠μϯϓ ͜͜·Ͱͷ಺༰ΛΩϡʔʹྲྀͯ͠ ίϚϯυͷ׬ྃΛ଴ͬͯ GPU͔Βདྷͨ σʔλΛ

Slide 81

Slide 81 text

$ ./src/compute [ 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, ... 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0 ] શ෦ΠϯΫϦϝϯτ͞ΕͯΔ

Slide 82

Slide 82 text

Graphics Processing Unit Α͘๨ΕΒΕΔ͕ GPUͷG͸ GraphicsͷG

Slide 83

Slide 83 text

vkBindBufferMemory VkDeviceMemory VkBuffer vkBindImageMemory VkDeviceMemory VkImage ͜ͷϝϞϦͷத਎͸൚༻తͳܭࢉσʔλͰ͢ ͜ͷϝϞϦͷத਎͸ը૾Ͱ͢ VkImageͰϝϞϦʹஔ͔Εͨσʔλ͕ ը૾Ͱ͋Δͱ͍͏͜ͱΛ໌ࣔ͢Δ

Slide 84

Slide 84 text

vkBindBufferMemory VkDeviceMemory VkBuffer vkBindImageMemory VkDeviceMemory VkImage σʔλ͸CPU͔ΒૹΒΕͨ··ͷॱংͰ GPUʹஔ͔Ε·͢ σʔλ͸ը૾ͷ༻్ʹԠͯ͡࠷దͳஔ͖ํʹ ม׵ͯ͠GPUʹஔ͔Ε·͢ VkImageʹը૾ͷ༻్Λࢦఆ͢Δͱ Vulkan͸ͦͷ༻్ʹదͨ͠ฒͼํͰϝϞϦʹϐΫηϧΛฒ΂Δ

Slide 85

Slide 85 text

p ྫ͑͹ΠϝʔδΛςΫενϟͱͯ͠࢖͏৔߹ p ͷҐஔͷ৭Λܾఆ͢Δͷʹ ࠷ۙ๣ͳΒ ͷϐΫηϧΛ ઢܗิ׬ͳΒ ͱ ͷϐΫηϧΛ Cubicิ׬ͳΒ ͱ ͷϐΫηϧΛ ͱ ಡΉඞཁ͕͋Δ

Slide 86

Slide 86 text

ྫ͑͹ΠϝʔδΛςΫενϟͱͯ͠࢖͏৔߹ ΠϝʔδΛx࣠ํ޲ʹ1ߦͮͭ ϝϞϦʹஔ͍͍ͯΔͱ ͜ͷൣғͷ஋͕ඞཁ y࣠ํ޲ͷྡ઀͢ΔϐΫηϧ͕ ϝϞϦ্Ͱ཭ΕͨҐஔʹه࿥͞ΕΔ ࣍ʹಡΉϐΫηϧ͕ Ωϟογϡʹ৐͍ͬͯΔ֬཰͕Լ͕Δ

Slide 87

Slide 87 text

ྫ͑͹ΠϝʔδΛςΫενϟͱͯ͠࢖͏৔߹ ྫ͑͹ΠϝʔδͷϐΫηϧ͕ ͜Μͳॱ൪ͰϝϞϦʹฒΜͰ͍Δͱ ͋ΔϐΫηϧͷ஋ΛಡΜͩޙͰ ۙ๣ͷϐΫηϧΛಡΜͩ࣌ʹ ͦͷϐΫηϧ͕ Ωϟογϡʹ৐͍ͬͯΔ֬཰্͕͕Δ

Slide 88

Slide 88 text

VkResult vkCreateImage( VkDevice device, const VkImageCreateInfo* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkImage* pImage ); typedef struct VkImageCreateInfo { VkStructureType sType; const void* pNext; VkImageCreateFlags flags; VkImageType imageType; VkFormat format; VkExtent3D extent; uint32_t mipLevels; uint32_t arrayLayers; VkSampleCountFlagBits samples; VkImageTiling tiling; VkImageUsageFlags usage; VkSharingMode sharingMode; uint32_t queueFamilyIndexCount; const uint32_t* pQueueFamilyIndices; VkImageLayout initialLayout; } VkImageCreateInfo; ༻్ VkImage࡞੒࣌ʹ ༻్Λࢦఆ͢Δ ༻్͸ϏοτϑϥάͰ ෳ਺ࢦఆͯ͠΋ྑ͍ VK_IMAGE_USAGE_TRANSFER_DST_BIT| VK_IMAGE_USAGE_SAMPLED_BIT ྫ vkCopyImageͷड͚ଆ͔ͭ ςΫενϟαϯϓϦϯάର৅

Slide 89

Slide 89 text

void vkCmdPipelineBarrier( VkCommandBuffer commandBuffer, VkPipelineStageFlags srcStageMask, VkPipelineStageFlags dstStageMask, VkDependencyFlags dependencyFlags, uint32_t memoryBarrierCount, const VkMemoryBarrier* pMemoryBarriers, uint32_t bufferMemoryBarrierCount, const VkBufferMemoryBarrier* pBufferMemoryBarriers, uint32_t imageMemoryBarrierCount, const VkImageMemoryBarrier* pImageMemoryBarriers ); typedef struct VkImageMemoryBarrier { VkStructureType sType; const void* pNext; VkAccessFlags srcAccessMask; VkAccessFlags dstAccessMask; VkImageLayout oldLayout; VkImageLayout newLayout; uint32_t srcQueueFamilyIndex; uint32_t dstQueueFamilyIndex; VkImage image; VkImageSubresourceRange subresourceRange; } VkImageMemoryBarrier; ͜ͷΠϝʔδΛ ͜ͷϨΠΞ΢τ͔Β ͜ͷϨΠΞ΢τʹ όϦΞ͢Δ͍ͭͰʹ ΠϝʔδͷϨΠΞ΢τΛ มߋͰ͖Δ

Slide 90

Slide 90 text

ίϚϯυόοϑΝ ը૾Λੜ੒ vkCmdPipelineBarrier CPUଆʹίϐʔ VK_IMAGE_LAYOUT_GENERALͰు͘ VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMALͰཉ͍͠ VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMALʹม׵ GPU͕ಡΈॻ͖͢Δͷʹదͨ͠ϨΠΞ΢τ సૹ͢Δͷʹదͨ͠ϨΠΞ΢τ సૹ͢Δͷʹదͨ͠ϨΠΞ΢τ λΠϧແޮ͔ͭ ϨΠϠʔ͕1ຕ͔ͭ mipmapͳ͔ͭ͠ సૹʹదͨ͠ϨΠΞ΢τ = ߦϝδϟʔͰ ύσΟϯάͤͣʹ ॱ൪ʹϐΫηϧ͕ฒΜͩ ϨΠΞ΢τ CPU͔ΒಡΈ΍͍͢

Slide 91

Slide 91 text

#version 450 #extension GL_ARB_separate_shader_objects : enable #extension GL_ARB_shading_language_420pack : enable #extension GL_KHR_shader_subgroup_basic : enable #extension GL_KHR_shader_subgroup_arithmetic : enable layout(local_size_x_id = 1, local_size_y_id = 2 ) in; layout(std430, binding = 1) buffer layout1 { float output_data[]; }; layout(set = 0, binding = 0, rgba8) uniform writeonly image2D img; void main() { ... imageStore( img, ivec2( pos.xy ), color ); } Storage ImageΛ࢖͏ͱ ίϯϐϡʔτύΠϓϥΠϯ͔ΒΠϝʔδΛಡΈॻ͖Ͱ͖Δ color͸pos.xyͷҐஔͷϐΫηϧ͕ஔ͔ΕΔ΂͖Ґஔʹॻ͔ΕΔ

Slide 92

Slide 92 text

PolyMorph େ͖ͳࡾ֯ܗΛ খ͞ͳෳ਺ͷࡾ֯ܗʹ ෼ׂ͢Δ (ςοηʔϨʔλ) GPUʹ͸ ޮ཰Α͘3DάϥϑΟΫεΛඳ͘ҝͷ ઐ༻ͷϋʔυ΢ΣΞ͕৭ʑࡌ͍ͬͯΔ

Slide 93

Slide 93 text

ϥελϥΠβ 3ͭͷ௖఺Ͱఆٛ͞Εͨࡾ֯ܗ͕ ͲͷϐΫηϧʹରԠ͢Δ͔ΛٻΊΔ GPUʹ͸ ޮ཰Α͘3DάϥϑΟΫεΛඳ͘ҝͷ ઐ༻ͷϋʔυ΢ΣΞ͕৭ʑࡌ͍ͬͯΔ

Slide 94

Slide 94 text

Raster Operators γΣʔσΟϯάͷ݁ՌΛू໿ͯ͠ ࠷ऴతͳΠϝʔδʹه࿥͢Δ৭Λܾఆ͢Δ GPUʹ͸ ޮ཰Α͘3DάϥϑΟΫεΛඳ͘ҝͷ ઐ༻ͷϋʔυ΢ΣΞ͕৭ʑࡌ͍ͬͯΔ

Slide 95

Slide 95 text

GPU ೚ҙͷܭࢉΛߦ͏ϓϩηοα + + ࣮ߦՄೳόΠφϦͱσʔλΛஔ͍͓ͯ͘ϝϞϦ 21ੈلͷ ϑϨʔϜόοϑΝͷ಺༰Λը໘ʹૹΔػߏ େྔͷ + ϓϩηοαͰ͸ ޮ཰͕ѱ͍෦෼Λ ิ͏ϋʔυ΢ΣΞ

Slide 96

Slide 96 text

Input Assembly Vertex Shader Tessellation Control Shader Tessellation Tessellation Evaluation Shader Geometry Shader Rasterization Fragment Shader Color Blend ϋʔυ΢ΣΞ ϋʔυ΢ΣΞ ϋʔυ΢ΣΞ 3DάϥϑΟΫεͷ ඳըखॱͷॴʑͰ ઐ༻ͷϋʔυ΢ΣΞΛ ࢖͍͍ͨ ϋʔυ΢ΣΞ

Slide 97

Slide 97 text

Input Assembly Vertex Shader Tessellation Control Shader Tessellation Tessellation Evaluation Shader Geometry Shader Rasterization Fragment Shader Color Blend ϋʔυ΢ΣΞ ϋʔυ΢ΣΞ ϋʔυ΢ΣΞ ࢒ΓͷεςοϓͦΕͧΕʹ SPIR-VΛ݁ͼ͚ͭΔ a b × + 3 a b × + 3 a b × + 3 a b × + 3 a b × + 3 a b × + 3 ϋʔυ΢ΣΞ

Slide 98

Slide 98 text

άϥϑΟΫε ύΠϓϥΠϯ

Slide 99

Slide 99 text

Input Assembly Vertex Shader Tessellation Control Shader Tessellation Tessellation Evaluation Shader Geometry Shader Rasterization Fragment Shader Color Blend

Slide 100

Slide 100 text

Input Assembly Vertex Shader Tessellation Control Shader Tessellation Tessellation Evaluation Shader Geometry Shader Rasterization Fragment Shader Color Blend ࣮ߦ࣌ʹಈతʹมߋͰ͖Δ ඞཁ͕͋ΔઃఆΛࢦఆ͢Δ

Slide 101

Slide 101 text

Ϩϯμʔύε ͱ͸

Slide 102

Slide 102 text

Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend Ϩϯμʔύε ෳ਺ͷάϥϑΟΫεύΠϓϥΠϯΛଋͶͨ΋ͷ

Slide 103

Slide 103 text

Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend VkImage ϚϧνύεϨϯμϦϯά VkImage 1ஈ֊໨ͷϨϯμϦϯάͷ݁ՌΛ ೖྗͱͯ͠2ஈ֊໨ͷϨϯμϦϯάΛߦ͏ Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend

Slide 104

Slide 104 text

Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend VkImage VkImage VkImage ࠲ඪ ๏ઢ ਂ౓ VkImage ࡐ࣭ VkImage Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend VkImage Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend VkImage Input Assemb VS TCS Tessellation TES GS Rasterization FS Color Blend র໌ র໌ র໌ GόοϑΝ

Slide 105

Slide 105 text

VkImage Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend VkImage Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend VkImage Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend র໌ র໌ র໌ Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend ∑ VkImage ϨϯμϦϯά݁Ռ

Slide 106

Slide 106 text

VS TCS sellation TES GS erization FS or Blend Image Image Image Image VkImage Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend VkImage Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend VkImage Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend র໌ র໌ র໌ In R ∑ V ϨϯμϦϯά݁Ռ ͜͜Ͱશͯͷর໌Λ ॱʹܭࢉ͢ΔΑΓεέʔϧ͢Δ

Slide 107

Slide 107 text

Tessellation TES GS Rasterization FS Color Blend VkImage VkImage VkImage ࠲ඪ ๏ઢ ਂ౓ VkImage ࡐ࣭ VkImage VS TCS Tessellation TES GS Rasterization FS Color Blend VkImage VS TCS Tessellation TES GS Rasterization FS Color Blend VkIma VS TCS Tessellati TES GS Rasterizat FS Color Ble ϨϯμϦ GόοϑΝʹ࢒Βͳ͔ͬͨ(=ଞͷ΋ͷͷഎޙʹ͋ͬͯݟ͑ͳ͍) ϐΫηϧ͸ҎޙͷܭࢉʹݱΕͳ͍

Slide 108

Slide 108 text

Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend VkImage VkImage VkImage ࠲ඪ ๏ઢ ਂ౓ VkImage ࡐ࣭ VkImage Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend VkIm Input A V TC Tesse TE G Raste F Color র໌ র໌ Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend র໌1ͷҐஔ͔Β ϨϯμϦϯά VkImage ਂ౓ র໌1ͷҐஔ͔Βͷ ϨϯμϦϯά݁Ռʹө͍ͬͯͳ͍ͳΒ ͦ͜ʹ͸র໌1ͷޫ͕ಧ͔ͳ͍

Slide 109

Slide 109 text

VkImage TES GS Rasterization FS Color Blend VkImage TES GS Rasterization FS Color Blend VS TCS Tessellation TES GS Rasterization FS Color Blend VkImage ϨϯμϦϯά݁Ռʹը૾ॲཧΛߦ͏ ϨϯμϦϯά݁Ռ Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend VkImage ඃࣸքਂ౓ޮՌ τʔϯϚοϓͳͲ ը૾ॲཧ͞ΕͨϨϯμϦϯά݁Ռ

Slide 110

Slide 110 text

ίϚϯυόοϑΝ vkCmdPipelineBarrier Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend όϦΞͰ ෳ਺ͷάϥϑΟΫεύΠϓϥΠϯͷ࣮ߦʹ ґଘؔ܎Λ࣋ͨͤΕ͹ྑ͍ͷͰ͸ ͜ͷํ๏Ͱ΋Ͱ͖Δ ͔͜͠͠ͷํ๏Ͱ͸ ϞόΠϧGPUͰੑೳ͕ग़ͳ͍ ύΠϓϥΠϯΛ࣮ߦ ύΠϓϥΠϯΛ࣮ߦ

Slide 111

Slide 111 text

CPU GPU ࡉ͍ ϞόΠϧGPU

Slide 112

Slide 112 text

CPU GPU ࡉ͍ ଠ͍ 1ը໘෼ͷ ϨϯμϦϯά݁ՌΛஔ͘ʹ͸ খ͗͢͞Δ VkImage ϨϯμϦϯά݁Ռ͸ ͜͜ʹஔ͔͘͠ͳ͍ SRAM

Slide 113

Slide 113 text

CPU GPU ࡉ͍ ଠ͍ ը໘ͷҰ෦͚ͩΛ SRAM্ͰϨϯμϦϯά͢Δ SRAM ॱ൪ʹϨϯμϦϯάͯ݁͠ՌΛॻ͖ࠐΉ λΠϧ

Slide 114

Slide 114 text

CPU GPU ࡉ͍ ଠ͍ SRAM 1 1 2 όϦΞ 1ύε໨Λ1ը໘෼ϝΠϯϝϞϦʹు͍͔ͯΒ ϝΠϯϝϞϦΛಡΜͰ2ύε໨Λܭࢉ࢝͠ΊΔ όϦΞΛ࢖ͬͨ Ϛϧνύεͷ৔߹

Slide 115

Slide 115 text

Ϩϯμʔύε Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend Ϩϯμʔύε಺ͷෳ਺ͷύΠϓϥΠϯ͸ ೖग़ྗʹґଘؔ܎Λ࣋ͨͤΔ͜ͱ͕Ͱ͖Δ ͨͩ͠B΍Cͷ ͷϐΫηϧΛܭࢉ͢Δ࣌ ಡΊΔ͜ͱ͕อূ͞ΕΔͷ͸Aͷ ͷҐஔͷ஋͚ͩ (x, y) (x, y) " # $

Slide 116

Slide 116 text

CPU GPU ࡉ͍ ଠ͍ SRAM 1 2 Ϩϯμʔύεͷ ৔߹ 1ͭͷλΠϧʹର͢Δ ෳ਺ͷύΠϓϥΠϯͷॲཧΛ Ұ౓ʹ࣮ߦ ϝΠϯϝϞϦ΁ͷ ॻ͖ࠐΈ͸ ࠷ޙͷ1౓͚ͩ

Slide 117

Slide 117 text

ό Ϧ Ξ ό Ϧ Ξ

Slide 118

Slide 118 text

ίϚϯυόοϑΝ vkCmdPipelineBarrier ύΠϓϥΠϯ୯ҐͰ͸ͳ͘ Ϩϯμʔύε୯ҐͰ࣮ߦ͢Δ Ϩϯμʔύε1Λ࣮ߦ Ϩϯμʔύε3Λ࣮ߦ vkCmdPipelineBarrier Ϩϯμʔύε2Λ࣮ߦ όϦΞ όϦΞ

Slide 119

Slide 119 text

GPU ೚ҙͷܭࢉΛߦ͏ϓϩηοα + + ࣮ߦՄೳόΠφϦͱσʔλΛஔ͍͓ͯ͘ϝϞϦ 21ੈلͷ ϑϨʔϜόοϑΝͷ಺༰Λը໘ʹૹΔػߏ ϨϯμϦϯά݁ՌΛը໘ʹग़͍ͨ͠

Slide 120

Slide 120 text

͜͜ʹॻ͘ͱग़Δ ίϯϙδλ X Window System Wayland Compositor Windows DWM etc. Vulkan ΞϓϦέʔγϣϯ ը໘ʹૹΔө૾Λॻ͖ࠐΉҝͷϝϞϦ͸ ଟ͘ͷ৔߹ίϯϙδλ͕઎༗͍ͯ͠Δ

Slide 121

Slide 121 text

͜͜ʹॻ͘ͱग़Δ ίϯϙδλ X Window System Wayland Compositor Windows DWM etc. Vulkan ΞϓϦέʔγϣϯ ΞϓϦέʔγϣϯ͸ίϯϙδλ͔Β ඳը಺༰Λ౉͢ઌαʔϑΣεΛ໯͏ ඳը಺༰ͷॻ͖ࠐΈઌ͍ͩ͘͞ ͜͜ʹඳը಺༰Λ ౉͍ͯͩ͘͠͞ αʔϑΣε

Slide 122

Slide 122 text

ΞϓϦέʔγϣϯ͸ίϯϙδλ͔Β ඳը಺༰Λ౉͢ઌαʔϑΣεΛ໯͏ ϓϥοτϑΥʔϜݻ༗ͷϋϯυϥͰ Windows X11 Wayland Android Fuchsia iOS GGP Nintendo Switch HWND xcb_window_t* wl_surface* ANativeWindow* zx_handle_t CAMetalLayer* GgpStreamDescriptor void*

Slide 123

Slide 123 text

HWND xcb_window_t* wl_surface* ANativeWindow* zx_handle_t CAMetalLayer* GgpStreamDescriptor void* vkCreateWin32SurfaceKHR vkCreateImagePipeSurfaceFUCHSIA VkSurfaceKHR vkGetPhysicalDeviceXcbPresentationSupportKHR vkCreateIOSSurfaceMVK vkGetPhysicalDeviceWaylandPresentationSupportKHR vkCreateStreamDescriptorSurfaceGGP vkGetPhysicalDeviceWaylandPresentationSupportKHR vkCreateViSurfaceNN

Slide 124

Slide 124 text

͜͜ʹॻ͘ͱग़Δ Vulkan ΞϓϦέʔγϣϯ ॻ͍ͯΔ ಡΜͰΔ ίϯϙδλ ॻ͍ͯΔ ίϯϙδλ͕ಡΜͰ͍ΔϝϞϦʹ௚઀ॻ͘ͱ ඳ͍͍ͯΔ్தͷ΋ͷ͕ը໘ʹग़ͯ͠·͏

Slide 125

Slide 125 text

͜͜ʹॻ͘ͱग़Δ Vulkan ΞϓϦέʔγϣϯ ॻ͍ͯΔ ಡΜͰΔ ίϯϙδλ ॻ͍ͯΔ ॻ͚ͨΒ ੾Γସ͑ ੾ΓସΘͬͨΒ ݹ͍ͷΛճऩ εϫοϓ νΣʔϯ

Slide 126

Slide 126 text

VkResult vkCreateSwapchainKHR( VkDevice device, const VkSwapchainCreateInfoKHR* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSwapchainKHR* pSwapchain ); typedef struct VkSwapchainCreateInfoKHR { VkStructureType sType; const void* pNext; VkSwapchainCreateFlagsKHR flags; VkSurfaceKHR surface; uint32_t minImageCount; VkFormat imageFormat; VkColorSpaceKHR imageColorSpace; VkExtent2D imageExtent; uint32_t imageArrayLayers; VkImageUsageFlags imageUsage; VkSharingMode imageSharingMode; uint32_t queueFamilyIndexCount; const uint32_t* pQueueFamilyIndices; VkSurfaceTransformFlagBitsKHR preTransform; VkCompositeAlphaFlagBitsKHR compositeAlpha; VkPresentModeKHR presentMode; VkBool32 clipped; VkSwapchainKHR oldSwapchain; } VkSwapchainCreateInfoKHR; ͜ͷຕ਺͘Ε ͜ͷαʔϑΣεʹ ౉ͨ͢Ίͷ ΠϝʔδΛ

Slide 127

Slide 127 text

εϫοϓνΣʔϯ VkDeviceMemory VkImage VkDeviceMemory VkImage VkDeviceMemory VkImage VkDeviceMemory VkImage VkDeviceMemory VkImage ͜ͷΠϝʔδ͸ ͜ͷϨΠΞ΢τʹ͔͠ͳΕ·ͤΜ ͜ͷϝϞϦ͸ίϯϙδλͷ ϓϩηεͱڞ༗͞Ε͍ͯ·͢ εϫοϓνΣʔϯ͸ ϝϞϦׂ͕Γ౰ͯΒΕͨ Πϝʔδͷଋ ίϯϙδλͷ౎߹Ͱ ϨΠΞ΢τ͕ ݶఆ͞Ε͍ͯΔ

Slide 128

Slide 128 text

εϫοϓνΣʔϯ VkImage VkImage VkDeviceMemory VkImage VkDeviceMemory VkImage VkDeviceMemory VkImage Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend εϫοϓνΣʔϯͷ Πϝʔδʹ޲͔ͬͯ άϥϑΟΫεύΠϓϥΠϯͰ ϨϯμϦϯά

Slide 129

Slide 129 text

ϑϨʔϜόοϑΝ νΣʔϯ ge ge age mage VkDeviceMemory VkImage Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend άϥϑΟΫεύΠϓϥΠϯ͸ ৭ͱਂ౓ͱεςϯγϧΛు͘ VkDeviceMemory VkImage ਂ౓ͱεςϯγϧΛड͚ΔΠϝʔδΛ ࣗ෼Ͱ༻ҙͯ͠ εϫοϓνΣʔϯͷΠϝʔδͱ͚ͬͭͯ͘ ϑϨʔϜόοϑΝʹ͢Δ

Slide 130

Slide 130 text

ϑϨʔϜόοϑΝ VkDeviceMemory VkImage Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend VkDeviceMemory VkImage VkResult vkCreateFramebuffer( VkDevice device, const VkFramebufferCreateInfo* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkFramebuffer* pFramebuffer ); typedef struct VkFramebufferCreateInfo { VkStructureType sType; const void* pNext; VkFramebufferCreateFlags flags; VkRenderPass renderPass; uint32_t attachmentCount; const VkImageView* pAttachments; uint32_t width; uint32_t height; uint32_t layers; } VkFramebufferCreateInfo; ࢖͏Πϝʔδͷ Ϗϡʔͷ഑ྻ

Slide 131

Slide 131 text

ry ry VkDeviceMemory VkImage Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend VkResult vkQueuePresentKHR( VkQueue queue, const VkPresentInfoKHR* pPresentInfo ); typedef struct VkPresentInfoKHR { VkStructureType sType; const void* pNext; uint32_t waitSemaphoreCount; const VkSemaphore* pWaitSemaphores; uint32_t swapchainCount; const VkSwapchainKHR* pSwapchains; const uint32_t* pImageIndices; VkResult* pResults; } VkPresentInfoKHR; ͜ͷεϫοϓνΣʔϯͷ ͜ͷΠϝʔδΛ ίϯϙδλʹૹΕ ඳ͚ͨΒ

Slide 132

Slide 132 text

εϫοϓνΣʔϯ VkDeviceMemory VkImage VkDeviceMemory VkImage VkDeviceMemory VkImage VkDeviceMemory VkImage VkDeviceMemory VkImage VkResult vkAcquireNextImageKHR( VkDevice device, VkSwapchainKHR swapchain, uint64_t timeout, VkSemaphore semaphore, VkFence fence, uint32_t* pImageIndex ); εϫοϓνΣʔϯͷΠϝʔδ΁ͷॻ͖ࠐΈ͸ ίϯϙδλଆ͕ย෇͍͔ͯΒߦ͏ඞཁ͕͋Δ ΋͏ॻ͚Δ?

Slide 133

Slide 133 text

VkResult vkAcquireNextImageKHR( VkDevice device, VkSwapchainKHR swapchain, uint64_t timeout, VkSemaphore semaphore, VkFence fence, uint32_t* pImageIndex ); VkResult vkCreateSemaphore( VkDevice device, const VkSemaphoreCreateInfo* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSemaphore* pSemaphore ); typedef struct VkSubmitInfo { VkStructureType sType; const void* pNext; uint32_t waitSemaphoreCount; const VkSemaphore* pWaitSemaphores; const VkPipelineStageFlags* pWaitDstStageMask; uint32_t commandBufferCount; const VkCommandBuffer* pCommandBuffers; uint32_t signalSemaphoreCount; const VkSemaphore* pSignalSemaphores; } VkSubmitInfo; Πϝʔδͷ४උ͕Ͱ͖ͨΒ ͜ͷηϚϑΥʹ௨஌ ࠓ͔Βྲྀ͢ίϚϯυ͸ ηϚϑΥ΁ͷ௨஌Λ଴͔ͬͯΒ ࣮ߦͤΑ Ωϡʔͷ֎΍Ωϡʔؒͷಉظ͸ όϦΞͰ͸ͳ͘ηϚϑΥΛ࢖͏

Slide 134

Slide 134 text

VkResult vkAcquireNextImageKHR( VkDevice device, VkSwapchainKHR swapchain, uint64_t timeout, VkSemaphore semaphore, VkFence fence, uint32_t* pImageIndex ); VkResult vkCreateSemaphore( VkDevice device, const VkSemaphoreCreateInfo* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkSemaphore* pSemaphore ); typedef struct VkSubmitInfo { VkStructureType sType; const void* pNext; uint32_t waitSemaphoreCount; const VkSemaphore* pWaitSemaphores; const VkPipelineStageFlags* pWaitDstStageMask; uint32_t commandBufferCount; const VkCommandBuffer* pCommandBuffers; uint32_t signalSemaphoreCount; const VkSemaphore* pSignalSemaphores; } VkSubmitInfo; Πϝʔδͷ४උ͕Ͱ͖ͨΒ ͜ͷηϚϑΥʹ௨஌ ࠓ͔Βྲྀ͢ίϚϯυ͸ ηϚϑΥ΁ͷ௨஌Λ଴͔ͬͯΒ ࣮ߦͤΑ Ωϡʔͷ֎΍Ωϡʔؒͷಉظ͸ όϦΞͰ͸ͳ͘ηϚϑΥΛ࢖͏

Slide 135

Slide 135 text

No content

Slide 136

Slide 136 text

Vulkan Modern Vulkan NAOMASA MATSUBAYASHI Twitter: @fadis_ ͍·Ͳ͖ͷ

Slide 137

Slide 137 text

Vulkan 1.1

Slide 138

Slide 138 text

όοϑΝ" CJOEJOH όοϑΝA #version 450 #extension GL_EXT_shader_16bit_storage : require layout(std430, binding = 1) buffer layout1 { uint16_t output_data[]; }; ... std::vector< std::uint16_t > data; 16bit੔਺ΛόοϑΝʹॻ͍ͯ γΣʔμ͔Β16bit੔਺ͱͯ͠ ಡΉ ܭࢉ͸32bit੔਺Ͱߦ͏ copy 16bitετϨʔδ

Slide 139

Slide 139 text

typedef struct VkPhysicalDevice16BitStorageFeatures { VkStructureType sType; void* pNext; VkBool32 storageBuffer16BitAccess; VkBool32 uniformAndStorageBuffer16BitAccess; VkBool32 storagePushConstant16; VkBool32 storageInputOutput16; } VkPhysicalDevice16BitStorageFeatures; GPU͸16bitͷload/store͕Ͱ͖ͳ͍͔΋͠Εͳ͍ ৽͘͠௥Ճ͞Εͨ VkPhysicalDevice16BitStorageFeatures Λௐ΂Ε͹ GPU͕ͦΕͧΕͷঢ়گͰ16bitͷload/storeΛͰ͖Δ͔͕Θ͔Δ 16bitετϨʔδ

Slide 140

Slide 140 text

#version 450 #extension GL_EXT_shader_16bit_storage : require layout(std430, binding = 1) buffer layout1 { float16_t output_data[]; }; ... 16bitͷload/storeʹରԠ͍ͯ͠Δ৔߹ ൒ਫ਼౓ුಈখ਺఺਺ͷload/store΋Ͱ͖Δ #version 450 #extension GL_EXT_shader_16bit_storage : require layout(std430, binding = 1) buffer layout1 { f16vec4 output_data[]; }; ... ϕΫλܕ΋OK 16bitετϨʔδ

Slide 141

Slide 141 text

GPUͷϓϩηοα͸ 32͔Β64ݸͷ஋ΛҰ౓ʹॲཧ͢Δ SIMD໋ྩΛඋ͍͑ͯΔ Vulkan͸͜ΕΛ32εϨουͱΧ΢ϯτ͠ 1ݸͷ஋Λૢ࡞͢Δؔ਺32εϨουΛ 1ͭͷSIMD໋ྩͷ࣮ߦʹׂΓ౰ͯΔ ͜ͷ32εϨουΛSubgroupͱݺͿ Subgroup Operation

Slide 142

Slide 142 text

⋯ ⋯ ⋯ + + + + + ਨ௚Ճࢉ ී௨ʹa+bΛ͢Δͱ ͜ΕʹͳΔ a b Subgroup Operation

Slide 143

Slide 143 text

⋯ ⋯ ⋯ ⋯ ਫฏՃࢉ + + + + a subgroupAdd(a) ∑ n an Subgroup Operation

Slide 144

Slide 144 text

⋯ ⋯ ⋯ ⋯ ਫฏՃࢉ + + + + a subgroupInclusiveAdd(a) Subgroup Operation

Slide 145

Slide 145 text

⋯ ⋯ ⋯ ⋯ ਫฏՃࢉ + + + a subgroupExclusiveAdd(a) + Subgroup Operation

Slide 146

Slide 146 text

⋯ ⋯ ⋯ ਫฏՃࢉ + a subgroupClusteredAdd(a,2) + + 2ͭͮͭ Subgroup Operation

Slide 147

Slide 147 text

⋯ ⋯ ⋯ γϟοϑϧ subgroupShuffle(a,b) a b ͜ͷॱͰฒ΂ସ͑ Subgroup Operation

Slide 148

Slide 148 text

⋯ ⋯ ⋯ ϒϩʔυΩϟετ a subgroupBroadcast(a,0) શ෦ ʹͳΔ a0 Subgroup Operation

Slide 149

Slide 149 text

⋯ ⋯ ⋯ ϒϩʔυΩϟετ a subgroupQuadBroadcast(a) 4ͭͮͭ Subgroup Operation

Slide 150

Slide 150 text

struct VkPhysicalDeviceSubgroupProperties { VkStructureType sType; void* pNext; uint32_t subgroupSize; VkShaderStageFlags supportedStages; VkSubgroupFeatureFlags supportedOperations; VkBool32 quadOperationsInAllStages; }; SubgroupͷαΠζΛҙࣝ͠ͳ͚Ε͹ͳΒͳ͘ͳͬͨ औಘͰ͖ΔΑ͏ʹ͠Α͏ Subgroup Operation

Slide 151

Slide 151 text

struct VkPhysicalDeviceSubgroupProperties { VkStructureType sType; void* pNext; uint32_t subgroupSize; VkShaderStageFlags supportedStages; VkSubgroupFeatureFlags supportedOperations; VkBool32 quadOperationsInAllStages; }; GPUʹΑͬͯ͸શͯͷਫฏԋࢉΛαϙʔτͰ͖ͳ͍͔΋͠Εͳ͍ ͲΕ͕࢖͑Δ͔ ௐ΂ΒΕΔΑ͏ʹ ͠Α͏ Subgroup Operation

Slide 152

Slide 152 text

͜ͷ ෺ཧσόΠε + Vulkan 1.0 VK_KHR_SWAPCHAIN_EXTENSION_NAME֦ு෇͖ = VkDevice ͜ͷόʔδϣϯͷ"1* ࿦ཧσόΠε

Slide 153

Slide 153 text

͜Ε͸Vulkan 1.0Ͱ΋Ͱ͖Δ ຕ໨ͷ (16 + Vulkan 1.0 VK_KHR_SWAPCHAIN_EXTENSION_NAME֦ு෇͖ = VkDevice ͜ͷόʔδϣϯͷ"1* ࿦ཧσόΠε ຕ໨ͷ (16 Vulkan 1.0 VK_KHR_SWAPCHAIN_EXTENSION_NAME֦ு෇͖ + = VkDevice ͜ͷόʔδϣϯͷ"1* ࿦ཧσόΠε

Slide 154

Slide 154 text

ຕ໨ͷ (16 ຕ໨ͷ (16 Vulkan 1.1 = VkDevice ͜ͷόʔδϣϯͷ"1* ࿦ཧσόΠε %FWJDF(SPVQ + /7-JOL౳Ͱ઀ଓ͞Εͨෳ਺ͷ(16͔Β ͭͷ࿦ཧσόΠεΛ࡞Δ Device Group

Slide 155

Slide 155 text

ຕ໨ͷ (16 ຕ໨ͷ (16 %FWJDF(SPVQ ίϚϯυόοϑΝ ίϚϯυ ίϚϯυ Ωϡʔʹྲྀͨ͠ίϚϯυ͸%FWJDF(SPVQ಺ͷ શͯͷ(16Ͱ࣮ߦ͞ΕΔ Device Group

Slide 156

Slide 156 text

ຕ໨ͷ (16 ຕ໨ͷ (16 %FWJDF(SPVQ ίϚϯυόοϑΝ ίϚϯυ ίϚϯυ ίϚϯυόοϑΝ୯ҐͰ ࣮ߦ͢Δ(16Λ੍ݶͰ͖Δ 1ຕ໨ͷGPU͚ͩͰ࣮ߦ Device Group

Slide 157

Slide 157 text

ຕ໨ͷ (16 ຕ໨ͷ (16 %FWJDF(SPVQ ίϚϯυόοϑΝ ίϚϯυ (16͸ෳ਺͚ͩͲ Ωϡʔ͸ಉ͔ͩ͡Β όϦΞͰಉظ͕Ͱ͖Δ 1ຕ໨ͷGPU͚ͩͰ࣮ߦ ίϚϯυόοϑΝ ίϚϯυ 2ຕ໨ͷGPU͚ͩͰ࣮ߦ ίϚϯυόοϑΝ όϦΞ ྆ํͰ࣮ߦ Device Group

Slide 158

Slide 158 text

VRͰ͸ϔουηοτͷϨϯζʹΑΔ࿪ΈΛ ϨϯμϦϯάଆͰଧͪফ͢

Slide 159

Slide 159 text

େ͖͘දࣔ͞ΕΔ=ղ૾౓͕ඞཁ খ͘͞දࣔ͞ΕΔ=ղ૾౓Λ্͛ͯ΋ແବ

Slide 160

Slide 160 text

୺ͷํ͚ͩ ࠷ॳ͔Βখ͘͞ඳ͜͏

Slide 161

Slide 161 text

Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend Ϩϯμʔύε ಉ͡௖఺഑ྻͷඳըཁٻΛ Ϩϯμʔύεͷෳ਺ͷύΠϓϥΠϯʹҰ੪ʹྲྀ͢ ό Ϧ Ξ Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend มܗ Multiview

Slide 162

Slide 162 text

Unprotected Protected 1SPUFDUFEͳϝϞϦͷதͰ ࡞ΒΕͨσʔλ͸ (16ͷ֎ʹ࣋ͪग़ͤͳ͍ ίϐʔϓϩςΫτ͞Εͨը૾΍ಈը͕ (16ͷϝϞϦ͔ΒಡΈऔΒΕΔͷΛ ๷͍͗ͨͬΆ͍ Protected Memory

Slide 163

Slide 163 text

Vulkan 1.2

Slide 164

Slide 164 text

όοϑΝ" CJOEJOH όοϑΝA #version 450 #extension GL_EXT_shader_16bit_storage : require layout(std430, binding = 1) buffer layout1 { uint8_t output_data[]; }; ... std::vector< std::uint8_t > data; 8bit੔਺ΛόοϑΝʹॻ͍ͯ γΣʔμ͔Β8bit੔਺ͱͯ͠ ಡΉ copy 8bitετϨʔδ 16bitಉ༷ 8bit੔਺ͷϕΫλ (ex. u8vec4) ΋OK

Slide 165

Slide 165 text

8bitετϨʔδ ͳΜͰ୹͍੔਺ͷαϙʔτΛ௥Ճ͢Δͷ χϡʔϥϧωοτϫʔΫ͸ ݸʑͷॏΈͷਫ਼౓ΑΓ΋ ॏΈͷݸ਺͕ ੑೳʹେ͖͘Өڹ͢Δ floatͷॏΈΛ1ݸஔ͘ϝϞϦ͕͋ͬͨΒ uint8_tͷॏΈΛ4ݸஔ͍ͨ΄͏͕ྑ͍

Slide 166

Slide 166 text

VkDeviceMemory VkBuffer 0x8000000 Buffer device address GPUͷϝϞϦ্ʹ͋ΔόοϑΝͷ GPU಺Ͱͷઌ಄ΞυϨεΛऔಘ͢Δ ༻్1: σόοά৘ใʹΞυϨεΛࡌͤΔ

Slide 167

Slide 167 text

#version 450 ... #extension GL_EXT_buffer_reference : enable layout(buffer_reference) buffer node_t; layout(buffer_reference, std430, buffer_reference_align = 16) buffer node_t { int value; node_t next; }; layout(std430) buffer uniforms_t { node_t root; } uniforms; void main() { node_t node = uniforms.root; node = b.next.next; ... } Buffer device address ༻్2: όοϑΝͷσʔλʹ ଞͷόοϑΝͷΞυϨεΛॻ͘ GPU্ͰḷΕΔlinked listΛ࡞ΕΔ GLSLͷbuffer_reference֦ுΛ࢖ͬͯಡΉ

Slide 168

Slide 168 text

#version 450 ... layout(binding = 1) uniform sampler2D tex1; layout(binding = 2) uniform sampler2D tex2; layout(binding = 3) uniform sampler2D tex3; layout(binding = 4) uniform sampler2D tex4; layout(binding = 5) uniform sampler2D tex5; layout(binding = 6) uniform sampler2D tex6; layout(binding = 7) uniform sampler2D tex7; layout(binding = 8) uniform sampler2D tex8; layout(binding = 9) uniform sampler2D tex9; layout(binding = 10) uniform sampler2D tex10; layout(binding = 11) uniform sampler2D tex11; layout(binding = 12) uniform sampler2D tex12; layout(binding = 13) uniform sampler2D tex13; layout(binding = 14) uniform sampler2D tex14; layout(binding = 15) uniform sampler2D tex15; layout(binding = 16) uniform sampler2D tex16; ... int main() { vec4 value = texture2D( tex5, tex_coord ); } γΣʔμʹ౉͢ Ϧιʔε͕૿͑ͯ͘Δͱ ਏ͍ίʔυ͕Ͱ͖Δ

Slide 169

Slide 169 text

#version 450 ... layout(binding = 1) uniform sampler2D tex[]; ... int main() { vec4 value = texture2D( tex[ 4 ], tex_coord ); } σεΫϦϓλͷ഑ྻ Λ࡞ΕΔΑ͏ʹ͢Δ Descriptor Indexing

Slide 170

Slide 170 text

#version 450 ... layout(binding = 1) uniform sampler2D tex[]; ... int main() { vec4 value = texture2D( tex[ 4 ], tex_coord ); } σεΫϦϓλͷ഑ྻ Λ࡞ΕΔΑ͏ʹ͢Δ Descriptor Indexing γΣʔμ͕৮Βͳ͍σεΫϦϓλ͸ ࣮ࡍͷϦιʔεʹ݁ͼ͍͍ͭͯͳͯ͘΋ྑ͍ σεΫϦϓληοτͷཁ݅ͷ؇࿨ ίϚϯυόοϑΝͷه࿥தͰ΋ ࠓ৮ͬͯͳ͍σεΫϦϓλ͸ߋ৽ͯ͠Α͍

Slide 171

Slide 171 text

int main() { vec4 value = texture2D( tex[ 4 ], tex_coord ); } Λ࡞ΕΔΑ͏ʹ͢Δ Descriptor Indexing γΣʔμ͕৮Βͳ͍σεΫϦϓλ͸ ࣮ࡍͷϦιʔεʹ݁ͼ͍͍ͭͯͳͯ͘΋ྑ͍ σεΫϦϓληοτͷཁ݅ͷ؇࿨ ίϚϯυόοϑΝͷه࿥தͰ΋ ࠓ৮ͬͯͳ͍σεΫϦϓλ͸ߋ৽ͯ͠Α͍ ͱΓ͋͑ͣڊେͳσεΫϦϓληοτΛ࡞͓͍ͬͯͯ ඞཁʹԠͯ͡ඞཁͳཁૉʹϦιʔεΛηοτ͢Δӡ༻͕Մೳʹ

Slide 172

Slide 172 text

ϑϨʔϜόοϑΝ VkDeviceMemory VkImage Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend VkDeviceMemory VkImage VkResult vkCreateFramebuffer( VkDevice device, const VkFramebufferCreateInfo* pCreateInfo, const VkAllocationCallbacks* pAllocator, VkFramebuffer* pFramebuffer ); typedef struct VkFramebufferCreateInfo { VkStructureType sType; const void* pNext; VkFramebufferCreateFlags flags; VkRenderPass renderPass; uint32_t attachmentCount; const VkImageView* pAttachments; uint32_t width; uint32_t height; uint32_t layers; } VkFramebufferCreateInfo; ࢖͏Πϝʔδͷ Ϗϡʔͷ഑ྻ ϑϨʔϜόοϑΝΑΓઌʹ Πϝʔδ͕ཁΔ

Slide 173

Slide 173 text

sType; pNext; Flags flags; renderPass; attachmentCount; pAttachments; width; height; layers; Info; NULL typedef struct VkFramebufferAttachmentsCreateInfo { VkStructureType sType; const void* pNext; uint32_t attachmentImageInfoCount; const VkFramebufferAttachmentImageInfo* pAttachmentImageInfos; } VkFramebufferAttachmentsCreateInfo; VK_FRAMEBUFFER_CREATE_IMAGELESS_BIT_KHR ༁:͋ͱͰ typedef struct VkFramebufferAttachmentImageInfo { VkStructureType sType; const void* pNext; VkImageCreateFlags flags; VkImageUsageFlags usage; uint32_t width; uint32_t height; uint32_t layerCount; uint32_t viewFormatCount; const VkFormat* pViewFormats; } VkFramebufferAttachmentImageInfo; ༁:͜ΜͳΠϝʔδϏϡʔ͕ ෇͘༧ఆ Imageless framebuffer

Slide 174

Slide 174 text

NULL ༁:͋ͱͰ typedef struct VkFramebufferAttachmentImageInfo { VkStructureType sType; const void* pNext; VkImageCreateFlags flags; VkImageUsageFlags usage; uint32_t width; uint32_t height; uint32_t layerCount; uint32_t viewFormatCount; const VkFormat* pViewFormats; } VkFramebufferAttachmentImageInfo; ༁:͜ΜͳΠϝʔδϏϡʔ͕ ෇͘༧ఆ Imageless framebuffer typedef struct VkRenderPassAttachmentBeginInfo { VkStructureType sType; const void* pNext; uint32_t attachmentCount; const VkImageView* pAttachments; } VkRenderPassAttachmentBeginInfo; ࢖͏Πϝʔδͷ Ϗϡʔͷ഑ྻ ϨϯμʔύεΛΩϡʔʹ౤͛Δͱ͖ʹ͜ΕΛ෇͚ͯ ࢖͏ΠϝʔδϏϡʔΛܾఆ

Slide 175

Slide 175 text

ϑϨʔϜόοϑΝ VkDeviceMemory VkImage VkDeviceMemory VkImage ৭͕ೖͬͯΔ ਂ౓ͱεςϯγϧ͕ ೖͬͯΔ VulkanͰ͸ਂ౓ͱεςϯγϧ͸ಉ͡Πϝʔδʹه࿥͢Δ Ұൠతͳਂ౓͕24bitɺεςϯγϧ͸8bitͰे෼ͳͷͰ ྆ऀΛ͚ͬͭͯ͘32bitʹ͢Δͱऩ·Γ͕ྑ͍

Slide 176

Slide 176 text

VkDeviceMemory VkImage ਂ౓ͱεςϯγϧ͕ ೖͬͯΔ ͜Ε͸࣮ࡍʹ͸ґଘ͕ͳ͍σʔλ΁ͷґଘؔ܎Λੜͤ͡͞Δ Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend ό Ϧ Ξ ਂ౓͔͍͠Βͳ͍Μ͚ͩͲ ͍ͬͭͯ͘Δ͔Β ྆ํʹґଘ͢Δ͔͠ͳ͍

Slide 177

Slide 177 text

VkDeviceMemory VkImage ͜Ε͸࣮ࡍʹ͸ґଘ͕ͳ͍σʔλ΁ͷґଘؔ܎Λੜͤ͡͞Δ FS Color Blend typedef struct VkAttachmentDescriptionStencilLayout { VkStructureType sType; void* pNext; VkImageLayout stencilInitialLayout; VkImageLayout stencilFinalLayout; } VkAttachmentDescriptionStencilLayout; ਂ౓εςϯγϧͷΠϝʔδͷ͏ͪ ͲͪΒ͔ҰํʹͷΈґଘ͕͋ΔࣄΛ໌ࣔͰ͖ΔΑ͏ʹ͢Δ Separate Depth Stencil Layouts

Slide 178

Slide 178 text

#version 450 #extension GL_ARB_gpu_shader_int64 : enable #extension GL_EXT_shader_atomic_int64 : enable ... void main() { uint64_t result = atomicCompSwap( data, 0, 1 ); ... } ʮdataʹஔ͔Εͨ஋͕0ͩͬͨΒ1ʹ͢ΔʯΛෆՄ෼ʹߦ͏ GPU͕αϙʔτ͍ͯ͠Δ৔߹ ͜ͷΑ͏ͳ64bit੔਺ͷAtomicԋࢉΛγΣʔμͰ࢖͑ΔΑ͏ʹͳΔ Atomic 64bit

Slide 179

Slide 179 text

#version 450 ... #extension GL_EXT_shader_16bit_storage : require layout(std430, binding = 1) buffer layout1 { f16vec4 input_bufffer[]; }; layout(std430, binding = 2) buffer layout22 { f16vec4 output_buffer[]; }; ... void main() { vec4 value = input_buffer[ gl_GlobalInvocationID.x ]; output_buffer[ gl_GlobalInvocationID.x ] = value * 2.0; } ൒ਫ਼౓ ൒ਫ਼౓ ୯ਫ਼౓ Vulkan 1.1ͷ16bitετϨʔδ͸ 16bitͰϝϞϦʹஔ͍ͯ32bitͰܭࢉͩͬͨ

Slide 180

Slide 180 text

#version 450 ... #extension GL_EXT_shader_16bit_storage : require layout(std430, binding = 1) buffer layout1 { f16vec4 input_bufffer[]; }; layout(std430, binding = 2) buffer layout22 { f16vec4 output_buffer[]; }; ... void main() { f16vec4 value = input_buffer[ gl_GlobalInvocationID.x ]; output_buffer[ gl_GlobalInvocationID.x ] = value * 2.0; } ൒ਫ਼౓ ൒ਫ਼౓ ൒ਫ਼౓ Float16 Int8 Vulkan 1.2Ͱ͸σόΠε͕αϙʔτ͍ͯ͠Δ৔߹ ൒ਫ਼౓ͷ··ܭࢉ͕Ͱ͖Δ

Slide 181

Slide 181 text

#version 450 ... #extension GL_EXT_shader_16bit_storage : require layout(std430, binding = 1) buffer layout1 { uint8_t input_bufffer[]; }; layout(std430, binding = 2) buffer layout22 { uint8_t output_buffer[]; }; ... void main() { uint8_t value = input_buffer[ gl_GlobalInvocationID.x ]; output_buffer[ gl_GlobalInvocationID.x ] = value * 2; } 8bit੔਺ 8bit੔਺ 8bit੔਺ Float16 Int8 Vulkan 1.2Ͱ͸σόΠε͕αϙʔτ͍ͯ͠Δ৔߹ 8bit੔਺ͷ··ܭࢉ͕Ͱ͖Δ

Slide 182

Slide 182 text

ίϚϯυόοϑΝ ηϚϑΥ ίϚϯυόοϑΝ ηϚϑΥ ίϚϯυόοϑΝ ηϚϑΥ ίϚϯυόοϑΝ ηϚϑΥ ίϚϯυόοϑΝ ผͷΩϡʔͷίϚϯυͱ ಉظΛऔΔʹ͸ ಉظճ਺෼ͷηϚϑΥ͕ཁΔ ͜Εͱ ͜Εͱ ͜Εͱ ͋ͱ͜Ε΋

Slide 183

Slide 183 text

ίϚϯυόοϑΝ ηϚϑΥ ίϚϯυόοϑΝ ίϚϯυόοϑΝ ίϚϯυόοϑΝ ίϚϯυόοϑΝ 1ͭͷηϚϑΥΛΧ΢ϯτ͍ͯ͘͠ ηϚϑΥΛ+1 ηϚϑΥ͕1ʹͳͬͨΒ։࢝ ηϚϑΥΛ+1 ηϚϑΥ͕2ʹͳͬͨΒ։࢝ ηϚϑΥΛ+1 ηϚϑΥ͕3ʹͳͬͨΒ։࢝ ηϚϑΥΛ+1 ηϚϑΥ͕4ʹͳͬͨΒ։࢝ ηϚϑΥΛ+1 ಉظՕॴ͕ଟ͍৔߹ʹ؅ཧָ͕ Timeline Semaphore

Slide 184

Slide 184 text

ίϚϯυόοϑΝ ηϚϑΥ ίϚϯυόοϑΝ ίϚϯυόοϑΝ ઌߦ͢Δ3ͭͷίϚϯυόοϑΝͷ͏ͪ 2͕ͭ׬ྃͨ͠Β̐ͭ໨Λ౤ೖͯ͠ྑ͍ ηϚϑΥΛ+1 ηϚϑΥΛ+1 Timeline Semaphore ίϚϯυόοϑΝ ηϚϑΥΛ+1 ηϚϑΥ͕2ʹͳͬͨΒ։࢝

Slide 185

Slide 185 text

ඪ४ʹೖ͍ͬͯͳ͍ϗοτͳ֦ு

Slide 186

Slide 186 text

VK_KHR_video_queue ίϚϯυόοϑΝ VkDeviceMemory VkImage VkDeviceMemory VkImage VkDeviceMemory VkImage VkDeviceMemory VkImage VkDeviceMemory VkImage VkDeviceMemory VkBuffer ͜ͷόοϑΝʹೖͬͨ ಈըͷετϦʔϜΛ σίʔυͯ͠ ͜ͷΠϝʔδͷྻʹు͍ͯ ಈըରԠΩϡʔ GPU͕උ͑Δ ϋʔυ΢ΣΞಈըΤϯίʔμɾσίʔμΛ࢖͏

Slide 187

Slide 187 text

VK_KHR_video_queue ίϚϯυόοϑΝ VkDeviceMemory VkImage VkDeviceMemory VkImage VkDeviceMemory VkImage VkDeviceMemory VkImage VkDeviceMemory VkImage VkDeviceMemory VkBuffer ͜ͷόοϑΝʹೖͬͨ ಈըͷετϦʔϜΛ σίʔυͯ͠ ͜ͷΠϝʔδͷྻʹు͍ͯ ಈըରԠΩϡʔ GPU͕උ͑Δ ϋʔυ΢ΣΞಈըΤϯίʔμɾσίʔμΛ࢖͏

Slide 188

Slide 188 text

ैདྷͷ ΠϯλϥΫςΟϒͳ 3DάϥϑΟΫε͸ ؒ઀র໌Λແࢹ͢Δ

Slide 189

Slide 189 text

ʹ͓͚Δؒ઀র໌Λܭࢉ͢Δʹ͸ ͷҐஔ͔Β͋Δํ޲΁৳ͼΔઢ෼ ͕ ͷҐஔͰ ଞͷ໘ͱަࠩ͢ΔࣄΛ ൃݟ͠ͳ͚Ε͹ͳΒͳ͍ p p v q p q v

Slide 190

Slide 190 text

v ⋮ ௖఺഑ྻ ͸ ઢ෼v ͱަࠩ͠·͔͢? ௖఺഑ྻͷࡾ֯ܗΛ1ͭͮͭᢞΊΔΑΓ ޮ཰ͷྑ͍൑ఆํ๏͕ͳ͍ ϦΞϧλΠϜͰ൑ఆͯ͠ Ͱ͖·ͤΜ!

Slide 191

Slide 191 text

v ௖఺഑ྻ ͸ ઢ෼v ͱަࠩ͠·͔͢? ϦΞϧλΠϜͰ൑ఆͯ͠ Ͱ͖·͢ ࣄલʹม׵ ໦ߏ଄ ϦΞϧλΠϜͰ มܗʹ௥ैͯ͠ Ͱ͖·ͤΜ! ௖఺഑ྻΛ໦ߏ଄ʹม׵ ൑ఆ͸Ͱ͖Δɺ͕

Slide 192

Slide 192 text

ڞ༗ϝϞϦ L1Ωϟογϡ RT Core ࠷ۙͷNVIDIAͷ GPUʹࡌͬͯΔ RT Core ௖఺഑ྻ͔Β BVH(໦ߏ଄)Λ ര଎Ͱ࡞Γ ര଎Ͱઢ෼ͱͷ ަࠩ൑ఆΛ͢Δ ઐ༻ϋʔυ΢ΣΞ

Slide 193

Slide 193 text

VK_KHR_acceleration_structure VkDeviceMemory VkAccelerationStructureKHR ͜ͷϝϞϦΛަࠩ൑ఆͷҝʹ GPU͕ੜ੒ͨ͠໦ߏ଄ͷஔ͖৔ॴͱͯ͠࢖͍·͢ ۩ମతͳϑΥʔϚοτ͸Vulkanʹ೚ͤ·͢ VkDeviceMemory VkImage VkDeviceMemory VkBuffer ͜ΕͷࣄΛVulkanͰ͸Acceleration StructureͱݺͿ

Slide 194

Slide 194 text

VK_KHR_acceleration_structure void vkCmdBuildAccelerationStructuresKHR( VkCommandBuffer commandBuffer, uint32_t infoCount, const VkAccelerationStructureBuildGeometryInfoKHR* pInfos, const VkAccelerationStructureBuildRangeInfoKHR* const* ppBuildRangeInfos ); typedef struct VkAccelerationStructureBuildGeometryInfoKHR { VkStructureType sType; const void* pNext; VkAccelerationStructureTypeKHR type; VkBuildAccelerationStructureFlagsKHR flags; VkBuildAccelerationStructureModeKHR mode; VkAccelerationStructureKHR srcAccelerationStructure; VkAccelerationStructureKHR dstAccelerationStructure; uint32_t geometryCount; const VkAccelerationStructureGeometryKHR* pGeometries; const VkAccelerationStructureGeometryKHR* const* ppGeometries; VkDeviceOrHostAddressKHR scratchData; } VkAccelerationStructureBuildGeometryInfoKHR; ͜Εʹ ޲͔ͬͯ

Slide 195

Slide 195 text

VK_KHR_acceleration_structure onStructureGeometryKHR* pGeometries; onStructureGeometryKHR* const* ppGeometries; essKHR scratchData; ctureBuildGeometryInfoKHR; typedef struct VkAccelerationStructureGeometryKHR { VkStructureType sType; const void* pNext; VkGeometryTypeKHR geometryType; VkAccelerationStructureGeometryDataKHR geometry; VkGeometryFlagsKHR flags; } VkAccelerationStructureGeometryKHR; typedef union VkAccelerationStructureGeometryDataKHR { VkAccelerationStructureGeometryTrianglesDataKHR triangles; VkAccelerationStructureGeometryAabbsDataKHR aabbs; VkAccelerationStructureGeometryInstancesDataKHR instances; } VkAccelerationStructureGeometryDataKHR;

Slide 196

Slide 196 text

VK_KHR_acceleration_structure uctureGeometryKHR; n VkAccelerationStructureGeometryDataKHR { tionStructureGeometryTrianglesDataKHR triangles; tionStructureGeometryAabbsDataKHR aabbs; tionStructureGeometryInstancesDataKHR instances; tionStructureGeometryDataKHR; typedef struct VkAccelerationStructureGeometryTrianglesDataKHR { VkStructureType sType; const void* pNext; VkFormat vertexFormat; VkDeviceOrHostAddressConstKHR vertexData; VkDeviceSize vertexStride; uint32_t maxVertex; VkIndexType indexType; VkDeviceOrHostAddressConstKHR indexData; VkDeviceOrHostAddressConstKHR transformData; } VkAccelerationStructureGeometryTrianglesDataKHR; ͜ͷΞυϨεʹ ஔ͍ͯ͋Δ ௖఺഑ྻ͔Β ໦ߏ଄Λੜ੒͢ΔίϚϯυΛΩϡʔʹੵΉ

Slide 197

Slide 197 text

VK_KHR_acceleration_structure uctureGeometryKHR; n VkAccelerationStructureGeometryDataKHR { tionStructureGeometryTrianglesDataKHR triangles; tionStructureGeometryAabbsDataKHR aabbs; tionStructureGeometryInstancesDataKHR instances; tionStructureGeometryDataKHR; typedef struct VkAccelerationStructureGeometryAabbsDataKHR { VkStructureType sType; const void* pNext; VkDeviceOrHostAddressConstKHR data; VkDeviceSize stride; } VkAccelerationStructureGeometryAabbsDataKHR; ͜ͷΞυϨεʹ ஔ͍ͯ͋Δ AABBͷ഑ྻ͔Β ໘ͱͷަࠩͰ͸ͳ͘ AABBͱͷަࠩ൑ఆΛ͢Δ໦ߏ଄Λ࡞Δ͜ͱ΋Ͱ͖Δ

Slide 198

Slide 198 text

#version 450 #extension GL_EXT_ray_query : enable ... void main() { rayQueryEXT ray_query; rayQueryInitializeEXT( ray_query, acceleration_structure, gl_RayFlagsTerminateOnFirstHitEXT, cull_mask, pos, near, direction, far ); while( rayQueryProceedEXT( ray_query ) ) { if( rayQueryGetIntersectionTypeEXT( ray_query, false ) == gl_RayQueryCandidateIntersectionTriangleEXT ) { rayQueryConfirmIntersectionEXT( ray_query ); } } if( rayQueryGetIntersectionTypeEXT( ray_query, true) == gl_RayQueryCommittedIntersectionNoneEXT ) { ... } } VK_KHR_ray_query ͜ͷAcceleration StructureͰ posͷҐஔ͔Βdirectionͷ޲͖ʹ near͔Βfar·Ͱͷڑ཭ͷઢ෼͕ Կ͔ͱަࠩ͢Δ͔ௐ΂ͯ ަࠩ͢Δࡾ֯ܗΛΈ͚ͭͨΒ ःṭ෺͕͋Δͱ͖ͷॲཧ

Slide 199

Slide 199 text

෺ମͷද໘͕׬શͳڸ໘Ͱͳ͍ݶΓ ෺ମͷද໘ʹ౰ͨͬͨޫ͸༷ʑͳํ޲ʹࢄΒ͹͍ͬͯ͘ ϨΠτϨʔγϯάͰ͸ ෺ମͷද໘ʹͿ͔ͭΔͨͼʹ σʔλͷฒྻ౓্͕͕͍ͬͯ͘

Slide 200

Slide 200 text

ϨΠτϨʔγϯάͰ͸ ෺ମͷද໘ʹͿ͔ͭΔͨͼʹ σʔλͷฒྻ౓্͕͕͍ͬͯ͘ ͜ΕΛطଘͷ ύΠϓϥΠϯͰߦ͏ Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend CS ίϯϐϡʔτύΠϓϥΠϯ άϥϑΟΫεύΠϓϥΠϯ

Slide 201

Slide 201 text

Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend CS ίϯϐϡʔτύΠϓϥΠϯ άϥϑΟΫεύΠϓϥΠϯ ͜ΕΛطଘͷ ύΠϓϥΠϯͰߦ͏ ͷ͸ແཧͦ͏ͩͬͨͷͰ৽͍͠ύΠϓϥΠϯ͕ੜ͑ͨ RayGen Shader Closest Hit Shader Miss Shader ϨΠτϨʔγϯάύΠϓϥΠϯ VK_KHR_ray_tracing_pipeline Ray Query

Slide 202

Slide 202 text

͜͜ʹॻ͘ͱग़Δ ίϯϙδλ X Window System Wayland Compositor Windows DWM etc. Vulkan ΞϓϦέʔγϣϯ ίϯϙδλΛܦ༝͢ΔΦʔόʔϔου͕զຫͰ͖ͳ͍

Slide 203

Slide 203 text

͜͜ʹॻ͘ͱग़Δ ίϯϙδλ Windows DWM Vulkan ΞϓϦέʔγϣϯ શը໘දࣔதͳΒΞϓϦέʔγϣϯଆʹ σΟεϓϨΠ΁ͷग़ྗ಺༰Λ௚઀৮Βͤͯ΋ྑ͍ͷͰ͸ vkAcquireFullScreenExclusiveModeEXT (༁:ը໘Λؙ͝ͱΑͤ͜) VK_EXT_full_screen_exclusive

Slide 204

Slide 204 text

͜͜ʹॻ͘ͱग़Δ XΛىಈ͍ͯ͠ͳ͍Linux Vulkan ΞϓϦέʔγϣϯ ͦ΋ͦ΋ίϯϙδλ͕ډͳ͍ͳΒ ΞϓϦέʔγϣϯ͕σΟεϓϨΠͷ੍ޚΛѲͬͯྑ͍ͷͰ͸ ίϯϙδλ ͲΜͳϞʔυͰදࣔͰ͖ΔσΟεϓϨΠ͕ ͍ͭ͘ܨ͕͍ͬͯ·͔͢? VK_KHR_display σΟεϓϨΠ1

Slide 205

Slide 205 text

͜͜ʹॻ͘ͱग़Δ Vulkan ΞϓϦέʔγϣϯ LinuxͷKernel Mode Settingʹର͢Δബ͍ϥούʔ͕ Vulkanʹ௥Ճ͞ΕΔ σΟεϓϨΠ1΁ͷग़ྗΛ1920x1080@60Hz 24bitʹͯ͠ ͦ͜ʹॻͨ͘ΊͷεϫοϓνΣʔϯΛ࡞੒ VK_KHR_display_swapchain εϫοϓνΣʔϯ VkDeviceMemory VkImage VkDeviceMemory VkImage VkDeviceMemory VkImage VkDeviceMemory VkImage σΟεϓϨΠ1

Slide 206

Slide 206 text

ϝογϡͷڥք෦෼Ҏ֎Ͱ͸ ۙ๣ͷϐΫηϧͱࣅͨ৭ʹͳΔϐΫηϧ͕ଟ͍

Slide 207

Slide 207 text

ࣄલʹڥք͕Ͳ͜ʹདྷΔ͔Θ͔Δ৔߹ ͦΕʹج͍ͮͯϑϥάϝϯτγΣʔμͷ࣮ߦΛؒҾ͖͍ͨ Fragment Density Map

Slide 208

Slide 208 text

≃ ؒҾ͍ͨ৔߹ શͯܭࢉͨ͠৔߹ VK_EXT_fragment_density_map

Slide 209

Slide 209 text

VK_EXT_fragment_density_map ਓؒͷࢹ֮͸ࢹ໺ͷத৺෦෼Ҏ֎͸ࡉ͔͍ྠֲΛଊ͍͑ͯͳ͍ ࢹઢΛ௥੻Ͱ͖ΔVRϔουηοτͰத৺෇͚ۙͩࡉ͔͘ඳ͖͍ͨ

Slide 210

Slide 210 text

VK_KHR_fragment_shading_rate MSAA΍SupersamplingͰ͸ ΞϯνΤΠϦΞεͷҝʹ1ϐΫηϧʹରͯ͠ ϑϥάϝϯτγΣʔμͷ࣮ߦ݁ՌΛෳ਺࣋ͭ ڥք෦෼Ͱ͸༗ޮ͕ͩ ͦΕҎ֎Ͱ͸ແବͳͷͰ ৔ॴʹΑͬͯݸ਺Λม͍͑ͨ

Slide 211

Slide 211 text

Input Assembly Vertex Shader Tessellation Control Shader Tessellation Tessellation Evaluation Shader Geometry Shader Rasterization Fragment Shader Color Blend VK_EXT_transform_feedback VkDeviceMemory VkBufer άϥϑΟΫεύΠϓϥΠϯΛ δΦϝτϦγΣʔμ·ͰͰࢭΊͯ δΦϝτϦγΣʔμͷग़ྗΛ όοϑΝʹు͘ OpenGLʹ͸ඪ४ͰඋΘͬͯͨ΍ͭ

Slide 212

Slide 212 text

Ϩϯμʔύε Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend " Ϩϯμʔύε Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend # Ϩϯμʔύε Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend $ Ϩϯμʔύε Input Assembly VS TCS Tessellation TES GS Rasterization FS Color Blend % ϞόΠϧGPUͰͳ͍GPUͰ͸ ϨϯμʔύεΛ׆༻͢Δҙຯ͸͋·Γͳ͍ͷͰ ύΠϓϥΠϯ͕1͚ͭͩͷϨϯμʔύε͕େྔʹͰ͖͕ͪ ϨϯμʔύεΛ࡞Δͷ͕ΊΜͲ͍͘͞

Slide 213

Slide 213 text

VK_KHR_dynamic_rendering ϨϯμʔύεΛ NULLͰ΋ྑ͘͢Δ άϥϑΟΫεύΠϓϥΠϯ ࡞੒࣌

Slide 214

Slide 214 text

VK_KHR_dynamic_rendering void vkCmdBeginRenderingKHR( VkCommandBuffer commandBuffer, VkRenderingInfoKHR* pRenderingInfo ); void vkCmdEndRenderingKHR( VkCommandBuffer commandBuffer ); ͔͜͜Βଈ੮Ͱ࡞ͬͨ ϨϯμʔύεΛ࢖͏ ͜͜·Ͱଈ੮Ͱ࡞ͬͨ ϨϯμʔύεΛ࢖͏ த਎͕ύΠϓϥΠϯ1͚ͭͩͷϨϯμʔύεͳΒ ϨϯμʔύεΛίϚϯυόοϑΝʹੵΉ࣌ʹ ͦͷ৔Ͱ࡞ΕΔΑ͏ʹ͢Δ

Slide 215

Slide 215 text

ٕज़ॻయ12Ͱ ࠷ۙͷVulkanͷ࿩Λ੝ΓࠐΜͩ 3DάϥϑΟΫεAPI VulkanΛ ग़དྷΔ͚ͩ ΍͘͞͠ ղઆ͢Δຊ Version 3.0 ΛϦϦʔε༧ఆ ※ࠨͷը૾͸Version 2.0ͷ΋ͷͰ͢ ిࢠ൛ͷ1.0·ͨ͸2.0Λ͍࣋ͬͯΔ৔߹ ແྉͰΞοϓσʔτΛड͚ΒΕ·͢ ※

Slide 216

Slide 216 text

·ͱΊ GPU͸୔ࢁͷϓϩηοα͕ࡌͬͨܭࢉػͩ VulkanΛ࢖͑͹GPUͷҰ௨Γͷૢ࡞͕Ͱ͖Δ Vulkan͸վྑ͕ଓ͚ΒΕ͍ͯΔ