Slide 1

Slide 1 text

ΛಡΊΔΑ͏ʹͳΖ͏ ͸ Khronos Group Inc. ͷొ࿥঎ඪͰ͢ NAOMASA MATSUBAYASHI(@fadis_)

Slide 2

Slide 2 text

x0 x1 x2 y0 y1 y2 χϡʔϥϧωοτϫʔΫ

Slide 3

Slide 3 text

w01 w02 w03 w04 × × × × ∑ ׆ ੑ Խ ؔ ਺ ॏΈ ܗࣜχϡʔϩϯ ೖྗ0 ೖྗ1 ೖྗ2 ೖྗ3 ⋯ ⋮ ग़ྗ

Slide 4

Slide 4 text

x0 x1 x2

Slide 5

Slide 5 text

x0 x1 x2 ೖ ྗ ૚ શ݁߹૚ શ݁߹૚ શ݁߹૚

Slide 6

Slide 6 text

x0 x1 x2

Slide 7

Slide 7 text

Կͱͳ͘૬ؔ͸͋Γͦ͏ͳΜ͚ͩͲ Ͳ͏͍͏ؔ܎͔Α͘Θ͔Βͳ͍σʔλ ? 2 7 4 खॻ͖จࣈͷը૾ ॻ͔Ε͍ͯΔ਺ࣈ

Slide 8

Slide 8 text

ͱΓ͋͑ͣΑ͘Θ͔Βͳ͍ؔ܎Λ χϡʔϥϧωοτϫʔΫʹ͢Δ 2 7 4 w શ ݁ ߹ ૚ શ ݁ ߹ ૚ શ ݁ ߹ ૚ ೖྗ૚ w w

Slide 9

Slide 9 text

ਖ਼͍͠ग़ྗ͕ग़ͯ͘ΔΑ͏ͳ Λݟ͚ͭΔ ਺ཧ࠷దԽ໰୊ʹͳΔ w 2 7 4 w શ ݁ ߹ ૚ શ ݁ ߹ ૚ શ ݁ ߹ ૚ ೖྗ૚ w w ֶश ͜Ε͕ ग़ͯ͘ΔΑ͏ʹ ͜ΕΛௐ੔

Slide 10

Slide 10 text

4 w ૚ ૚ ૚ w w ͜Ε͕ ग़ͯ͘ΔΑ͏ʹ ͜ΕΛௐ੔ ֶशΛߦ͏ͨΊͷϑϨʔϜϫʔΫ TensorFlow PyTorch Caffe

Slide 11

Slide 11 text

TensorFlow ΑΜ 7 ≠ ͜ͷঢ়گͰ7͕ग़ΔΑ͏ʹ ॏΈΛमਖ਼ w w w ͜Ε͸ԿͰ͔͢ ΍Γ௚͠ ΞϓϦέʔγϣϯ ͳͳ w w w ͜Ε͸ԿͰ͔͢ ਖ਼͍͠ग़ྗΛు͚ΔΑ͏ʹͳͬͨΒ ΞϓϦέʔγϣϯʹ૊ΈࠐΜͰ࢖͍͍ͨ

Slide 12

Slide 12 text

ϑϨʔϜϫʔΫͷ਺͚ͩอଘܗ͕ࣜ͋Δ TensorFlow PyTorch Caffe w w tf.saved_modelܗࣜ w w pickleܗࣜ w w caffemodelܗࣜ

Slide 13

Slide 13 text

ΞϓϦέʔγϣϯ͔Βར༻͢Δͷ͕ਏ͍ w w tf.saved_modelܗࣜ w w pickleܗࣜ w w caffemodelܗࣜ ΞϓϦέʔγϣϯ

Slide 14

Slide 14 text

Adobe Photoshop PSDܗࣜ GIMP XCFܗࣜ PNGܗࣜ ΞϓϦέʔγϣϯ ͜͏͍͏ͷແ͍ͷ? ม׵ ม׵ JPEGܗࣜ

Slide 15

Slide 15 text

w w tf.saved_modelܗࣜ TensorFlow w w caffemodelܗࣜ Caffe ม׵ ม׵ w w ΞϓϦέʔγϣϯ ͋Δ

Slide 16

Slide 16 text

ೖ ྗ ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ maxpool ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ maxpool ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ maxpool ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ maxpool ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ maxpool ૚ શ ݁ ߹ ૚ શ ݁ ߹ ૚ શ ݁ ߹ ૚ ֶश݁ՌΛ࠶ݱ͢Δҝʹอଘ͓͔ͯ͠ͳ͚Ε͹ͳΒͳ͍෺ ͲͷΑ͏ͳ૚͕ͲͷΑ͏ͳॱͰܨ͕͍ͬͯΔ͔ 3x3ͷϑΟϧλͰ 3νϟωϧ͔Β64νϟωϧʹม׵ ύσΟϯά͸֤ล1ͮͭ dilationͱstride͸ͦΕͧΕ1 ֤ͦͯ͠૚ͷઃఆ

Slide 17

Slide 17 text

ೖ ྗ ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ maxpool ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ maxpool ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ maxpool ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ maxpool ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ maxpool ૚ શ ݁ ߹ ૚ શ ݁ ߹ ૚ શ ݁ ߹ ૚ ֶश݁ՌΛ࠶ݱ͢Δҝʹอଘ͓͔ͯ͠ͳ͚Ε͹ͳΒͳ͍෺ ֤૚ͷॏΈ w w w w w w w w w w w w w w w w 3x3x3x64ͷ4֊ͷςϯιϧ ֤ཁૉͷ஋͸32bitුಈখ਺఺਺Ͱه࿥ ͦͯ͠ॏΈ͕ͲͷΑ͏ͳܕͰ ϑΝΠϧʹॻ͔Ε͍ͯΔ͔

Slide 18

Slide 18 text

w w w ුಈখ਺఺਺ͷ ςϯιϧ ුಈখ਺఺਺ͷ ςϯιϧ ੔਺ͷ ςϯιϧ ੔਺Λ ුಈখ਺఺਺ʹ ม׵͢Δํ๏ ࢀর ࢀর min=-2.0 max=2.0 bit=8 ࢀর ωοτϫʔΫͷܗͱઃఆ

Slide 19

Slide 19 text

w w w ුಈখ਺఺਺ͷ ςϯιϧ ුಈখ਺఺਺ͷ ςϯιϧ ੔਺ͷ ςϯιϧ ੔਺Λ ුಈখ਺఺਺ʹ ม׵͢Δํ๏ ࢀর ࢀর min=-2.0 max=2.0 bit=8 ࢀর ωοτϫʔΫͷܗͱઃఆ ਓ͕ؒ ಡΈॻ͖Ͱ͖Δ ςΩετܗࣜ ૉૣ͘ύʔεͰ͖Δ όΠφϦܗࣜ

Slide 20

Slide 20 text

w w w ුಈখ਺఺਺ͷ ςϯιϧ ුಈখ਺఺਺ͷ ςϯιϧ ੔਺ͷ ςϯιϧ ੔਺Λ ුಈখ਺఺਺ʹ ม׵͢Δํ๏ ࢀর ࢀর min=-2.0 max=2.0 bit=8 ࢀর ωοτϫʔΫͷܗͱઃఆ ϔομ ਺஋ͷྻ 128όΠτݻఆ ϑΝΠϧΛύʔε͢Δલʹ σʔλͷసૹΛ࢝ΊΒΕΔ

Slide 21

Slide 21 text

w w w ුಈখ਺఺਺ͷ ςϯιϧ ුಈখ਺఺਺ͷ ςϯιϧ ੔਺ͷ ςϯιϧ ੔਺Λ ුಈখ਺఺਺ʹ ม׵͢Δํ๏ ࢀর ࢀর min=-2.0 max=2.0 bit=8 ࢀর ωοτϫʔΫͷܗͱઃఆ version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { ... } graph.nnef NNEF 1.0ͷ࢓༷ʹैͬͯॻ͔Ε͍ͯ·͢ ͜ͷωοτϫʔΫͷ໊લ

Slide 22

Slide 22 text

version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { ... } graph.nnef ωοτϫʔΫʹର͢Δೖྗ஋͸ ͜ͷൣғͷதͰఆٛ͞ΕΔ dataͱ͍͏όοϑΝʹॻ͖ࠐΈ·͢ ͜ͷൣғ ωοτϫʔΫͷग़ྗ͸ ͜ͷൣғͷதͰఆٛ͞ΕΔ probͱ͍͏όοϑΝ͔ΒಡΈ·͢

Slide 23

Slide 23 text

input convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool linear ReLU linear ReLU linear softmax VGG16 SIMONYAN, Karen; ZISSERMAN, Andrew. Very deep convolutional networks for large- scale image recognition. arXiv preprint arXiv:1409.1556, 2014. https://arxiv.org/abs/1409.1556

Slide 24

Slide 24 text

https://github.com/KhronosGroup/NNEF-Tools/tree/main/models#nnef-model-zoo ILSVRC༻ʹֶशΛߦͬͨ VGG16ͷcaffemodelΛ ʹม׵ͨ͠΋ͷ

Slide 25

Slide 25 text

version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable(label = 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable(label = 'conv1_2_blob2', shape = [1, 64]); data = external(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef ೖྗ ग़ྗ

Slide 26

Slide 26 text

version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable(label = 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable(label = 'conv1_2_blob2', shape = [1, 64]); data = external(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef 224x224ͰRGB(3νϟωϧ)ͷը૾͕10ຕ·ͱΊͯ֎͔Βೖͬͯ͘Δ external - ֎͔Βೖͬͯ͘ΔσʔλΛఆٛ͢Δ external(shape = [10, 3, 224, 224]);

Slide 27

Slide 27 text

version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable(label = 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable(label = 'conv1_2_blob2', shape = [1, 64]); data = external(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef linear - ઢܗม׵ linear(relu_13, variable_28, variable_29); ϕΫτϧrelu_13ʹߦྻvariable_28Λֻ͚ͯϕΫτϧvariable_29Λ଍͢

Slide 28

Slide 28 text

x0 x1 x2 x3 y0 y1 y2 y3 y4 ͱ ͕͜͏͍͏ؔ܎ʹ͋Δ࣌ x y શ݁߹૚ - શͯͷ૊Έ߹Θ͕ͤ઀ଓ͞Ε͍ͯΔ

Slide 29

Slide 29 text

x0 x1 x2 x3 y0 y1 y2 y3 y4 w00 w01 w02 w03 w10 w11 w12 w13 w20 w21 w22 w23 w30 w31 w32 w33 w40 w41 w42 w43 ֤χϡʔϩϯͷॏΈΛฒ΂ͯߦྻʹ͢Δͱ

Slide 30

Slide 30 text

ϕ w00 w01 w02 w03 w10 w11 w12 w13 w20 w21 w22 w23 w30 w31 w32 w33 w40 w41 w42 w43 x0 x1 x2 x3 = y0 y1 y2 y3 y4 ׆ੑԽؔ਺ શ݁߹૚=ߦྻͱϕΫτϧͷੵΛٻΊͯ׆ੑԽؔ਺ʹ௨͢

Slide 31

Slide 31 text

version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable(label = 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable(label = 'conv1_2_blob2', shape = [1, 64]); data = external(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef ReLU - ྲྀߦΓͷ׆ੑԽؔ਺ relu(linear_1);

Slide 32

Slide 32 text

version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable(label = 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable(label = 'conv1_2_blob2', shape = [1, 64]); data = external(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); ߦྻͱϕΫτϧͷ ੵΛٻΊͯ ׆ੑԽؔ਺ʹ௨͢ ͦͷ݁ՌΛ

Slide 33

Slide 33 text

version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable(label = 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable(label = 'conv1_2_blob2', shape = [1, 64]); data = external(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef શ݁߹૚

Slide 34

Slide 34 text

version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable(label = 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable(label = 'conv1_2_blob2', shape = [1, 64]); data = external(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef softmax - ग़ྗͷ૯࿨͕1ʹͳΔ׆ੑԽؔ਺ softmax(linear_2, axes = [1]);

Slide 35

Slide 35 text

version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable(label = 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable(label = 'conv1_2_blob2', shape = [1, 64]); data = external(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); ߦྻͱϕΫτϧͷ ੵΛٻΊͯ ׆ੑԽؔ਺ʹ௨͢ ͦͷ݁ՌΛ

Slide 36

Slide 36 text

version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable(label = 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable(label = 'conv1_2_blob2', shape = [1, 64]); data = external(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef શ݁߹૚ શ݁߹૚

Slide 37

Slide 37 text

version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable(label = 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable(label = 'conv3_3_blob2', shape = [1, 256]); variable_31 = variable(label = 'fc8_blob2', shape = [1, 1000]); variable_30 = variable(label = 'fc8_blob1', shape = [1000, 4096]); variable_29 = variable(label = 'fc7_blob2', shape = [1, 4096]); variable_28 = variable(label = 'fc7_blob1', shape = [4096, 4096]); ... variable_3 = variable(label = 'conv1_2_blob2', shape = [1, 64]); data = external(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef variable - ϑΝΠϧ͔Β஋ΛಡΈࠐΉ

Slide 38

Slide 38 text

version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable(label = 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable(label = 'conv3_3_blob2', shape = [1, 256]); variable_31 = variable(label = 'fc8_blob2', shape = [1, 1000]); variable_30 = variable(label = 'fc8_blob1', shape = [1000, 4096]); variable_29 = variable(label = 'fc7_blob2', shape = [1, 4096]); variable_28 = variable(label = 'fc7_blob1', shape = [4096, 4096]); ... variable_3 = variable(label = 'conv1_2_blob2', shape = [1, 64]); data = external(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef variable - ϑΝΠϧ͔Β஋ΛಡΈࠐΉ fc7_blob2.datͱ͍͏໊લͷϑΝΠϧ͔Β 4096ཁૉͷϕΫτϧΛಡΜͩ΋ͷΛ variable_29ͱݺͿࣄʹ͢Δ variable(label = 'fc7_blob2', shape = [1, 4096]);

Slide 39

Slide 39 text

version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable(label = 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable(label = 'conv3_3_blob2', shape = [1, 256]); variable_31 = variable(label = 'fc8_blob2', shape = [1, 1000]); variable_30 = variable(label = 'fc8_blob1', shape = [1000, 4096]); variable_29 = variable(label = 'fc7_blob2', shape = [1, 4096]); variable_28 = variable(label = 'fc7_blob1', shape = [4096, 4096]); ... variable_3 = variable(label = 'conv1_2_blob2', shape = [1, 64]); data = external(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef variable - ϑΝΠϧ͔Β஋ΛಡΈࠐΉ fc7_blob1.datͱ͍͏໊લͷϑΝΠϧ͔Β 4096x4096ͷߦྻΛಡΜͩ΋ͷΛ variable_28ͱݺͿࣄʹ͢Δ variable(label = 'fc7_blob1', shape = [4096, 4096]);

Slide 40

Slide 40 text

ReLU w00 w01 w02 w03 w10 w11 w12 w13 w20 w21 w22 w23 w30 w31 w32 w33 w40 w41 w42 w43 x0 x1 x2 x3 + aaaaa aaaaa aaaaa aaaaa aaaaa = y0 y1 y2 y3 y4 ׆ੑԽؔ਺ w fc7_blob1.dat w fc7_blob2.dat relu_13 relu_14

Slide 41

Slide 41 text

input convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool linear ReLU linear ReLU linear softmax VGG16 SIMONYAN, Karen; ZISSERMAN, Andrew. Very deep convolutional networks for large- scale image recognition. arXiv preprint arXiv:1409.1556, 2014. https://arxiv.org/abs/1409.1556 ࠓݟͨ෦෼

Slide 42

Slide 42 text

input convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool linear ReLU linear ReLU linear softmax ৞ΈࠐΈ૚ VGG͸ը૾ͷ෼ྨͷҝʹ࡞ΒΕͨχϡʔϥϧωοτϫʔΫͳͷͰ ্ͷํʹ͸ը૾ॲཧ޲͖ͷ૚͕ஔ͔Ε͍ͯΔ

Slide 43

Slide 43 text

0.2 0.1 0.1 0.2 0.7 0.1 0.1 0.2 0.1 0.9 0 0 0 0 0 0.7 0.4 0 0.3 0 0 0 0.8 0 0 0.31 ೖྗ ग़ྗ ೖྗը૾ͷ࿈ଓ͢ΔϐΫηϧʹ ϑΟϧλΛ͔͚ͯ ग़ྗը૾ͷ஋ΛಘΔ ϑΟϧλ Α͋͘Δը૾ॲཧͷܗ

Slide 44

Slide 44 text

0.2 0.1 0.1 0.2 0.7 0.1 0.1 0.2 0.1 0.9 0 0 0 0 0 0.7 0.4 0 0.3 0 0 0 0.8 0 0 0.31 ೖྗ ग़ྗ ϑΟϧλΛॏΈ ͱͯ͠ w w ৞ΈࠐΈ૚ ֶशͰཉ͍͠৘ใΛऔΓग़ͤΔ ϑΟϧλΛ֫ಘ͢Δ

Slide 45

Slide 45 text

version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { ... variable_2 = variable(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable(label = 'conv1_1_blob2', shape = [1, 64]); ... variable = variable(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable(label = 'conv1_2_blob2', shape = [1, 64]); data = external(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); ... } graph.nnef conv( data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1] ); conv - ৞ΈࠐΈ

Slide 46

Slide 46 text

conv( data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1] ); conv - ৞ΈࠐΈ ೖྗը૾ ϑΟϧλ νϟωϧຖʹՃ͑Δఆ਺

Slide 47

Slide 47 text

conv( data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1] ); dilation=1 dilation=2 ͜Ε

Slide 48

Slide 48 text

conv( data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1] ); padding=0 padding=1 ͜Ε

Slide 49

Slide 49 text

conv( data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1] ); stride=1 stride=2 ͜Ε

Slide 50

Slide 50 text

ೖྗ3νϟωϧ ग़ྗ4νϟωϧ ͷϑΟϧλ͕ ݸඞཁ 3 × 3 3 × 4

Slide 51

Slide 51 text

input convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool linear ReLU linear ReLU linear softmax RGBͷ3νϟωϧ 3νϟωϧΛ64νϟωϧʹ͢Δ 64νϟωϧΛ64νϟωϧʹ͢Δ

Slide 52

Slide 52 text

version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { ... variable_2 = variable(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable(label = 'conv1_1_blob2', shape = [1, 64]); ... variable = variable(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable(label = 'conv1_2_blob2', shape = [1, 64]); data = external(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); ... } graph.nnef variable(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); ϑΝΠϧconv1_1_blob1.datʹॻ͔Εͨ ͷ4֊ͷςϯιϧΛϑΟϧλͱͯ͠࢖͏ 3 × 3 × 3 × 64

Slide 53

Slide 53 text

version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { ... variable_2 = variable(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable(label = 'conv1_1_blob2', shape = [1, 64]); ... variable = variable(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable(label = 'conv1_2_blob2', shape = [1, 64]); data = external(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); ... } graph.nnef 3νϟωϧ͔Β64νϟωϧ΁ͷ৞ΈࠐΈ૚

Slide 54

Slide 54 text

version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { ... variable_2 = variable(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable(label = 'conv1_1_blob2', shape = [1, 64]); ... variable = variable(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable(label = 'conv1_2_blob2', shape = [1, 64]); data = external(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); ... } graph.nnef variable(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); ϑΝΠϧconv1_2_blob1.datʹॻ͔Εͨ ͷ4֊ͷςϯιϧΛϑΟϧλͱͯ͠࢖͏ 3 × 3 × 64 × 64

Slide 55

Slide 55 text

version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { ... variable_2 = variable(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable(label = 'conv1_1_blob2', shape = [1, 64]); ... variable = variable(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable(label = 'conv1_2_blob2', shape = [1, 64]); data = external(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); ... } graph.nnef 64νϟωϧ͔Β64νϟωϧ΁ͷ৞ΈࠐΈ૚

Slide 56

Slide 56 text

version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { ... variable_2 = variable(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable(label = 'conv1_1_blob2', shape = [1, 64]); ... variable = variable(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable(label = 'conv1_2_blob2', shape = [1, 64]); data = external(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); ... } graph.nnef max_pool( relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2] ); max_pool - ϑΟϧλൣғ಺ͷ࠷େ஋Λग़ྗ͢Δ

Slide 57

Slide 57 text

4 3 6 5 3 0 1 0 8 6 1 9 1 3 7 0 8 ೖྗ ग़ྗ Max Pooling૚ ϑΟϧλͷൣғ಺Ͱ࠷େͷ஋Λ ग़ྗը૾ͷରԠ͢ΔҐஔʹు͘ ͜ͷൣғ಺Ͱ ࠷େ ͜ͷ૚͸ॏΈ Λ࣋ͨͳ͍ w

Slide 58

Slide 58 text

max_pool( relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2] ); padding=0 padding=1 ͜Ε

Slide 59

Slide 59 text

max_pool( relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2] ); size=2 size=3 size=4 ͜Ε

Slide 60

Slide 60 text

max_pool( relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2] ); stride=1 stride=2 ͜Ε

Slide 61

Slide 61 text

version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { ... variable_2 = variable(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable(label = 'conv1_1_blob2', shape = [1, 64]); ... variable = variable(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable(label = 'conv1_2_blob2', shape = [1, 64]); data = external(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); ... } graph.nnef Max PoolingͰը૾Λ224x224͔Β112x112ʹॖখ

Slide 62

Slide 62 text

input convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool linear ReLU linear ReLU linear softmax VGG16 SIMONYAN, Karen; ZISSERMAN, Andrew. Very deep convolutional networks for large- scale image recognition. arXiv preprint arXiv:1409.1556, 2014. https://arxiv.org/abs/1409.1556 ࠓݟͨ෦෼

Slide 63

Slide 63 text

input convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool linear ReLU linear ReLU linear softmax ͷը૾͕10ຕग़ΔΑ 7 × 7 × 512 ཁૉͷϕΫτϧ͕10ݸཉ͍͠Α 25088 σʔλͱͯ͠͸ಉ͕ͩ͡ ܕͷม׵͕͍Δ

Slide 64

Slide 64 text

version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { ... relu_12 = relu(conv_12); max_pool_4 = max_pool(relu_12, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); reshape = reshape(max_pool_4, shape = [10, -1]); linear = linear(reshape, variable_26, variable_27); relu_13 = relu(linear); ... } graph.nnef reshape(max_pool_4, shape = [10, -1]); ͜ͷόοϑΝͷ಺༰Λ 10౳෼ʹͯ͠ɺ10ຊͷϕΫτϧʹ͢Δ reshape - σʔλͷղऍͷมߋ

Slide 65

Slide 65 text

version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable(label = 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable(label = 'conv3_3_blob2', shape = [1, 256]); variable_31 = variable(label = 'fc8_blob2', shape = [1, 1000]); variable_30 = variable(label = 'fc8_blob1', shape = [1000, 4096]); variable_29 = variable(label = 'fc7_blob2', shape = [1, 4096]); variable_28 = variable(label = 'fc7_blob1', shape = [4096, 4096]); variable_27 = variable(label = 'fc6_blob2', shape = [1, 4096]); variable_26 = variable(label = 'fc6_blob1', shape = [4096, 25088]); variable_25 = variable(label = 'conv5_3_blob2', shape = [1, 512]); variable_24 = variable(label = 'conv5_3_blob1', shape = [512, 512, 3, 3]); variable_23 = variable(label = 'conv5_2_blob2', shape = [1, 512]); variable_22 = variable(label = 'conv5_2_blob1', shape = [512, 512, 3, 3]); variable_21 = variable(label = 'conv5_1_blob2', shape = [1, 512]); variable_20 = variable(label = 'conv5_1_blob1', shape = [512, 512, 3, 3]); variable_19 = variable(label = 'conv4_3_blob2', shape = [1, 512]); variable_18 = variable(label = 'conv4_3_blob1', shape = [512, 512, 3, 3]); variable_17 = variable(label = 'conv4_2_blob2', shape = [1, 512]); variable_16 = variable(label = 'conv4_2_blob1', shape = [512, 512, 3, 3]); variable_12 = variable(label = 'conv3_3_blob1', shape = [256, 256, 3, 3]); variable_10 = variable(label = 'conv3_2_blob1', shape = [256, 256, 3, 3]); variable_9 = variable(label = 'conv3_1_blob2', shape = [1, 256]); variable_8 = variable(label = 'conv3_1_blob1', shape = [256, 128, 3, 3]); variable_6 = variable(label = 'conv2_2_blob1', shape = [128, 128, 3, 3]); variable_11 = variable(label = 'conv3_2_blob2', shape = [1, 256]); variable_5 = variable(label = 'conv2_1_blob2', shape = [1, 128]); variable_4 = variable(label = 'conv2_1_blob1', shape = [128, 64, 3, 3]); variable_2 = variable(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable(label = 'conv1_1_blob2', shape = [1, 64]); variable_7 = variable(label = 'conv2_2_blob2', shape = [1, 128]); variable = variable(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable(label = 'conv1_2_blob2', shape = [1, 64]); data = external(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_2 = conv(max_pool, variable_4, variable_5, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); graph.nnef ϑΝΠϧ͔Β஋ΛಡΉ

Slide 66

Slide 66 text

variable_6 = variable(label = 'conv2_2_blob1', shape = [128, 128, 3, 3]); variable_11 = variable(label = 'conv3_2_blob2', shape = [1, 256]); variable_5 = variable(label = 'conv2_1_blob2', shape = [1, 128]); variable_4 = variable(label = 'conv2_1_blob1', shape = [128, 64, 3, 3]); variable_2 = variable(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable(label = 'conv1_1_blob2', shape = [1, 64]); variable_7 = variable(label = 'conv2_2_blob2', shape = [1, 128]); variable = variable(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable(label = 'conv1_2_blob2', shape = [1, 64]); data = external(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_2 = conv(max_pool, variable_4, variable_5, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_2 = relu(conv_2); conv_3 = conv(relu_2, variable_6, variable_7, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_3 = relu(conv_3); max_pool_1 = max_pool(relu_3, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_4 = conv(max_pool_1, variable_8, variable_9, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_4 = relu(conv_4); conv_5 = conv(relu_4, variable_10, variable_11, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_5 = relu(conv_5); conv_6 = conv(relu_5, variable_12, variable_13, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_6 = relu(conv_6); max_pool_2 = max_pool(relu_6, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_7 = conv(max_pool_2, variable_14, variable_15, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_7 = relu(conv_7); conv_8 = conv(relu_7, variable_16, variable_17, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_8 = relu(conv_8); conv_9 = conv(relu_8, variable_18, variable_19, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_9 = relu(conv_9); max_pool_3 = max_pool(relu_9, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_10 = conv(max_pool_3, variable_20, variable_21, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_10 = relu(conv_10); conv_11 = conv(relu_10, variable_22, variable_23, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_11 = relu(conv_11); conv_12 = conv(relu_11, variable_24, variable_25, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_12 = relu(conv_12); max_pool_4 = max_pool(relu_12, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); reshape = reshape(max_pool_4, shape = [10, -1]); linear = linear(reshape, variable_26, variable_27); relu_13 = relu(linear); linear_1 = linear(relu_13, variable_28, variable_29); input convolution ReLU maxpool convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU

Slide 67

Slide 67 text

variable_2 = variable(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable(label = 'conv1_1_blob2', shape = [1, 64]); variable_7 = variable(label = 'conv2_2_blob2', shape = [1, 128]); variable = variable(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable(label = 'conv1_2_blob2', shape = [1, 64]); data = external(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_2 = conv(max_pool, variable_4, variable_5, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_2 = relu(conv_2); conv_3 = conv(relu_2, variable_6, variable_7, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_3 = relu(conv_3); max_pool_1 = max_pool(relu_3, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_4 = conv(max_pool_1, variable_8, variable_9, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_4 = relu(conv_4); conv_5 = conv(relu_4, variable_10, variable_11, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_5 = relu(conv_5); conv_6 = conv(relu_5, variable_12, variable_13, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_6 = relu(conv_6); max_pool_2 = max_pool(relu_6, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_7 = conv(max_pool_2, variable_14, variable_15, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_7 = relu(conv_7); conv_8 = conv(relu_7, variable_16, variable_17, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_8 = relu(conv_8); conv_9 = conv(relu_8, variable_18, variable_19, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_9 = relu(conv_9); max_pool_3 = max_pool(relu_9, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_10 = conv(max_pool_3, variable_20, variable_21, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_10 = relu(conv_10); conv_11 = conv(relu_10, variable_22, variable_23, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_11 = relu(conv_11); conv_12 = conv(relu_11, variable_24, variable_25, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_12 = relu(conv_12); max_pool_4 = max_pool(relu_12, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); reshape = reshape(max_pool_4, shape = [10, -1]); linear = linear(reshape, variable_26, variable_27); relu_13 = relu(linear); linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool

Slide 68

Slide 68 text

max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_2 = conv(max_pool, variable_4, variable_5, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_2 = relu(conv_2); conv_3 = conv(relu_2, variable_6, variable_7, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_3 = relu(conv_3); max_pool_1 = max_pool(relu_3, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_4 = conv(max_pool_1, variable_8, variable_9, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_4 = relu(conv_4); conv_5 = conv(relu_4, variable_10, variable_11, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_5 = relu(conv_5); conv_6 = conv(relu_5, variable_12, variable_13, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_6 = relu(conv_6); max_pool_2 = max_pool(relu_6, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_7 = conv(max_pool_2, variable_14, variable_15, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_7 = relu(conv_7); conv_8 = conv(relu_7, variable_16, variable_17, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_8 = relu(conv_8); conv_9 = conv(relu_8, variable_18, variable_19, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_9 = relu(conv_9); max_pool_3 = max_pool(relu_9, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_10 = conv(max_pool_3, variable_20, variable_21, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_10 = relu(conv_10); conv_11 = conv(relu_10, variable_22, variable_23, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_11 = relu(conv_11); conv_12 = conv(relu_11, variable_24, variable_25, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_12 = relu(conv_12); max_pool_4 = max_pool(relu_12, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); reshape = reshape(max_pool_4, shape = [10, -1]); linear = linear(reshape, variable_26, variable_27); relu_13 = relu(linear); linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } convolution ReLU convolution ReLU convolution ReLU maxpool linear ReLU linear softmax ReLU convolution ReLU maxpool

Slide 69

Slide 69 text

constant( shape = [ 3, 4 ], value = [42.0] ); constant - શͯͷཁૉ͕ಛఆͷ஋ʹͳ͍ͬͯΔόοϑΝΛ࡞Δ [ 42.0 42.0 42.0 42.0 42.0 42.0 42.0 42.0 42.0 42.0 42.0 42.0 ]

Slide 70

Slide 70 text

conv( data, variable, variable_1, border = 'constant', dilation = [2, 2], groups = 1, padding = [(0, 0), (0, 0)], stride = [1, 1] ); deconv - ٯ৞ΈࠐΈ deconv( data, variable, variable_1, border = 'constant', dilation = [2, 2], groups = 1, padding = [(0, 0), (0, 0)], stride = [1, 1] );

Slide 71

Slide 71 text

box - ϑΟϧλͷൣғ಺ͷ૯࿨Λฦ͢ box( data, size = [ 1, 3, 3 ], border = 'constant', dilation = [2, 2], groups = 1, padding = [(0, 0), (0, 0)], stride = [1, 1] ); debox( data, size = [ 1, 3, 3 ], border = 'constant', dilation = [2, 2], groups = 1, padding = [(0, 0), (0, 0)], stride = [1, 1] ); 9 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4 1 1 1 1 1 1 1 1 1

Slide 72

Slide 72 text

6 9 8 2 7 6 8 7 8 5 3 6 6 1 0 6 4 7 3 4 7 argmax_pool - ࠷େ஋Λ࣋ͭཁૉͷΠϯσοΫεΛฦ͢ sample - ΠϯσοΫεͰଞͷςϯιϧ͔Β஋Λरͬͯ͘Δ

Slide 73

Slide 73 text

22 9 2 4 7 9 9 2 4 7 2 9 2 4 7 0 9 2 4 7 1 9 2 4 7 0 1 0 1 1 1 1 0 1 1 reduce - ಛఆͷ࣠ํ޲ͷશͯͷ஋Λ1ͭͷ஋ʹू໿͢Δ sum_reduce max_reduce min_reduce argmax_reduce argmin_reduce all_reduce any_reduce

Slide 74

Slide 74 text

split - ಛఆͷ࣠ํ޲ʹςϯιϧΛ෼ׂ͢Δ split concat pad

Slide 75

Slide 75 text

3 9 5 4 gather - ಛఆͷ࣠ํ޲ͷཁૉ͔Β1ͭΛબͿ 1 0 3 2 3 9 5 4 9.0 2.0 7.0 7.0 3.0 6.0 0.0 8.0 8.0 5.0 7.0 6.0 2.0 0.0 7.0 9.0 9 2 7 7 3 6 0 8 8 5 7 6 2 0 7 9 cast - ςϯιϧͷཁૉͷܕΛม͑Δ

Slide 76

Slide 76 text

matmul - ߦྻੵ result = matmul( input1, input2 ); w00 w01 w02 w03 w10 w11 w12 w13 w20 w21 w22 w23 w30 w31 w32 w33 = w00 w01 w02 w03 w10 w11 w12 w13 w20 w21 w22 w23 w30 w31 w32 w33 w00 w01 w02 w03 w10 w11 w12 w13 w20 w21 w22 w23 w30 w31 w32 w33 input1 input2 result result = transpose( input, [ 1, 0 ] ); transpose - సஔ w00 w01 w02 w03 w10 w11 w12 w13 w20 w21 w22 w23 w30 w31 w32 w33 = w00 w01 w02 w03 w10 w11 w12 w13 w20 w21 w22 w23 w30 w31 w32 w33 T result input

Slide 77

Slide 77 text

׆ੑԽؔ਺ sigmoid relu prelu leaky_relu elu gelu silu softmax softplus ϓʔϦϯά max_pool_with_index max_pool avg_pool rms_pool ਖ਼نԽ local_response_normalization local_mean_normalization local_variance_normalization local_contrast_normalization l1_normalization l2_normalization batch_normalization ྔࢠԽ min_max_linear_quantize zero_point_linear_quantize logarithmetic_quantize

Slide 78

Slide 78 text

ؔ৺ྖҬϓʔϦϯά avg_roi_pool max_roi_pool roi_resample avg_roi_align max_roi_align ϦαΠζ nearest_downsample area_downsample nearest_upsample multilinear_upsample ୯߲ԋࢉࢠ copy neg rcp exp log sin cos tan sinh cosh tanh asin acos atan asinh acosh atanh abs sign not floor ceil round ೋ߲ԋࢉࢠ add sub mul div pow lt gt le ge eq ne and or ࡾ߲ԋࢉࢠ select ͦͷଞͷؔ਺ sqr sqrt rsqr rsqrt log2 min max clamp

Slide 79

Slide 79 text

https://github.com/Fadis/gct GPU Computing Toolkit VulkanΛ࢖ͬͯ GPUΛ࢖͏ΞϓϦέʔγϣϯ͕ Α͘ߦ͏ॲཧΛ ؆ܿʹॻ͚ΔΑ͏ʹ͢Δ

Slide 80

Slide 80 text

https://github.com/Fadis/gct GPU Computing Toolkit VulkanΛ࢖ͬͯ GPUΛ࢖͏ΞϓϦέʔγϣϯ͕ Α͘ߦ͏ॲཧΛ ؆ܿʹॻ͚ΔΑ͏ʹ͢Δ Λ౉ͨ͠Β χϡʔϥϧωοτϫʔΫͷධՁΛ GPUͰγϡοͱ࣮ߦͰ͖ΔΑ͏ʹ͍ͨ͠

Slide 81

Slide 81 text

ͷΦϖϨʔλͷ͏ͪ VGGΛಈ͔ͨ͢Ίʹ࣮૷͞Ε͍ͯͳ͚Ε͹ͳΒͳ͍෺ external variable conv relu max_pool reshape linear softmax

Slide 82

Slide 82 text

ͷΦϖϨʔλͷ͏ͪ VGGΛಈ͔ͨ͢Ίʹ࣮૷͞Ε͍ͯͳ͚Ε͹ͳΒͳ͍෺ external variable conv relu max_pool reshape linear softmax GPUͷϝϞϦ֬อ͢Δ͚ͩ GPUͷϝϞϦ֬อͯ͠σʔλΛసૹ͢Δ͚ͩ ςϯιϧͷαΠζΛม͑Δ͚ͩ

Slide 83

Slide 83 text

#version 450 #extension GL_ARB_separate_shader_objects : enable #extension GL_ARB_shading_language_420pack : enable #extension GL_KHR_shader_subgroup_basic : enable #extension GL_KHR_shader_subgroup_arithmetic : enable #extension GL_KHR_memory_scope_semantics : enable layout(local_size_x = 32, local_size_y = 32, local_size_z = 1 ) in; layout(std430, binding = 0) buffer input_vector { float input_data[]; }; layout(std430, binding = 1) buffer weight { float weight_data[]; }; layout(std430, binding = 2) buffer output_vector { float output_data[]; }; layout(std430, binding = 3) buffer bias { float bias_data[]; }; layout(constant_id = 1) const int filter_size_x = 0; layout(constant_id = 2) const int filter_size_y = 0; ೖྗ όΠΞε ग़ྗ ϑΟϧλ conv

Slide 84

Slide 84 text

layout(constant_id = 1) const int filter_size_x = 0; layout(constant_id = 2) const int filter_size_y = 0; layout(constant_id = 3) const int filter_size_z = 0; layout(constant_id = 4) const int lpadding = 0; layout(constant_id = 5) const int rpadding = 0; layout(constant_id = 6) const int tpadding = 0; layout(constant_id = 7) const int bpadding = 0; layout(constant_id = 8) const int stride_x = 0; layout(constant_id = 9) const int stride_y = 0; layout(constant_id = 10) const int dilation_x = 0; layout(constant_id = 11) const int dilation_y = 0; layout(constant_id = 12) const int input_dim_x = 0; layout(constant_id = 13) const int input_dim_y = 0; layout(constant_id = 14) const int input_dim_z = 0; layout(constant_id = 15) const int output_dim_x = 0; layout(constant_id = 16) const int output_dim_y = 0; layout(constant_id = 17) const int output_dim_z = 0; layout(constant_id = 18) const float border_value = 0.0; layout(constant_id = 19) const int bias_mode = 0; layout(constant_id = 20) const float bias_value = 0; int get_filter_length() { return filter_size_x * filter_size_y * filter_size_z; } ϑΟϧλαΠζ όΠΞεͷ৐ͤํ ൣғ֎ͷ஋ padding stride dilation ೖྗͷαΠζ ग़ྗͷαΠζ

Slide 85

Slide 85 text

shared float[filter_size_x*filter_size_y*filter_size_z] filter_cache; void load_filter() { const uint local_id = gl_LocalInvocationID.x + gl_LocalInvocationID.y * gl_WorkGroupSize.x + gl_LocalInvocationID.z * gl_WorkGroupSize.y * gl_WorkGroupSize.x; const uint local_size = gl_WorkGroupSize.x * gl_WorkGroupSize.y * gl_WorkGroupSize.z; const uint filter_size = uint( get_filter_length() ); const uint cycles = filter_size / local_size + ( bool( filter_size % local_size ) ? 1 : 0 ); for( uint c = 0u; c != cycles; c++ ) { uint i = c * local_size + local_id; const int filter_offset = get_filter_offset( int( i ) ); if( i < filter_size ) { filter_cache[ i ] = weight_data[ filter_offset ]; } } } float get_input( int i ) { const int input_offset = get_input_offset( i ); return ( input_offset < 0 ) ? 1εϨου͕ग़ྗͷ1ཁૉΛ୲౰͢Δ ෳ਺ͷεϨου͕ಉ͡ϑΟϧλΛ࢖͏ͷͰ ڞ༗ϝϞϦʹϑΟϧλΛϩʔυ͢Δ

Slide 86

Slide 86 text

const int output_channel = int( gl_GlobalInvocationID.z % output_dim_z ); return ( bias_mode == 0 ) ? bias_value : bias_data[ output_channel ]; } void set_output( float v ) { const int output_offset = get_output_offset(); if( output_offset >= 0 ) { output_data[ output_offset ] = v; } } void main() { load_filter(); barrier(); const int filter_length = get_filter_length(); float sum = 0.0; for( int i = 0; i != filter_length; i++ ) { sum += get_input( i ) * get_filter( i ); } sum += get_bias(); set_output( sum ); } ϑΟϧλͷൣғ಺ͷೖྗ஋ʹϑΟϧλΛ͔͚ͯ ૯࿨ΛऔΓɺόΠΞεΛՃ͑ͯग़ྗʹॻ͘

Slide 87

Slide 87 text

#version 450 #extension GL_ARB_separate_shader_objects : enable #extension GL_ARB_shading_language_420pack : enable #extension GL_KHR_shader_subgroup_basic : enable #extension GL_KHR_shader_subgroup_arithmetic : enable #extension GL_KHR_memory_scope_semantics : enable layout(local_size_x = 1024, local_size_y = 1, local_size_z = 1 ) in; layout(std430, binding = 0) buffer input_vector { float input_data[]; }; layout(std430, binding = 1) buffer weight { float weight_data[]; }; layout(std430, binding = 2) buffer output_vector { float output_data[]; }; layout(std430, binding = 3) buffer bias { float bias_data[]; }; layout(constant_id = 1) const uint input_length = 32; layout(constant_id = 2) const uint bias_mode = 0; layout(constant_id = 3) const float bias_value = 0.0; linear ೖྗ όΠΞε ग़ྗ ϑΟϧλ

Slide 88

Slide 88 text

}; layout(constant_id = 1) const uint input_length = 32; layout(constant_id = 2) const uint bias_mode = 0; layout(constant_id = 3) const float bias_value = 0.0; shared float[32] temp; float get_bias() { const uint output_index = gl_GlobalInvocationID.y; return ( bias_mode == 0 ) ? bias_value : bias_data[ output_index ]; } void main() { const uint x = gl_GlobalInvocationID.x; const uint y = gl_GlobalInvocationID.y; const uint batch = gl_GlobalInvocationID.z; float sum = float( 0 ); const uint x_blocks = input_length / gl_WorkGroupSize.x + ( bool( input_length % gl_WorkGroupSize.x ) ? 1 : 0 ); for( uint x_index = 0; x_index != x_blocks; x_index++ ) { const uint x_global = x_index * gl_WorkGroupSize.x + x; ೖྗͷαΠζ όΠΞεͷ৐ͤํ ਫฏՃࢉʹ࢖͏ڞ༗ϝϞϦ

Slide 89

Slide 89 text

const uint batch = gl_GlobalInvocationID.z; float sum = float( 0 ); const uint x_blocks = input_length / gl_WorkGroupSize.x + ( bool( input_length % gl_WorkGroupSize.x ) ? 1 : 0 ); for( uint x_index = 0; x_index != x_blocks; x_index++ ) { const uint x_global = x_index * gl_WorkGroupSize.x + x; const uint batch_offset = input_length * batch; bool mask = ( x_global < input_length ); float result = float( 0 ); result = mask ? ( weight_data[ x_global + y * input_length ] * input_data[ batch_offset + x_global ] ) : float( 0 ); result = subgroupAdd( result ); if( gl_SubgroupInvocationID == 0 ) { temp[ gl_SubgroupID ] = result; } barrier(); mask = ( x < gl_NumSubgroups ); result = subgroupAdd( mask ? temp[ x ] : float( 0 ) ); sum += result; } if( x == 0 ) { const uint output_length = gl_WorkGroupSize.y * gl_NumWorkGroups.y; const uint batch_offset = output_length * batch; output_data[ batch_offset + y ] = sum + get_bias(); } ߦྻͷ1ཁૉʹ͖ͭ1εϨου ਫฏՃࢉ໋ྩͰ ೖྗͱॏΈ1ྻͷ಺ੵΛٻΊΔ 1ߦ໨Λ୲౰͢ΔεϨου͕ ܭࢉ݁ՌΛॻ͘

Slide 90

Slide 90 text

#version 450 #extension GL_ARB_separate_shader_objects : enable #extension GL_ARB_shading_language_420pack : enable #extension GL_KHR_shader_subgroup_basic : enable #extension GL_KHR_shader_subgroup_arithmetic : enable #extension GL_KHR_memory_scope_semantics : enable layout(local_size_x = 32, local_size_y = 32, local_size_z = 1 ) in; layout(std430, binding = 0) buffer input_vector { float input_data[]; }; layout(std430, binding = 1) buffer output_vector { float output_data[]; }; layout(constant_id = 1) const int filter_size_x = 0; layout(constant_id = 2) const int filter_size_y = 0; layout(constant_id = 3) const int filter_size_z = 0; layout(constant_id = 4) const int filter_size_w = 0; layout(constant_id = 5) const int lpadding = 0; layout(constant_id = 6) const int rpadding = 0; layout(constant_id = 7) const int tpadding = 0; layout(constant_id = 8) const int bpadding = 0; layout(constant_id = 9) const int stride_x = 0; layout(constant_id = 10) const int stride_y = 0; layout(constant_id = 11) const int stride_z = 0; layout(constant_id = 12) const int stride_w = 0; max_pool ೖྗ ग़ྗ

Slide 91

Slide 91 text

}; layout(constant_id = 1) const int filter_size_x = 0; layout(constant_id = 2) const int filter_size_y = 0; layout(constant_id = 3) const int filter_size_z = 0; layout(constant_id = 4) const int filter_size_w = 0; layout(constant_id = 5) const int lpadding = 0; layout(constant_id = 6) const int rpadding = 0; layout(constant_id = 7) const int tpadding = 0; layout(constant_id = 8) const int bpadding = 0; layout(constant_id = 9) const int stride_x = 0; layout(constant_id = 10) const int stride_y = 0; layout(constant_id = 11) const int stride_z = 0; layout(constant_id = 12) const int stride_w = 0; layout(constant_id = 13) const int input_dim_x = 0; layout(constant_id = 14) const int input_dim_y = 0; layout(constant_id = 15) const int input_dim_z = 0; layout(constant_id = 16) const int input_dim_w = 0; layout(constant_id = 17) const int output_dim_x = 0; layout(constant_id = 18) const int output_dim_y = 0; layout(constant_id = 19) const int output_dim_z = 0; layout(constant_id = 20) const int output_dim_w = 0; layout(constant_id = 21) const float border_value = 0.0; int get_filter_length() { return filter_size_x * filter_size_y * filter_size_z * filter_size_w; } int get_input_offset( int i ) { ϑΟϧλαΠζ ൣғ֎ͷ஋ padding stride ೖྗͷαΠζ ग़ྗͷαΠζ

Slide 92

Slide 92 text

} float get_input( int i ) { const int input_offset = get_input_offset( i ); return ( input_offset < 0 ) ? border_value : input_data[ input_offset ]; } void set_output( float v ) { const int output_offset = get_output_offset(); if( output_offset >= 0 ) { output_data[ output_offset ] = v; } } void main() { const int filter_length = get_filter_length(); float v = -10000.0; for( int i = 0; i != filter_length; i++ ) { const int input_offset = get_input_offset( i ); if( input_offset != -1 ) { v = max( input_data[ input_offset ], v ); } } set_output( v ); } ϑΟϧλͷൣғ಺Ͱ ࠷େͷ஋Λग़ྗʹॻ͘

Slide 93

Slide 93 text

#version 450 #extension GL_ARB_separate_shader_objects : enable #extension GL_ARB_shading_language_420pack : enable #extension GL_KHR_shader_subgroup_basic : enable #extension GL_KHR_shader_subgroup_arithmetic : enable #extension GL_KHR_memory_scope_semantics : enable layout(local_size_x = 1024, local_size_y = 1, local_size_z = 1 ) in; layout(std430, binding = 0) buffer input_vector { float input_data[]; }; layout(std430, binding = 1) buffer output_vector { float output_data[]; }; layout(constant_id = 1) const uint input_length = 32; void main() { const uint index = gl_GlobalInvocationID.x + gl_NumWorkGroups.x * gl_WorkGroupSize.x * gl_GlobalInvocationID.y + gl_NumWorkGroups.x * gl_WorkGroupSize.x * gl_NumWorkGroups.y * gl_WorkGroupSize.y * gl_GlobalInvocationID.z; if( index >= input_length ) return; float v = input_data[ index ]; output_data[ index ] = ( v >= 0.0 ) ? v : 0.0; } ೖྗ ग़ྗ ग़ྗ = max( ೖྗ, 0 ) relu

Slide 94

Slide 94 text

#version 450 #extension GL_ARB_separate_shader_objects : enable #extension GL_ARB_shading_language_420pack : enable #extension GL_KHR_shader_subgroup_basic : enable #extension GL_KHR_shader_subgroup_arithmetic : enable #extension GL_KHR_memory_scope_semantics : enable layout(local_size_x = 1024, local_size_y = 1, local_size_z = 1 ) in; layout(std430, binding = 0) buffer input_vector { float input_data[]; }; layout(std430, binding = 1) buffer output_vector { float output_data[]; }; layout(constant_id = 1) const uint input_length = 32; shared float[32] temp; void main() { const uint offset = gl_LocalInvocationID.x; const uint batch = gl_GlobalInvocationID.z; float max = 0.0; for( uint i = offset; i < input_length; i += 1024 ) { softmax ೖྗ ग़ྗ

Slide 95

Slide 95 text

const uint offset = gl_LocalInvocationID.x; const uint batch = gl_GlobalInvocationID.z; float max = 0.0; for( uint i = offset; i < input_length; i += 1024 ) { float v = input_data[ i + input_length * batch ]; if( v > max ) { max = v; } } const float smax = subgroupMax( max ); if( gl_SubgroupInvocationID == 0 ) { temp[ gl_SubgroupID ] = smax; } barrier(); const float gmax = subgroupMax( temp[ gl_SubgroupInvocationID ] ); float sum = 0.0; for( uint i = offset; i < input_length; i += 1024 ) { sum += exp( input_data[ i + input_length * batch ] - gmax ); } const float ssum = subgroupAdd( sum ); softmaxʹ͸ ೖྗ஋ͷexpͷ૯࿨͕ཁΔͷͰ શͯͷ஋Λڞ༗ϝϞϦ͕ಧ͘ 1024εϨουͷதͰย෇͚Δ ೖྗͷ࠷େ஋ΛٻΊΔ

Slide 96

Slide 96 text

const float smax = subgroupMax( max ); if( gl_SubgroupInvocationID == 0 ) { temp[ gl_SubgroupID ] = smax; } barrier(); const float gmax = subgroupMax( temp[ gl_SubgroupInvocationID ] ); float sum = 0.0; for( uint i = offset; i < input_length; i += 1024 ) { sum += exp( input_data[ i + input_length * batch ] - gmax ); } const float ssum = subgroupAdd( sum ); if( gl_SubgroupInvocationID == 0 ) { temp[ gl_SubgroupID ] = ssum; } barrier(); const float gsum = subgroupAdd( temp[ gl_SubgroupInvocationID ] ); for( uint i = offset; i < input_length; i += 1024 ) { output_data[ i + input_length * batch ] = exp( input_data[ i + input_length * batch ] - gmax ) / gsum; } } ೖྗͷexpͷ૯࿨ΛٻΊΔ ೖྗͷexpΛೖྗͷexpͷ૯࿨ͰׂΔ

Slide 97

Slide 97 text

https://github.com/KhronosGroup/NNEF-Tools ͜Ε NNEFͷެࣜϥΠϒϥϦ NNEF-Toolsͷதʹ NNEFͷύʔαؚ͕·Ε͍ͯΔ Pythonͷ࣮૷ͱ C++ͷ࣮૷͕༻ҙ͞Ε͍ͯΔ

Slide 98

Slide 98 text

graph::graph( const std::shared_ptr< device_t > &device, const std::shared_ptr< allocator_t > &allocator, const std::shared_ptr< descriptor_pool_t > &descriptor_pool, const std::shared_ptr< pipeline_cache_t > &pipeline_cache, const std::filesystem::path &dir, const std::filesystem::path &shader_dir, command_buffer_recorder_t &rec ) { nnef::Graph parsed; std::string error; if( !nnef::load_graph( dir.string(), parsed, error, "" ) ) { std::cerr << error << std::endl; throw -1; } if( !nnef::infer_shapes( parsed, error ) ) { std::cerr << error << std::endl; throw -1; } if( !nnef::allocate_buffers( parsed, error ) ) { std::cerr << error << std::endl; throw -1; } gct/src/gct/dnn/graph.cpp NNEF-Toolsͷ ύʔαͰ NNEFΛಡΈࠐΉ

Slide 99

Slide 99 text

for( const auto &o: parsed.operations ) { if( o.name == "variable" ) { const std::string name = get_output_name( o ); const auto label = std::find_if( o.attribs.begin(), o.attribs.end(), []( const auto &v ) { return v.first == "label"; } ); if( label == o.attribs.end() ) { throw -1; } const auto data_name = label->second.string(); const auto data_filename = dir / ( data_name + ".dat" ); auto nnef_data = rec.load_nnef_data( allocator, std::filesystem::absolute( data_filename ), vk::BufferUsageFlagBits::eStorageBuffer| vk::BufferUsageFlagBits::eTransferDst ); bufs.insert( std::make_pair( name, nnef_data ) ); } gct/src/gct/dnn/graph.cpp ඞཁͳϑΝΠϧͷ಺༰Λ GPUͷϝϞϦʹૹΔ

Slide 100

Slide 100 text

for( const auto &o: parsed.operations ) { if( o.name == "conv" ) { const std::string name = get_output_name( o ); auto op = std::make_shared< operation::convolution >( allocator, descriptor_pool, pipeline_cache, o, shaders, bufs ); bufs.insert( std::make_pair( name, op->get_output() ) ); ops.push_back( op ); } else if( o.name == "linear" ) { const std::string name = get_output_name( o ); auto op = std::make_shared< operation::linear >( allocator, descriptor_pool, pipeline_cache, o, shaders, bufs gct/src/gct/dnn/graph.cpp ֤૚ʹରԠ͢Δ ύΠϓϥΠϯΛ࡞Γ σεΫϦϓληοτʹ όοϑΝΛ݁ͼ͚ͭΔ

Slide 101

Slide 101 text

void convolution::operator()( command_buffer_recorder_t &rec ) { rec.compute_barrier( { input.buffer }, {} ); rec.bind_descriptor_set( vk::PipelineBindPoint::eCompute, pipeline_layout, descriptor_set ); rec.bind_pipeline( pipeline ); rec.dispatch_threads( exec_dim[ 0 ], exec_dim[ 1 ], exec_dim[ 2 ] ); } gct/src/gct/dnn/convolution.cpp ίϚϯυόοϑΝʹ ඞཁͳϝϞϦόϦΞͱίϯϐϡʔτύΠϓϥΠϯͷ࣮ߦΛੵΉ

Slide 102

Slide 102 text

std::vector< std::uint8_t > temp( dest.buffer->get_props().get_basic().size, 0u ); std::unordered_map< std::string, int > channel_order{ { "R", 2 }, { "G", 1 }, { "B", 0 } }; for( int c = 0; c != spec.nchannels; ++c ) { const auto order = channel_order.find( spec.channelnames[ c ] ); if( order != channel_order.end() ) { file->read_image( c, c + 1u, type, reinterpret_cast< std::uint8_t* >( std::next( temp.data(), spec.width * spec.height * dest.type.depth/8u * order->second ) ) ); } } constexpr std::array< float, 3u > mean{ 123.68f, 116.779f, 103.939f }; for( int c = 0; c != spec.nchannels; ++c ) { for( unsigned int y = 0; y != spec.height; ++y ) { for( unsigned int x = 0; x != spec.width; ++x ) { const auto index = x + y * spec.width + c * spec.width * spec.height; reinterpret_cast< float* >( temp.data() )[ index ] = reinterpret_cast< float* >( temp.data() )[ index ] * 255.0f - mean[ c ]; } } } gct/src/gct/dnn/load_image.cpp AlexNetޓ׵ͷೖྗը૾ͷલॲཧ νϟωϧΛBGRͷॱʹฒ΂ସ͑ ImageNetͷֶशσʔλͷνϟωϧຖͷฏۉ஋ΛҾ͘ VGGؚΉଟ͘ͷը૾ॲཧܥͷϞσϧ͕͜ͷલॲཧΛ࠾༻͍ͯ͠Δ

Slide 103

Slide 103 text

--- /home/fadis/vgg16.orig/graph.nnef 2019-05-21 21:24:49.000000000 +0900 +++ /home/fadis/vgg16/graph.nnef 2023-07-17 20:32:06.938232809 +0900 @@ -34,7 +34,7 @@ variable_7 = variable(label = 'conv2_2_blob2', shape = [1, 128]); variable = variable(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable(label = 'conv1_2_blob2', shape = [1, 64]); - data = external(shape = [10, 3, 224, 224]); + data = external(shape = [1, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); @@ -66,7 +66,7 @@ conv_12 = conv(relu_11, variable_24, variable_25, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_12 = relu(conv_12); max_pool_4 = max_pool(relu_12, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); - reshape = reshape(max_pool_4, shape = [10, -1]); + reshape = reshape(max_pool_4, shape = [1, -1]); linear = linear(reshape, variable_26, variable_27); relu_13 = relu(linear); linear_1 = linear(relu_13, variable_28, variable_29); ධՁ͸ը૾1ຕͰߦ͍͍ͨͷͰόοναΠζΛ10͔Β1ʹมߋ

Slide 104

Slide 104 text

https://github.com/KhronosGroup/NNEF-Tools/tree/main/models#nnef-model-zoo ͜ͷϞσϧ͸ ೖྗͱͯ͠ը૾Λड͚औΓ ͦΕ͕ԿͰ͋Δ͔Λද͢ IDΛฦ͢Α͏ʹ ֶश͕ͳ͞Ε͍ͯΔ

Slide 105

Slide 105 text

https://www.kaggle.com/c/imagenet-object-localization-challenge/data?select=LOC_synset_mapping.txt ImageNetͷ഑෍ݩ͔Β ͲͷID͕Կͳͷ͔ͷ ରԠදΛरͬͯ͘Δ

Slide 106

Slide 106 text

Ϙ΢ϧʹཛԫɺคνʔζɺࠇމᑦΛೖΕͯͨ·͕͝ۉҰʹͳΔ·ͰࠞͥΔ ುʹਫͱԘΛೖΕͯ෸ಅͤ͞ɺύελΛାʹॻ͔Εͨ࣌ؒ௨ΓʹᣐͰΔ ϑϥΠύϯͰܰ͘ম͍ͨϕʔίϯΛϘ΢ϧʹՃ͑Δ ುͷத਎Λ͟Δʹ͋͛ɺ͟Δͷத਎ΛϘ΢ϧʹҠ͢ ༨೤Ͱคνʔζ͕ͱ͚Δ·ͰΑ͋͑͘Δ ΧϧϘφʔϥͷϏϧυखॱ ϐΫηϧͷΧϧϘφʔϥͷը૾ͷ׬੒ 224 × 224

Slide 107

Slide 107 text

$ dnn_eval -m ~/vgg16 -i ~/002.jpg -l ~/LOC_synset_mapping.txt 959 carbonara 0.997535 923 plate 0.00213662 940 spaghetti squash 0.000186059 937 broccoli 2.2913e-05 762 restaurant, eating house, eating place, eatery 1.6662e-05 809 soup bowl 1.57591e-05 935 mashed potato 1.37334e-05 962 meat loaf, meatloaf 1.27348e-05 934 hotdog, hot dog, red hot 1.23476e-05 925 consomme 6.84554e-06 ͜ͷ෺ମ͕ ΧϧϘφʔϥͰ͋ΔՄೳੑ 99.75%

Slide 108

Slide 108 text

$ dnn_eval -m ~/vgg16 -i ~/001.jpg -l ~/LOC_synset_mapping.txt 951 lemon 0.986053 950 orange 0.0096886 961 dough 0.001014 954 banana 0.00058848 928 ice cream, icecream 0.000531914 953 pineapple, ananas 0.000395049 949 strawberry 0.000154118 952 fig 0.000151808 940 spaghetti squash 0.000140983 948 Granny Smith 0.000132113 ͜ͷ෺ମ͕ ϨϞϯͰ͋ΔՄೳੑ98.61%

Slide 109

Slide 109 text

$ dnn_eval -m ~/vgg16 -i ~/003.jpg -l ~/LOC_synset_mapping.txt 784 screwdriver 0.937934 845 syringe 0.0574703 696 paintbrush 0.000626332 418 ballpoint, ballpoint pen, ballpen, Biro 0.000506295 840 swab, swob, mop 0.000496733 629 lipstick, lip rouge 0.000402968 749 quill, quill pen 0.000344308 731 plunger, plumber's helper 0.000300347 813 spatula 0.000254442 623 letter opener, paper knife, paperknife 0.000160844 ͜ͷ෺ମ͕ υϥΠόʔͰ͋ΔՄೳੑ93.79%

Slide 110

Slide 110 text

·ͱΊ ֶशࡁΈͷχϡʔϥϧωοτϫʔΫΛ ΤΫεϙʔτ͢ΔϑΝΠϧܗࣜ άϥϑఆٛ෦෼͸ςΩετܗࣜͳͷͰ ਓ͕ؒ௚઀ಡΊΔ