Slide 1

Slide 1 text

i ໨ ࣍ ୈ 1 ষ ং࿦ 1 ୈ 2 ষ ؔ࿈ݚڀ 3 2.1 ύʔηϓτϩϯ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2 ଟ૚χϡʔϥϧωοτϫʔΫ . . . . . . . . . . . . . . . . . . . . . . . 5 2.3 CNNʢConvolutional Neural Networkɼ৞ΈࠐΈχϡʔϥϧωοτϫʔ Ϋʣ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.4 Attentionʢ஫ҙػߏʣ . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.5 GANʢGenerative Adversarial Networkɼఢରతੜ੒ωοτϫʔΫʣ . 13 2.5.1 DCGANʢDeep Convolutional Generative Adversarial Networkɼ ৞ΈࠐΈఢରతੜ੒ωοτϫʔΫʣ . . . . . . . . . . . . . . . 13 2.5.2 cGANʢConditional Generative Adversarial Networkɼ৚݅෇͖ ఢରతੜ੒ωοτϫʔΫʣ . . . . . . . . . . . . . . . . . . . . 15 2.6 Image to Image Translation . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.6.1 pix2pix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.6.2 CycleGAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.7 Attention-Guided Image-to-Image Translation . . . . . . . . . . . . . . . 20 2.7.1 Attention-Guided Generator Scheme I . . . . . . . . . . . . . . . 21 2.7.2 Attention-Guided Generator Scheme II . . . . . . . . . . . . . . 24 2.8 ճస֯౓༧ଌλεΫʢSelf-Supervised taskʣ . . . . . . . . . . . . . . 27 ୈ 3 ষ ఏҊख๏ 30 3.1 ωοτϫʔΫߏ଄ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

Slide 2

Slide 2 text

ii 3.2 ଛࣦؔ਺ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.2.1 ఢରੑଛࣦ (Adversarial Loss) . . . . . . . . . . . . . . . . . . . 31 3.2.2 Cycle Consistency LossʢαΠΫϧҰ؏ੑଛࣦʣ . . . . . . . . 34 3.2.3 ࠷ऴతͳଛࣦؔ਺ . . . . . . . . . . . . . . . . . . . . . . . . . 34 ୈ 4 ষ ࣮ݧ 35 4.1 ύϥϝʔλઃఆ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.2 σʔληοτ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.3 ධՁํ๏ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.3.1 Fr´ echet Inception Distance(FID) . . . . . . . . . . . . . . . . . . 36 ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 38 5.1 ࣮ݧ݁Ռ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 5.1.1 ఆྔతධՁ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 5.1.2 ఆੑతධՁ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 5.2 ߟ࡯ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 5.2.1 ఆྔతධՁ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 5.2.2 ఆੑతධՁ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 5.2.3 ఆྔతධՁͱఆੑతධՁͷൺֱ . . . . . . . . . . . . . . . . . 42 ୈ 6 ষ ·ͱΊ 64 ँࣙ 66 ࢀߟจݙ 67

Slide 3

Slide 3 text

ୈ 1 ষ ং࿦ 1 ୈ1ষ ং࿦ ਓ޻஌ೳʢAIʣ͸ 1950 ೥୅ʹݚڀ͕࢝·͔ͬͯΒݱࡏʹࢸΔ·Ͱɼஶ͍͠ൃలΛ ਱͍͛ͯΔɽಛʹࡢࠓ͸ୈ 3 ࣍ AI ϒʔϜͷӔதʹ͋Γɼਂ૚χϡʔϥϧωοτϫʔ ΫʢDNNʣΛத৺ͱͨ͠σΟʔϓϥʔχϯά͕ݚڀʹ༻͍ΒΕ͖ͯͨ [1]ɽσΟʔϓ ϥʔχϯάʹΑΓɼVAE ΍ GAN ͳͲͱ͍ͬͨੜ੒ϞσϧΛֶशͰ͖ΔΑ͏ʹͳͬ ͍ͯΔɽ ੜ੒ϞσϧͷҰͭͰ΋͋Δ GAN(Generative Adversarial Network)͸ɼ2014೥ʹ Ian J.Goodfellow ࢯʹΑͬͯ։ൃ͞Εͨੜ੒ϞσϧͰ͋Γɼࠓ೔ʹࢸΔ·Ͱʹ༷ʑͳԠ༻ ͕ͳ͞Ε͍ͯΔ [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]ɽGAN ͷԠ༻ͷҰͭʹ CycleGAN ͕͋Γɼ ը૾தͷഅΛγϚ΢ϚʹɼGoogle Map ͷΑ͏ͳ஍ਤΛӴ੕ࣸਅʹɼࣸਅΛֆը෩ʹɼ ͱ͍ͬͨը૾ม׵͕Ͱ͖Δ [7]ɽCycleGAN Λ༻͍ͯͷը૾ม׵͸ɼ2 ͭͷσʔληο τΛ༻͍ͨڭࢣͳֶ͠शͰ͋Γɼڭࢣ͋ΓֶशΛ༻͍ͨը૾ม׵ख๏ͷ pix2pix [6] ͷ Α͏ʹσʔλෆ଍ʹؕΔ͜ͱ͕ͳ͍ͱ͍͏ͷ͕ར఺Ͱ͋Δɽ ͞ΒʹCycleGANͷൃలܗͱͯ͠ɼ GeneratorʹAttentionΛ༻͍ͨAttention-Guided GAN ͕։ൃ͞Εͨ [8] [9]ɽAttention-Guided GAN Ͱ͸ɼ஫໨͍ͨ͠ը૾ͷҰ෦෼Λ நग़ͯ͠ Attention ͱ͠ɼAttention ͷ෦෼΁ͷը૾ม׵Λߦ͏͜ͱ͕Ͱ͖Δ [8] [9]ɽ͠ ͔͠ը૾ͷҰ෦෼Λநग़͢Δࡍʹɼ஫໨͍ͨ͠෦෼ͱ͸ؔ܎ͷͳ͍෦෼Λޡม׵͠ ͯ͠·͏ͱ͍ͬͨ՝୊͕ੜͯ͡͠·͏ɽ ͜͏ͨ͠എܠ͔ΒຊݚڀͰ͸ɼը૾ͷҰ෦෼Λਖ਼֬ʹநग़ͯ͠ࢀরը૾ͷΑ͏ʹ Ϛοϐϯά͢Δ͜ͱΛ໨తͱ͠ɼAttention-Guided GAN ͷ Discriminator ʹճస֯౓ ༧ଌλεΫʢSelf-Supervised TaskʣΛ௥Ճͨ͠ SSAttention-Guided GAN ΛఏҊ͢Δɽ Discriminator ʹճస֯౓༧ଌλεΫΛ௥Ճ͢Δ͜ͱʹΑͬͯɼճసෆมੑʹΑΔը ૾ͷزԿతಛ௃Λ೺ѲͰ͖ɼࣝผਫ਼౓ͷ޲্Λ໨ࢦ͢͜ͱ͕ՄೳͱͳΔɽͦͯ͠ɼ

Slide 4

Slide 4 text

ୈ 1 ষ ং࿦ 2 GAN ͷ Discriminator ͱ Generator Λڝ͍߹ΘͤΔੑ࣭͔ΒɼDiscriminator ͷࣝผਫ਼ ౓ͷ޲্͕ɼGenerator ΁ͷֶशʹ΋ӨڹΛ༩͑ɼഎܠʹӨڹΛ༩͑ͣʹ஫໨͍ͨ͠ ෦෼ʹͷΈը૾ม׵͞Εͨը૾͕ੜ੒Ͱ͖Δͱ͍͏ԾઆΛཱͯɼຊ࣮ݧΛߦͬͨɽ ຊ࿦จͰ͸ 2 ষʹ GAN ٴͼը૾ม׵ٕज़ͷؔ࿈ݚڀʹ͍ͭͯड़΂Δɽ3 ষͰ Attention-Guided GAN ʹճస֯౓༧ଌλεΫΛ௥Ճͨ͠ SSAttention-Guided GAN ΛఏҊ͢Δɽ4 ষͰ͸࣮ݧʹ༻͍ΔύϥϝʔλͷઃఆɼσʔληοτɼධՁํ๏Λ ঺հ͢Δɽͦͯ͠ 5 ষͰ࣮ݧ݁Ռͱߟ࡯ʹ͍ͭͯهड़͢Δɽ࠷ޙʹ 6 ষͰ·ͱΊͱ ࠓޙͷ՝୊ʹ͍ͭͯड़΂Δɽ

Slide 5

Slide 5 text

ୈ 2 ষ ؔ࿈ݚڀ 3 ୈ2ষ ؔ࿈ݚڀ ॳΊʹɼ2.1 અͰύʔηϓτϩϯɼ2.2 અͰχϡʔϥϧωοτϫʔΫͷجૅతͳ஌ ࣝΛड़΂ΔɽͦͷޙɼຊݚڀͷςʔϚͱͳ͍ͬͯΔ GAN ʹ͍ͭͯͷؔ࿈ݚڀΛड़ ΂Δɽ 2.1 ύʔηϓτϩϯ ύʔηϓτϩϯͱ͸ɼ1957 ೥ʹ Frank Rosenblatt ࢯʹΑͬͯߟҊ͞ΕͨΞϧΰϦ ζϜͰ͋Γɼ೴ͷਆܦճ࿏ͷҰ෦Λ਺ࣜͰදݱͨ͠΋ͷͰ͋Δ [12]ɽύʔηϓτϩϯ ͸ෳ਺ͷ৴߸Λೖྗͱͯ͠ड͚औΓɼҰͭͷ৴߸Λग़ྗ͢Δߏ଄Ͱ͋Δɽྫͱͯ͠ɼ ̎ͭͷ৴߸ x1 , x2 Λೖྗͱͯ͠ड͚औΔύʔηϓτϩϯͷߏ଄Λਤ 2.1 ʹࣔ͢ɽ ਤ 2.1 ͷʓ͸χϡʔϩϯ͋Δ͍͸ϊʔυͱݺ͹ΕΔɽx1 , x2 ͸ೖྗ৴߸ɼy ͸ग़ྗ ৴߸ɼw1 , w2 ͸ॏΈΛද͢ͱ͢Δɽᮢ஋Λ θ ͱͯ͠ɼग़ྗ y ͸ࣜ (2.1) ͷΑ͏ʹܭࢉ ͢Δ [12]ɽ y =            1 ɹ (w1 x1 + w2 x2 > θ) 0 ɹ (w1 x1 + w2 x2 ≤ θ) (2.1) ύʔηϓτϩϯͰ͸ɼᮢ஋ θ ͷ୅ΘΓʹόΠΞε b = −θ Λಋೖ͢Δ͜ͱ͕͋Δɽό ΠΞεΛ 1 ͭͷχϡʔϩϯͱͯ͠ɼਤ 2.2 ͷΑ͏ͳߏ଄ͱ͢Δ৔߹͕͋Δɽ ύʔηϓτϩϯʹόΠΞεΛಋೖͨ͠ग़ྗ y ͷܭࢉΛࣜ (2.2) ʹࣔ͢ [12]. y =            1 ɹ (w1 x1 + w2 x2 + b > 0) 0 ɹ (w1 x1 + w2 x2 + b ≤ 0) (2.2) ࣜ (2.2) Ͱ͸ɼೖྗ x1 , x2 ͱॏΈ w1 , w2 ͷੵ࿨ w1 x1 + w2 x2 ʹόΠΞε߲ b ΛՃ͑ͨ ͕ࣜਖ਼ͷ஋ͳΒ͹ 1ɼͦΕҎ֎ͷ஋ͳΒ͹ 0 ΛͱΔɽग़ྗ y ͸ɼ׆ੑԽؔ਺ h(·) Λ ༻͍ͯܭࢉ͢Δ͜ͱ΋Ͱ͖ɼࣜ (2.3) ʹࣔ͢ [12]ɽࣜ (2.3) Λࣜ (2.2) ͱಉ͡ग़ྗͱ͢

Slide 6

Slide 6 text

ୈ 2 ষ ؔ࿈ݚڀ 4 ਤ 2.1 ̎ͭͷ৴߸Λೖྗͱͯ͠ड͚औΔύʔηϓτϩϯͷྫɹ ਤ 2.2 όΠΞεΛಋೖͨ͠ύʔηϓτϩϯɹ

Slide 7

Slide 7 text

ୈ 2 ষ ؔ࿈ݚڀ 5 Δ৔߹ɼࣜ (2.4) ʹࣔ͢εςοϓؔ਺Λ༻͍Δ [12]ɽ y = h(w1 x1 + w2 x2 + b) (2.3) h(u) =            1 ɹ (u > 0) 0 ɹ (u ≤ 0) (2.4) 2.2 ଟ૚χϡʔϥϧωοτϫʔΫ ଟ૚χϡʔϥϧωοτϫʔΫ͸ɼ೴ͷਆܦճ࿏ͷҰ෦Λ໛฿ͨ͠਺ཧϞσϧͰ͋ Γɼύʔηϓτϩϯಉ࢜Λͭͳ͗ɼͦΕΒΛଟ૚ʹͨ͠ϞσϧͰ͋Δ [12]ɽଟ૚χϡʔ ϥϧωοτϫʔΫͷྫΛਤ 2.3 ʹࣔ͢ [13]ɽਤ 2.3 ͷࠨଆͷྻΛೖྗ૚ɼதؒͷྻΛ தؒ૚ʢӅΕ૚ʣɼӈଆͷྻΛग़ྗ૚ͱݺͿɽਤ 2.3 ʹ͓͍ͯɼN (N > 0) ݸͷೖ ྗΛද͢ม਺ x1 , ..., xi , ..., xN ͱόΠΞε x0 ɼM (M > 0) ݸͷग़ྗΛද͢ม਺ y1 , ..., yj , ..., yM ɼ L (L > 0) ݸͷӅΕม਺ z1 , ..., zk , ..., zL ͱόΠΞε z0 Λࣔ͢ɽ ਤ 2.3 χϡʔϥϧωοτϫʔΫͷྫɹ

Slide 8

Slide 8 text

ୈ 2 ষ ؔ࿈ݚڀ 6 χϡʔϥϧωοτϫʔΫʹ͓͚Δܭࢉʹ͍ͭͯઆ໌͢Δɽ·ͣೖྗ૚ʹͯೖྗ x1 , ..., xN ͱόΠΞε x0 Λ༻͍֤ͯରԠ͢ΔॏΈ w(1) ki ͱͷੵ࿨Λܭࢉ͠ɼ׆ੑԽؔ਺ h ʹ୅ೖͨ͠஋͕ӅΕؔ਺ͷ஋ͱͳΔɽ׆ੑԽؔ਺ h ʹ͸ɺࣜ (2.5) ͰͷγάϞΠυ ؔ਺΍ࣜ (2.6) ͷ ReLU(Rectitled Linear Unit) ͕͋Δ [12]ɽ h(uk ) = 1 1 + exp(−uk ) (2.5) h(uk ) = max(uk , 0) (2.6) ೖྗ૚͔Βதؒ૚΁ͷܭࢉࣜΛࣜ (2.7) ʹͯࣔ͢ɽ͜͜ͰόΠΞε x0 ͸ x0 = 1 ͱ͠ɼ ରԠ͢ΔॏΈ͸ w(1) k0 ͱ͢Δɽ zk = h( N ∑ i=1 w(1) ki xi + w(1) k0 ) (2.7) ࣜ (2.3) ͱಉ༷ʹͯ͠ܭࢉ͠ɼग़ྗ yi Λܭࢉ͢Δɽதؒ૚͔Βग़ྗ૚΁ࢸΔͷʹ༻ ͍ΒΕΔ׆ੑԽؔ਺ σ ͸ɼղ͘໰୊ʹΑͬͯࣜ (2.8) ͷ߃౳ؔ਺΍ࣜ (2.9) ͷιϑτ ϚοΫεؔ਺͕༻͍ΒΕΔ [12]ɽ σ(vj ) = vj (2.8) σ(vj ) = exp(vj ) ∑ M m=1 exp(vm ) (2.9) தؒ૚͔Βग़ྗ૚΁ͷܭࢉࣜΛࣜ (2.10) ʹͯࣔ͢ɽ͜͜ͰόΠΞε z0 ͸ z0 = 1 ͱ͠ɼ ରԠ͢ΔॏΈΛ w(2) j0 ͱ͢Δɽ yj = σ( L ∑ k=1 w(2) jk zk + w(2) j0 ) (2.10) ਤ 2.3 ͰͷχϡʔϥϧωοτϫʔΫͷܭࢉࣜΛࣜ (2.11) ʹͯࣔ͢ɽࠓճͷਤ 2.3 ٴͼ ࣜ (2.11) ͷΑ͏ʹೖྗ૚ → தؒ૚ → ग़ྗ૚ͷॱ൪Ͱ࣮ࢪ͞ΕΔܭࢉࣜ͸ɼॱ఻ൖ ʢforward propagationʣͱදݱ͢Δ [12]ɽ yj = σ( L ∑ k=1 ω(2) jk h( N ∑ i=1 w(1) ki xi + w(1) k0 ) + w(2) j0 ) (2.11) χϡʔϥϧωοτϫʔΫʹ͓͚Δσʔλ͸ɼओʹֶशσʔλͱςετσʔλͷ̎ͭ ͕͋Δɽֶशσʔλ͸ɼ࠷ॳʹͦΕࣗମͷΈͰχϡʔϥϧωοτϫʔΫΛֶश͠ɼ

Slide 9

Slide 9 text

ୈ 2 ষ ؔ࿈ݚڀ 7 ॏΈͷௐ੔Λߦ͏ͨΊʹ༻͍ΒΕΔɽςετσʔλ͸ɼχϡʔϥϧωοτϫʔΫͷ ग़ྗΛධՁ͢ΔͨΊʹ༻͍ΒΕΔɽςετσʔλͷධՁ͸ɼҰൠతʹςετσʔ λ tn (n = 1, ..., N) ͱɼֶशσʔλΛೖྗͱͨ͠χϡʔϥϧωοτϫʔΫͷग़ྗ yn (n = 1, ..., N) ͱͷޡࠩͷେ͖͞Λࢦඪͱ͓ͯ͠ΓɼͦΕΒͷࢦඪΛଛࣦͱݺͿ [12]ɽ ଛࣦؔ਺ͷྫͱͯࣜ͠ (2.12) ͷೋ৐࿨ޡࠩ΍ɼࣜ (2.13) ͷΫϩεΤϯτϩϐʔ͕༻ ͍ΒΕΔ [12]ɽͳ͓ɼࣜ (2.12) ΍ࣜ (2.13) ʹ͓͍ͯɼଛࣦؔ਺Λ E ͱ͢Δɽ E = 1 N N ∑ n=1 (yn − tn )2 (2.12) E = − 1 N N ∑ n=1 tn log yn (2.13) χϡʔϥϧωοτϫʔΫͷֶशͰ͸ɼଛࣦؔ਺͕࠷খʹۙͮ͘Α͏ʹॏΈΛߋ৽͠ ֶश͢Δ [12]ɽॏΈͷௐ੔ʹ͸ޯ഑๏͕༻͍ΒΕΔ [12]ɽ 2.3 CNN ʢConvolutional Neural Networkɼ ৞ΈࠐΈχϡʔ ϥϧωοτϫʔΫʣ CNN ͱ͸ɼχϡʔϥϧωοτϫʔΫͷߏ଄ʹ৞ΈࠐΈ૚Λಋೖͨ͠ωοτϫʔΫ Ͱɼओʹը૾ೝࣝͰ༻͍ΒΕ͍ͯΔ [12]ɽ2.2 અͰͷଟ૚χϡʔϥϧωοτϫʔΫ͸ɼ શ݁߹૚ͱ׆ੑԽؔ਺Ͱߏ੒͞Ε͍ͯΔɽ͜Εʹର͠ CNN ͸৞ΈࠐΈ૚ͱϓʔϦ ϯά૚Λ༻͍ͯߏ੒͞ΕΔɽ৞ΈࠐΈ૚Ͱ͸ɼྫ͑͹ը૾಺ʹ͓͚ΔϐΫηϧಉ࢜ ͷڞ௨ٴͼ૬ҧੑ΍ RGB νϟωϧ಺ͷؔ࿈ੑͱ͍ͬͨɼۭؒత৘ใʹج͍ͮͨσʔ λͷಛ௃Λଊ͑Δ͜ͱ͕ՄೳͱͳΔ [12]ɽਤ 2.4 ʹ CNN ͷߏஙྫΛࣔ͢ɽ ৞ΈࠐΈ૚Ͱ͸ೖྗσʔλʹରͯ͠ɼΧʔωϧʢϑΟϧλʣͱݺ͹ΕΔॏΈͱͷ ৞ΈࠐΈԋࢉΛߦ͏ɽਤ 2.5 ͷ৞ΈࠐΈԋࢉͷΠϝʔδྫͰ͸ɼ4x4 ͷೖྗσʔλͱ 3x3 ͷΧʔωϧͱͷ৞ΈࠐΈԋࢉʹΑͬͯɼ2x2 ͷσʔλΛग़ྗ͢Δɽ ͜͜Ͱਤ 2.5 ͷ৞ΈࠐΈԋࢉʹ͍ͭͯઆ໌͢Δɽ·ͣਤ 2.6 ͷΑ͏ʹ 4x4 ͷೖྗ σʔλͷࠨ্ͷϐΫηϧྖҬ͔Β 3x3 ͷྖҬΛऔΓɼ3x3 ͷྖҬͷ֤ཁૉͱରԠ͠

Slide 10

Slide 10 text

ୈ 2 ষ ؔ࿈ݚڀ 8 ਤ 2.4 CNNʢ৞ΈࠐΈχϡʔϥϧωοτϫʔΫʣͷߏ੒ྫɹ ਤ 2.5 ৞ΈࠐΈ૚ʹ͓͚Δ৞ΈࠐΈԋࢉͷΠϝʔδྫɹ

Slide 11

Slide 11 text

ୈ 2 ষ ؔ࿈ݚڀ 9 ͨϑΟϧλͷཁૉͱͷੵΛٻΊɼ࿨Λࢉग़͢Δɽ࣍ʹɼਤ 2.7 ͷΑ͏ʹೖྗσʔλ ͷ 3x3 ྖҬΛ 1 ϐΫηϧӈʹͣΒ͠ɼಉ༷ʹϑΟϧλͱͷੵ࿨ԋࢉΛߦ͏ɽଓ͍ͯɼ ਤ 2.8 ͷΑ͏ʹೖྗσʔλͷ 3x3 ྖҬΛ 1 ϐΫηϧԼʹͣΒ͠ɼಉ༷ʹϑΟϧλͱ ͷੵ࿨ԋࢉΛߦ͏ɽ࠷ޙʹɼਤ 2.9 ͷΑ͏ʹೖྗσʔλͷ 3x3 ྖҬΛ 1 ϐΫηϧӈ ʹͣΒ͠ɼಉ༷ʹϑΟϧλͱͷੵ࿨ԋࢉΛߦ͏ɽ ҰํϓʔϦϯά૚Ͱ͸σʔλʹ͓͚Δॎԣํ޲ͷۭؒΛখ͘͢͞ΔԋࢉΛߦ͏ɽ ਤ 2.10 ͷϓʔϦϯάԋࢉͷྫͰ͸ɼ4x4 ͷೖྗσʔλΛ̐ͭͷ 2x2 ྖҬʹ෼ׂ͠ɼ ֤ྖҬͷநग़͞Εͨ࠷େ஋Λ 2x2 ͷσʔλʹू໿͍ͯ͠ΔɽϓʔϦϯά૚ʹ͓͚Δ ԋࢉํ๏ͱͯ͠ओʹ࠷େ஋ϓʔϦϯά΍ฏۉ஋ϓʔϦϯά͕༻͍ΒΕΔ [12]ɽ ϓʔϦϯά૚Ͱ͸ɼֶश͢ΔύϥϝʔλΛ༻͍ͣɼೖྗσʔλʹ͓͚ΔඍখͳҐ ஔมԽͷӨڹΛड͚ͳ͍͜ͱ͕ଟ͍ͱ͍͏ಛ௃Λ࣋ͭ [12]ɽ 2.4 Attentionʢ஫ҙػߏʣ Attention ͱ͸ɼը૾΍จষͷಛఆͷ෦෼ʹ஫໨͠ɼಛ௃Λଊ͑ΔΑ͏ʹֶश͢Δ ωοτϫʔΫͰ͋Δ [14]ɽྫ͑͹ਓ͕ؒը૾ΛݟΔͱ͖ɼਓؒ͸ը૾தʹ͓͚Δશͯ ͷ෦෼Λಉ͡Α͏ʹݟΔͷͰ͸ͳ͘ɼը૾தʹ͋ΔҰ෦෼ʹ஫໨ͦ͠ΕΛΦϒδΣΫ τͱͯ͠ೝ͍ࣝͯ͠Δɽ͜ͷΑ͏ͳਓؒͷಛੑΛػցֶशʹԠ༻ͨ͠ͷ͕ Attnetion Ͱ͋Δɽ Attention ͷ࢓૊ΈΛਤ 2.11 ʹࣔ͢ɽਤ 2.11 Ͱ͸ɼ஫໨͍ͨ͠ಛ௃Ͱ͋Δ Query ͱɼը૾ͷݩσʔλ͔ΒͦΕͧΕ KeyɼValue ͱݺ͹ΕΔಛ௃ϕΫτϧΛऔΓग़͢ɽ Query ͱ Key ͷཁૉͷ֤ੵʢྨࣅ౓ʣΛ Value ͷॏΈͱͯ͠ɼValue ͷ஫໨͍ͨ͠෦ ෼Λڧௐ͢Δ [14]ɽ

Slide 12

Slide 12 text

ୈ 2 ষ ؔ࿈ݚڀ 10 ਤ 2.6 ೖྗσʔλͷࠨ্ͷ 3x3 ྖҬͱϑΟϧλͷ৞ΈࠐΈԋࢉɹ ਤ 2.7 ӈʹ 1 ϐΫηϧಈ͔ͨ͠ೖྗσʔλͷ 3x3 ྖҬͱϑΟϧλ ͷ৞ΈࠐΈԋࢉɹ

Slide 13

Slide 13 text

ୈ 2 ষ ؔ࿈ݚڀ 11 ਤ 2.8 Լʹ 1 ϐΫηϧಈ͔ͨ͠ೖྗσʔλͷ 3x3 ྖҬͱϑΟϧλ ͷ৞ΈࠐΈԋࢉɹ ਤ 2.9 ӈͱԼʹ 1 ϐΫηϧಈ͔ͨ͠ೖྗσʔλͷ 3x3 ྖҬͱϑΟ ϧλͷ৞ΈࠐΈԋࢉɹ

Slide 14

Slide 14 text

ୈ 2 ষ ؔ࿈ݚڀ 12 ਤ 2.10 ࠷େ஋ϓʔϦϯάͷྫ ਤ 2.11 Attention ͷߏ଄ɹ

Slide 15

Slide 15 text

ୈ 2 ষ ؔ࿈ݚڀ 13 2.5 GANʢGenerative Adversarial Networkɼఢରతੜ੒ ωοτϫʔΫʣ GAN ͸ɼ2014 ೥ʹ Ian J. Goodfellow ࢯΒʹΑͬͯߟҊ͞Εͨੜ੒Ϟσϧʹ͓͚ ΔΞʔΩςΫνϟͰ͋Δ [3]ɽGAN ͷߏ଄Λਤ 2.12 ʹࣔ͢ɽGAN ͸ɼGeneratorʢੜ ੒ثʣͱ Discriminatorʢࣝผثʣͷ 2 ͭͷωοτϫʔΫͰߏ੒͞ΕɼGenerator ͱ Discriminator Λఢରతʹֶशͤ͞ΔɽGenerator ͸ϊΠζΛೖྗ͞Εɼੜ੒σʔλΛ ग़ྗ͢ΔɽDiscriminator ͸ɼೖྗ͞Εͨσʔλֶ͕शઌͷσʔλͰ͋Δ͔Ͳ͏͔Λ ࣝผ͢ΔɽGAN ͸ Generator ͱ Discriminator Λఢରతʹֶश͢ΔͨΊɼGAN ͷଛ ࣦؔ਺ V(D,G) ͸ࣜ (2.14) ͰදͤΔ [3]ɽ min G max D V(D,G) = Ex [log(D(x))] + Ez [log(1 − D(G(z)))] (2.14) ࣜ (2.14) ʹ͓͍ͯɼೖྗσʔλ͕ຊ෺Ͱ͋Δ֬཰Λ D(x)ɼGenerator Ͱੜ੒͞Εͨ σʔλΛ G(z) ͱ͢Δɽೖྗσʔλ͕ຊ෺Ͱ͋Δͱ൑அ͞ΕΔ৔߹ɼD ͸େ͖͍஋ ͱͳΔɽҰํͰɼೖྗσʔλِ͕෺ͱ൑அ͞ΕΔ৔߹ɼD ͸খ͍͞஋ͱͳΔɽͨ͠ ͕ͬͯɼࣜ 2.14 ʹ͓͍ͯ log(D(x)) ͷ஋ͱ log(1 − D(G(z))) ͷ஋͕େ͖͘ͳΔΑ͏ʹ Discriminator Λֶश͢ΔɽҰํɼGenerator ͸ຊ෺σʔλʹ͍ۙੜ੒σʔλ G(z) Λ ੜ੒͢ΔͨΊʹɼlog(1 − D(G(z))) ͷ஋͕খ͘͞ͳΔΑ͏ʹֶश͢Δɽ͜ΕΒͷֶश Λఢରతʹ܁Γฦ͢͜ͱʹΑͬͯ GAN Λֶश͢Δ [3]ɽ 2.5.1 DCGAN ʢDeep Convolutional Generative Adversarial Networkɼ ৞ΈࠐΈఢରతੜ੒ωοτϫʔΫʣ DCGAN Ͱ͸ɼGAN Ͱ༻͍ΒΕΔ̎ͭͷωοτϫʔΫ GeneratorɼDiscriminator ʹ ͦΕͧΕ৞ΈࠐΈ૚Λ༻͍͍ͯΔ [4]ɽDCGAN Ͱ͸ 2.3 અͷ CNN ͷߏ଄ͱ͸ҧ͍ɼ ৞ΈࠐΈ૚ͱ׆ੑԽؔ਺ͷΈͷωοτϫʔΫߏ੒ͱͳΔɽ ਤ 2.13 ʹωοτϫʔΫͷྫΛࣔ͢ɽਤ 2.13 ͷ DCGAN ʹ͓͚Δ Generator ωοτ ϫʔΫͷߏ଄ྫͰ͸ɼϊΠζϕΫτϧ z Λ Generator ʹೖྗͯ͠ঃʑʹνϟωϧ਺

Slide 16

Slide 16 text

ୈ 2 ষ ؔ࿈ݚڀ 14 ਤ 2.12 GAN ͷߏ଄ ਤ 2.13 DCGAN Ͱͷ Generator ͷωοτϫʔΫߏ଄ྫ

Slide 17

Slide 17 text

ୈ 2 ষ ؔ࿈ݚڀ 15 ਤ 2.14 cGAN ͷߏ଄ɹ ͷ࡟ݮͱಉ࣌ʹ৞ΈࠐΈΛ༻͍ͯϐΫηϧΛΞοϓαϯϓϦϯά͠ɼࢦఆͨ͠େ͖ ͞ͷੜ੒ը૾ G(z) Λग़ྗ͢Δɽਤ 2.13 ͷྫͰ͸ 100 ࣍ݩͷϊΠζϕΫτϧΛೖྗ ͠ɼ4x4x1024ɼ8x8x512, 16x16x256, 32x32x128 ͱ 4 ճͷΞοϓαϯϓϦϯάΛհ͠ ͯɼ࠷ऴతʹνϟωϧ਺͸ 3 ͷ 64x64 ϐΫηϧͷΧϥʔը૾Λग़ྗ͍ͯ͠Δ [4]ɽ 2.5.2 cGANʢConditional Generative Adversarial Networkɼ৚݅෇ ͖ఢରతੜ੒ωοτϫʔΫʣ cGAN ͱ͸ɼGAN ͷ Generator ͱ Discriminator ʹೖྗ͢ΔσʔλʹɼΫϥεʹର Ԡ͢ΔϥϕϧΛ෇͚Ճ͑ͨϞσϧͰ͋Δ [5]ɽcGANͷߏ଄͸ਤ2.14ͷ௨ΓͱͳΔɽਤ 2.14 Ͱ͸ɼਤ 2.12 Ͱࣔ͢ GAN ͷΞʔΩςΫνϟͷ͏ͪɼֶशσʔλ x ͱੜ੒σʔλ G(z|y)ɼજࡏม਺ z ʹͦΕͧΕϥϕϧ y Λ෇Ճ͍ͯ͠ΔɽGenerator Ͱ͸ɼજࡏม਺ z ͱͦΕʹରԠ͢Δϥϕϧ y Λೖྗ͠ɼϥϕϧ෇͚͞Εͨੜ੒σʔλ G(z|y) Λग़ྗ ͢Δɽֶͦͯ͠शσʔλ x ͱϥϕϧ y ʹج͖ͮɼੜ੒σʔλ G(z|y) Λ Discriminator ʹΑͬͯࣝผ͢Δɽ

Slide 18

Slide 18 text

ୈ 2 ষ ؔ࿈ݚڀ 16 cGAN ͷଛࣦؔ਺͸ɼҎԼͷࣜ (2.15) ͰදͤΔ [5]ɽ min G max D V(D,G) = Ex [log(D(x|y))] + Ez [log(1 − D(G(z|y)))] (2.15) GAN ͱҾ͖ଓ͖ɼࣜ (2.15) ʹ͓͍ͯ Discriminator ͷग़ྗΛ D(·)ɼGenerator ͷग़ྗ Λ G(·)ɼଛࣦؔ਺Λ V(D,C) ͱ͓͘ɽࣜ (2.15) Ͱ͸ɼରԠ͢ΔϥϕϧΛ y ͱ͓͘ɽͦ ͯ͠ɼGAN ͱಉ༷ʹ log(D(x|y)) ͷ஋ͱ log(1 − D(G(z|y))) ͷ஋͕େ͖͘ͳΔΑ͏ʹ Discriminator Λֶश͠ɼlog(1 − D(G(z|y))) ͷ஋͕খ͘͞ͳΔΑ͏ʹ Generator Λֶ श͢Δ [5]ɽ 2.6 Image to Image Translation 2.5 અͰͷ GAN ͸ɼGenerator Λ௨ͯ͡ϊΠζ͔Βը૾Λੜ੒͢Δը૾ੜ੒Λߦ ͏ϞσϧͰ͋ΔɽຊઅͰ͸ GAN ͷը૾ੜ੒ٕज़ΛԠ༻͠ɼιʔεͱͳΔը૾͔Β λʔήοτͱͳΔը૾΁ͷม׵Λߦ͏ Image to Image Translation ͱݺ͹ΕΔը૾ม ׵ٕज़Λઆ໌͢Δɽ2.6.1 ߲Ͱ pix2pixɼ2.6.2 ߲Ͱ CycleGAN ʹ͍ͭͯͦΕͧΕઆ໌ ͢Δɽ 2.6.1 pix2pix pix2pix ͱ͸ɼPhillip Isola ࢯΒʹΑͬͯ։ൃ͞Εͨ Image to Image Translation ख๏ Ͱ͋Δ [6]ɽpix2pix Ͱ͸ɼม׵ઌը૾͔ΒΤοδநग़ͨ͠ը૾Λม׵ݩը૾ͱ͢Δɽ pix2pix ͸ 2.5.2 ߲ͷ cGAN ΛԠ༻ͨ͠ϞσϧͰ͋ΓɼΤοδநग़ͨ͠ม׵ݩը૾Λ ϥϕϧͱͯ͠ը૾ม׵Λߦ͏ɽਤ 2.15 ʹ pix2pix ͷߏ଄Λࣔ͢ɽpix2pix Ͱ͸ 2.5.2 ߲ ͷ cGAN Λ༻͍͓ͯΓɼਤ 2.15 ͷΤοδ͔Βը૾΁ͷྫͰ͸ɼϥϕϧʹΤοδͷը ૾ɼຊ෺ը૾ʹม׵ઌͷը૾Λద༻͍ͯ͠Δɽޙ͸ cGAN ͱಉ͘͡ɼDiscriminator ͱ Generator ͱͷఢରతֶशʹΑͬͯɼม׵ઌͷը૾ͷΑ͏ͳը૾Λ Generator Ͱੜ ੒͢Δɽ ϥϕϧը૾ʹ͸ɼม׵ݩը૾ͷΤοδ෦෼͚ͩͰͳ͘ɼGoogle Map ͷ஍ਤ΍ Se- mantic Segmentation [15] Λ༻͍ͨϥϕϧը૾Λ༻͍Δ͜ͱ΋Ͱ͖ΔɽGoogle Map ͷ

Slide 19

Slide 19 text

ୈ 2 ষ ؔ࿈ݚڀ 17 ஍ਤ͔ΒߤۭࣸਅΛੜ੒ͨ͠Γ Semantic Segmentation Λ༻͍ͨϥϕϧը૾͔Βࣸ ਅΛੜ੒ͨ͠Γ͢Δ͜ͱ΋ՄೳͰ͋Δɽͳ͓ɼSemantic Segmentation ͱ͸ɼ֤ϐΫ ηϧΛपลͷϐΫηϧ৘ใʹج͖ͮɼΧςΰϦ෼ྨ͢Δํ๏Λࢦ͢ [15]ɽྫͱͯ͠ɼ Image-to-Image Demo [6] Ͱ pix2pix ʹΑΔը૾ม׵ͷ࣮ߦ݁ՌΛਤ 2.16 ʹࣔ͢ɽਤ 2.16 ͷྫͰ͸ɼۺɼϋϯυόοάͷը૾͔ΒͦΕͧΕΤοδΛऔͬͨϥϕϧը૾΍ ݐ෺֎؍ͷϥϕϧը૾ΛجʹɼςΫενϟϚοϐϯάͰࣸਅͷΑ͏ͳը૾Λੜ੒͠ ͍ͯΔɽ 2.6.2 CycleGAN CycleGAN ͸ɼGAN Λ̎ͭܨ͛Δ͜ͱͰɼ͋Δσʔλ܈ͷը૾Λผͷσʔλ܈ͷ ը૾ͷΑ͏ʹ૬ޓม׵͢Δ Image to Image Translation ख๏Ͱ͋Δ [7]ɽਤ 2.17 ʹͯɼ pix2pix Ͱ༻͍Δม׵ݩը૾ͱม׵ઌը૾ͷϖΞͷྫͱ CycleGAN Ͱ༻͍Δม׵ݩ ը૾܈ͱม׵ઌը૾܈ͷྫΛࣔ͢ɽલऀ͸ɼۺ΍όοάͷը૾ͱ͍ͬͨม׵ઌը૾ ਤ 2.15 pix2pix ͷߏ଄

Slide 20

Slide 20 text

ୈ 2 ষ ؔ࿈ݚڀ 18 ਤ 2.16 pix2pixͰͷImage to Image Translationͷྫ ʢImage-to-Im- age Demo Ͱ࡞੒ [6]ʣɹ ͱ֤ม׵ઌը૾ͷΤοδը૾͕ϖΞͱͳͬͨڭࢣ͋Γֶशͱͳ͍ͬͯΔɽ͜ͷΑ͏ ʹ pix2pix Ͱ͸ɼม׵ݩͱͳΔϥϕϧͱม׵ઌը૾ͷ̍ର̍ͷϖΞ͕ඞཁͱͳΔͨ Ίɼσʔληοτͷऩू͕೉͘͠ɼσʔλͷྔ͕ലେͱͳΔͱ͍ͬͨ՝୊͕͋Δɽ ҰํޙऀͰ͸ɼ֤ม׵ݩը૾͝ͱʹରԠͨ͠ϥϕϧΛ༻͍Δ͜ͱͳ͘ɼม׵ݩը૾ ܈ͱม׵ઌը૾܈ΛϖΞͱͨ͠ڭࢣͳֶ͠शͱͳ͍ͬͯΔɽCycleGANͰ͸ɼpix2pix ͱൺ΂ͯϖΞͱͳΔֶशσʔλΛඞཁͱͤͣʹը૾ม׵͕Ͱ͖ΔΑ͏ʹվྑ͞Εͯ ͍Δ [7]ɽ CycleGAN ͷଛࣦؔ਺͸ओʹɼఢରੑଛࣦʢAdversarial lossʣ ɼαΠΫϧҰ؏ੑଛ ࣦʢCycle consistency lossʣ͔Βߏ੒͞ΕΔ [7]ɽఢରੑଛࣦͱ͸ɼม׵ઌͱͳΔσʔ λ܈ͷ෼෍ʹੜ੒ը૾ͷ෼෍ΛҰகͤ͞Δ͜ͱΛ໨తͱͨ͠ଛࣦͰ͋ΓɼαΠΫϧ

Slide 21

Slide 21 text

ୈ 2 ষ ؔ࿈ݚڀ 19 Ұ؏ੑଛࣦ͸ɼ̎ͭͷ Generator ಉ࢜ʹໃ६͕ੜ͡ͳ͍Α͏ʹ͢Δ͜ͱΛ໨తͱͨ͠ ଛࣦͰ͋Δɽఢରੑଛࣦ LGAN Λࣜ (2.16)ɼࣜ (2.17)ɼαΠΫϧҰ؏ੑଛࣦ Lcycle Λࣜ (2.18) ʹͯࣔ͢ɽ͜͜Ͱσʔλ܈Λ X, Y ͱͯ͠ɼG Λ X → Y ͷม׵Λߦ͏ Generatorɼ F Λ Y → X ͷม׵Λߦ͏ Generator ͱ͠ɼX, Y ʹରԠ͢Δ Discriminator ΛͦΕͧΕ DX , DY ͱ͢ΔɽDX ͸ࣝผର৅ͱͳΔσʔλ͕ը૾σʔλ܈ X தͷը૾Ͱ͋Δ͔Ͳ ͏͔ɼDY ͸ࣝผର৅ͱͳΔσʔλ͕ը૾σʔλ܈ Y தͷը૾Ͱ͋Δ͔Ͳ͏͔Λࣝ ผ͢Δɽ LGAN (G, DY , X, Y) = Ey [log(DY (y))] +Ex [log(1 − DY (G(x)))] (2.16) LGAN (F, DX , Y, X) = Ex [log(DX (x))] + Ey [log(1 − DX (F(y)))] (2.17) Lcycle (G, F) = Ex [∥F(G(x)) − x∥1 ] + Ey [∥G(F(y)) − y∥1 ] (2.18) ͜ΕΒͷଛࣦΛ௨ৗͷ GAN ͱಉ༷ʹఢରతʹֶशΛ܁Γฦ͢ɽαΠΫϧҰ؏ੑଛ ࣦ Lcycle ͷࣜ (2.18) ͸ɼը૾ x ∈ X ͱɼx Λσʔλ܈ Y ʹม׵ͨ͠ G(x) ΛͰ࠶ม ਤ 2.17 pix2pix Ͱ༻͍Δσʔλͱ CycleGAN Ͱ༻͍Δσʔλͷ ൺֱɹ

Slide 22

Slide 22 text

ୈ 2 ষ ؔ࿈ݚڀ 20 ׵ͨ͠ F(G(x)) ͱͷࠩΛܭࢉ͓ͯ͠Γɼը૾ y ∈ Y ʹରͯ͠΋ɼಉ༷ͷܭࢉΛͨ͠ ଛࣦͰ͋Δɽ͕ͨͬͯ͠ɼαΠΫϧҰ؏ੑଛࣦ͸ɼೖྗσʔλͱɼೖྗσʔλΛ֤ Generator Ͱ̍ճͣͭม׵ͯ͠࠶ߏஙͨ͠σʔλ͕ಉ͡ʹͳΔΑ͏ʹ Generator ಉ࢜ Λௐ੔͢Δ໾ׂΛ͍࣋ͬͯΔ [7]ɽ ͞Βʹ CycleGAN Ͱ͸ɼΞΠσϯςΟςΟଛࣦʢIdentity Lossʣ Lidentity Λಋೖ͠ɼ ֤ը૾܈ʹ͓͚Δ৭߹͍ΛอͭΑ͏ͳ޻෉Λ͍ͯ͠Δ [7]ɽΞΠσϯςΟςΟଛࣦ͸ɼ ҎԼͷࣜ (2.19) ͷΑ͏ʹද͞ΕΔɽ Lidentity (G, F) = Ex [∥F(x) − x∥1 ] + Ey [∥G(y) − y∥1 ] (2.19) CycleGAN ͷଛࣦؔ਺ L ͸ࣜ (2.20) ͷ௨ΓͱͳΔɽࣜ (2.20) தͷ λcycle , λidentity ͸ଛ ࣦؔ਺ͷόϥϯεΛͱΔͨΊͷϋΠύʔύϥϝʔλͰ͋Δɽ L = LGAN (G, DY , X, Y) + LGAN (F, DX , Y, X) + λcycle Lcycle (G, F) + λidentity Lidentity (G, F) (2.20) CycleGANͷωοτϫʔΫΛਤ2.18ʹࣔ͢ɽCycleGAN͸ɼGANͷωοτϫʔΫΛ ̎ͭ૊Έ߹ΘͤͨΞʔΩςΫνϟͰ͋ΓɼGenerator G, F ͱ Discriminator DX , DY Λ༻ ͍ΔɽDiscriminator DX ͸ɼೖྗը૾͕ຊ෺ը૾ x ͔ੜ੒ը૾ F(y) ͔ɼDiscriminator DY ͸ɼೖྗը૾͕ຊ෺ը૾ y ͔ੜ੒ը૾ G(x) ͔ΛͦΕͧΕࣝผ͢Δɽ 2.7 Attention-Guided Image-to-Image Translation Attention-Guided Image-to-Image Translation ͱ͸ɼImage to Image Translation ʹ Attentionʢ஫ҙػߏʣͷ֓೦ΛऔΓೖΕͨ΋ͷͰ͋Δ [8] [9]ɽ֤ Generator ʹ Attention

Slide 23

Slide 23 text

ୈ 2 ষ ؔ࿈ݚڀ 21 ਤ 2.18 CycleGAN ͷωοτϫʔΫɹ ΛऔΓೖΕΔ͜ͱʹΑͬͯɼը૾಺ͷ஫໨͢΂͖ΦϒδΣΫτΛڧௐ͠ɼΦϒδΣ ΫτҎ֎ͷഎܠͳͲͷ஫໨͠ͳ͍෦෼ʹӨڹΛ༩͑ͳ͍ը૾Λੜ੒͢Δ͜ͱΛՄೳ ͱ͍ͯ͠Δɽ Attention-Guided Image-to-Image Translation ͷ Generator ΞʔΩςΫνϟͷछྨ͝ ͱʹɼAttention-Guided Generatior Scheme I [9] ͱ Attention-Guided Generator Scheme II ͕͋Δ [8]ɽ 2.7.1 Attention-Guided Generator Scheme I Attention-Guided Generator Scheme IΛ༻͍ͨAttention-Guided Image-to-Image Trans- lation ͸ 2019 ೥ʹ Hao Tang ࢯΒʹΑͬͯߟҊ͞Εͨ [8]ɽAttention-Guided Generator Scheme I ͷ Generator ͷߏ଄Λਤ 2.19 ʹࣔ͢ɽ ਤ 2.19 Ͱ͸ɼBu3dfe σʔληοτ [16] Λ༻͍ͯɼதੑతͳإͷը૾ͱޱ֯ͷ্͕ͬ ͨإͷը૾ͱͷը૾ม׵Λߦ͍ͬͯΔɽGenerator ͸ͦΕͧΕ G ͱ F Λ༻͍͓ͯΓɼ G ͸தੑతͳإ͔Βޱ֯ͷ্͕ͬͨإͷը૾ม׵Λߦ͏ GeneratorɼF ͸ޱ֯ͷ্͕ͬ

Slide 24

Slide 24 text

ୈ 2 ষ ؔ࿈ݚڀ 22 ਤ 2.19 Attention-Guided Generator Scheme I ͷ Generator ͷߏ଄ ɹ ͨإ͔Βதੑతͳإ΁ͷը૾ม׵Λߦ͏ Generator Ͱ͋Δɽ ͳ͓ɼGenerator G, F ʹ ͸ͦΕͧΕ Attention Λ෇Ճ͓ͯ͠ΓɼAttention Mask Ay , Ax ͱ Content Mask Cy ,Cx Λग़ྗ͢Δɽ Attention Mask Ay , Ax Ͱ͸ɼը૾಺ͷ஫໨͍ͨ͠෦෼Λڧௐ͢ΔϚεΫͰ͋Δɽਤ 2.19 Ͱ͸ɼAy ͸ೖྗը૾ x தͷإͷޱ֯෦෼ʹ஫໨͍ͯ͠ΔɽAx ʹ͓͍ͯ΋ Ay ͱ ಉ༷ͷํ๏Ͱɼը૾ G(x) தͷإͷޱ͕֯ڧௐ͞Ε͍ͯΔɽ Content Mask Cy ,Cx ͸ɼม׵ઌͷը૾܈ͷಛ௃Λ΋ͱʹϨϯμϦϯά͞Εͨը૾ Λࣔ͢ɽɹਤ 2.19 Ͱ͸ɼGenerator G Ͱޱ֯ͷ্͕ͬͨޱݩΛදݱͨ͠ը૾Λ Content Mask Cy ͱͯ͠ग़ྗ͍ͯ͠Δɽ ಉ༷ʹ Generator F Ͱ͸தੑతͳإΛදݱͨ͠ը૾ Λ Content Mask Cx ͱͯ͠ग़ྗ͍ͯ͠Δɽ ࠷ऴతʹਤ 2.19 ͷ Fusion ͷ෦෼ͰɼAttention Mask Ay ͱ Content Mask Cy ɼೖྗը ૾ x Λ଍ͯ͠ɼը૾ม׵Λߦ͏ɽೖྗը૾ x Λ Generator Ͱม׵ͨ͠ G(x) ͸ࣜ (2.21) Ͱද͞ΕΔ [8]ɽͳ͓ɼ ⊙ ͸ΞμϚʔϧੵͷԋࢉΛද͢ɽ G(x) = Ay ⊙ Cy + (1 − Ay ) ⊙ x (2.21)

Slide 25

Slide 25 text

ୈ 2 ষ ؔ࿈ݚڀ 23 ɹಉ༷ʹɼAttention Mask Ax ͱ Content Mask Cx ɼೖྗը૾ y Λ଍ͯ͠ɼF(y) = Ax ⊙ Cx + (1 − Ax ) ⊙ y ͱը૾ม׵Λߦ͏ɽ Attention-Guided Generator Scheme I Λ༻͍ͨ Attention-Guided GAN ͷଛࣦؔ਺ ͸ɼఢରੑଛࣦɼαΠΫϧҰ؏ੑଛࣦɼAttention ఢରੑଛࣦɼAttention ଛࣦɼϐΫη ϧଛࣦʢΞΠσϯςΟςΟଛࣦʣͰߏ੒͞Ε͍ͯΔ [8]ɽఢରੑଛࣦɼαΠΫϧҰ؏ੑ ଛࣦ͸ɼCycleGAN Ͱ༻͍ΒΕ͍ͯΔଛࣦͱಉ͡Ͱɼఢରੑଛࣦ͸ࣜ (2.22)ɼ(2.23)ɼ αΠΫϧҰ؏ੑଛࣦ͸ࣜ (2.24) ͷ௨ΓʹදͤΔɽ LGAN (G, DY , X, Y) = Ey [log(DY (y))] +Ex [log(1 − DY (G(x)))] (2.22) LGAN (F, DX , Y, X) = Ex [log(DX (x))] + Ey [log(1 − DX (F(y)))] (2.23) Lcycle (G, F) = Ex [∥F(G(x)) − x∥1 ] + Ey [∥G(F(y)) − y∥1 ] (2.24) ϐΫηϧଛࣦ͸ɼCycleGAN ͷΞΠσϯςΟςΟଛࣦͱಉ༷ɼ֤ը૾܈ʹ͓͚Δ৭ ߹͍ΛอͭΑ͏ઃ͚ΒΕͨଛࣦͰ͋Δ [8]ɽೖྗը૾Λ Generator Ͱม׵ͨ͠ը૾ͱ ݩͷೖྗը૾ͱͷϚϯϋολϯڑ཭Λଛࣦͱ͓ͯ͠Γɼࣜ (2.25) ͷΑ͏ʹදͤΔɽ Lpixel (G, F) = Ex [∥G(x) − x∥1 ] + Ey [∥F(y) − y∥1 ] (2.25) ଓ͍ͯ Attention ఢରੑଛࣦ͸ɼDiscriminator Ͱ Attention ΛؚΊͯࣝผͨ͠ͱ͖ ͷఢରੑଛࣦͰ͋Δ [8]ɽAttention ఢରੑଛࣦͰ͸ Discriminator DX , DY ͷ୅ΘΓʹ Attention-Guided Discriminator DXAttention , DYAttention Λಋೖͨ͠΋ͷͰ͋Γɼ ࣜ(2.26)ɼ (2.23) Ͱද͢ɽDYAttention ͸ɼຊ෺ը૾ͱ Attention ͷϖΞ [Ax , y] ͱੜ੒ը૾ͱ Attention ͷϖ Ξ [Ax ,G(x)] Λࣝผ͠ɼDXAttention ͸ɼຊ෺ը૾ͱ Attention ͷϖΞ [Ay , x] ͱੜ੒ը૾ͱ Attention ͷϖΞ [Ay , F(y)] Λࣝผ͢Δɽ LAGAN (G, DY , X, Y) = Ey [log(DYAttention ([Ax , y]))] + Ex [log(1 − DYAttention ([Ax ,G(x)]))] (2.26)

Slide 26

Slide 26 text

ୈ 2 ষ ؔ࿈ݚڀ 24 LAGAN (F, DX , Y, X) = Ex [log(DXAttention ([Ay , x]))] + Ey [log(1 − DXAttention ([Ax , F(y)]))] (2.27) Attention ଛࣦ [8] ͸ɼAttention Mask ʹ Total Variation ਖ਼نԽΛߦ͍ɼత֬ͳ Attention ʹ͢Δͷʹઃ͚ΒΕͨଛࣦͰ͋ΔɽAttention Mask Ax ʹ͓͚Δ Attnetion ଛࣦ Ltv (Ax ) ͸ࣜ (2.28) ͷ௨Γʹද͞ΕΔɽAttention Mask Ay ʹ͓͍ͯ΋ಉ༷ͷܭࢉΛߦ͏ɽͳ ͓ɼW, H ͸ͦΕͧΕ Attention Mask Ax ͷԣ෯ɼॎ෯Λද͢ɽ Ltv (Ax ) = W,H ∑ w,h=1 |Ax (w + 1, h, c) − Ax (w, h, c)| + |Ax (w, h + 1, c) − Ax (w, h, c)| (2.28) Attention-Guided GAN ͷଛࣦؔ਺͸ࣜ (2.29) Ͱද͞ΕΔɽ͜͜ͰɼLGAN ͸ࣜ (2.22)ɼ (2.23) ͷ࿨ɼLAGAN ͸ࣜ (2.26)ɼ(2.23) ͷ࿨Ͱ͋Δɽ͞ΒʹɼλGAN , λcycle , λpixel , λtv ͸ ଛࣦؔ਺ͷόϥϯεΛͱΔͨΊͷϋΠύʔύϥϝʔλͰ͋Δɽ L = λGAN (LGAN + LAGAN ) + λcycle Lcycle + λpixel Lpixel + λtv Ltv (2.29) 2.7.2 Attention-Guided Generator Scheme II Attention-Guided Generator Scheme I Ͱ͸̍ͭͷ Generator ͔Β̍ͭͣͭ Attention Mask ͱ Content Mask Λੜ੒͢Δ [8]ɽ͔͠͠ɼAttention-Guided Generator Scheme I ͷ Generator Λ༻͍ͨ৔߹ɼσʔληοτʹΑͬͯ͸ਫ਼౓͕མͪΔͱ͍͏ݚڀ݁Ռ͕ ͋Δ [9]ɽ͜͏͍ͬͨਫ਼౓্ͷ໰୊Λࠀ෰͢΂͘ɼAttention-Guided Generator Scheme II ͕ߟҊ͞ΕͨɽAttention-Guided Generator Scheme II ͷ Generator ͷߏ଄Λਤ 2.20 ʹࣔ͢ɽ ਤ 2.20 ͷ Generator G Λྫʹը૾ม׵खॱΛઆ໌͢Δɽೖྗը૾ x Λ Parameter Sharing Encoder GE Ͱม׵͢ΔɽAttention-Guided Generator Scheme I Ͱ͸̍ͭͷ

Slide 27

Slide 27 text

ୈ 2 ষ ؔ࿈ݚڀ 25 ਤ 2.20 Attention-Guided Generator Scheme II ͷ Generator ͷߏ଄ Generator ͔Β Attention Mask ͱ Content Mask Λಉ࣌ʹੜ੒͍͕ͯͨ͠ɼAttention- Guided Generator Scheme II ͷ Generator Ͱ͸ɼGE Ͱͷग़ྗΛͦΕͧΕ Content Mask Generator GC ͱ Attention Mask Generator GA ʹೖྗ͠ɼ֤ Mask Generator Ͱ Content Mask ͱ Attention Mask Λग़ྗ͢Δ [9]ɽ Attention Mask Generator GA ͸ɼ஫໨͢΂͖෦෼ͷΈΛڧௐ͢Δ Foreground At- tention Mask Af y ͱɼ൓ରը૾ม׵ʹؔ܎ͷͳ͍എܠͳͲͱ͍ͬͨ෦෼Λڧௐ͢Δ Background Attention Mask Ab y Λੜ੒͢Δɽ͜ͷΑ͏ʹվྑ͢Δ͜ͱʹΑͬͯɼFore- ground Attention Mask Af y ͱ Background Attention Mask Ab y ʹΑͬͯഎܠʹӨڹΛ༩ ͑ͣʹ஫໨͍ͨ͠෦෼ʹͷΈը૾ม׵Λߦ͏͜ͱ͕Ͱ͖Δɽ Ҏ্ΑΓ Attention-Guided Generator Scheme I ͱͷΞʔΩςΫνϟͷҧ͍͸ɼҎԼ ͷ̏ͭʹ·ͱΊΒΕΔɽ 1 GeneratorΛ్த·Ͱֶश͠ɼֶशޙͷग़ྗΛ༻͍ͯɼAttention Mask Generator ͱ Content Mask Generator ʹ෼ׂ͢Δ 2 Attention Mask Generator ͔Β Foreground Attention Mask ͱ Background Atten- tion Mask Λੜ੒͢Δ 3 Attention Mask Generator ͔Βੜ੒͞ΕΔ Foreground Attention Mask ͱ Content

Slide 28

Slide 28 text

ୈ 2 ষ ؔ࿈ݚڀ 26 Mask Generator ͔Βੜ੒͞ΕΔ Content Mask ͷຕ਺Λෳ਺ʹ૿΍͢ʢFore- ground Attention Mask ͱ Content Mask ͷຕ਺͸ಉ͡ʣ Foreground Attention Mask Af y ͱ Content Mask Cf y ͷϖΞͷཁૉͷ֤ੵͱɼҰຕͷ Background Attention Mask Af y ͱ Input x ͷཁૉͷ֤ੵΛ଍͠߹Θͤͯը૾Λੜ੒͢ ΔɽAttention Mask Generator ͔Βੜ੒͞ΕΔ Attention Mask ͷ૯਺Λ n ຕͱͨ͠ͱ ͖ɼBackground Attention Mask Ab y ͸ 1 ຕͰ͋ΔɽͦͷͨΊɼForeground Attention Mask Af y ͷ૯਺͸ n−1 ͱͳΓɼForeground Attention Mask Af y ͱϖΞͷ Content Mask Cf y ΋ n − 1 ͱͳΔɽ ͕ͨͬͯ͠ɼೖྗը૾ x Λ Generator Ͱม׵ͨ͠ G(x) ͸ࣜ (2.30) Ͱද͞ΕΔɽͳ ͓ɼ ⊙ ͸ΞμϚʔϧੵͷԋࢉΛද͢ɽ G(x) = n−1 ∑ f=1 Af y ⊙ Cf y + Ab y ⊙ x (2.30) ɹ Generator F Ͱ΋ɼಉ༷ʹͯ͠ y Ͱ͋Δͱ͢ΔͱɼF(y) ͸ࣜ (2.31) Ͱද͞ΕΔɽ F(x) = n−1 ∑ f=1 Af x ⊙ Cf x + Ab x ⊙ y (2.31) Attention-Guided GAN ͷଛࣦؔ਺͸ CycleGAN ͱಉ༷ʹɼఢରੑଛࣦͱαΠΫϧ Ұ؏ੑଛࣦɼΞΠσϯςΟςΟଛࣦΛ༻͍Δ [9]ɽఢରੑଛࣦͱ͸ɼม׵ઌͱͳΔσʔ λ܈ͷ෼෍ʹੜ੒ը૾ͷ෼෍ΛҰகͤ͞Δ͜ͱΛ໨తͱͨ͠ଛࣦͰ͋ΓɼαΠΫϧ Ұ؏ੑଛࣦ͸ɼ̎ͭͷ Generator ಉ࢜ʹໃ६͕ੜ͡ͳ͍Α͏ʹ͢Δ͜ͱΛ໨తͱͨ͠ ଛࣦͰ͋ΔɽͦΕʹՃ͑ͯɼΞΠσϯςΟςΟଛࣦʢIdentity Lossʣ Lidentity Λಋೖ ͠ɼ֤ը૾܈ʹ͓͚Δ৭߹͍ΛอͭΑ͏ͳ޻෉Λ͍ͯ͠Δ [7]ɽఢରੑଛࣦ LGAN Λ ࣜ (2.32)ɼ(2.33)ɼαΠΫϧҰ؏ੑଛࣦ Lcycle Λࣜ (2.34)ɼΞΠσϯςΟςΟଛࣦΛࣜ (2.35) ʹͯࣔ͢ɽ͜͜Ͱσʔλ܈Λ X, Y ͱͯ͠ɼG Λ X → Y ͷม׵Λߦ͏ Generatorɼ F Λ Y → X ͷม׵Λߦ͏ Generator ͱ͠ɼX, Y ʹରԠ͢Δ Discriminator ΛͦΕͧΕ DX , DY ͱ͢ΔɽDX ͸ࣝผର৅ͱͳΔσʔλ͕ը૾σʔλ܈ X தͷը૾Ͱ͋Δ͔Ͳ

Slide 29

Slide 29 text

ୈ 2 ষ ؔ࿈ݚڀ 27 ͏͔ɼDY ͸ࣝผର৅ͱͳΔσʔλ͕ը૾σʔλ܈ Y தͷը૾Ͱ͋Δ͔Ͳ͏͔Λࣝ ผ͢Δɽ LGAN (G, DY , X, Y) = Ey [log(DY (y))] +Ex [log(1 − DY (G(x)))] (2.32) LGAN (F, DX , Y, X) = Ex [log(DX (x))] + Ey [log(1 − DX (F(y)))] (2.33) Lcycle (G, F) = Ex [∥F(G(x)) − x∥1 ] + Ey [∥G(F(y)) − y∥1 ] (2.34) Lidentity (G, F) = Ex [∥F(x) − x∥1 ] + Ey [∥G(y) − y∥1 ] (2.35) ࣜ (2.32)ɼࣜ (2.33)ɼࣜ (2.34)ɼࣜ (2.35) ΑΓɼAttention-Guided GAN ͷଛࣦؔ਺͸ ࣜ (2.36) ͷ௨ΓͱͳΔɽࣜ (2.36) தͷ λcycle , λidentity ͸ଛࣦؔ਺ͷόϥϯεΛͱΔͨ ΊͷϋΠύʔύϥϝʔλͰ͋Δ. L = LGAN (G, DY , X, Y) + LGAN (F, DX , Y, X) + λcycle Lcycle (G, F) + λidentity Lidentity (G, F) (2.36) 2.8 ճస֯౓༧ଌλεΫʢSelf-Supervised taskʣ ճస֯౓༧ଌλεΫͰ͸ɼݩͷը૾Λ͋Δ֯౓෼ճసͤͨ͞΋ͷΛೖྗը૾ͱ͠ɼ ͦͷೖྗը૾͕ݩͷը૾͔ΒԿ౓ճస͍ͯ͠Δ͔Λਪଌ͢Δ [17]ɽը૾ͷճస֯౓Λ ਪଌ͢ΔλεΫΛՃ͑Δ͜ͱʹΑͬͯɼճసෆมੑΛ೺Ѳ͠ɼը૾ͷزԿతಛ௃Λ ΑΓଊ͑ΒΕΔΑ͏ʹ͍ͯ͠Δɽ

Slide 30

Slide 30 text

ୈ 2 ষ ؔ࿈ݚڀ 28 ਤ 2.21 Self-Supervised GAN ʹ͓͚Δ Discriminator ͷߏ଄ ͳ͓ɼGAN ʹճస֯౓༧ଌλεΫΛಋೖͨ͠ Self-Supervised GAN ͕ଘࡏ͢Δ [10]ɽਤ 2.21 ʹͯɼSelf-Supervised GAN ʹ͓͚Δ Discriminator ͷߏ଄Λࣔ͢ɽSelf- Supervised GAN Ͱ͸ɼDiscriminator ͷ෦෼Λʮೖྗը૾͕ຊ෺ը૾Ͱ͋Δ͔൱͔ʯ Λࣝผ͢Δ D ͱʮೖྗը૾͕ݩͷը૾͔ΒԿ౓ճసͨ͠΋ͷͰ͋Δ͔ʯΛࣝผ͢Δ Drot ʹ෼͚͓ͯΓɼલऀ͸ Generator ͔Βੜ੒͞Εͨݩͷຊ෺ը૾ͱݩͷੜ੒ը૾ɼ ޙऀ͸ݩͷຊ෺ը૾ͱݩͷੜ੒ը૾Λ, ͦΕͧΕ 0◦, 90◦, 180◦, 270◦ ճసͤͨ͞΋ͷΛ ͦΕͧΕ༻͍Δɽਤ 2.21 ͷΑ͏ʹ Self-Supervised GAN ͷ Discriminator ʹճస֯౓ ਪଌλεΫΛՃ͑Δ͜ͱʹΑͬͯɼຊ෺ը૾ͷҐஔؔ܎Λ΋ͱʹੜ੒ը૾ͷࣝผ͕ Ͱ͖ΔͨΊɼࣝผਫ਼౓্͕͕Δɽͦͯ͠ Discriminator ͱ Generator ͷఢରతֶशʹ ΑͬͯɼGenerator ΋ֶश͢ΔͷͰɼੜ੒ը૾ͷ࣭΋޲্ͤ͞Δ͜ͱ͕ՄೳͱͳΔɽ Self-Supervised GAN ͷଛࣦؔ਺ LD , LG ͸ɼࣜ (2.37)ɼࣜ (2.37) ͱදͤΔɽV(D,G) ͸ࣜ (2.14) ͱಉ༷ GAN ͷଛࣦؔ਺Ͱ͋Γɼຊ෺ը૾ʹ͓͚Δճస֯౓༧ଌλεΫ LrotD ɼੜ੒ը૾ʹ͓͚Δճస֯౓༧ଌλεΫ LrotG ΛͦΕͧΕ௥Ճ͍ͯ͠Δɽ

Slide 31

Slide 31 text

ୈ 2 ষ ؔ࿈ݚڀ 29 LD = V(D,G) + λd LrotD (2.37) LG = V(D,G) − λg LrotG (2.38) ຊ෺ը૾ʹ͓͚Δճస֯౓༧ଌλεΫ LrotD ɼੜ੒ը૾ʹ͓͚Δճస֯౓༧ଌλεΫ LrotG Λࣜ(2.39)ɼ ࣜ(2.40)Ͱද͢ɽ T ͸ճస֯౓ͷू߹Λද͠ɼT = {0◦, 90◦, 180◦, 270◦} ͱ͢ΔɽT ∈ T Λճస֯౓ͱ͠ɼຊ෺ը૾ x, y Λ T ճసͤͨ͞ը૾ΛͦΕͧΕ xT , yT ͷΑ͏ʹද͢ɽ Self-Supervised GAN Ͱ͸ Discriminator ʹճస֯౓Λ༧ଌ͢Δ Drot Λಋೖ͠ɼճ స֯౓Λਖ਼֬ʹ༧ଌͰ͖ΔΑ͏ʹ Discriminator Λֶश͢Δɽճస֯౓͕ਖ਼֬ʹ༧ଌ Ͱ͖ΔΑ͏ʹͳΔʹͭΕɼDrot (x) ͷ஋͸େ͖͘ͳΔɽ LrotD = Ex ET log(Drot (xT )) (2.39) LrotG = Ex ET log(Drot (xT )) (2.40)

Slide 32

Slide 32 text

ୈ 3 ষ ఏҊख๏ 30 ୈ3ষ ఏҊख๏ ຊݚڀʹ͓͍ͯɼ2.7.2 ߲ͷ Attention-Guided GAN ͷ Discriminator ʹɼճస͞ Εͨݩը૾ͷճస֯౓Λਪଌ͢ΔλεΫΛՃ͑ͨ Self-Supervised Attention-Guided GAN ʢSSAttention-Guided GANʣΛఏҊ͢Δɽͦͯ͠ɼઌߦݚڀͰࣔͨ͠ 2.7.2 ߲ ͷ Attnetion-Guided GAN ͱൺֱͯ͠ SSAttention-Guided GAN ͷධՁΛߦ͏ɽ SSAttention-Guided GAN Ͱ͸ɼ2 ͭͷը૾܈ X, Y ؒͰͷը૾ม׵ʹର͠ɼ2 ͭͷ Generator Λ GX→Y : X → YɼGY→X : Y → X Λ༻ҙ͢Δɽͦͯ͠ը૾܈ X, Y தͷը૾ x ∈ X ͱ y ∈ Y Λຊ෺ը૾ͱ֤ͯ͠ Discriminator DX , DY ͰࣝผΛߦ͏ɽDX ͸ը૾܈ X ͷຊ෺ը૾ x ͱੜ੒ը૾ GY→X (y) Λࣝผ͠ɼDY ͸ը૾܈ Y ͷຊ෺ը૾ y ͱੜ੒ը ૾ GX→Y (x) Λࣝผ͢Δɽɹ ճస֯౓༧ଌλεΫʹ͓͍ͯɼ T ͸ճస֯౓ͷू߹Λද͠ɼT = {0◦, 90◦, 180◦, 270◦} ͱ͢ΔɽT ∈ T Λճస֯౓ͱ͓ͯ͠Γɼຊ෺ը૾ x, y Λ T ճసͤͨ͞ը૾ΛͦΕͧΕ xT , yT ɼੜ੒ը૾ GX→Y (x),GY→X (y) Λ T ճసͤͨ͞ը૾ΛͦΕͧΕ GT X→Y (x),GT Y→X (y) ͱද͢ɽ ճస֯౓༧ଌΛ͢Δ Discriminator ΛͦΕͧΕ Drot X ɼDrot Y ͱ͢ΔɽDrot X ͸ɼຊ෺ը૾ xT ͱੜ੒ը૾GT Y→X (y)ͷճస֯౓Λਪଌ͠ɼDrot Y ͸ɼຊ෺ը૾ yT ͱੜ੒ը૾GT X→Y (x) ͷճస֯౓Λਪଌ͢Δɽ 3.1 ωοτϫʔΫߏ଄ ຊݚڀͰఏҊ͢Δ SSAttention-Guided GAN ͸ɼ2.7.2 ߲ͷ Attention-Guided GAN ͱಉ༷ͷωοτϫʔΫߏ଄Λࢀߟͱ͍ͯ͠Δ [9]ɽਤ3.1ʹͯSSAttention-Guided GAN ͷߏ଄Λࣔ͢ɽਤ 3.1 Ͱ͸ɼਤ 2.17 Ͱࣔ͢ CycleGAN ߏ଄ͷ֤ Discriminator ෦෼ʹ ਤ 2.21 ͱಉ͡ߏ଄ͷճస֯౓༧ଌλεΫΛಋೖ͍ͯ͠ΔɽDiscriminator DX Ͱ͸ೖ

Slide 33

Slide 33 text

ୈ 3 ষ ఏҊख๏ 31 ྗը૾͕ຊ෺ը૾ x ͔ੜ੒ը૾ GY→X ͔ͷࣝผΛߦ͍ɼ͔ͭ xT ͱ GT Y→X (y) ͷճస֯ ౓ਪఆΛߦ͍ͬͯΔɽಉ༷ʹ Discriminator DY Ͱ͸ೖྗը૾͕ຊ෺ը૾ y ͔ੜ੒ը ૾ GX→Y ͔ͷࣝผΛߦ͍ɼ͔ͭ yT ͱ GT X→Y (x) ͷճస֯౓ਪఆΛߦ͍ͬͯΔɽ ֤ Generator ͸ɼAttention-Guided Generator Scheme II Λ༻͍ΔɽAttention-Guided Generator Scheme II Ͱ͸ɼn ݸͷ Attention Mask ͱ n−1 ݸͷ Content Mask Λग़ྗ͠ɼ ͜ΕΒͷ Mask ͱೖྗը૾Λ଍ͯ͠ग़ྗ͢Δ [9]ɽGenerator ͷωοτϫʔΫ͸ɼ3 ͭ ͷ৞ΈࠐΈ૚ͱɼ9 ͭͷ Residual Blocks [18]ɼ3 ͭͷٯ৞ΈࠐΈ૚Ͱߏ੒͞ΕΔ [7] [9]ɽ Residual Blocks ͸ɼਂ૚χϡʔϥϧωοτϫʔΫͷֶशਫ਼౓޲্Λୡ੒͢ΔͨΊʹ ಋೖ͞ΕͨɼೖྗΛ 2 ૚࿈ଓ͢Δ৞ΈࠐΈ૚ʹࣸ૾ͨ͠ͱ͖ͷग़ྗʹೖྗΛ଍͠߹ Θͤͨ ResNet ͱݺ͹ΕΔωοτϫʔΫΛܨ͗߹Θͤͯߏ੒͞ΕΔ [18] [12] ֤ Discriminator ͸ɼ5 ݸͷ৞ΈࠐΈ૚Ͱߏ੒͞Εɼ࠷ऴతʹ 512 νϟωϧͷςϯ ιϧΛग़ྗ͢Δ [7]ɽͳ͓ຊݚڀͰ͸ɼνϟωϧ਺ 512 ͷ࠷ޙͷ૚ͷग़ྗ͔Βɼνϟ ωϧ਺ 1 ͷग़ྗͱνϟωϧ਺ 4 ͷग़ྗΛಘΔɽલऀ͸ɼೖྗը૾͕ຊ෺Ͱ͋Δ͔Ͳ ͏͔Λࣝผ͢ΔͨΊʹ༻͍Δग़ྗͰɼޙऀ͸ɼճస֯౓Λ༧ଌ͢ΔͨΊʹ༻͍Δग़ ྗͰ͋Δɽ 3.2 ଛࣦؔ਺ 2.7.2 ߲ͷ Attention-Guided GAN Ͱ͸ CycleGAN ͱಉ͘͡ɼఢରੑଛࣦʢAdver- sarial LossʣͱαΠΫϧҰ؏ੑଛࣦʢCycle Consistency LossʣΛ࢖༻͍ͯ͠Δɽຊ ݚڀͰ঺հ͢Δ SSAttention-Guided GAN Ͱ͸ɼ֤ఢରੑଛࣦʹճస֯౓༧ଌଛࣦ ʢRotation LossʣΛ௥Ճ͢Δɽ 3.2.1 ఢରੑଛࣦ (Adversarial Loss) ఢରੑଛࣦ͸ɼDiscriminator ͷఢରੑଛࣦͱ Generator ͷఢରੑଛࣦʹ෼͚ΒΕ Δɽ֤ Discriminator ͷࣝผʹ͓͚ΔଛࣦΛ Discriminator ͷఢରੑଛࣦͱ͢ΔɽDis- criminator DX , DY ʹ͓͚Δఢରੑଛࣦ LGANDX , LGANDY ͸ࣜ (3.1)ɼࣜ (3.2) ͷΑ͏ʹද

Slide 34

Slide 34 text

ୈ 3 ষ ఏҊख๏ 32 ਤ 3.1 SSAttention-Guided GAN ͷߏ଄ ͢ɽͳ͓ຊݚڀʹ͓͍ͯɼLGANDX , LGANDY ͸ΫϩεΤϯτϩϐʔͰ͸ͳ͘࠷খೋ৐ ๏Λ࠾༻͍ͯ͠Δɽ࠷খೋ৐๏Λ༻͍Δ͜ͱͰɼֶशͷ҆ఆੑΛ֬อ͢Δ͜ͱ͕Մ ೳͱͳΔ [11]ɽ LGANDX = 1 2 Ex [(DX (x) − 1)2] + 1 2 Ey [(DX (GY→X (y)))2] (3.1) LGANDY = 1 2 Ey [(DY (y) − 1)2] + 1 2 Ex [(DY (GX→Y (x)))2] (3.2) ɹ ଓ͍ͯ Generator GY→X ,GX→Y ͷఢରੑଛࣦ LGANGY→X , LGANGX→Y Λࣜ (3.3)ɼࣜ (3.4) ͷ Α͏ʹఆٛ͢Δɽ LGANGY→X = 1 2 Ey [(DX (GY→X (y)) − 1)2] (3.3) LGANGX→Y = 1 2 Ex [(DY (GX→Y (x)) − 1)2] (3.4) ຊݚڀͰఏҊ͢Δఢରੑଛࣦ͸ɼ֤ఢରੑଛࣦʹͦΕͧΕճస֯౓༧ଌଛࣦΛ௥

Slide 35

Slide 35 text

ୈ 3 ষ ఏҊख๏ 33 Ճͨ͠΋ͷͰ͋ΔɽDiscriminator ͷఢରੑଛࣦ LGANDX , LGANDY ʹճస֯౓༧ଌଛࣦ LrotDX , LrotDY ΛͦΕͧΕ௥Ճͨ͠ఢରੑଛࣦΛࣜ(3.5)ɼ ࣜ(3.6)Ͱఆٛ͢Δɽ LrotDX , LrotDY ͸ 2.8 ߲ͷࣜ (2.39) ͱಉ༷ʹࣜ (3.7), ࣜ (3.8) Ͱఆٛ͢ΔɽຊݚڀͰ͸ֶश҆ఆੑͷ ֬อͷͨΊɼLrotDX , LrotDY ͸ɼΫϩεΤϯτϩϐʔͷ୅ΘΓʹೋ৐ޡࠩΛ༻͍͍ͯ Δɽͳ͓ LrotDX , LrotDY ͸ଛࣦͷόϥϯεΛͱΔͨΊͷϋΠύʔύϥϝʔλ λDX , λDY ͱͷ֤ੵΛͱΔɽ LDX = LGANDX + λDX LrotDX (3.5) LDY = LGANDY + λDY LrotDY (3.6) LrotDX = Ex ET [(Drot X (xT ) − 1)2] (3.7) LrotDY = Ey ET [(Drot Y (yT ) − 1)2] (3.8) Generator ͷఢରੑଛࣦ LGANGY→X , LGANGX→Y ʹճస֯౓༧ଌଛࣦ LrotGY→X , LrotGX→Y Λ ͦΕͧΕ௥Ճͨ͠ఢରੑଛࣦΛࣜ (3.9)ɼࣜ (3.10) ͱఆٛ͢ΔɽLrotGY→X , LrotGX→Y ͸ 2.8 અͷࣜ (2.40) ͱಉ༷ʹࣜ (3.11)ɼ ࣜ (3.12) Ͱఆٛ͢ΔɽຊݚڀͰ͸ֶश҆ఆੑͷ֬ อͷͨΊɼLrotDX , LrotDY ͸ɼΫϩεΤϯτϩϐʔͷ୅ΘΓʹೋ৐ޡࠩΛ༻͍͍ͯΔɽ LrotDX , LrotDY ͸ଛࣦͷόϥϯεΛͱΔͨΊͷϋΠύʔύϥϝʔλ λGY→X , λGX→Y ͱͷ֤ ੵΛͱΔɽ LGY→X = LGANGY→X + λGY→X LrotGY→X (3.9) LGX→Y = LGANGX→Y + λGX→Y LrotGX→Y (3.10) LrotGY→X = Ey ET [(Drot X (GT Y→X (y)) − 1)2] (3.11) LrotGX→Y = Ex ET [(Drot Y (GT X→Y (x)) − 1)2] (3.12)

Slide 36

Slide 36 text

ୈ 3 ষ ఏҊख๏ 34 3.2.2 Cycle Consistency LossʢαΠΫϧҰ؏ੑଛࣦʣ αΠΫϧҰ؏ੑଛࣦͰ͸ɼGenerator GX→Y ,GY→X ؒͰໃ६͕ੜ͡ͳ͍Α͏ʹɼCy- cleGAN ΍ Attention-Guided GAN Ͱ༻͍ΒΕ͍ͯΔ [7] [8] [9]ɽຊݚڀͰ΋ઌߦݚڀͱ ಉ༷ͷαΠΫϧҰ؏ੑଛࣦ Lcycle Λѻ͏ɽαΠΫϧҰ؏ੑଛࣦ Lcycle Λࣜ (3.13) ʹͯ ද͢ɽ Lcycle = Ex [∥GY→X (GX→Y (x)) − x∥1 ] + Ey [∥GX→Y (GY→X (y)) − y∥1 ] (3.13) 3.2.3 ࠷ऴతͳଛࣦؔ਺ Discriminator ͷଛࣦؔ਺͸ͦΕͧΕࣜ (3.5)ɼࣜ (3.6) ͷ௨ΓͰ͋ΔɽҰํɼGen- erator ͷଛࣦؔ਺͸ɼGenerator ͷ֤ఢରੑଛࣦͱαΠΫϧҰ؏ੑଛࣦΛ߹Θͤɼࣜ (3.14) ͷΑ͏ʹද͢ɽͳ͓ λcycle ͸ଛࣦͷόϥϯεΛͱΔͨΊͷϋΠύʔύϥϝʔλ Ͱ͋Δɽ LG = LGX→Y + LGY→X + λcycle Lcycle (3.14) ͞ΒʹຊݚڀͰ͸ɼΞΠσϯςΟςΟଛࣦʢIdentity Lossʣ Lidentity Λಋೖ͢ΔɽΞ ΠσϯςΟςΟଛࣦ͸֤ը૾܈ʹ͓͚Δ৭߹͍ΛอͭͨΊʹઃ͚ΒΕͨଛࣦͰ͋Γɼ CycleGAN ΍ Attention-Guided GAN Ͱ΋ಉ༷ʹ༻͍ΒΕ͍ͯΔ [7] [9]ɽ ΞΠσϯςΟςΟଛࣦ͸ɼҎԼͷࣜ (3.15) ͷΑ͏ʹද͞ΕΔɽ Lidentity = Ex [∥GY→X (x) − x∥1 ] + Ey [∥GX→Y (y) − y∥1 ] (3.15) ࣜ (3.14) ʹΞΠσϯςΟςΟଛࣦΛ௥Ճ͠ɼຊݚڀͰ༻͍Δ Generator ͷ࠷ऴతͳଛ ࣦؔ਺Λࣜ (3.16) ʹࣔ͢ɽͳ͓ λidentity ͸ଛࣦͷόϥϯεΛͱΔͨΊͷϋΠύʔύϥ ϝʔλͰ͋Δɽ LG = LGX→Y + LGY→X + λcycle Lcycle + λidentity Lidentity (3.16)

Slide 37

Slide 37 text

ୈ 4 ষ ࣮ݧ 35 ୈ4ষ ࣮ݧ ຊݚڀͰ͸ɼઌߦݚڀͷ Attention-Guided Generator Scheme II Λ༻͍ͨ Attention- Guided CycleGAN [9] ͱఏҊख๏ͷ SSAttention-Guided GAN ͱͷൺֱΛߦ͏ɽ 4.1 ύϥϝʔλઃఆ ຊݚڀʹ͓͚Δύϥϝʔλ͸ɼઌߦݚڀͷ Attention-Guided GAN ͱಉ༷ͷ΋ͷΛ ༻͍Δ [9]ɽ·ͣόοναΠζ͸ઌߦݚڀͱಉ͡ 4, ΤϙοΫ਺͸ 60, ֶश཰͸ 0.0002ɼ ͦͯ͠ Attention Mask ͷ਺Λ 10 ͱ͢Δɽ ଛࣦؔ਺ͷ֤ύϥϝʔλ͸ɼλcycle ͸ 10.0ɼλidentity ͸ 0.5 ͱ͠ɼճస֯౓༧ଌʹ͓ ͚Δ֤ύϥϝʔλ͸ɼຊݚڀͰ͸ද 4.1 ͷ௨ΓʹύϥϝʔλΛ Discriminator λDX , λDY : Generator λF , λG = 5 : 1 ͷൺ཰ʹͳΔΑ͏ʹݻఆ࣮ͯ͠ݧΛߦ͏ɽઌߦݚڀͰ͸ɼ Discriminator ͷճస֯ଛࣦͷόϥϯεύϥϝʔλ͸ 1.0ɼGenerator ͸ 0.2 ͱઃఆ͞ Ε͍ͯΔ [10]ɽ ද 4.1 ճస֯౓༧ଌʹ͓͚Δ֤ύϥϝʔλ Discriminator ͷճస֯ଛࣦ Generator ͷճస֯ଛࣦ λDX λDY λF λG 0.5 0.5 0.1 0.1 1.0 1.0 0.2 0.2 1.5 1.5 0.3 0.3 2.0 2.0 0.4 0.4

Slide 38

Slide 38 text

ୈ 4 ষ ࣮ݧ 36 4.2 σʔληοτ ຊݚڀͰ͸ɼઌߦݚڀͰ༻͍ΒΕͨ 5 ͭͷσʔληοτΛ༻͍ΔɽຊݚڀͰ༻͍ Δσʔληοτʹ͍ͭͯද 4.2 ʹͯ·ͱΊΔɽ 4.3 ධՁํ๏ ຊݚڀͰ͸ɼFr´ echet Inception Distance(FID) Λ༻͍ͨఆྔతධՁͱ֤ੜ੒ը૾ͷ ఆੑతධՁΛߦ͏ɽ 4.3.1 Fr´ echet Inception Distance(FID) Fr´ echet Inception Distance(FID) ͸ɼ2 ͭͷը૾ͷू߹ͷ෼෍ؒڑ཭Λ༻͍ͨධՁ ํ๏Ͱ͋Δ [19]ɽ2 ͭͷը૾ͷू߹ͷ෼෍ؒڑ཭Λ FID ͱ͠ɼFID ͷ஋͕খ͍͞΄Ͳ ੜ੒ը૾͕ຊ෺ը૾ʹ͍ۙͱ͍͏͜ͱʹͳΔɽGAN Ͱ͸ɼຊ෺ը૾σʔληοτͷ ෼෍ pdata ͱੜ੒ը૾σʔληοτͷ෼෍ pg ؒͷ Fr´ echet ڑ཭Λܭࢉ͠ɼFID Λࢉग़ ͢Δ [19]ɽࣜ (4.1) Ͱ GAN ͷ FID ͷܭࢉΛࣔ͢ɽpdata ͷฏۉϕΫτϧͱڞ෼ࢄߦྻ (µr , Cr )ɼpg ͷฏۉϕΫτϧͱڞ෼ࢄߦྻ (µg , Cg ) Λ༻͍ͯࣜ (4.1) ͷ௨ΓʹܭࢉΛ ߦ͏ɽ FID(pdata , pg ) = ∥µr − µg ∥ + tr(Cr + Cg − 2(Cr Cg )1/2) (4.1) ɹ ຊݚڀͰ͸ɼઌߦݚڀͱಉ༷ʹͯ͠ɼhorse2zebra ͱ apple2orange σʔληοτͰ ը૾ม׵͞Εͨը૾Λ༻͍ͯɼFID Ͱຊ෺ը૾ͱੜ੒ը૾ͷఆྔతධՁΛߦ͏ [9]ɽ

Slide 39

Slide 39 text

ୈ 4 ষ ࣮ݧ 37 ද 4.2 ࢖༻͢Δσʔληοτ σʔληοτ໊ σʔληοτλΠϓ σʔληοτͷத਎ αΠζ horse2zebra train horse 1067 ʢഅ → γϚ΢Ϛʣ zebra 1334 test horse 120 zebra 140 apple2orange train apple 995 ʢΞοϓϧ → ΦϨϯδʣ orange 1019 test apple 266 orange 248 facades train facade 400 ʢݐ෺֎؍ → ϥϕϧʣ label 400 test facade 106 label 106 map2photo train map 1096 ʢ஍ਤ → ߤۭࣸਅʣ photo 1096 test map 1098 photo 1098 cityscapes train cityscape 2975 ʢ౎ࢢܠ؍ → ϥϕϧʣ label 2975 test cityscape 500 label 500

Slide 40

Slide 40 text

ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 38 ୈ5ষ ࣮ݧ݁Ռͱߟ࡯ ຊݚڀͰ͸ɼઌߦݚڀͷ Attention-Guided Generator Scheme II ͷ Attention-Guided GAN ͷϓϩάϥϜΛ༻͍ͯɼSSAttention-Guided GAN ͷϓϩάϥϜΛ࣮૷͠ɼઌ ߦݚڀͱఏҊख๏ͷධՁΛߦͬͨɽ 5.1 ࣮ݧ݁Ռ 5.1.1 ఆྔతධՁ FID Ͱͷ horse2zebra ͱ apple2orange ͷఆྔతධՁΛද 5.1 ʹࣔ͢ɽද 5.1 Ͱ͸ɼ horse to zebraʢഅ → γϚ΢Ϛʣͱ apple to orangeʢΞοϓϧ → ΦϨϯδʣͷ FID Λ ֤ճస֯ύϥϝʔλ͝ͱʹ͍ࣔͯ͠Δɽͳ͓ճస֯ύϥϝʔλ λD , λG ͸, ͦΕͧΕ λDX , λDY ͱ λGX→Y , λGY→X Λ·ͱΊͨ΋ͷͰ͋Δɽճస֯ύϥϝʔλ (λD , λG ) = (0.0, 0.0) ͸ઌߦݚڀͷ Attention-Guided GAN [9] Ͱ͋Δɽ ද5.1ͷ݁ՌΑΓɼ horse to zebraͱapple to orangeͷ྆ํʹ͓͍ͯɼ ճస֯ύϥϝʔ λ (λD , λG ) = (0.5, 0.1), (2.0, 0.4) ͷͱ͖ͷ FID ͕ઌߦݚڀͷ Attention-Guided GAN ͷ FID ΑΓখ͍͞஋Λͱͳ͍ͬͯΔɽҰํͰճస֯ύϥϝʔλ (λD , λG ) = (1.5, 0.3) Ͱ ͸ɼapple to orange ͷ FID ͸ൺֱతখ͍͞஋Ͱ͋Δͷʹର͠ɼhorse to zebra ͷ FID ͸ൺֱతେ͖͍஋ͱͳ͍ͬͯΔɽ͞Βʹ (λD , λG ) = (1.0, 0.2) ͷͱ͖͸ɼhorse to zebra ͱ apple to orange ͷ྆ํʹ͓͍ͯɼFID ͕ൺֱతେ͖͍஋Λͱ͍ͬͯΔ͜ͱ͕ݟͯ औΕΔɽ

Slide 41

Slide 41 text

ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 39 ද 5.1 FID ͷఆྔతධՁ ճస֯ύϥϝʔλ (λD , λG ) horse to zebra apple to orange (0.0, 0.0) 79.64 156.88 (0.5, 0.1) 54.58 149.88 (1.0, 0.2) 117.60 158.46 (1.5, 0.3) 83.37 154.71 (2.0, 0.4) 66.87 155.53 5.1.2 ఆੑతධՁ ֤σʔληοτ horse2zebraɼapple2orangeɼmap2photoɼfacadesɼcityscapes ͷը ૾ม׵ޙͷੜ੒ը૾ͱ Attention ͷ݁Ռʹ͍ͭͯड़΂Δɽ ·ͣ horse to zebra ม׵ͷҰྫΛද 5.2ɼද 5.3 ʹɼapple to orange ม׵ͷҰྫΛද 5.4ɼද 5.5 ʹࣔ͢ɽhorse2zebra ͷҰྫʹ͓͍ͯɼैདྷͷ Attention-Guided GAN ͷ ม׵ը૾ͱ Attention ͸എܠʹ΋ࣶ໛༷΁ͷม׵͕͞Ε͍ͯΔ͜ͱ͕ݟͯऔΕΔɽҰ ํ SSAttention-Guided GAN Λ༻͍ͨ݁Ռɼ྆ํͱ΋എܠ΁ͷࣶ໛༷͕ݮগ͠ɼͦͷ Attention ͕ม׵ର৅ͷܗʹ੔͍ͬͯΔ͜ͱ͕෼͔Δɽද 5.2 Ͱ͸ɼճస֯ύϥϝʔ λ (λD , λG ) = (0.5, 0.1), (1.0, 0.2) Ͱ͸ɼઌߦݚڀͷ Attention-Guided GAN ͰݟΒΕͨ എܠͷ্൒෼ͷࣶ໛༷ม׵ͷӨڹ͕΄ͱΜͲແ͘ͳ͓ͬͯΓɼͦΕҎ߱͸·ͨผͷ Өڹ͕ग़͍ͯΔ͜ͱ͕؍࡯Ͱ͖Δɽද 5.3 Ͱ΋ಉ༷ʹɼճస֯౓༧ଌλεΫಋೖޙɼ Attention ͕ม׵ର৅ͷഅͷܗʹͳ͓ͬͯΓɼൺֱతγϚ΢Ϛ΁ͷม׵͕Ͱ͖͍ͯΔ Օॴ͕ݟड͚ΒΕͨɽɹ ଓ͍ͯ apple2orange Ͱͷม׵݁ՌΛࣔͨ͠ද 5.4 Ͱ͸ɼ(λD , λG ) = (0.5, 0.1) ͷͱ ͖ɼઌߦݚڀͷ Attention-Guided GAN ͷͱ͖ͷ݁ՌͰ͸ݟΒΕͳ͔ͬͨԞͷ෦෼΁ ͷ Attention Mask ͕֬ೝ͞Ε͓ͯΓɼԞͷΞοϓϧ͕ը૾ม׵Ͱ͖͍ͯΔ͜ͱ͕ݟ ͯऔΕΔɽҰํɼද 5.5 ͷΑ͏ʹɼճస֯౓༧ଌλεΫಋೖલͱಋೖޙͰ͸͋·Γ มԽ͕ݟΒΕͳ͔ͬͨ΋ͷ΋֬ೝ͞Εͨɽ ࢒Γͷσʔληοτͷը૾ม׵ʹ͍ͭͯɼmap to photo ͷҰྫΛද 5.6ɼݐ෺֎؍

Slide 42

Slide 42 text

ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 40 ͔Βϥϕϧ΁ͷม׵ͷҰྫΛද 5.7ɼ౎ࢢܠ؍͔Βηάϝϯςʔγϣϯ΁ͷม׵ͷҰ ྫΛද 5.8 ʹࣔ͢ɽ·ͣ map2photo Ͱ͸ઌߦݚڀͱൺ΂ͯɼͲͷճస֯ύϥϝʔλ ΋ Attnetion ͱੜ੒ը૾ʹ͋·ΓมԽ͕ݟΒΕͳ͔ͬͨɽଓ͘ facades ͸ɼಛʹද 5.7 ͷΑ͏ʹɼճస֯ύϥϝʔλ͕ (λD , λG ) = (1.5, 0.3) ͷͱ͖ͷੜ੒ը૾ͷϥϕϧཱ͕ ମతʹͳ͍ͬͯΔ͜ͱ͕֬ೝͰ͖Δɽcityscapes Ͱ͸ɼද 5.8 ͷΑ͏ʹɼं΍ݐ෺ͱ ͍ͬͨ෺ମ͔Βϥϕϧ΁ͷม׵͕ઌߦݚڀͷ Attention-Guided GAN ΑΓൺֱతਖ਼֬ ʹߦΘΕ͓ͯΓɼ֤ϥϕϧ͕ं΍ݐ෺ͷܗʹͳ͍ͬͯΔ͜ͱ͕ݟͯऔΕΔɽͨͩ͠ ճస֯ύϥϝʔλ͕ (λD , λG ) = (1.0, 0.2) ͷͱ͖ɼAttention Mask ʹ̏ͭͷന͍఺͕ ֬ೝ͞Ε͍ͯΔɽ͜ΕΒ͸ Attention Mask ͱରԠ͢Δੜ੒ը૾ͰมԽ͍ͯ͠ͳ͍෦ ෼ͱͳ͍ͬͯΔɽ horse2zebra ͱ apple2orange ʹ͓͚Δੜ੒ը૾ࣗମͷఆੑతධՁʹ͍ͭͯड़΂Δɽ ·ͣճస֯ύϥϝʔλ͝ͱʹ horse2zebra ͷੜ੒ը૾ΛαϯϓϦϯάͨ͠΋ͷΛਤ 5.1, ਤ 5.2, ਤ 5.3, ਤ 5.4, ਤ 5.5 ʹͯࣔ͢ɽ͜ΕΒͷαϯϓϦϯάͨ͠ੜ੒ը૾Λൺ ֱ͢Δͱɼઌߦݚڀͷ Attention-Guided GAN Ͱͷม׵ը૾ΑΓഎܠ΁ͷӨڹ͕վળ ͞Εͨ΋ͷ͕΄ͱΜͲͷੜ੒ը૾Ͱݟड͚ΒΕͨɽͨͩ͠Ͳͷճస֯ύϥϝʔλͰ എܠ΁ͷӨڹ͕վળ͞Ε͔ͨ͸ɼੜ੒ը૾ʹΑͬͯҟͳΔ͜ͱ͕֬ೝ͞Εͨɽͦͷ ҰํͰɼಉ༷ʹઌߦݚڀͷ Attention-Guided GAN Ͱͷม׵ը૾ΑΓഎܠ΁ͷӨڹ͕ ֦େ͞Εͨ෦෼΍ɼࣶ໛༷͕ର৅ͱͳΔ෦෼ʹ͍͍ͭͯͳ͍ͱ͍ͬͨ෦෼΋ݟड͚ ΒΕͨɽ ଓ͍ͯճస֯ύϥϝʔλ͝ͱʹ apple2orange ͷੜ੒ը૾ΛαϯϓϦϯάͨ͠΋ͷ Λਤ 5.6, ਤ 5.7, ਤ 5.8, ਤ 5.9, ਤ 5.10 ʹͯࣔ͢ɽαϯϓϦϯάͨ͠ੜ੒ը૾Λൺ΂ͯɼ தʹ͸શ͘มԽ͠ͳ͍ੜ੒ը૾΋ଘࡏ͕ͨ͠ɼجຊతʹઌߦݚڀͷ Attention-Guided GAN Ͱͷੜ੒ը૾ͱൺ΂ɼճస֯ύϥϝʔλ͕ (λD , λG ) = (0.5, 0.1), (1.5, 0.3) ͷͱ͖ ʹର৅ͱͳΔ෦෼શମ΁ͷม׵͕͞Ε͍ͯΔը૾͕ଟ͘ݟΒΕͨɽͦΕʹՃ͑ͯ੺ ͍Ξοϓϧͱɼผͷ৭ͷΞοϓϧ΍Ξοϓϧͷத਎Λ۠ผͯ͠ม׵͢Δ͜ͱ͕Ͱ͖ ͍ͯΔ͜ͱ΋֬ೝͰ͖Δɽɹ ճస֯ύϥϝʔλ͝ͱʹ map2photo ٴͼ facades ٴͼ cityscapes ͷੜ੒ը૾Λαϯ ϓϦϯάͨ͠΋ͷΛͦΕͧΕࣔ͢ɽmap2photo ͸ਤ 5.11, ਤ 5.12, ਤ 5.13, ਤ 5.14, ਤ

Slide 43

Slide 43 text

ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 41 5.15 ʹɼfacades ͸ਤ 5.16, ਤ 5.17, ਤ 5.18, ਤ 5.19, ਤ 5.20 ʹɼcityscapes ͸ਤ 5.21, ਤ 5.22, ਤ 5.23, ਤ 5.24, ਤ 5.25 ʹͯࣔ͢ɽ·ͣ map2photo Ͱ͸ɼճస֯ύϥϝʔλ ʹΑͬͯ͸ը૾ͷ৭͕มΘ͍ͬͯΔՕॴ΋ݟΒΕ͕ͨɼҐஔؔ܎ʹؔͯ͠͸มԽ͸ ݟΒΕͳ͔ͬͨɽ࣍ʹ facades Ͱ͸ɼճస֯ύϥϝʔλ͕ (λD , λG ) = (1.5, 0.3) ͷͱ ͖͸ϥϕϧ͕ม׵ݩͷݐ෺֎؍ͷΑ͏ʹཱମతʹͳ͍ͬͯΔ΋ͷ͕ଟ਺֬ೝ͞Εͨɽ ͳ͓ (λD , λG ) = (1.5, 0.3) Ҏ֎ͷճస֯ύϥϝʔλʹ͓͍ͯ͸ɼઌߦݚڀͷ Attention- Guided GAN Ͱͷ݁Ռͱ͋·ΓมԽ͕ݟΒΕͳ͔ͬͨɽ࠷ޙʹ cityscapes Ͱ͸ɼಛ ʹճస֯ύϥϝʔλ͕ (λD , λG ) = (1.0, 0.2), (2.0, 0.4) ͷΑ͏ʹ੨ͷंͷϥϕϧ΍ɼ྘ ͷݐ෺ͷϥϕϧ͕ΑΓਖ਼֬ʹੜ੒ը૾ʹ൓ө͞Ε͍ͯΔ෦෼͕ݟΒΕͨɽͦͷҰํ Ͱɼઌߦݚڀͷ Attention-Guided GAN ͰݟΒΕͨ੺͍ਓӨͷ෦෼͸ൺֱతݮগͯ͠ ͍ͨɽͦΕʹՃ͑ͯɼճస֯ύϥϝʔλ͕ (λD , λG ) = (0.5, 0.1), (1.5, 0.3) ͷͱ͖͸ɼ ઌߦݚڀͷ Attention-Guided GAN ͷੜ੒ը૾ͱൺ΂ͯϐϯΫͷ෦෼͕૿Ճ͍ͯ͠Δ ͜ͱ͕֬ೝ͞Εͨɽ 5.2 ߟ࡯ 5.2.1 ఆྔతධՁ SSAttention-Guided GAN Ͱੜ੒͞Εͨը૾͸ɼઌߦݚڀͷ Attention-Guided GAN ͷੜ੒ը૾ͱൺֱͯ͠ɼύϥϝʔλʹΑͬͯ͸ FID ͷ਺஋ΛվળͰ͖Δɽಛʹද 5.1 ʹ͓͍ͯɼhorse to zebraɼapple to orange ͱ΋ʹճస֯ύϥϝʔλ (λD , λG ) = (0.5, 0.1) ͷͱ͖͕࠷΋௿͍਺஋ͱͳ͍ͬͯΔɽ͕ͨͬͯ͠ɼճస֯౓༧ଌλεΫ͸ FID ͷվ ળʹߩݙ͓ͯ͠Γɼ(λD , λG ) = (0.5, 0.1) ͷͱ͖͕࠷దͳճస֯ύϥϝʔλͰ͋Δͱ ߟ͑ΒΕΔɽ 5.2.2 ఆੑతධՁ horse2zebraͷఆੑతධՁͰ͸ɼද5.2ɼද5.3ͷΑ͏ʹઌߦݚڀͷAttention-Guided GAN ͰഎܠΛר͖ࠐΜͩੜ੒ը૾͔ΒɼSSAttention-Guided GAN Ͱͷੜ੒ը૾Ͱ

Slide 44

Slide 44 text

ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 42 ͸എܠ΁ͷӨڹ͕վળͨ͠ࣄྫ͕ݟΒΕͨɽͦΕʹ൐͍ɼAttention ΋ݩͷഅͷ෦෼ ʹ͍ۙܗʹͳ͓ͬͯΓɼճస֯౓༧ଌλεΫʹΑͬͯɼݩͷഅͷزԿతಛ௃΍Ґஔ ؔ܎Λ͖ͪΜͱ೺ѲͰ͖͍ͯΔͱߟ͑ΒΕΔɽ ͨͩ͠ɼճస֯ύϥϝʔλʹΑͬͯ͸֤ม׵ը૾ͷม׵ͷਫ਼౓͕ͦΕͧΕҧͬͯ ͍Δ͜ͱ͕֬ೝ͞Ε͓ͯΓɼੜ੒ը૾ʹΑͬͯద੾ͳճస֯ύϥϝʔλ͸ҟͳΔ͜ ͱΛҙຯ͍ͯ͠Δͱߟ͑ΒΕΔɽ࣮ࡍʹഅ͕ॏͳ͍ͬͯͨΓɼਓؒͱ͍ͬͨഅҎ֎ ͷ෺ମ͕Ҡ͍ͬͯΔը૾͸ɼҰ؏ࣶͯ͠໛༷͕͏·͍͍ͭͯ͘ͳ͔ͬͨɽ apple2orangeͷఆੑతධՁʹ͓͍ͯɼ ճస֯ύϥϝʔλ (λD , λG ) = (0.5, 0.1), (1.5, 0.3) ͷͱ͖Ξοϓϧͷ৭ΛΦϨϯδʹ͢Δ texture ม׵͸҆ఆ্ͯ͠ख͘ߦΘΕ͍ͯΔɽ ͦΕʹՃ͑ͯผͷ৭ͷΞοϓϧ΍Ξοϓϧͷஅ໘ɼ༿ͳͲΛ۠ผͯ͠ը૾ม׵Λߦͬ ͍ͯΔ͜ͱ͔Βɼճస֯౓ͷ༧ଌʹΑͬͯΞοϓϧͷಛ௃͕೺ѲͰ͖͍ͯΔͱߟ͑ ΒΕΔɽͳ͓ม׵ݩͷΞοϓϧͷதͰ΋ɼΦϨϯδ৭ʹ͍ۙ΋ͷ͕ଘࡏ͍ͯ͠Δ͕ɼ ͦΕΒ͸͋·ΓมԽ͕ݟΒΕͳ͔ͬͨɽཧ༝ͱͯ͠͸ɼ੺͍ΞοϓϧΛֶश͍ͯ͠ ΔͨΊɼผͷ৭ͷΞοϓϧͱೝࣝͯ͠͠·͍ͬͯΔ͜ͱ͕ߟ͑ΒΕΔɽ ͦͷଞͷσʔληοτʹ͍ͭͯ͸ɼ·ͣ map2photo Ͱ͸ઌߦݚڀͱൺ΂ɼͲͷճ స֯ύϥϝʔλ΋͋·ΓมԽ͕ݟΒΕͳ͔ͬͨɽݪҼͱͯ͠͸ɼม׵ݩͷ஍ਤͷํ ֯ͱม׵ઌͷӴ੕ࣸਅͷը૾಺ͷํ֯Λͦͷ··ʹͯ͠ը૾ม׵͢ΔͨΊɼվΊͯ ճస֯౓༧ଌΛͯ͠ํ֯ͱ͍ͬͨಛ௃Λ೺Ѳ͢Δඞཁ͕ͳ͍ͱ͍͏͜ͱ͕ڍ͛ΒΕ Δɽfacades ͸ಛʹද 5.7 ͷΑ͏ʹɼ(λD , λG ) = (1.5, 0.3) ͷੜ੒ը૾ͷϥϕϧཱ͕ମ తʹͳ͍ͬͯΔ͜ͱ͕ଟ਺֬ೝ͞Ε͍ͯͨɽcityscapes Ͱ΋ઌߦݚڀͱൺ΂ɼं΍ݐ ෺ϥϕϧ͕ΑΓਖ਼֬ʹ͍͍ͭͯΔՕॴ͕ଘࡏ͍ͯ͠Δɽ͜ΕΒͷ݁ՌΑΓɼfacades ΍ cityscapes σʔληοτʹ͍ͭͯ΋ճస֯౓༧ଌλεΫΛಋೖ͢Δ͜ͱͰը૾಺ ͷಛ௃Λ͔ͭΉ͜ͱ͕Ͱ͖͍ͯΔͱߟ͑ΒΕΔɽ 5.2.3 ఆྔతධՁͱఆੑతධՁͷൺֱ horse2zebra ͷఆྔతධՁͱఆੑతධՁΛൺֱ͢Δͱɼੜ੒ը૾ͱ FID ͷ਺஋ʹ૬ ؔੑ͸ͳ͘ɼFID ͷ਺஋ͷେ͖͞ʹ͔͔ΘΒͣɼੜ੒ը૾ͷਫ਼౓͕ͦΕͧΕҟͳͬ

Slide 45

Slide 45 text

ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 43 ͍ͯΔ͜ͱ͕෼͔Δɽྫ͑͹ horse2zebra ͷճస֯ύϥϝʔλ (λD , λG ) = (2.0, 0.4) ͷ ͱ͖ͷ FID ͸ઌߦݚڀͱൺ΂ͯվળ͞Ε͍͕ͯͨɼද 5.2 ͱরΒ͠߹ΘͤͯݟͯΈ Δͱɼ৽ͨʹม׵͍ͨ͠ର৅Ҏ֎ͷ෦෼Λר͖ࠐΜͰ͠·͍ͬͯΔͱ͍ͬͨࣄྫ΋ ֬ೝ͞Εͨɽ͜ͷݪҼͷҰͭͱͯ͠ɼhorse2zebra σʔληοτͷதʹ͸ਓؒ΍૲ݪ ͕͍ࣸͬͯͨΓɼഅ͕ॏͳ͍ͬͯͨΓͯ͠ɼ࣮ࡍʹഅ͔ΒγϚ΢Ϛ΁ͷม׵্͕ख ͍͔͘ͳ͔ͬͨ΋ͷͷଘࡏ͕ڍ͛ΒΕΔɽhorse2zebra σʔληοτதͷਓؒ΍૲ݪɼ ̎ͭͷॏͳͬͨഅͷֶश͕ FID ʹӨڹΛ༩͓͑ͯΓɼੜ੒ը૾ͷਫ਼౓ʹ͕ࠩग़͍ͯ ΔͷͰ͸ͳ͍͔ͱߟ͑ΒΕΔɽ ରͯ͠ɼapple2orange Ͱ͸ɼFID ͷ਺஋͕ྑ͍ճస֯ύϥϝʔλ΄ͲఆੑతධՁͷ ը૾ม׵্͕ख͘ߦΘΕ͍ͯͨɽ ྫ͑͹ճస֯ύϥϝʔλ (λD , λG ) = (0.5, 0.1), (1.5, 0.3), (2.0, 0.4) ͷͱ͖ͷ FID ͸ઌߦݚڀͱൺ΂ͯྑ͍਺஋Ͱ͋ΓɼఆੑతධՁ΋੺͍Ξοϓ ϧ͔ΒΦϨϯδͷΞοϓϧ΁ͷม׵্͕ख͘Ͱ͖͍ͯͨɽ͕ͨͬͯ͠ɼapple2orange ʹ͓͍ͯ͸ FID ͱੜ੒ը૾ͷ૬ؔੑ͸΄ͱΜͲݟΒΕΔͱ݁࿦෇͚ΒΕΔɽ

Slide 46

Slide 46 text

ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 44 ද 5.2 horse2zebra σʔληοτͰͷ horse to zebra ม׵݁Ռʢ̍ʣ ɹ (λD , λG) horse zebra Attention Mask (0.0, 0.0) (0.5, 0.1) (1.0, 0.2) (1.5, 0.3) (2.0, 0.4) ’

Slide 47

Slide 47 text

ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 45 ද 5.3 horse2zebra σʔληοτͰͷ horse to zebra ม׵݁Ռʢ̎ʣ ɹ (λD , λG) horse zebra Attention Mask (0.0, 0.0) (0.5, 0.1) (1.0, 0.2) (1.5, 0.3) (2.0, 0.4)

Slide 48

Slide 48 text

ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 46 ද 5.4 apple2orange σʔληοτͰͷ apple to orange ม׵݁Ռ ʢ̍ʣ (λD , λG) apple orange Attention Mask (0.0, 0.0) (0.5, 0.1) (1.0, 0.2) (1.5, 0.3) (2.0, 0.4)

Slide 49

Slide 49 text

ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 47 ද 5.5 apple2orange σʔληοτͰͷ apple to orange ม׵݁Ռ ʢ̎ʣɹ (λD , λG) apple orange Attention Mask (0.0, 0.0) (0.5, 0.1) (1.0, 0.2) (1.5, 0.3) (2.0, 0.4)

Slide 50

Slide 50 text

ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 48 ද 5.6 map2photo σʔληοτͰͷ map to photo ม׵݁Ռ (λD , λG) map photo Attention Mask (0.0, 0.0) (0.5, 0.1) (1.0, 0.2) (1.5, 0.3) (2.0, 0.4)

Slide 51

Slide 51 text

ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 49 ද 5.7 facades σʔληοτΛ༻͍ͨݐ෺֎؍͔Βϥϕϧ΁ͷม ׵݁Ռ (λD , λG) ݐ෺֎؍ ϥϕϧ Attention Mask (0.0, 0.0) (0.5, 0.1) (1.0, 0.2) (1.5, 0.3) (2.0, 0.4)

Slide 52

Slide 52 text

ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 50 ද 5.8 cityscapesσʔληοτΛ༻͍ͨ౎ࢢܠ؍͔Βηάϝϯςʔ γϣϯ΁ͷม׵݁Ռ (λD , λG) ౎ࢢܠ؍ ηάϝϯςʔγϣϯ Attention Mask (0.0, 0.0) (0.5, 0.1) (1.0, 0.2) (1.5, 0.3) (2.0, 0.4)

Slide 53

Slide 53 text

ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 51 ਤ 5.1 ճస֯ύϥϝʔλ (λD , λG ) = (0.0, 0.0) ͷͱ͖ͷ horse2zebra ͷੜ੒ը૾ ਤ 5.2 ճస֯ύϥϝʔλ (λD , λG ) = (0.5, 0.1) ͷͱ͖ͷ horse2zebra ͷੜ੒ը૾

Slide 54

Slide 54 text

ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 52 ਤ 5.3 ճస֯ύϥϝʔλ (λD , λG ) = (1.0, 0.2) ͷͱ͖ͷ horse2zebra ͷੜ੒ը૾ ਤ 5.4 ճస֯ύϥϝʔλ (λD , λG ) = (1.5, 0.3) ͷͱ͖ͷ horse2zebra ͷੜ੒ը૾

Slide 55

Slide 55 text

ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 53 ਤ 5.5 ճస֯ύϥϝʔλ (λD , λG ) = (2.0, 0.4) ͷͱ͖ͷ horse2zebra ͷੜ੒ը૾ ਤ 5.6 ճస֯ύϥϝʔλ (λD , λG ) = (0.0, 0.0) ͷͱ͖ͷ ap- ple2orange ͷੜ੒ը૾

Slide 56

Slide 56 text

ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 54 ਤ 5.7 ճస֯ύϥϝʔλ (λD , λG ) = (0.5, 0.1) ͷͱ͖ͷ ap- ple2orange ͷੜ੒ը૾ ਤ 5.8 ճస֯ύϥϝʔλ (λD , λG ) = (1.0, 0.2) ͷͱ͖ͷ ap- ple2orange ͷੜ੒ը૾

Slide 57

Slide 57 text

ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 55 ਤ 5.9 ճస֯ύϥϝʔλ (λD , λG ) = (1.5, 0.3) ͷͱ͖ͷ ap- ple2orange ͷੜ੒ը૾ ਤ 5.10 ճస֯ύϥϝʔλ (λD , λG ) = (2.0, 0.4) ͷͱ͖ͷ ap- ple2orange ͷੜ੒ը૾

Slide 58

Slide 58 text

ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 56 ਤ 5.11 ճస֯ύϥϝʔλ (λD , λG ) = (0.0, 0.0) ͷͱ͖ͷ map2photo ͷੜ੒ը૾ ਤ 5.12 ճస֯ύϥϝʔλ (λD , λG ) = (0.5, 0.1) ͷͱ͖ͷ map2photo ͷੜ੒ը૾

Slide 59

Slide 59 text

ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 57 ਤ 5.13 ճస֯ύϥϝʔλ (λD , λG ) = (1.0, 0.2) ͷͱ͖ͷ map2photo ͷੜ੒ը૾ ਤ 5.14 ճస֯ύϥϝʔλ (λD , λG ) = (1.5, 0.3) ͷͱ͖ͷ map2photo ͷੜ੒ը૾

Slide 60

Slide 60 text

ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 58 ਤ 5.15 ճస֯ύϥϝʔλ (λD , λG ) = (2.0, 0.4) ͷͱ͖ͷ map2photo ͷੜ੒ը૾ ਤ 5.16 ճస֯ύϥϝʔλ (λD , λG ) = (0.0, 0.0) ͷͱ͖ͷ facades ͷ ੜ੒ը૾

Slide 61

Slide 61 text

ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 59 ਤ 5.17 ճస֯ύϥϝʔλ (λD , λG ) = (0.5, 0.1) ͷͱ͖ͷ facades ͷ ੜ੒ը૾ ਤ 5.18 ճస֯ύϥϝʔλ (λD , λG ) = (1.0, 0.2) ͷͱ͖ͷ facades ͷ ੜ੒ը૾

Slide 62

Slide 62 text

ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 60 ਤ 5.19 ճస֯ύϥϝʔλ (λD , λG ) = (1.5, 0.3) ͷͱ͖ͷ facades ͷ ੜ੒ը૾ ਤ 5.20 ճస֯ύϥϝʔλ (λD , λG ) = (2.0, 0.4) ͷͱ͖ͷ facades ͷ ੜ੒ը૾

Slide 63

Slide 63 text

ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 61 ਤ 5.21 ճస֯ύϥϝʔλ (λD , λG ) = (0.0, 0.0) ͷͱ͖ͷ cityscapes ͷੜ੒ը૾ ਤ 5.22 ճస֯ύϥϝʔλ (λD , λG ) = (0.5, 0.1) ͷͱ͖ͷ cityscapes ͷੜ੒ը૾

Slide 64

Slide 64 text

ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 62 ਤ 5.23 ճస֯ύϥϝʔλ (λD , λG ) = (1.0, 0.2) ͷͱ͖ͷ cityscapes ͷੜ੒ը૾ ਤ 5.24 ճస֯ύϥϝʔλ (λD , λG ) = (1.5, 0.3) ͷͱ͖ͷ cityscapes ͷੜ੒ը૾

Slide 65

Slide 65 text

ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 63 ਤ 5.25 ճస֯ύϥϝʔλ (λD , λG ) = (2.0, 0.4) ͷͱ͖ͷ cityscapes ͷੜ੒ը૾

Slide 66

Slide 66 text

ୈ 6 ষ ·ͱΊ 64 ୈ6ষ ·ͱΊ ຊݚڀͰ͸ઌߦݚڀͷ Attention-Guided GAN ͷ Discriminator ʹճస֯౓༧ଌλ εΫΛ෇Ճͨ͠ SSAttention-Guided GAN ͷϓϩάϥϜΛߏங͠ɼੜ੒ը૾ͷࣝผਫ਼ ౓޲্ΛࢼΈͨɽઌߦݚڀͷ Attention-Guided GAN ͱ SSAttention-Guided GAN ͷ ੜ੒ը૾ͱ Attention Mask Λൺֱ͠ɼม׵ର৅ͷಛ௃͕ Attention ΍ੜ੒ը૾ʹ൓ө ͞Ε͍ͯΔ͔ʹ͍ͭͯఆྔతධՁͱఆੑతධՁΛߦͬͨɽ FID Λ༻͍ͨఆྔతධՁͰ͸ɼઌߦݚڀͷ Attention-Guided GAN ͷͱ͖ΑΓ FID ͷݮগ͕֬ೝ͞ΕͨɽຊݚڀͰ͸ɼhorse2zebra ͱ apple2orange ͷ྆ํͱ΋ճస֯ύ ϥϝʔλ (λD , λG ) = (0.5, 0.1) ͷͱ͖ʹ࠷΋վળ͢Δ܏޲͕͋Δ͜ͱ͕֬ೝ͞Εͨɽ ఆੑతධՁͰ͸ɼઌߦݚڀͷ Attention-Guided GAN ͷͱ͖ʹ؍࡯͞Εͨ࿪ͳ At- tention Mask ͱੜ੒ը૾͕ճస֯౓༧ଌλεΫͷಋೖޙʹվળͨ͠ࣄྫ͕ଟ਺ݟड ͚ΒΕͨɽhorse2zebra Ͱ͸ɼճస֯౓༧ଌλεΫͷಋೖޙʹର৅Ҏ֎ͷഎܠͱ͍ͬ ͨ෦෼΁ͷม׵͕ݮগ͠ɼͦͷͱ͖ͷ Attention Mask ΋അͷܗʹ੔͍ͬͯΔ΋ͷ͕ ֬ೝ͞Εͨɽapple2orange ΋ɼઌߦݚڀͷ Attention-Guided GAN ͷͱ͖ͷੜ੒ը૾ Ͱ͸্ख͘ΦϨϯδʹม׵͠ͳ͔ͬͨ੺͍Ξοϓϧ͕ɼճస֯౓༧ଌλεΫͷಋೖ ޙʹ׬શʹΦϨϯδ৭ʹม׵Ͱ͖͍ͯΔ΋ͷ΋ݟΒΕͨɽ apple2orange Ͱ͸ FID ͷ஋͕খ͍͞΄Ͳɼੜ੒ը૾ͷ੺͍Ξοϓϧ͕ΦϨϯδ৭ ʹ্ख͘ม׵͞ΕΔ܏޲͕ݟΒΕͨɽ͜ͷ݁Ռ͔Βɼapple2orange ʹ͓͚Δ FID ͷ ஋͸ɼੜ੒ը૾ͷਫ਼౓ʹ൓ө͞ΕΔ͜ͱ͕ߟ͑ΒΕΔɽ͜Εʹରͯ͠ɼhorse2zebra ͸ɼFID ͷ஋ʹؔΘΒͣɼੜ੒ը૾ͷਫ਼౓͸ͦΕͧΕҟͳΔ͜ͱ͕֬ೝ͞Εͨɽ͜ ͷཁҼͷҰͭͱͯ͠ɼhorse2zebra σʔληοτͷதʹ͸ਓؒ΍૲ݪɼ2 ͭͷॏͳͬ ͨഅ͕ࠞࡏ͓ͯ͠Γɼ͜ΕΒͷը૾ͷֶश͕ੜ੒ը૾ͷਫ਼౓΍ FID ʹӨڹΛ༩͑ͯ ͍Δͱ͍͏͜ͱ͕ڍ͛ΒΕΔɽຊݚڀͰ΋ɼճస֯౓ͷ༧ଌͷ௥Ճͷ༗ແʹ͔͔Θ

Slide 67

Slide 67 text

ୈ 6 ষ ·ͱΊ 65 Βͣɼਓؒʹࣶ໛༷͕͔͔͍ͬͯͨΓɼ2 ͭͷॏͳͬͨഅ΁ͷม׵্͕ख͘Ͱ͖ͯ ͍ͳ͍ͱ͍ͬͨࣄྫ͕ݟΒΕͨɽ͜ΕΒΛ౿·͑ɼॏͳͬͨΦϒδΣΫτ΍ม׵ର ৅֎ͷΦϒδΣΫτͷը૾ม׵ਫ਼౓ͷվળΛࠓޙͷ՝୊ͱ͢Δɽ

Slide 68

Slide 68 text

ँࣙ 66 ँࣙ ઒ݪູઌੜʹ͸ɼ͝ࢦಋͱ͝ࢧԉΛ௖͖ɼେม͓ੈ࿩ʹͳΓ·ͨ͠ɽࠤʑ໦ོࢤ ઌੜʹ͸ɼ͜ͷҰ೥ؒΛ௨ͯ͠ɼओʹݚڀձΛ௨ͯ͠͝ॿݴͱ͝ࢦಋΛ௖͖·ͨ͠ɽ ຊ࿦จͷఴ࡟ʹ͝ڠྗ௖͍ͨݚڀࣨͷํʑʹײँͷҙΛਃ্͛͠·͢ɽ࠷ޙʹͳΓ ·͕ͨ͠ɼຊݚڀΛ͝৹ࠪ௖͖·ͨ͠Ѩᤈ༟ٱઌੜٴͼ໦Լߒೋઌੜʹਂ͓͘ྱਃ ্͛͠·͢ɽ

Slide 69

Slide 69 text

ࢀߟจݙ 67 ࢀߟจݙ ʦ1ʧ ಠཱߦ੓๏ਓ৘ใॲཧਪਐػߏ AI നॻฤूҕһձɿAI നॻ 2019, KADOKAWA (2019). ʦ2ʧ Gui, J., Sun, Z., Wen, Y., Tao, D. and Ye, J.: A Review on Generative Adversarial Networks: Algorithms, Theory, and Applications, CoRR, Vol. abs/2001.06937, (2020). ʦ3ʧ Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A. and Bengio, Y.: Generative Adversarial Nets, in Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N. and Weinberger, K. Q. eds., Advances in Neural Information Processing Systems, Vol. 27, Curran Associates, Inc. (2014). ʦ4ʧ Radford, A., Metz, L. and Chintala, S.: Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, in Bengio, Y. and LeCun, Y. eds., 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings (2016). ʦ5ʧ Mirza, M. and Osindero, S.: Conditional Generative Adversarial Nets, CoRR, Vol. abs/1411.1784, (2014). ʦ6ʧ Knyaz, V. A., Kniaz, V. V. and Remondino, F.: Image-to-Voxel Model Transla- tion with Conditional Adversarial Networks, in Leal-Taix´ e, L. and Roth, S. eds., Computer Vision - ECCV 2018 Workshops - Munich, Germany, September 8-14, 2018, Proceedings, Part I, Vol. 11129 of Lecture Notes in Computer Science, pp. 601–618, Springer (2018).

Slide 70

Slide 70 text

ࢀߟจݙ 68 ʦ7ʧ Zhu, J., Park, T., Isola, P. and Efros, A. A.: Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks, in IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017, pp. 2242– 2251, IEEE Computer Society (2017). ʦ8ʧ Tang, H., Xu, D., Sebe, N. and Yan, Y.: Attention-Guided Generative Adversarial Networks for Unsupervised Image-to-Image Translation, in International Joint Conference on Neural Networks, IJCNN 2019 Budapest, Hungary, July 14-19, 2019, pp. 1–8, IEEE (2019). ʦ9ʧ Tang, H., Liu, H., Xu, D., Torr, P. H. S. and Sebe, N.: AttentionGAN: Unpaired Image-to-Image Translation using Attention-Guided Generative Adversarial Net- works, CoRR, Vol. abs/1911.11897, (2019). ʦ10ʧ Chen, T., Zhai, X., Ritter, M., Lucic, M. and Houlsby, N.: Self-Supervised GANs via Auxiliary Rotation Loss, in IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, pp. 12154– 12163, Computer Vision Foundation / IEEE (2019). ʦ11ʧ Mao, X., Li, Q., Xie, H., Lau, R. Y. K., Wang, Z. and Smolley, S. P.: Least Squares Generative Adversarial Networks, in IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017, pp. 2813–2821, IEEE Computer Society (2017). ʦ12ʧ ࡈ౻߁ؽɿθϩ͔Β࡞Δ Deep Learning - Python ͰֶͿσΟʔϓϥʔχϯά ͷཧ࿦ͱ࣮૷, ΦϥΠϦʔɾδϟύϯ (2017). ʦ13ʧ C.M. Ϗγϣ οϓɿύλʔϯೝࣝͱػցֶश ্ ϕΠζཧ࿦ʹΑΔ౷ܭత༧ଌ, ؙળग़൛ (2014). ʦ14ʧ Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L. u. and Polosukhin, I.: Attention is All you Need, in Guyon, I.,

Slide 71

Slide 71 text

ࢀߟจݙ 69 Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S. and Gar- nett, R. eds., Advances in Neural Information Processing Systems, Vol. 30, Curran Associates, Inc. (2017). ʦ15ʧ Long, J., Shelhamer, E. and Darrell, T.: Fully Convolutional Networks for Se- mantic Segmentation, CoRR, Vol. abs/1411.4038, (2014). ʦ16ʧ Yin, L., Wei, X., Sun, Y., Wang, J. and Rosato, M. J.: A 3D Facial Expression Database For Facial Behavior Research, in Seventh IEEE International Confer- ence on Automatic Face and Gesture Recognition (FGR 2006), 10-12 April 2006, Southampton, UK, pp. 211–216, IEEE Computer Society (2006). ʦ17ʧ Gidaris, S., Singh, P. and Komodakis, N.: Unsupervised Representation Learn- ing by Predicting Image Rotations, in 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings, OpenReview.net (2018). ʦ18ʧ He, K., Zhang, X., Ren, S. and Sun, J.: Deep Residual Learning for Image Recog- nition, CoRR, Vol. abs/1512.03385, (2015). ʦ19ʧ Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B. and Hochreiter, S.: GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilib- rium, in Guyon, I., Luxburg, von U., Bengio, S., Wallach, H. M., Fergus, R., Vishwanathan, S. V. N. and Garnett, R. eds., Advances in Neural Information Pro- cessing Systems 30: Annual Conference on Neural Information Processing Sys- tems 2017, December 4-9, 2017, Long Beach, CA, USA, pp. 6626–6637 (2017).