Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Self-Supervised Attention-Guided GAN

Self-Supervised Attention-Guided GAN

My Graduation Thesis of B4

since1998

May 31, 2021
Tweet

More Decks by since1998

Other Decks in Research

Transcript

  1. i ໨ ࣍ ୈ 1 ষ ং࿦ 1 ୈ 2

    ষ ؔ࿈ݚڀ 3 2.1 ύʔηϓτϩϯ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2 ଟ૚χϡʔϥϧωοτϫʔΫ . . . . . . . . . . . . . . . . . . . . . . . 5 2.3 CNNʢConvolutional Neural Networkɼ৞ΈࠐΈχϡʔϥϧωοτϫʔ Ϋʣ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.4 Attentionʢ஫ҙػߏʣ . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.5 GANʢGenerative Adversarial Networkɼఢରతੜ੒ωοτϫʔΫʣ . 13 2.5.1 DCGANʢDeep Convolutional Generative Adversarial Networkɼ ৞ΈࠐΈఢରతੜ੒ωοτϫʔΫʣ . . . . . . . . . . . . . . . 13 2.5.2 cGANʢConditional Generative Adversarial Networkɼ৚݅෇͖ ఢରతੜ੒ωοτϫʔΫʣ . . . . . . . . . . . . . . . . . . . . 15 2.6 Image to Image Translation . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.6.1 pix2pix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.6.2 CycleGAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.7 Attention-Guided Image-to-Image Translation . . . . . . . . . . . . . . . 20 2.7.1 Attention-Guided Generator Scheme I . . . . . . . . . . . . . . . 21 2.7.2 Attention-Guided Generator Scheme II . . . . . . . . . . . . . . 24 2.8 ճస֯౓༧ଌλεΫʢSelf-Supervised taskʣ . . . . . . . . . . . . . . 27 ୈ 3 ষ ఏҊख๏ 30 3.1 ωοτϫʔΫߏ଄ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
  2. ii 3.2 ଛࣦؔ਺ . . . . . . .

    . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.2.1 ఢରੑଛࣦ (Adversarial Loss) . . . . . . . . . . . . . . . . . . . 31 3.2.2 Cycle Consistency LossʢαΠΫϧҰ؏ੑଛࣦʣ . . . . . . . . 34 3.2.3 ࠷ऴతͳଛࣦؔ਺ . . . . . . . . . . . . . . . . . . . . . . . . . 34 ୈ 4 ষ ࣮ݧ 35 4.1 ύϥϝʔλઃఆ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.2 σʔληοτ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.3 ධՁํ๏ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.3.1 Fr´ echet Inception Distance(FID) . . . . . . . . . . . . . . . . . . 36 ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 38 5.1 ࣮ݧ݁Ռ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 5.1.1 ఆྔతධՁ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 5.1.2 ఆੑతධՁ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 5.2 ߟ࡯ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 5.2.1 ఆྔతධՁ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 5.2.2 ఆੑతධՁ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 5.2.3 ఆྔతධՁͱఆੑతධՁͷൺֱ . . . . . . . . . . . . . . . . . 42 ୈ 6 ষ ·ͱΊ 64 ँࣙ 66 ࢀߟจݙ 67
  3. ୈ 1 ষ ং࿦ 1 ୈ1ষ ং࿦ ਓ޻஌ೳʢAIʣ͸ 1950 ೥୅ʹݚڀ͕࢝·͔ͬͯΒݱࡏʹࢸΔ·Ͱɼஶ͍͠ൃలΛ

    ਱͍͛ͯΔɽಛʹࡢࠓ͸ୈ 3 ࣍ AI ϒʔϜͷӔதʹ͋Γɼਂ૚χϡʔϥϧωοτϫʔ ΫʢDNNʣΛத৺ͱͨ͠σΟʔϓϥʔχϯά͕ݚڀʹ༻͍ΒΕ͖ͯͨ [1]ɽσΟʔϓ ϥʔχϯάʹΑΓɼVAE ΍ GAN ͳͲͱ͍ͬͨੜ੒ϞσϧΛֶशͰ͖ΔΑ͏ʹͳͬ ͍ͯΔɽ ੜ੒ϞσϧͷҰͭͰ΋͋Δ GAN(Generative Adversarial Network)͸ɼ2014೥ʹ Ian J.Goodfellow ࢯʹΑͬͯ։ൃ͞Εͨੜ੒ϞσϧͰ͋Γɼࠓ೔ʹࢸΔ·Ͱʹ༷ʑͳԠ༻ ͕ͳ͞Ε͍ͯΔ [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]ɽGAN ͷԠ༻ͷҰͭʹ CycleGAN ͕͋Γɼ ը૾தͷഅΛγϚ΢ϚʹɼGoogle Map ͷΑ͏ͳ஍ਤΛӴ੕ࣸਅʹɼࣸਅΛֆը෩ʹɼ ͱ͍ͬͨը૾ม׵͕Ͱ͖Δ [7]ɽCycleGAN Λ༻͍ͯͷը૾ม׵͸ɼ2 ͭͷσʔληο τΛ༻͍ͨڭࢣͳֶ͠शͰ͋Γɼڭࢣ͋ΓֶशΛ༻͍ͨը૾ม׵ख๏ͷ pix2pix [6] ͷ Α͏ʹσʔλෆ଍ʹؕΔ͜ͱ͕ͳ͍ͱ͍͏ͷ͕ར఺Ͱ͋Δɽ ͞ΒʹCycleGANͷൃలܗͱͯ͠ɼ GeneratorʹAttentionΛ༻͍ͨAttention-Guided GAN ͕։ൃ͞Εͨ [8] [9]ɽAttention-Guided GAN Ͱ͸ɼ஫໨͍ͨ͠ը૾ͷҰ෦෼Λ நग़ͯ͠ Attention ͱ͠ɼAttention ͷ෦෼΁ͷը૾ม׵Λߦ͏͜ͱ͕Ͱ͖Δ [8] [9]ɽ͠ ͔͠ը૾ͷҰ෦෼Λநग़͢Δࡍʹɼ஫໨͍ͨ͠෦෼ͱ͸ؔ܎ͷͳ͍෦෼Λޡม׵͠ ͯ͠·͏ͱ͍ͬͨ՝୊͕ੜͯ͡͠·͏ɽ ͜͏ͨ͠എܠ͔ΒຊݚڀͰ͸ɼը૾ͷҰ෦෼Λਖ਼֬ʹநग़ͯ͠ࢀরը૾ͷΑ͏ʹ Ϛοϐϯά͢Δ͜ͱΛ໨తͱ͠ɼAttention-Guided GAN ͷ Discriminator ʹճస֯౓ ༧ଌλεΫʢSelf-Supervised TaskʣΛ௥Ճͨ͠ SSAttention-Guided GAN ΛఏҊ͢Δɽ Discriminator ʹճస֯౓༧ଌλεΫΛ௥Ճ͢Δ͜ͱʹΑͬͯɼճసෆมੑʹΑΔը ૾ͷزԿతಛ௃Λ೺ѲͰ͖ɼࣝผਫ਼౓ͷ޲্Λ໨ࢦ͢͜ͱ͕ՄೳͱͳΔɽͦͯ͠ɼ
  4. ୈ 1 ষ ং࿦ 2 GAN ͷ Discriminator ͱ Generator

    Λڝ͍߹ΘͤΔੑ࣭͔ΒɼDiscriminator ͷࣝผਫ਼ ౓ͷ޲্͕ɼGenerator ΁ͷֶशʹ΋ӨڹΛ༩͑ɼഎܠʹӨڹΛ༩͑ͣʹ஫໨͍ͨ͠ ෦෼ʹͷΈը૾ม׵͞Εͨը૾͕ੜ੒Ͱ͖Δͱ͍͏ԾઆΛཱͯɼຊ࣮ݧΛߦͬͨɽ ຊ࿦จͰ͸ 2 ষʹ GAN ٴͼը૾ม׵ٕज़ͷؔ࿈ݚڀʹ͍ͭͯड़΂Δɽ3 ষͰ Attention-Guided GAN ʹճస֯౓༧ଌλεΫΛ௥Ճͨ͠ SSAttention-Guided GAN ΛఏҊ͢Δɽ4 ষͰ͸࣮ݧʹ༻͍ΔύϥϝʔλͷઃఆɼσʔληοτɼධՁํ๏Λ ঺հ͢Δɽͦͯ͠ 5 ষͰ࣮ݧ݁Ռͱߟ࡯ʹ͍ͭͯهड़͢Δɽ࠷ޙʹ 6 ষͰ·ͱΊͱ ࠓޙͷ՝୊ʹ͍ͭͯड़΂Δɽ
  5. ୈ 2 ষ ؔ࿈ݚڀ 3 ୈ2ষ ؔ࿈ݚڀ ॳΊʹɼ2.1 અͰύʔηϓτϩϯɼ2.2 અͰχϡʔϥϧωοτϫʔΫͷجૅతͳ஌

    ࣝΛड़΂ΔɽͦͷޙɼຊݚڀͷςʔϚͱͳ͍ͬͯΔ GAN ʹ͍ͭͯͷؔ࿈ݚڀΛड़ ΂Δɽ 2.1 ύʔηϓτϩϯ ύʔηϓτϩϯͱ͸ɼ1957 ೥ʹ Frank Rosenblatt ࢯʹΑͬͯߟҊ͞ΕͨΞϧΰϦ ζϜͰ͋Γɼ೴ͷਆܦճ࿏ͷҰ෦Λ਺ࣜͰදݱͨ͠΋ͷͰ͋Δ [12]ɽύʔηϓτϩϯ ͸ෳ਺ͷ৴߸Λೖྗͱͯ͠ड͚औΓɼҰͭͷ৴߸Λग़ྗ͢Δߏ଄Ͱ͋Δɽྫͱͯ͠ɼ ̎ͭͷ৴߸ x1 , x2 Λೖྗͱͯ͠ड͚औΔύʔηϓτϩϯͷߏ଄Λਤ 2.1 ʹࣔ͢ɽ ਤ 2.1 ͷʓ͸χϡʔϩϯ͋Δ͍͸ϊʔυͱݺ͹ΕΔɽx1 , x2 ͸ೖྗ৴߸ɼy ͸ग़ྗ ৴߸ɼw1 , w2 ͸ॏΈΛද͢ͱ͢Δɽᮢ஋Λ θ ͱͯ͠ɼग़ྗ y ͸ࣜ (2.1) ͷΑ͏ʹܭࢉ ͢Δ [12]ɽ y =            1 ɹ (w1 x1 + w2 x2 > θ) 0 ɹ (w1 x1 + w2 x2 ≤ θ) (2.1) ύʔηϓτϩϯͰ͸ɼᮢ஋ θ ͷ୅ΘΓʹόΠΞε b = −θ Λಋೖ͢Δ͜ͱ͕͋Δɽό ΠΞεΛ 1 ͭͷχϡʔϩϯͱͯ͠ɼਤ 2.2 ͷΑ͏ͳߏ଄ͱ͢Δ৔߹͕͋Δɽ ύʔηϓτϩϯʹόΠΞεΛಋೖͨ͠ग़ྗ y ͷܭࢉΛࣜ (2.2) ʹࣔ͢ [12]. y =            1 ɹ (w1 x1 + w2 x2 + b > 0) 0 ɹ (w1 x1 + w2 x2 + b ≤ 0) (2.2) ࣜ (2.2) Ͱ͸ɼೖྗ x1 , x2 ͱॏΈ w1 , w2 ͷੵ࿨ w1 x1 + w2 x2 ʹόΠΞε߲ b ΛՃ͑ͨ ͕ࣜਖ਼ͷ஋ͳΒ͹ 1ɼͦΕҎ֎ͷ஋ͳΒ͹ 0 ΛͱΔɽग़ྗ y ͸ɼ׆ੑԽؔ਺ h(·) Λ ༻͍ͯܭࢉ͢Δ͜ͱ΋Ͱ͖ɼࣜ (2.3) ʹࣔ͢ [12]ɽࣜ (2.3) Λࣜ (2.2) ͱಉ͡ग़ྗͱ͢
  6. ୈ 2 ষ ؔ࿈ݚڀ 5 Δ৔߹ɼࣜ (2.4) ʹࣔ͢εςοϓؔ਺Λ༻͍Δ [12]ɽ y

    = h(w1 x1 + w2 x2 + b) (2.3) h(u) =            1 ɹ (u > 0) 0 ɹ (u ≤ 0) (2.4) 2.2 ଟ૚χϡʔϥϧωοτϫʔΫ ଟ૚χϡʔϥϧωοτϫʔΫ͸ɼ೴ͷਆܦճ࿏ͷҰ෦Λ໛฿ͨ͠਺ཧϞσϧͰ͋ Γɼύʔηϓτϩϯಉ࢜Λͭͳ͗ɼͦΕΒΛଟ૚ʹͨ͠ϞσϧͰ͋Δ [12]ɽଟ૚χϡʔ ϥϧωοτϫʔΫͷྫΛਤ 2.3 ʹࣔ͢ [13]ɽਤ 2.3 ͷࠨଆͷྻΛೖྗ૚ɼதؒͷྻΛ தؒ૚ʢӅΕ૚ʣɼӈଆͷྻΛग़ྗ૚ͱݺͿɽਤ 2.3 ʹ͓͍ͯɼN (N > 0) ݸͷೖ ྗΛද͢ม਺ x1 , ..., xi , ..., xN ͱόΠΞε x0 ɼM (M > 0) ݸͷग़ྗΛද͢ม਺ y1 , ..., yj , ..., yM ɼ L (L > 0) ݸͷӅΕม਺ z1 , ..., zk , ..., zL ͱόΠΞε z0 Λࣔ͢ɽ ਤ 2.3 χϡʔϥϧωοτϫʔΫͷྫɹ
  7. ୈ 2 ষ ؔ࿈ݚڀ 6 χϡʔϥϧωοτϫʔΫʹ͓͚Δܭࢉʹ͍ͭͯઆ໌͢Δɽ·ͣೖྗ૚ʹͯೖྗ x1 , ..., xN

    ͱόΠΞε x0 Λ༻͍֤ͯରԠ͢ΔॏΈ w(1) ki ͱͷੵ࿨Λܭࢉ͠ɼ׆ੑԽؔ਺ h ʹ୅ೖͨ͠஋͕ӅΕؔ਺ͷ஋ͱͳΔɽ׆ੑԽؔ਺ h ʹ͸ɺࣜ (2.5) ͰͷγάϞΠυ ؔ਺΍ࣜ (2.6) ͷ ReLU(Rectitled Linear Unit) ͕͋Δ [12]ɽ h(uk ) = 1 1 + exp(−uk ) (2.5) h(uk ) = max(uk , 0) (2.6) ೖྗ૚͔Βதؒ૚΁ͷܭࢉࣜΛࣜ (2.7) ʹͯࣔ͢ɽ͜͜ͰόΠΞε x0 ͸ x0 = 1 ͱ͠ɼ ରԠ͢ΔॏΈ͸ w(1) k0 ͱ͢Δɽ zk = h( N ∑ i=1 w(1) ki xi + w(1) k0 ) (2.7) ࣜ (2.3) ͱಉ༷ʹͯ͠ܭࢉ͠ɼग़ྗ yi Λܭࢉ͢Δɽதؒ૚͔Βग़ྗ૚΁ࢸΔͷʹ༻ ͍ΒΕΔ׆ੑԽؔ਺ σ ͸ɼղ͘໰୊ʹΑͬͯࣜ (2.8) ͷ߃౳ؔ਺΍ࣜ (2.9) ͷιϑτ ϚοΫεؔ਺͕༻͍ΒΕΔ [12]ɽ σ(vj ) = vj (2.8) σ(vj ) = exp(vj ) ∑ M m=1 exp(vm ) (2.9) தؒ૚͔Βग़ྗ૚΁ͷܭࢉࣜΛࣜ (2.10) ʹͯࣔ͢ɽ͜͜ͰόΠΞε z0 ͸ z0 = 1 ͱ͠ɼ ରԠ͢ΔॏΈΛ w(2) j0 ͱ͢Δɽ yj = σ( L ∑ k=1 w(2) jk zk + w(2) j0 ) (2.10) ਤ 2.3 ͰͷχϡʔϥϧωοτϫʔΫͷܭࢉࣜΛࣜ (2.11) ʹͯࣔ͢ɽࠓճͷਤ 2.3 ٴͼ ࣜ (2.11) ͷΑ͏ʹೖྗ૚ → தؒ૚ → ग़ྗ૚ͷॱ൪Ͱ࣮ࢪ͞ΕΔܭࢉࣜ͸ɼॱ఻ൖ ʢforward propagationʣͱදݱ͢Δ [12]ɽ yj = σ( L ∑ k=1 ω(2) jk h( N ∑ i=1 w(1) ki xi + w(1) k0 ) + w(2) j0 ) (2.11) χϡʔϥϧωοτϫʔΫʹ͓͚Δσʔλ͸ɼओʹֶशσʔλͱςετσʔλͷ̎ͭ ͕͋Δɽֶशσʔλ͸ɼ࠷ॳʹͦΕࣗମͷΈͰχϡʔϥϧωοτϫʔΫΛֶश͠ɼ
  8. ୈ 2 ষ ؔ࿈ݚڀ 7 ॏΈͷௐ੔Λߦ͏ͨΊʹ༻͍ΒΕΔɽςετσʔλ͸ɼχϡʔϥϧωοτϫʔΫͷ ग़ྗΛධՁ͢ΔͨΊʹ༻͍ΒΕΔɽςετσʔλͷධՁ͸ɼҰൠతʹςετσʔ λ tn (n

    = 1, ..., N) ͱɼֶशσʔλΛೖྗͱͨ͠χϡʔϥϧωοτϫʔΫͷग़ྗ yn (n = 1, ..., N) ͱͷޡࠩͷେ͖͞Λࢦඪͱ͓ͯ͠ΓɼͦΕΒͷࢦඪΛଛࣦͱݺͿ [12]ɽ ଛࣦؔ਺ͷྫͱͯࣜ͠ (2.12) ͷೋ৐࿨ޡࠩ΍ɼࣜ (2.13) ͷΫϩεΤϯτϩϐʔ͕༻ ͍ΒΕΔ [12]ɽͳ͓ɼࣜ (2.12) ΍ࣜ (2.13) ʹ͓͍ͯɼଛࣦؔ਺Λ E ͱ͢Δɽ E = 1 N N ∑ n=1 (yn − tn )2 (2.12) E = − 1 N N ∑ n=1 tn log yn (2.13) χϡʔϥϧωοτϫʔΫͷֶशͰ͸ɼଛࣦؔ਺͕࠷খʹۙͮ͘Α͏ʹॏΈΛߋ৽͠ ֶश͢Δ [12]ɽॏΈͷௐ੔ʹ͸ޯ഑๏͕༻͍ΒΕΔ [12]ɽ 2.3 CNN ʢConvolutional Neural Networkɼ ৞ΈࠐΈχϡʔ ϥϧωοτϫʔΫʣ CNN ͱ͸ɼχϡʔϥϧωοτϫʔΫͷߏ଄ʹ৞ΈࠐΈ૚Λಋೖͨ͠ωοτϫʔΫ Ͱɼओʹը૾ೝࣝͰ༻͍ΒΕ͍ͯΔ [12]ɽ2.2 અͰͷଟ૚χϡʔϥϧωοτϫʔΫ͸ɼ શ݁߹૚ͱ׆ੑԽؔ਺Ͱߏ੒͞Ε͍ͯΔɽ͜Εʹର͠ CNN ͸৞ΈࠐΈ૚ͱϓʔϦ ϯά૚Λ༻͍ͯߏ੒͞ΕΔɽ৞ΈࠐΈ૚Ͱ͸ɼྫ͑͹ը૾಺ʹ͓͚ΔϐΫηϧಉ࢜ ͷڞ௨ٴͼ૬ҧੑ΍ RGB νϟωϧ಺ͷؔ࿈ੑͱ͍ͬͨɼۭؒత৘ใʹج͍ͮͨσʔ λͷಛ௃Λଊ͑Δ͜ͱ͕ՄೳͱͳΔ [12]ɽਤ 2.4 ʹ CNN ͷߏஙྫΛࣔ͢ɽ ৞ΈࠐΈ૚Ͱ͸ೖྗσʔλʹରͯ͠ɼΧʔωϧʢϑΟϧλʣͱݺ͹ΕΔॏΈͱͷ ৞ΈࠐΈԋࢉΛߦ͏ɽਤ 2.5 ͷ৞ΈࠐΈԋࢉͷΠϝʔδྫͰ͸ɼ4x4 ͷೖྗσʔλͱ 3x3 ͷΧʔωϧͱͷ৞ΈࠐΈԋࢉʹΑͬͯɼ2x2 ͷσʔλΛग़ྗ͢Δɽ ͜͜Ͱਤ 2.5 ͷ৞ΈࠐΈԋࢉʹ͍ͭͯઆ໌͢Δɽ·ͣਤ 2.6 ͷΑ͏ʹ 4x4 ͷೖྗ σʔλͷࠨ্ͷϐΫηϧྖҬ͔Β 3x3 ͷྖҬΛऔΓɼ3x3 ͷྖҬͷ֤ཁૉͱରԠ͠
  9. ୈ 2 ষ ؔ࿈ݚڀ 9 ͨϑΟϧλͷཁૉͱͷੵΛٻΊɼ࿨Λࢉग़͢Δɽ࣍ʹɼਤ 2.7 ͷΑ͏ʹೖྗσʔλ ͷ 3x3

    ྖҬΛ 1 ϐΫηϧӈʹͣΒ͠ɼಉ༷ʹϑΟϧλͱͷੵ࿨ԋࢉΛߦ͏ɽଓ͍ͯɼ ਤ 2.8 ͷΑ͏ʹೖྗσʔλͷ 3x3 ྖҬΛ 1 ϐΫηϧԼʹͣΒ͠ɼಉ༷ʹϑΟϧλͱ ͷੵ࿨ԋࢉΛߦ͏ɽ࠷ޙʹɼਤ 2.9 ͷΑ͏ʹೖྗσʔλͷ 3x3 ྖҬΛ 1 ϐΫηϧӈ ʹͣΒ͠ɼಉ༷ʹϑΟϧλͱͷੵ࿨ԋࢉΛߦ͏ɽ ҰํϓʔϦϯά૚Ͱ͸σʔλʹ͓͚Δॎԣํ޲ͷۭؒΛখ͘͢͞ΔԋࢉΛߦ͏ɽ ਤ 2.10 ͷϓʔϦϯάԋࢉͷྫͰ͸ɼ4x4 ͷೖྗσʔλΛ̐ͭͷ 2x2 ྖҬʹ෼ׂ͠ɼ ֤ྖҬͷநग़͞Εͨ࠷େ஋Λ 2x2 ͷσʔλʹू໿͍ͯ͠ΔɽϓʔϦϯά૚ʹ͓͚Δ ԋࢉํ๏ͱͯ͠ओʹ࠷େ஋ϓʔϦϯά΍ฏۉ஋ϓʔϦϯά͕༻͍ΒΕΔ [12]ɽ ϓʔϦϯά૚Ͱ͸ɼֶश͢ΔύϥϝʔλΛ༻͍ͣɼೖྗσʔλʹ͓͚ΔඍখͳҐ ஔมԽͷӨڹΛड͚ͳ͍͜ͱ͕ଟ͍ͱ͍͏ಛ௃Λ࣋ͭ [12]ɽ 2.4 Attentionʢ஫ҙػߏʣ Attention ͱ͸ɼը૾΍จষͷಛఆͷ෦෼ʹ஫໨͠ɼಛ௃Λଊ͑ΔΑ͏ʹֶश͢Δ ωοτϫʔΫͰ͋Δ [14]ɽྫ͑͹ਓ͕ؒը૾ΛݟΔͱ͖ɼਓؒ͸ը૾தʹ͓͚Δશͯ ͷ෦෼Λಉ͡Α͏ʹݟΔͷͰ͸ͳ͘ɼը૾தʹ͋ΔҰ෦෼ʹ஫໨ͦ͠ΕΛΦϒδΣΫ τͱͯ͠ೝ͍ࣝͯ͠Δɽ͜ͷΑ͏ͳਓؒͷಛੑΛػցֶशʹԠ༻ͨ͠ͷ͕ Attnetion Ͱ͋Δɽ Attention ͷ࢓૊ΈΛਤ 2.11 ʹࣔ͢ɽਤ 2.11 Ͱ͸ɼ஫໨͍ͨ͠ಛ௃Ͱ͋Δ Query ͱɼը૾ͷݩσʔλ͔ΒͦΕͧΕ KeyɼValue ͱݺ͹ΕΔಛ௃ϕΫτϧΛऔΓग़͢ɽ Query ͱ Key ͷཁૉͷ֤ੵʢྨࣅ౓ʣΛ Value ͷॏΈͱͯ͠ɼValue ͷ஫໨͍ͨ͠෦ ෼Λڧௐ͢Δ [14]ɽ
  10. ୈ 2 ষ ؔ࿈ݚڀ 10 ਤ 2.6 ೖྗσʔλͷࠨ্ͷ 3x3 ྖҬͱϑΟϧλͷ৞ΈࠐΈԋࢉɹ

    ਤ 2.7 ӈʹ 1 ϐΫηϧಈ͔ͨ͠ೖྗσʔλͷ 3x3 ྖҬͱϑΟϧλ ͷ৞ΈࠐΈԋࢉɹ
  11. ୈ 2 ষ ؔ࿈ݚڀ 11 ਤ 2.8 Լʹ 1 ϐΫηϧಈ͔ͨ͠ೖྗσʔλͷ

    3x3 ྖҬͱϑΟϧλ ͷ৞ΈࠐΈԋࢉɹ ਤ 2.9 ӈͱԼʹ 1 ϐΫηϧಈ͔ͨ͠ೖྗσʔλͷ 3x3 ྖҬͱϑΟ ϧλͷ৞ΈࠐΈԋࢉɹ
  12. ୈ 2 ষ ؔ࿈ݚڀ 13 2.5 GANʢGenerative Adversarial Networkɼఢରతੜ੒ ωοτϫʔΫʣ

    GAN ͸ɼ2014 ೥ʹ Ian J. Goodfellow ࢯΒʹΑͬͯߟҊ͞Εͨੜ੒Ϟσϧʹ͓͚ ΔΞʔΩςΫνϟͰ͋Δ [3]ɽGAN ͷߏ଄Λਤ 2.12 ʹࣔ͢ɽGAN ͸ɼGeneratorʢੜ ੒ثʣͱ Discriminatorʢࣝผثʣͷ 2 ͭͷωοτϫʔΫͰߏ੒͞ΕɼGenerator ͱ Discriminator Λఢରతʹֶशͤ͞ΔɽGenerator ͸ϊΠζΛೖྗ͞Εɼੜ੒σʔλΛ ग़ྗ͢ΔɽDiscriminator ͸ɼೖྗ͞Εͨσʔλֶ͕शઌͷσʔλͰ͋Δ͔Ͳ͏͔Λ ࣝผ͢ΔɽGAN ͸ Generator ͱ Discriminator Λఢରతʹֶश͢ΔͨΊɼGAN ͷଛ ࣦؔ਺ V(D,G) ͸ࣜ (2.14) ͰදͤΔ [3]ɽ min G max D V(D,G) = Ex [log(D(x))] + Ez [log(1 − D(G(z)))] (2.14) ࣜ (2.14) ʹ͓͍ͯɼೖྗσʔλ͕ຊ෺Ͱ͋Δ֬཰Λ D(x)ɼGenerator Ͱੜ੒͞Εͨ σʔλΛ G(z) ͱ͢Δɽೖྗσʔλ͕ຊ෺Ͱ͋Δͱ൑அ͞ΕΔ৔߹ɼD ͸େ͖͍஋ ͱͳΔɽҰํͰɼೖྗσʔλِ͕෺ͱ൑அ͞ΕΔ৔߹ɼD ͸খ͍͞஋ͱͳΔɽͨ͠ ͕ͬͯɼࣜ 2.14 ʹ͓͍ͯ log(D(x)) ͷ஋ͱ log(1 − D(G(z))) ͷ஋͕େ͖͘ͳΔΑ͏ʹ Discriminator Λֶश͢ΔɽҰํɼGenerator ͸ຊ෺σʔλʹ͍ۙੜ੒σʔλ G(z) Λ ੜ੒͢ΔͨΊʹɼlog(1 − D(G(z))) ͷ஋͕খ͘͞ͳΔΑ͏ʹֶश͢Δɽ͜ΕΒͷֶश Λఢରతʹ܁Γฦ͢͜ͱʹΑͬͯ GAN Λֶश͢Δ [3]ɽ 2.5.1 DCGAN ʢDeep Convolutional Generative Adversarial Networkɼ ৞ΈࠐΈఢରతੜ੒ωοτϫʔΫʣ DCGAN Ͱ͸ɼGAN Ͱ༻͍ΒΕΔ̎ͭͷωοτϫʔΫ GeneratorɼDiscriminator ʹ ͦΕͧΕ৞ΈࠐΈ૚Λ༻͍͍ͯΔ [4]ɽDCGAN Ͱ͸ 2.3 અͷ CNN ͷߏ଄ͱ͸ҧ͍ɼ ৞ΈࠐΈ૚ͱ׆ੑԽؔ਺ͷΈͷωοτϫʔΫߏ੒ͱͳΔɽ ਤ 2.13 ʹωοτϫʔΫͷྫΛࣔ͢ɽਤ 2.13 ͷ DCGAN ʹ͓͚Δ Generator ωοτ ϫʔΫͷߏ଄ྫͰ͸ɼϊΠζϕΫτϧ z Λ Generator ʹೖྗͯ͠ঃʑʹνϟωϧ਺
  13. ୈ 2 ষ ؔ࿈ݚڀ 14 ਤ 2.12 GAN ͷߏ଄ ਤ

    2.13 DCGAN Ͱͷ Generator ͷωοτϫʔΫߏ଄ྫ
  14. ୈ 2 ষ ؔ࿈ݚڀ 15 ਤ 2.14 cGAN ͷߏ଄ɹ ͷ࡟ݮͱಉ࣌ʹ৞ΈࠐΈΛ༻͍ͯϐΫηϧΛΞοϓαϯϓϦϯά͠ɼࢦఆͨ͠େ͖

    ͞ͷੜ੒ը૾ G(z) Λग़ྗ͢Δɽਤ 2.13 ͷྫͰ͸ 100 ࣍ݩͷϊΠζϕΫτϧΛೖྗ ͠ɼ4x4x1024ɼ8x8x512, 16x16x256, 32x32x128 ͱ 4 ճͷΞοϓαϯϓϦϯάΛհ͠ ͯɼ࠷ऴతʹνϟωϧ਺͸ 3 ͷ 64x64 ϐΫηϧͷΧϥʔը૾Λग़ྗ͍ͯ͠Δ [4]ɽ 2.5.2 cGANʢConditional Generative Adversarial Networkɼ৚݅෇ ͖ఢରతੜ੒ωοτϫʔΫʣ cGAN ͱ͸ɼGAN ͷ Generator ͱ Discriminator ʹೖྗ͢ΔσʔλʹɼΫϥεʹର Ԡ͢ΔϥϕϧΛ෇͚Ճ͑ͨϞσϧͰ͋Δ [5]ɽcGANͷߏ଄͸ਤ2.14ͷ௨ΓͱͳΔɽਤ 2.14 Ͱ͸ɼਤ 2.12 Ͱࣔ͢ GAN ͷΞʔΩςΫνϟͷ͏ͪɼֶशσʔλ x ͱੜ੒σʔλ G(z|y)ɼજࡏม਺ z ʹͦΕͧΕϥϕϧ y Λ෇Ճ͍ͯ͠ΔɽGenerator Ͱ͸ɼજࡏม਺ z ͱͦΕʹରԠ͢Δϥϕϧ y Λೖྗ͠ɼϥϕϧ෇͚͞Εͨੜ੒σʔλ G(z|y) Λग़ྗ ͢Δɽֶͦͯ͠शσʔλ x ͱϥϕϧ y ʹج͖ͮɼੜ੒σʔλ G(z|y) Λ Discriminator ʹΑͬͯࣝผ͢Δɽ
  15. ୈ 2 ষ ؔ࿈ݚڀ 16 cGAN ͷଛࣦؔ਺͸ɼҎԼͷࣜ (2.15) ͰදͤΔ [5]ɽ

    min G max D V(D,G) = Ex [log(D(x|y))] + Ez [log(1 − D(G(z|y)))] (2.15) GAN ͱҾ͖ଓ͖ɼࣜ (2.15) ʹ͓͍ͯ Discriminator ͷग़ྗΛ D(·)ɼGenerator ͷग़ྗ Λ G(·)ɼଛࣦؔ਺Λ V(D,C) ͱ͓͘ɽࣜ (2.15) Ͱ͸ɼରԠ͢ΔϥϕϧΛ y ͱ͓͘ɽͦ ͯ͠ɼGAN ͱಉ༷ʹ log(D(x|y)) ͷ஋ͱ log(1 − D(G(z|y))) ͷ஋͕େ͖͘ͳΔΑ͏ʹ Discriminator Λֶश͠ɼlog(1 − D(G(z|y))) ͷ஋͕খ͘͞ͳΔΑ͏ʹ Generator Λֶ श͢Δ [5]ɽ 2.6 Image to Image Translation 2.5 અͰͷ GAN ͸ɼGenerator Λ௨ͯ͡ϊΠζ͔Βը૾Λੜ੒͢Δը૾ੜ੒Λߦ ͏ϞσϧͰ͋ΔɽຊઅͰ͸ GAN ͷը૾ੜ੒ٕज़ΛԠ༻͠ɼιʔεͱͳΔը૾͔Β λʔήοτͱͳΔը૾΁ͷม׵Λߦ͏ Image to Image Translation ͱݺ͹ΕΔը૾ม ׵ٕज़Λઆ໌͢Δɽ2.6.1 ߲Ͱ pix2pixɼ2.6.2 ߲Ͱ CycleGAN ʹ͍ͭͯͦΕͧΕઆ໌ ͢Δɽ 2.6.1 pix2pix pix2pix ͱ͸ɼPhillip Isola ࢯΒʹΑͬͯ։ൃ͞Εͨ Image to Image Translation ख๏ Ͱ͋Δ [6]ɽpix2pix Ͱ͸ɼม׵ઌը૾͔ΒΤοδநग़ͨ͠ը૾Λม׵ݩը૾ͱ͢Δɽ pix2pix ͸ 2.5.2 ߲ͷ cGAN ΛԠ༻ͨ͠ϞσϧͰ͋ΓɼΤοδநग़ͨ͠ม׵ݩը૾Λ ϥϕϧͱͯ͠ը૾ม׵Λߦ͏ɽਤ 2.15 ʹ pix2pix ͷߏ଄Λࣔ͢ɽpix2pix Ͱ͸ 2.5.2 ߲ ͷ cGAN Λ༻͍͓ͯΓɼਤ 2.15 ͷΤοδ͔Βը૾΁ͷྫͰ͸ɼϥϕϧʹΤοδͷը ૾ɼຊ෺ը૾ʹม׵ઌͷը૾Λద༻͍ͯ͠Δɽޙ͸ cGAN ͱಉ͘͡ɼDiscriminator ͱ Generator ͱͷఢରతֶशʹΑͬͯɼม׵ઌͷը૾ͷΑ͏ͳը૾Λ Generator Ͱੜ ੒͢Δɽ ϥϕϧը૾ʹ͸ɼม׵ݩը૾ͷΤοδ෦෼͚ͩͰͳ͘ɼGoogle Map ͷ஍ਤ΍ Se- mantic Segmentation [15] Λ༻͍ͨϥϕϧը૾Λ༻͍Δ͜ͱ΋Ͱ͖ΔɽGoogle Map ͷ
  16. ୈ 2 ষ ؔ࿈ݚڀ 17 ஍ਤ͔ΒߤۭࣸਅΛੜ੒ͨ͠Γ Semantic Segmentation Λ༻͍ͨϥϕϧը૾͔Βࣸ ਅΛੜ੒ͨ͠Γ͢Δ͜ͱ΋ՄೳͰ͋Δɽͳ͓ɼSemantic

    Segmentation ͱ͸ɼ֤ϐΫ ηϧΛपลͷϐΫηϧ৘ใʹج͖ͮɼΧςΰϦ෼ྨ͢Δํ๏Λࢦ͢ [15]ɽྫͱͯ͠ɼ Image-to-Image Demo [6] Ͱ pix2pix ʹΑΔը૾ม׵ͷ࣮ߦ݁ՌΛਤ 2.16 ʹࣔ͢ɽਤ 2.16 ͷྫͰ͸ɼۺɼϋϯυόοάͷը૾͔ΒͦΕͧΕΤοδΛऔͬͨϥϕϧը૾΍ ݐ෺֎؍ͷϥϕϧը૾ΛجʹɼςΫενϟϚοϐϯάͰࣸਅͷΑ͏ͳը૾Λੜ੒͠ ͍ͯΔɽ 2.6.2 CycleGAN CycleGAN ͸ɼGAN Λ̎ͭܨ͛Δ͜ͱͰɼ͋Δσʔλ܈ͷը૾Λผͷσʔλ܈ͷ ը૾ͷΑ͏ʹ૬ޓม׵͢Δ Image to Image Translation ख๏Ͱ͋Δ [7]ɽਤ 2.17 ʹͯɼ pix2pix Ͱ༻͍Δม׵ݩը૾ͱม׵ઌը૾ͷϖΞͷྫͱ CycleGAN Ͱ༻͍Δม׵ݩ ը૾܈ͱม׵ઌը૾܈ͷྫΛࣔ͢ɽલऀ͸ɼۺ΍όοάͷը૾ͱ͍ͬͨม׵ઌը૾ ਤ 2.15 pix2pix ͷߏ଄
  17. ୈ 2 ষ ؔ࿈ݚڀ 18 ਤ 2.16 pix2pixͰͷImage to Image

    Translationͷྫ ʢImage-to-Im- age Demo Ͱ࡞੒ [6]ʣɹ ͱ֤ม׵ઌը૾ͷΤοδը૾͕ϖΞͱͳͬͨڭࢣ͋Γֶशͱͳ͍ͬͯΔɽ͜ͷΑ͏ ʹ pix2pix Ͱ͸ɼม׵ݩͱͳΔϥϕϧͱม׵ઌը૾ͷ̍ର̍ͷϖΞ͕ඞཁͱͳΔͨ Ίɼσʔληοτͷऩू͕೉͘͠ɼσʔλͷྔ͕ലେͱͳΔͱ͍ͬͨ՝୊͕͋Δɽ ҰํޙऀͰ͸ɼ֤ม׵ݩը૾͝ͱʹରԠͨ͠ϥϕϧΛ༻͍Δ͜ͱͳ͘ɼม׵ݩը૾ ܈ͱม׵ઌը૾܈ΛϖΞͱͨ͠ڭࢣͳֶ͠शͱͳ͍ͬͯΔɽCycleGANͰ͸ɼpix2pix ͱൺ΂ͯϖΞͱͳΔֶशσʔλΛඞཁͱͤͣʹը૾ม׵͕Ͱ͖ΔΑ͏ʹվྑ͞Εͯ ͍Δ [7]ɽ CycleGAN ͷଛࣦؔ਺͸ओʹɼఢରੑଛࣦʢAdversarial lossʣ ɼαΠΫϧҰ؏ੑଛ ࣦʢCycle consistency lossʣ͔Βߏ੒͞ΕΔ [7]ɽఢରੑଛࣦͱ͸ɼม׵ઌͱͳΔσʔ λ܈ͷ෼෍ʹੜ੒ը૾ͷ෼෍ΛҰகͤ͞Δ͜ͱΛ໨తͱͨ͠ଛࣦͰ͋ΓɼαΠΫϧ
  18. ୈ 2 ষ ؔ࿈ݚڀ 19 Ұ؏ੑଛࣦ͸ɼ̎ͭͷ Generator ಉ࢜ʹໃ६͕ੜ͡ͳ͍Α͏ʹ͢Δ͜ͱΛ໨తͱͨ͠ ଛࣦͰ͋Δɽఢରੑଛࣦ LGAN

    Λࣜ (2.16)ɼࣜ (2.17)ɼαΠΫϧҰ؏ੑଛࣦ Lcycle Λࣜ (2.18) ʹͯࣔ͢ɽ͜͜Ͱσʔλ܈Λ X, Y ͱͯ͠ɼG Λ X → Y ͷม׵Λߦ͏ Generatorɼ F Λ Y → X ͷม׵Λߦ͏ Generator ͱ͠ɼX, Y ʹରԠ͢Δ Discriminator ΛͦΕͧΕ DX , DY ͱ͢ΔɽDX ͸ࣝผର৅ͱͳΔσʔλ͕ը૾σʔλ܈ X தͷը૾Ͱ͋Δ͔Ͳ ͏͔ɼDY ͸ࣝผର৅ͱͳΔσʔλ͕ը૾σʔλ܈ Y தͷը૾Ͱ͋Δ͔Ͳ͏͔Λࣝ ผ͢Δɽ LGAN (G, DY , X, Y) = Ey [log(DY (y))] +Ex [log(1 − DY (G(x)))] (2.16) LGAN (F, DX , Y, X) = Ex [log(DX (x))] + Ey [log(1 − DX (F(y)))] (2.17) Lcycle (G, F) = Ex [∥F(G(x)) − x∥1 ] + Ey [∥G(F(y)) − y∥1 ] (2.18) ͜ΕΒͷଛࣦΛ௨ৗͷ GAN ͱಉ༷ʹఢରతʹֶशΛ܁Γฦ͢ɽαΠΫϧҰ؏ੑଛ ࣦ Lcycle ͷࣜ (2.18) ͸ɼը૾ x ∈ X ͱɼx Λσʔλ܈ Y ʹม׵ͨ͠ G(x) ΛͰ࠶ม ਤ 2.17 pix2pix Ͱ༻͍Δσʔλͱ CycleGAN Ͱ༻͍Δσʔλͷ ൺֱɹ
  19. ୈ 2 ষ ؔ࿈ݚڀ 20 ׵ͨ͠ F(G(x)) ͱͷࠩΛܭࢉ͓ͯ͠Γɼը૾ y ∈

    Y ʹରͯ͠΋ɼಉ༷ͷܭࢉΛͨ͠ ଛࣦͰ͋Δɽ͕ͨͬͯ͠ɼαΠΫϧҰ؏ੑଛࣦ͸ɼೖྗσʔλͱɼೖྗσʔλΛ֤ Generator Ͱ̍ճͣͭม׵ͯ͠࠶ߏஙͨ͠σʔλ͕ಉ͡ʹͳΔΑ͏ʹ Generator ಉ࢜ Λௐ੔͢Δ໾ׂΛ͍࣋ͬͯΔ [7]ɽ ͞Βʹ CycleGAN Ͱ͸ɼΞΠσϯςΟςΟଛࣦʢIdentity Lossʣ Lidentity Λಋೖ͠ɼ ֤ը૾܈ʹ͓͚Δ৭߹͍ΛอͭΑ͏ͳ޻෉Λ͍ͯ͠Δ [7]ɽΞΠσϯςΟςΟଛࣦ͸ɼ ҎԼͷࣜ (2.19) ͷΑ͏ʹද͞ΕΔɽ Lidentity (G, F) = Ex [∥F(x) − x∥1 ] + Ey [∥G(y) − y∥1 ] (2.19) CycleGAN ͷଛࣦؔ਺ L ͸ࣜ (2.20) ͷ௨ΓͱͳΔɽࣜ (2.20) தͷ λcycle , λidentity ͸ଛ ࣦؔ਺ͷόϥϯεΛͱΔͨΊͷϋΠύʔύϥϝʔλͰ͋Δɽ L = LGAN (G, DY , X, Y) + LGAN (F, DX , Y, X) + λcycle Lcycle (G, F) + λidentity Lidentity (G, F) (2.20) CycleGANͷωοτϫʔΫΛਤ2.18ʹࣔ͢ɽCycleGAN͸ɼGANͷωοτϫʔΫΛ ̎ͭ૊Έ߹ΘͤͨΞʔΩςΫνϟͰ͋ΓɼGenerator G, F ͱ Discriminator DX , DY Λ༻ ͍ΔɽDiscriminator DX ͸ɼೖྗը૾͕ຊ෺ը૾ x ͔ੜ੒ը૾ F(y) ͔ɼDiscriminator DY ͸ɼೖྗը૾͕ຊ෺ը૾ y ͔ੜ੒ը૾ G(x) ͔ΛͦΕͧΕࣝผ͢Δɽ 2.7 Attention-Guided Image-to-Image Translation Attention-Guided Image-to-Image Translation ͱ͸ɼImage to Image Translation ʹ Attentionʢ஫ҙػߏʣͷ֓೦ΛऔΓೖΕͨ΋ͷͰ͋Δ [8] [9]ɽ֤ Generator ʹ Attention
  20. ୈ 2 ষ ؔ࿈ݚڀ 21 ਤ 2.18 CycleGAN ͷωοτϫʔΫɹ ΛऔΓೖΕΔ͜ͱʹΑͬͯɼը૾಺ͷ஫໨͢΂͖ΦϒδΣΫτΛڧௐ͠ɼΦϒδΣ

    ΫτҎ֎ͷഎܠͳͲͷ஫໨͠ͳ͍෦෼ʹӨڹΛ༩͑ͳ͍ը૾Λੜ੒͢Δ͜ͱΛՄೳ ͱ͍ͯ͠Δɽ Attention-Guided Image-to-Image Translation ͷ Generator ΞʔΩςΫνϟͷछྨ͝ ͱʹɼAttention-Guided Generatior Scheme I [9] ͱ Attention-Guided Generator Scheme II ͕͋Δ [8]ɽ 2.7.1 Attention-Guided Generator Scheme I Attention-Guided Generator Scheme IΛ༻͍ͨAttention-Guided Image-to-Image Trans- lation ͸ 2019 ೥ʹ Hao Tang ࢯΒʹΑͬͯߟҊ͞Εͨ [8]ɽAttention-Guided Generator Scheme I ͷ Generator ͷߏ଄Λਤ 2.19 ʹࣔ͢ɽ ਤ 2.19 Ͱ͸ɼBu3dfe σʔληοτ [16] Λ༻͍ͯɼதੑతͳإͷը૾ͱޱ֯ͷ্͕ͬ ͨإͷը૾ͱͷը૾ม׵Λߦ͍ͬͯΔɽGenerator ͸ͦΕͧΕ G ͱ F Λ༻͍͓ͯΓɼ G ͸தੑతͳإ͔Βޱ֯ͷ্͕ͬͨإͷը૾ม׵Λߦ͏ GeneratorɼF ͸ޱ֯ͷ্͕ͬ
  21. ୈ 2 ষ ؔ࿈ݚڀ 22 ਤ 2.19 Attention-Guided Generator Scheme

    I ͷ Generator ͷߏ଄ ɹ ͨإ͔Βதੑతͳإ΁ͷը૾ม׵Λߦ͏ Generator Ͱ͋Δɽ ͳ͓ɼGenerator G, F ʹ ͸ͦΕͧΕ Attention Λ෇Ճ͓ͯ͠ΓɼAttention Mask Ay , Ax ͱ Content Mask Cy ,Cx Λग़ྗ͢Δɽ Attention Mask Ay , Ax Ͱ͸ɼը૾಺ͷ஫໨͍ͨ͠෦෼Λڧௐ͢ΔϚεΫͰ͋Δɽਤ 2.19 Ͱ͸ɼAy ͸ೖྗը૾ x தͷإͷޱ֯෦෼ʹ஫໨͍ͯ͠ΔɽAx ʹ͓͍ͯ΋ Ay ͱ ಉ༷ͷํ๏Ͱɼը૾ G(x) தͷإͷޱ͕֯ڧௐ͞Ε͍ͯΔɽ Content Mask Cy ,Cx ͸ɼม׵ઌͷը૾܈ͷಛ௃Λ΋ͱʹϨϯμϦϯά͞Εͨը૾ Λࣔ͢ɽɹਤ 2.19 Ͱ͸ɼGenerator G Ͱޱ֯ͷ্͕ͬͨޱݩΛදݱͨ͠ը૾Λ Content Mask Cy ͱͯ͠ग़ྗ͍ͯ͠Δɽ ಉ༷ʹ Generator F Ͱ͸தੑతͳإΛදݱͨ͠ը૾ Λ Content Mask Cx ͱͯ͠ग़ྗ͍ͯ͠Δɽ ࠷ऴతʹਤ 2.19 ͷ Fusion ͷ෦෼ͰɼAttention Mask Ay ͱ Content Mask Cy ɼೖྗը ૾ x Λ଍ͯ͠ɼը૾ม׵Λߦ͏ɽೖྗը૾ x Λ Generator Ͱม׵ͨ͠ G(x) ͸ࣜ (2.21) Ͱද͞ΕΔ [8]ɽͳ͓ɼ ⊙ ͸ΞμϚʔϧੵͷԋࢉΛද͢ɽ G(x) = Ay ⊙ Cy + (1 − Ay ) ⊙ x (2.21)
  22. ୈ 2 ষ ؔ࿈ݚڀ 23 ɹಉ༷ʹɼAttention Mask Ax ͱ Content

    Mask Cx ɼೖྗը૾ y Λ଍ͯ͠ɼF(y) = Ax ⊙ Cx + (1 − Ax ) ⊙ y ͱը૾ม׵Λߦ͏ɽ Attention-Guided Generator Scheme I Λ༻͍ͨ Attention-Guided GAN ͷଛࣦؔ਺ ͸ɼఢରੑଛࣦɼαΠΫϧҰ؏ੑଛࣦɼAttention ఢରੑଛࣦɼAttention ଛࣦɼϐΫη ϧଛࣦʢΞΠσϯςΟςΟଛࣦʣͰߏ੒͞Ε͍ͯΔ [8]ɽఢରੑଛࣦɼαΠΫϧҰ؏ੑ ଛࣦ͸ɼCycleGAN Ͱ༻͍ΒΕ͍ͯΔଛࣦͱಉ͡Ͱɼఢରੑଛࣦ͸ࣜ (2.22)ɼ(2.23)ɼ αΠΫϧҰ؏ੑଛࣦ͸ࣜ (2.24) ͷ௨ΓʹදͤΔɽ LGAN (G, DY , X, Y) = Ey [log(DY (y))] +Ex [log(1 − DY (G(x)))] (2.22) LGAN (F, DX , Y, X) = Ex [log(DX (x))] + Ey [log(1 − DX (F(y)))] (2.23) Lcycle (G, F) = Ex [∥F(G(x)) − x∥1 ] + Ey [∥G(F(y)) − y∥1 ] (2.24) ϐΫηϧଛࣦ͸ɼCycleGAN ͷΞΠσϯςΟςΟଛࣦͱಉ༷ɼ֤ը૾܈ʹ͓͚Δ৭ ߹͍ΛอͭΑ͏ઃ͚ΒΕͨଛࣦͰ͋Δ [8]ɽೖྗը૾Λ Generator Ͱม׵ͨ͠ը૾ͱ ݩͷೖྗը૾ͱͷϚϯϋολϯڑ཭Λଛࣦͱ͓ͯ͠Γɼࣜ (2.25) ͷΑ͏ʹදͤΔɽ Lpixel (G, F) = Ex [∥G(x) − x∥1 ] + Ey [∥F(y) − y∥1 ] (2.25) ଓ͍ͯ Attention ఢରੑଛࣦ͸ɼDiscriminator Ͱ Attention ΛؚΊͯࣝผͨ͠ͱ͖ ͷఢରੑଛࣦͰ͋Δ [8]ɽAttention ఢରੑଛࣦͰ͸ Discriminator DX , DY ͷ୅ΘΓʹ Attention-Guided Discriminator DXAttention , DYAttention Λಋೖͨ͠΋ͷͰ͋Γɼ ࣜ(2.26)ɼ (2.23) Ͱද͢ɽDYAttention ͸ɼຊ෺ը૾ͱ Attention ͷϖΞ [Ax , y] ͱੜ੒ը૾ͱ Attention ͷϖ Ξ [Ax ,G(x)] Λࣝผ͠ɼDXAttention ͸ɼຊ෺ը૾ͱ Attention ͷϖΞ [Ay , x] ͱੜ੒ը૾ͱ Attention ͷϖΞ [Ay , F(y)] Λࣝผ͢Δɽ LAGAN (G, DY , X, Y) = Ey [log(DYAttention ([Ax , y]))] + Ex [log(1 − DYAttention ([Ax ,G(x)]))] (2.26)
  23. ୈ 2 ষ ؔ࿈ݚڀ 24 LAGAN (F, DX , Y,

    X) = Ex [log(DXAttention ([Ay , x]))] + Ey [log(1 − DXAttention ([Ax , F(y)]))] (2.27) Attention ଛࣦ [8] ͸ɼAttention Mask ʹ Total Variation ਖ਼نԽΛߦ͍ɼత֬ͳ Attention ʹ͢Δͷʹઃ͚ΒΕͨଛࣦͰ͋ΔɽAttention Mask Ax ʹ͓͚Δ Attnetion ଛࣦ Ltv (Ax ) ͸ࣜ (2.28) ͷ௨Γʹද͞ΕΔɽAttention Mask Ay ʹ͓͍ͯ΋ಉ༷ͷܭࢉΛߦ͏ɽͳ ͓ɼW, H ͸ͦΕͧΕ Attention Mask Ax ͷԣ෯ɼॎ෯Λද͢ɽ Ltv (Ax ) = W,H ∑ w,h=1 |Ax (w + 1, h, c) − Ax (w, h, c)| + |Ax (w, h + 1, c) − Ax (w, h, c)| (2.28) Attention-Guided GAN ͷଛࣦؔ਺͸ࣜ (2.29) Ͱද͞ΕΔɽ͜͜ͰɼLGAN ͸ࣜ (2.22)ɼ (2.23) ͷ࿨ɼLAGAN ͸ࣜ (2.26)ɼ(2.23) ͷ࿨Ͱ͋Δɽ͞ΒʹɼλGAN , λcycle , λpixel , λtv ͸ ଛࣦؔ਺ͷόϥϯεΛͱΔͨΊͷϋΠύʔύϥϝʔλͰ͋Δɽ L = λGAN (LGAN + LAGAN ) + λcycle Lcycle + λpixel Lpixel + λtv Ltv (2.29) 2.7.2 Attention-Guided Generator Scheme II Attention-Guided Generator Scheme I Ͱ͸̍ͭͷ Generator ͔Β̍ͭͣͭ Attention Mask ͱ Content Mask Λੜ੒͢Δ [8]ɽ͔͠͠ɼAttention-Guided Generator Scheme I ͷ Generator Λ༻͍ͨ৔߹ɼσʔληοτʹΑͬͯ͸ਫ਼౓͕མͪΔͱ͍͏ݚڀ݁Ռ͕ ͋Δ [9]ɽ͜͏͍ͬͨਫ਼౓্ͷ໰୊Λࠀ෰͢΂͘ɼAttention-Guided Generator Scheme II ͕ߟҊ͞ΕͨɽAttention-Guided Generator Scheme II ͷ Generator ͷߏ଄Λਤ 2.20 ʹࣔ͢ɽ ਤ 2.20 ͷ Generator G Λྫʹը૾ม׵खॱΛઆ໌͢Δɽೖྗը૾ x Λ Parameter Sharing Encoder GE Ͱม׵͢ΔɽAttention-Guided Generator Scheme I Ͱ͸̍ͭͷ
  24. ୈ 2 ষ ؔ࿈ݚڀ 25 ਤ 2.20 Attention-Guided Generator Scheme

    II ͷ Generator ͷߏ଄ Generator ͔Β Attention Mask ͱ Content Mask Λಉ࣌ʹੜ੒͍͕ͯͨ͠ɼAttention- Guided Generator Scheme II ͷ Generator Ͱ͸ɼGE Ͱͷग़ྗΛͦΕͧΕ Content Mask Generator GC ͱ Attention Mask Generator GA ʹೖྗ͠ɼ֤ Mask Generator Ͱ Content Mask ͱ Attention Mask Λग़ྗ͢Δ [9]ɽ Attention Mask Generator GA ͸ɼ஫໨͢΂͖෦෼ͷΈΛڧௐ͢Δ Foreground At- tention Mask Af y ͱɼ൓ରը૾ม׵ʹؔ܎ͷͳ͍എܠͳͲͱ͍ͬͨ෦෼Λڧௐ͢Δ Background Attention Mask Ab y Λੜ੒͢Δɽ͜ͷΑ͏ʹվྑ͢Δ͜ͱʹΑͬͯɼFore- ground Attention Mask Af y ͱ Background Attention Mask Ab y ʹΑͬͯഎܠʹӨڹΛ༩ ͑ͣʹ஫໨͍ͨ͠෦෼ʹͷΈը૾ม׵Λߦ͏͜ͱ͕Ͱ͖Δɽ Ҏ্ΑΓ Attention-Guided Generator Scheme I ͱͷΞʔΩςΫνϟͷҧ͍͸ɼҎԼ ͷ̏ͭʹ·ͱΊΒΕΔɽ 1 GeneratorΛ్த·Ͱֶश͠ɼֶशޙͷग़ྗΛ༻͍ͯɼAttention Mask Generator ͱ Content Mask Generator ʹ෼ׂ͢Δ 2 Attention Mask Generator ͔Β Foreground Attention Mask ͱ Background Atten- tion Mask Λੜ੒͢Δ 3 Attention Mask Generator ͔Βੜ੒͞ΕΔ Foreground Attention Mask ͱ Content
  25. ୈ 2 ষ ؔ࿈ݚڀ 26 Mask Generator ͔Βੜ੒͞ΕΔ Content Mask

    ͷຕ਺Λෳ਺ʹ૿΍͢ʢFore- ground Attention Mask ͱ Content Mask ͷຕ਺͸ಉ͡ʣ Foreground Attention Mask Af y ͱ Content Mask Cf y ͷϖΞͷཁૉͷ֤ੵͱɼҰຕͷ Background Attention Mask Af y ͱ Input x ͷཁૉͷ֤ੵΛ଍͠߹Θͤͯը૾Λੜ੒͢ ΔɽAttention Mask Generator ͔Βੜ੒͞ΕΔ Attention Mask ͷ૯਺Λ n ຕͱͨ͠ͱ ͖ɼBackground Attention Mask Ab y ͸ 1 ຕͰ͋ΔɽͦͷͨΊɼForeground Attention Mask Af y ͷ૯਺͸ n−1 ͱͳΓɼForeground Attention Mask Af y ͱϖΞͷ Content Mask Cf y ΋ n − 1 ͱͳΔɽ ͕ͨͬͯ͠ɼೖྗը૾ x Λ Generator Ͱม׵ͨ͠ G(x) ͸ࣜ (2.30) Ͱද͞ΕΔɽͳ ͓ɼ ⊙ ͸ΞμϚʔϧੵͷԋࢉΛද͢ɽ G(x) = n−1 ∑ f=1 Af y ⊙ Cf y + Ab y ⊙ x (2.30) ɹ Generator F Ͱ΋ɼಉ༷ʹͯ͠ y Ͱ͋Δͱ͢ΔͱɼF(y) ͸ࣜ (2.31) Ͱද͞ΕΔɽ F(x) = n−1 ∑ f=1 Af x ⊙ Cf x + Ab x ⊙ y (2.31) Attention-Guided GAN ͷଛࣦؔ਺͸ CycleGAN ͱಉ༷ʹɼఢରੑଛࣦͱαΠΫϧ Ұ؏ੑଛࣦɼΞΠσϯςΟςΟଛࣦΛ༻͍Δ [9]ɽఢରੑଛࣦͱ͸ɼม׵ઌͱͳΔσʔ λ܈ͷ෼෍ʹੜ੒ը૾ͷ෼෍ΛҰகͤ͞Δ͜ͱΛ໨తͱͨ͠ଛࣦͰ͋ΓɼαΠΫϧ Ұ؏ੑଛࣦ͸ɼ̎ͭͷ Generator ಉ࢜ʹໃ६͕ੜ͡ͳ͍Α͏ʹ͢Δ͜ͱΛ໨తͱͨ͠ ଛࣦͰ͋ΔɽͦΕʹՃ͑ͯɼΞΠσϯςΟςΟଛࣦʢIdentity Lossʣ Lidentity Λಋೖ ͠ɼ֤ը૾܈ʹ͓͚Δ৭߹͍ΛอͭΑ͏ͳ޻෉Λ͍ͯ͠Δ [7]ɽఢରੑଛࣦ LGAN Λ ࣜ (2.32)ɼ(2.33)ɼαΠΫϧҰ؏ੑଛࣦ Lcycle Λࣜ (2.34)ɼΞΠσϯςΟςΟଛࣦΛࣜ (2.35) ʹͯࣔ͢ɽ͜͜Ͱσʔλ܈Λ X, Y ͱͯ͠ɼG Λ X → Y ͷม׵Λߦ͏ Generatorɼ F Λ Y → X ͷม׵Λߦ͏ Generator ͱ͠ɼX, Y ʹରԠ͢Δ Discriminator ΛͦΕͧΕ DX , DY ͱ͢ΔɽDX ͸ࣝผର৅ͱͳΔσʔλ͕ը૾σʔλ܈ X தͷը૾Ͱ͋Δ͔Ͳ
  26. ୈ 2 ষ ؔ࿈ݚڀ 27 ͏͔ɼDY ͸ࣝผର৅ͱͳΔσʔλ͕ը૾σʔλ܈ Y தͷը૾Ͱ͋Δ͔Ͳ͏͔Λࣝ ผ͢Δɽ

    LGAN (G, DY , X, Y) = Ey [log(DY (y))] +Ex [log(1 − DY (G(x)))] (2.32) LGAN (F, DX , Y, X) = Ex [log(DX (x))] + Ey [log(1 − DX (F(y)))] (2.33) Lcycle (G, F) = Ex [∥F(G(x)) − x∥1 ] + Ey [∥G(F(y)) − y∥1 ] (2.34) Lidentity (G, F) = Ex [∥F(x) − x∥1 ] + Ey [∥G(y) − y∥1 ] (2.35) ࣜ (2.32)ɼࣜ (2.33)ɼࣜ (2.34)ɼࣜ (2.35) ΑΓɼAttention-Guided GAN ͷଛࣦؔ਺͸ ࣜ (2.36) ͷ௨ΓͱͳΔɽࣜ (2.36) தͷ λcycle , λidentity ͸ଛࣦؔ਺ͷόϥϯεΛͱΔͨ ΊͷϋΠύʔύϥϝʔλͰ͋Δ. L = LGAN (G, DY , X, Y) + LGAN (F, DX , Y, X) + λcycle Lcycle (G, F) + λidentity Lidentity (G, F) (2.36) 2.8 ճస֯౓༧ଌλεΫʢSelf-Supervised taskʣ ճస֯౓༧ଌλεΫͰ͸ɼݩͷը૾Λ͋Δ֯౓෼ճసͤͨ͞΋ͷΛೖྗը૾ͱ͠ɼ ͦͷೖྗը૾͕ݩͷը૾͔ΒԿ౓ճస͍ͯ͠Δ͔Λਪଌ͢Δ [17]ɽը૾ͷճస֯౓Λ ਪଌ͢ΔλεΫΛՃ͑Δ͜ͱʹΑͬͯɼճసෆมੑΛ೺Ѳ͠ɼը૾ͷزԿతಛ௃Λ ΑΓଊ͑ΒΕΔΑ͏ʹ͍ͯ͠Δɽ
  27. ୈ 2 ষ ؔ࿈ݚڀ 28 ਤ 2.21 Self-Supervised GAN ʹ͓͚Δ

    Discriminator ͷߏ଄ ͳ͓ɼGAN ʹճస֯౓༧ଌλεΫΛಋೖͨ͠ Self-Supervised GAN ͕ଘࡏ͢Δ [10]ɽਤ 2.21 ʹͯɼSelf-Supervised GAN ʹ͓͚Δ Discriminator ͷߏ଄Λࣔ͢ɽSelf- Supervised GAN Ͱ͸ɼDiscriminator ͷ෦෼Λʮೖྗը૾͕ຊ෺ը૾Ͱ͋Δ͔൱͔ʯ Λࣝผ͢Δ D ͱʮೖྗը૾͕ݩͷը૾͔ΒԿ౓ճసͨ͠΋ͷͰ͋Δ͔ʯΛࣝผ͢Δ Drot ʹ෼͚͓ͯΓɼલऀ͸ Generator ͔Βੜ੒͞Εͨݩͷຊ෺ը૾ͱݩͷੜ੒ը૾ɼ ޙऀ͸ݩͷຊ෺ը૾ͱݩͷੜ੒ը૾Λ, ͦΕͧΕ 0◦, 90◦, 180◦, 270◦ ճసͤͨ͞΋ͷΛ ͦΕͧΕ༻͍Δɽਤ 2.21 ͷΑ͏ʹ Self-Supervised GAN ͷ Discriminator ʹճస֯౓ ਪଌλεΫΛՃ͑Δ͜ͱʹΑͬͯɼຊ෺ը૾ͷҐஔؔ܎Λ΋ͱʹੜ੒ը૾ͷࣝผ͕ Ͱ͖ΔͨΊɼࣝผਫ਼౓্͕͕Δɽͦͯ͠ Discriminator ͱ Generator ͷఢରతֶशʹ ΑͬͯɼGenerator ΋ֶश͢ΔͷͰɼੜ੒ը૾ͷ࣭΋޲্ͤ͞Δ͜ͱ͕ՄೳͱͳΔɽ Self-Supervised GAN ͷଛࣦؔ਺ LD , LG ͸ɼࣜ (2.37)ɼࣜ (2.37) ͱදͤΔɽV(D,G) ͸ࣜ (2.14) ͱಉ༷ GAN ͷଛࣦؔ਺Ͱ͋Γɼຊ෺ը૾ʹ͓͚Δճస֯౓༧ଌλεΫ LrotD ɼੜ੒ը૾ʹ͓͚Δճస֯౓༧ଌλεΫ LrotG ΛͦΕͧΕ௥Ճ͍ͯ͠Δɽ
  28. ୈ 2 ষ ؔ࿈ݚڀ 29 LD = V(D,G) + λd

    LrotD (2.37) LG = V(D,G) − λg LrotG (2.38) ຊ෺ը૾ʹ͓͚Δճస֯౓༧ଌλεΫ LrotD ɼੜ੒ը૾ʹ͓͚Δճస֯౓༧ଌλεΫ LrotG Λࣜ(2.39)ɼ ࣜ(2.40)Ͱද͢ɽ T ͸ճస֯౓ͷू߹Λද͠ɼT = {0◦, 90◦, 180◦, 270◦} ͱ͢ΔɽT ∈ T Λճస֯౓ͱ͠ɼຊ෺ը૾ x, y Λ T ճసͤͨ͞ը૾ΛͦΕͧΕ xT , yT ͷΑ͏ʹද͢ɽ Self-Supervised GAN Ͱ͸ Discriminator ʹճస֯౓Λ༧ଌ͢Δ Drot Λಋೖ͠ɼճ స֯౓Λਖ਼֬ʹ༧ଌͰ͖ΔΑ͏ʹ Discriminator Λֶश͢Δɽճస֯౓͕ਖ਼֬ʹ༧ଌ Ͱ͖ΔΑ͏ʹͳΔʹͭΕɼDrot (x) ͷ஋͸େ͖͘ͳΔɽ LrotD = Ex ET log(Drot (xT )) (2.39) LrotG = Ex ET log(Drot (xT )) (2.40)
  29. ୈ 3 ষ ఏҊख๏ 30 ୈ3ষ ఏҊख๏ ຊݚڀʹ͓͍ͯɼ2.7.2 ߲ͷ Attention-Guided

    GAN ͷ Discriminator ʹɼճస͞ Εͨݩը૾ͷճస֯౓Λਪଌ͢ΔλεΫΛՃ͑ͨ Self-Supervised Attention-Guided GAN ʢSSAttention-Guided GANʣΛఏҊ͢Δɽͦͯ͠ɼઌߦݚڀͰࣔͨ͠ 2.7.2 ߲ ͷ Attnetion-Guided GAN ͱൺֱͯ͠ SSAttention-Guided GAN ͷධՁΛߦ͏ɽ SSAttention-Guided GAN Ͱ͸ɼ2 ͭͷը૾܈ X, Y ؒͰͷը૾ม׵ʹର͠ɼ2 ͭͷ Generator Λ GX→Y : X → YɼGY→X : Y → X Λ༻ҙ͢Δɽͦͯ͠ը૾܈ X, Y தͷը૾ x ∈ X ͱ y ∈ Y Λຊ෺ը૾ͱ֤ͯ͠ Discriminator DX , DY ͰࣝผΛߦ͏ɽDX ͸ը૾܈ X ͷຊ෺ը૾ x ͱੜ੒ը૾ GY→X (y) Λࣝผ͠ɼDY ͸ը૾܈ Y ͷຊ෺ը૾ y ͱੜ੒ը ૾ GX→Y (x) Λࣝผ͢Δɽɹ ճస֯౓༧ଌλεΫʹ͓͍ͯɼ T ͸ճస֯౓ͷू߹Λද͠ɼT = {0◦, 90◦, 180◦, 270◦} ͱ͢ΔɽT ∈ T Λճస֯౓ͱ͓ͯ͠Γɼຊ෺ը૾ x, y Λ T ճసͤͨ͞ը૾ΛͦΕͧΕ xT , yT ɼੜ੒ը૾ GX→Y (x),GY→X (y) Λ T ճసͤͨ͞ը૾ΛͦΕͧΕ GT X→Y (x),GT Y→X (y) ͱද͢ɽ ճస֯౓༧ଌΛ͢Δ Discriminator ΛͦΕͧΕ Drot X ɼDrot Y ͱ͢ΔɽDrot X ͸ɼຊ෺ը૾ xT ͱੜ੒ը૾GT Y→X (y)ͷճస֯౓Λਪଌ͠ɼDrot Y ͸ɼຊ෺ը૾ yT ͱੜ੒ը૾GT X→Y (x) ͷճస֯౓Λਪଌ͢Δɽ 3.1 ωοτϫʔΫߏ଄ ຊݚڀͰఏҊ͢Δ SSAttention-Guided GAN ͸ɼ2.7.2 ߲ͷ Attention-Guided GAN ͱಉ༷ͷωοτϫʔΫߏ଄Λࢀߟͱ͍ͯ͠Δ [9]ɽਤ3.1ʹͯSSAttention-Guided GAN ͷߏ଄Λࣔ͢ɽਤ 3.1 Ͱ͸ɼਤ 2.17 Ͱࣔ͢ CycleGAN ߏ଄ͷ֤ Discriminator ෦෼ʹ ਤ 2.21 ͱಉ͡ߏ଄ͷճస֯౓༧ଌλεΫΛಋೖ͍ͯ͠ΔɽDiscriminator DX Ͱ͸ೖ
  30. ୈ 3 ষ ఏҊख๏ 31 ྗը૾͕ຊ෺ը૾ x ͔ੜ੒ը૾ GY→X ͔ͷࣝผΛߦ͍ɼ͔ͭ

    xT ͱ GT Y→X (y) ͷճస֯ ౓ਪఆΛߦ͍ͬͯΔɽಉ༷ʹ Discriminator DY Ͱ͸ೖྗը૾͕ຊ෺ը૾ y ͔ੜ੒ը ૾ GX→Y ͔ͷࣝผΛߦ͍ɼ͔ͭ yT ͱ GT X→Y (x) ͷճస֯౓ਪఆΛߦ͍ͬͯΔɽ ֤ Generator ͸ɼAttention-Guided Generator Scheme II Λ༻͍ΔɽAttention-Guided Generator Scheme II Ͱ͸ɼn ݸͷ Attention Mask ͱ n−1 ݸͷ Content Mask Λग़ྗ͠ɼ ͜ΕΒͷ Mask ͱೖྗը૾Λ଍ͯ͠ग़ྗ͢Δ [9]ɽGenerator ͷωοτϫʔΫ͸ɼ3 ͭ ͷ৞ΈࠐΈ૚ͱɼ9 ͭͷ Residual Blocks [18]ɼ3 ͭͷٯ৞ΈࠐΈ૚Ͱߏ੒͞ΕΔ [7] [9]ɽ Residual Blocks ͸ɼਂ૚χϡʔϥϧωοτϫʔΫͷֶशਫ਼౓޲্Λୡ੒͢ΔͨΊʹ ಋೖ͞ΕͨɼೖྗΛ 2 ૚࿈ଓ͢Δ৞ΈࠐΈ૚ʹࣸ૾ͨ͠ͱ͖ͷग़ྗʹೖྗΛ଍͠߹ Θͤͨ ResNet ͱݺ͹ΕΔωοτϫʔΫΛܨ͗߹Θͤͯߏ੒͞ΕΔ [18] [12] ֤ Discriminator ͸ɼ5 ݸͷ৞ΈࠐΈ૚Ͱߏ੒͞Εɼ࠷ऴతʹ 512 νϟωϧͷςϯ ιϧΛग़ྗ͢Δ [7]ɽͳ͓ຊݚڀͰ͸ɼνϟωϧ਺ 512 ͷ࠷ޙͷ૚ͷग़ྗ͔Βɼνϟ ωϧ਺ 1 ͷग़ྗͱνϟωϧ਺ 4 ͷग़ྗΛಘΔɽલऀ͸ɼೖྗը૾͕ຊ෺Ͱ͋Δ͔Ͳ ͏͔Λࣝผ͢ΔͨΊʹ༻͍Δग़ྗͰɼޙऀ͸ɼճస֯౓Λ༧ଌ͢ΔͨΊʹ༻͍Δग़ ྗͰ͋Δɽ 3.2 ଛࣦؔ਺ 2.7.2 ߲ͷ Attention-Guided GAN Ͱ͸ CycleGAN ͱಉ͘͡ɼఢରੑଛࣦʢAdver- sarial LossʣͱαΠΫϧҰ؏ੑଛࣦʢCycle Consistency LossʣΛ࢖༻͍ͯ͠Δɽຊ ݚڀͰ঺հ͢Δ SSAttention-Guided GAN Ͱ͸ɼ֤ఢରੑଛࣦʹճస֯౓༧ଌଛࣦ ʢRotation LossʣΛ௥Ճ͢Δɽ 3.2.1 ఢରੑଛࣦ (Adversarial Loss) ఢରੑଛࣦ͸ɼDiscriminator ͷఢରੑଛࣦͱ Generator ͷఢରੑଛࣦʹ෼͚ΒΕ Δɽ֤ Discriminator ͷࣝผʹ͓͚ΔଛࣦΛ Discriminator ͷఢରੑଛࣦͱ͢ΔɽDis- criminator DX , DY ʹ͓͚Δఢରੑଛࣦ LGANDX , LGANDY ͸ࣜ (3.1)ɼࣜ (3.2) ͷΑ͏ʹද
  31. ୈ 3 ষ ఏҊख๏ 32 ਤ 3.1 SSAttention-Guided GAN ͷߏ଄

    ͢ɽͳ͓ຊݚڀʹ͓͍ͯɼLGANDX , LGANDY ͸ΫϩεΤϯτϩϐʔͰ͸ͳ͘࠷খೋ৐ ๏Λ࠾༻͍ͯ͠Δɽ࠷খೋ৐๏Λ༻͍Δ͜ͱͰɼֶशͷ҆ఆੑΛ֬อ͢Δ͜ͱ͕Մ ೳͱͳΔ [11]ɽ LGANDX = 1 2 Ex [(DX (x) − 1)2] + 1 2 Ey [(DX (GY→X (y)))2] (3.1) LGANDY = 1 2 Ey [(DY (y) − 1)2] + 1 2 Ex [(DY (GX→Y (x)))2] (3.2) ɹ ଓ͍ͯ Generator GY→X ,GX→Y ͷఢରੑଛࣦ LGANGY→X , LGANGX→Y Λࣜ (3.3)ɼࣜ (3.4) ͷ Α͏ʹఆٛ͢Δɽ LGANGY→X = 1 2 Ey [(DX (GY→X (y)) − 1)2] (3.3) LGANGX→Y = 1 2 Ex [(DY (GX→Y (x)) − 1)2] (3.4) ຊݚڀͰఏҊ͢Δఢରੑଛࣦ͸ɼ֤ఢରੑଛࣦʹͦΕͧΕճస֯౓༧ଌଛࣦΛ௥
  32. ୈ 3 ষ ఏҊख๏ 33 Ճͨ͠΋ͷͰ͋ΔɽDiscriminator ͷఢରੑଛࣦ LGANDX , LGANDY

    ʹճస֯౓༧ଌଛࣦ LrotDX , LrotDY ΛͦΕͧΕ௥Ճͨ͠ఢରੑଛࣦΛࣜ(3.5)ɼ ࣜ(3.6)Ͱఆٛ͢Δɽ LrotDX , LrotDY ͸ 2.8 ߲ͷࣜ (2.39) ͱಉ༷ʹࣜ (3.7), ࣜ (3.8) Ͱఆٛ͢ΔɽຊݚڀͰ͸ֶश҆ఆੑͷ ֬อͷͨΊɼLrotDX , LrotDY ͸ɼΫϩεΤϯτϩϐʔͷ୅ΘΓʹೋ৐ޡࠩΛ༻͍͍ͯ Δɽͳ͓ LrotDX , LrotDY ͸ଛࣦͷόϥϯεΛͱΔͨΊͷϋΠύʔύϥϝʔλ λDX , λDY ͱͷ֤ੵΛͱΔɽ LDX = LGANDX + λDX LrotDX (3.5) LDY = LGANDY + λDY LrotDY (3.6) LrotDX = Ex ET [(Drot X (xT ) − 1)2] (3.7) LrotDY = Ey ET [(Drot Y (yT ) − 1)2] (3.8) Generator ͷఢରੑଛࣦ LGANGY→X , LGANGX→Y ʹճస֯౓༧ଌଛࣦ LrotGY→X , LrotGX→Y Λ ͦΕͧΕ௥Ճͨ͠ఢରੑଛࣦΛࣜ (3.9)ɼࣜ (3.10) ͱఆٛ͢ΔɽLrotGY→X , LrotGX→Y ͸ 2.8 અͷࣜ (2.40) ͱಉ༷ʹࣜ (3.11)ɼ ࣜ (3.12) Ͱఆٛ͢ΔɽຊݚڀͰ͸ֶश҆ఆੑͷ֬ อͷͨΊɼLrotDX , LrotDY ͸ɼΫϩεΤϯτϩϐʔͷ୅ΘΓʹೋ৐ޡࠩΛ༻͍͍ͯΔɽ LrotDX , LrotDY ͸ଛࣦͷόϥϯεΛͱΔͨΊͷϋΠύʔύϥϝʔλ λGY→X , λGX→Y ͱͷ֤ ੵΛͱΔɽ LGY→X = LGANGY→X + λGY→X LrotGY→X (3.9) LGX→Y = LGANGX→Y + λGX→Y LrotGX→Y (3.10) LrotGY→X = Ey ET [(Drot X (GT Y→X (y)) − 1)2] (3.11) LrotGX→Y = Ex ET [(Drot Y (GT X→Y (x)) − 1)2] (3.12)
  33. ୈ 3 ষ ఏҊख๏ 34 3.2.2 Cycle Consistency LossʢαΠΫϧҰ؏ੑଛࣦʣ αΠΫϧҰ؏ੑଛࣦͰ͸ɼGenerator

    GX→Y ,GY→X ؒͰໃ६͕ੜ͡ͳ͍Α͏ʹɼCy- cleGAN ΍ Attention-Guided GAN Ͱ༻͍ΒΕ͍ͯΔ [7] [8] [9]ɽຊݚڀͰ΋ઌߦݚڀͱ ಉ༷ͷαΠΫϧҰ؏ੑଛࣦ Lcycle Λѻ͏ɽαΠΫϧҰ؏ੑଛࣦ Lcycle Λࣜ (3.13) ʹͯ ද͢ɽ Lcycle = Ex [∥GY→X (GX→Y (x)) − x∥1 ] + Ey [∥GX→Y (GY→X (y)) − y∥1 ] (3.13) 3.2.3 ࠷ऴతͳଛࣦؔ਺ Discriminator ͷଛࣦؔ਺͸ͦΕͧΕࣜ (3.5)ɼࣜ (3.6) ͷ௨ΓͰ͋ΔɽҰํɼGen- erator ͷଛࣦؔ਺͸ɼGenerator ͷ֤ఢରੑଛࣦͱαΠΫϧҰ؏ੑଛࣦΛ߹Θͤɼࣜ (3.14) ͷΑ͏ʹද͢ɽͳ͓ λcycle ͸ଛࣦͷόϥϯεΛͱΔͨΊͷϋΠύʔύϥϝʔλ Ͱ͋Δɽ LG = LGX→Y + LGY→X + λcycle Lcycle (3.14) ͞ΒʹຊݚڀͰ͸ɼΞΠσϯςΟςΟଛࣦʢIdentity Lossʣ Lidentity Λಋೖ͢ΔɽΞ ΠσϯςΟςΟଛࣦ͸֤ը૾܈ʹ͓͚Δ৭߹͍ΛอͭͨΊʹઃ͚ΒΕͨଛࣦͰ͋Γɼ CycleGAN ΍ Attention-Guided GAN Ͱ΋ಉ༷ʹ༻͍ΒΕ͍ͯΔ [7] [9]ɽ ΞΠσϯςΟςΟଛࣦ͸ɼҎԼͷࣜ (3.15) ͷΑ͏ʹද͞ΕΔɽ Lidentity = Ex [∥GY→X (x) − x∥1 ] + Ey [∥GX→Y (y) − y∥1 ] (3.15) ࣜ (3.14) ʹΞΠσϯςΟςΟଛࣦΛ௥Ճ͠ɼຊݚڀͰ༻͍Δ Generator ͷ࠷ऴతͳଛ ࣦؔ਺Λࣜ (3.16) ʹࣔ͢ɽͳ͓ λidentity ͸ଛࣦͷόϥϯεΛͱΔͨΊͷϋΠύʔύϥ ϝʔλͰ͋Δɽ LG = LGX→Y + LGY→X + λcycle Lcycle + λidentity Lidentity (3.16)
  34. ୈ 4 ষ ࣮ݧ 35 ୈ4ষ ࣮ݧ ຊݚڀͰ͸ɼઌߦݚڀͷ Attention-Guided Generator

    Scheme II Λ༻͍ͨ Attention- Guided CycleGAN [9] ͱఏҊख๏ͷ SSAttention-Guided GAN ͱͷൺֱΛߦ͏ɽ 4.1 ύϥϝʔλઃఆ ຊݚڀʹ͓͚Δύϥϝʔλ͸ɼઌߦݚڀͷ Attention-Guided GAN ͱಉ༷ͷ΋ͷΛ ༻͍Δ [9]ɽ·ͣόοναΠζ͸ઌߦݚڀͱಉ͡ 4, ΤϙοΫ਺͸ 60, ֶश཰͸ 0.0002ɼ ͦͯ͠ Attention Mask ͷ਺Λ 10 ͱ͢Δɽ ଛࣦؔ਺ͷ֤ύϥϝʔλ͸ɼλcycle ͸ 10.0ɼλidentity ͸ 0.5 ͱ͠ɼճస֯౓༧ଌʹ͓ ͚Δ֤ύϥϝʔλ͸ɼຊݚڀͰ͸ද 4.1 ͷ௨ΓʹύϥϝʔλΛ Discriminator λDX , λDY : Generator λF , λG = 5 : 1 ͷൺ཰ʹͳΔΑ͏ʹݻఆ࣮ͯ͠ݧΛߦ͏ɽઌߦݚڀͰ͸ɼ Discriminator ͷճస֯ଛࣦͷόϥϯεύϥϝʔλ͸ 1.0ɼGenerator ͸ 0.2 ͱઃఆ͞ Ε͍ͯΔ [10]ɽ ද 4.1 ճస֯౓༧ଌʹ͓͚Δ֤ύϥϝʔλ Discriminator ͷճస֯ଛࣦ Generator ͷճస֯ଛࣦ λDX λDY λF λG 0.5 0.5 0.1 0.1 1.0 1.0 0.2 0.2 1.5 1.5 0.3 0.3 2.0 2.0 0.4 0.4
  35. ୈ 4 ষ ࣮ݧ 36 4.2 σʔληοτ ຊݚڀͰ͸ɼઌߦݚڀͰ༻͍ΒΕͨ 5 ͭͷσʔληοτΛ༻͍ΔɽຊݚڀͰ༻͍

    Δσʔληοτʹ͍ͭͯද 4.2 ʹͯ·ͱΊΔɽ 4.3 ධՁํ๏ ຊݚڀͰ͸ɼFr´ echet Inception Distance(FID) Λ༻͍ͨఆྔతධՁͱ֤ੜ੒ը૾ͷ ఆੑతධՁΛߦ͏ɽ 4.3.1 Fr´ echet Inception Distance(FID) Fr´ echet Inception Distance(FID) ͸ɼ2 ͭͷը૾ͷू߹ͷ෼෍ؒڑ཭Λ༻͍ͨධՁ ํ๏Ͱ͋Δ [19]ɽ2 ͭͷը૾ͷू߹ͷ෼෍ؒڑ཭Λ FID ͱ͠ɼFID ͷ஋͕খ͍͞΄Ͳ ੜ੒ը૾͕ຊ෺ը૾ʹ͍ۙͱ͍͏͜ͱʹͳΔɽGAN Ͱ͸ɼຊ෺ը૾σʔληοτͷ ෼෍ pdata ͱੜ੒ը૾σʔληοτͷ෼෍ pg ؒͷ Fr´ echet ڑ཭Λܭࢉ͠ɼFID Λࢉग़ ͢Δ [19]ɽࣜ (4.1) Ͱ GAN ͷ FID ͷܭࢉΛࣔ͢ɽpdata ͷฏۉϕΫτϧͱڞ෼ࢄߦྻ (µr , Cr )ɼpg ͷฏۉϕΫτϧͱڞ෼ࢄߦྻ (µg , Cg ) Λ༻͍ͯࣜ (4.1) ͷ௨ΓʹܭࢉΛ ߦ͏ɽ FID(pdata , pg ) = ∥µr − µg ∥ + tr(Cr + Cg − 2(Cr Cg )1/2) (4.1) ɹ ຊݚڀͰ͸ɼઌߦݚڀͱಉ༷ʹͯ͠ɼhorse2zebra ͱ apple2orange σʔληοτͰ ը૾ม׵͞Εͨը૾Λ༻͍ͯɼFID Ͱຊ෺ը૾ͱੜ੒ը૾ͷఆྔతධՁΛߦ͏ [9]ɽ
  36. ୈ 4 ষ ࣮ݧ 37 ද 4.2 ࢖༻͢Δσʔληοτ σʔληοτ໊ σʔληοτλΠϓ

    σʔληοτͷத਎ αΠζ horse2zebra train horse 1067 ʢഅ → γϚ΢Ϛʣ zebra 1334 test horse 120 zebra 140 apple2orange train apple 995 ʢΞοϓϧ → ΦϨϯδʣ orange 1019 test apple 266 orange 248 facades train facade 400 ʢݐ෺֎؍ → ϥϕϧʣ label 400 test facade 106 label 106 map2photo train map 1096 ʢ஍ਤ → ߤۭࣸਅʣ photo 1096 test map 1098 photo 1098 cityscapes train cityscape 2975 ʢ౎ࢢܠ؍ → ϥϕϧʣ label 2975 test cityscape 500 label 500
  37. ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 38 ୈ5ষ ࣮ݧ݁Ռͱߟ࡯ ຊݚڀͰ͸ɼઌߦݚڀͷ Attention-Guided Generator

    Scheme II ͷ Attention-Guided GAN ͷϓϩάϥϜΛ༻͍ͯɼSSAttention-Guided GAN ͷϓϩάϥϜΛ࣮૷͠ɼઌ ߦݚڀͱఏҊख๏ͷධՁΛߦͬͨɽ 5.1 ࣮ݧ݁Ռ 5.1.1 ఆྔతධՁ FID Ͱͷ horse2zebra ͱ apple2orange ͷఆྔతධՁΛද 5.1 ʹࣔ͢ɽද 5.1 Ͱ͸ɼ horse to zebraʢഅ → γϚ΢Ϛʣͱ apple to orangeʢΞοϓϧ → ΦϨϯδʣͷ FID Λ ֤ճస֯ύϥϝʔλ͝ͱʹ͍ࣔͯ͠Δɽͳ͓ճస֯ύϥϝʔλ λD , λG ͸, ͦΕͧΕ λDX , λDY ͱ λGX→Y , λGY→X Λ·ͱΊͨ΋ͷͰ͋Δɽճస֯ύϥϝʔλ (λD , λG ) = (0.0, 0.0) ͸ઌߦݚڀͷ Attention-Guided GAN [9] Ͱ͋Δɽ ද5.1ͷ݁ՌΑΓɼ horse to zebraͱapple to orangeͷ྆ํʹ͓͍ͯɼ ճస֯ύϥϝʔ λ (λD , λG ) = (0.5, 0.1), (2.0, 0.4) ͷͱ͖ͷ FID ͕ઌߦݚڀͷ Attention-Guided GAN ͷ FID ΑΓখ͍͞஋Λͱͳ͍ͬͯΔɽҰํͰճస֯ύϥϝʔλ (λD , λG ) = (1.5, 0.3) Ͱ ͸ɼapple to orange ͷ FID ͸ൺֱతখ͍͞஋Ͱ͋Δͷʹର͠ɼhorse to zebra ͷ FID ͸ൺֱతେ͖͍஋ͱͳ͍ͬͯΔɽ͞Βʹ (λD , λG ) = (1.0, 0.2) ͷͱ͖͸ɼhorse to zebra ͱ apple to orange ͷ྆ํʹ͓͍ͯɼFID ͕ൺֱతେ͖͍஋Λͱ͍ͬͯΔ͜ͱ͕ݟͯ औΕΔɽ
  38. ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 39 ද 5.1 FID ͷఆྔతධՁ ճస֯ύϥϝʔλ

    (λD , λG ) horse to zebra apple to orange (0.0, 0.0) 79.64 156.88 (0.5, 0.1) 54.58 149.88 (1.0, 0.2) 117.60 158.46 (1.5, 0.3) 83.37 154.71 (2.0, 0.4) 66.87 155.53 5.1.2 ఆੑతධՁ ֤σʔληοτ horse2zebraɼapple2orangeɼmap2photoɼfacadesɼcityscapes ͷը ૾ม׵ޙͷੜ੒ը૾ͱ Attention ͷ݁Ռʹ͍ͭͯड़΂Δɽ ·ͣ horse to zebra ม׵ͷҰྫΛද 5.2ɼද 5.3 ʹɼapple to orange ม׵ͷҰྫΛද 5.4ɼද 5.5 ʹࣔ͢ɽhorse2zebra ͷҰྫʹ͓͍ͯɼैདྷͷ Attention-Guided GAN ͷ ม׵ը૾ͱ Attention ͸എܠʹ΋ࣶ໛༷΁ͷม׵͕͞Ε͍ͯΔ͜ͱ͕ݟͯऔΕΔɽҰ ํ SSAttention-Guided GAN Λ༻͍ͨ݁Ռɼ྆ํͱ΋എܠ΁ͷࣶ໛༷͕ݮগ͠ɼͦͷ Attention ͕ม׵ର৅ͷܗʹ੔͍ͬͯΔ͜ͱ͕෼͔Δɽද 5.2 Ͱ͸ɼճస֯ύϥϝʔ λ (λD , λG ) = (0.5, 0.1), (1.0, 0.2) Ͱ͸ɼઌߦݚڀͷ Attention-Guided GAN ͰݟΒΕͨ എܠͷ্൒෼ͷࣶ໛༷ม׵ͷӨڹ͕΄ͱΜͲແ͘ͳ͓ͬͯΓɼͦΕҎ߱͸·ͨผͷ Өڹ͕ग़͍ͯΔ͜ͱ͕؍࡯Ͱ͖Δɽද 5.3 Ͱ΋ಉ༷ʹɼճస֯౓༧ଌλεΫಋೖޙɼ Attention ͕ม׵ର৅ͷഅͷܗʹͳ͓ͬͯΓɼൺֱతγϚ΢Ϛ΁ͷม׵͕Ͱ͖͍ͯΔ Օॴ͕ݟड͚ΒΕͨɽɹ ଓ͍ͯ apple2orange Ͱͷม׵݁ՌΛࣔͨ͠ද 5.4 Ͱ͸ɼ(λD , λG ) = (0.5, 0.1) ͷͱ ͖ɼઌߦݚڀͷ Attention-Guided GAN ͷͱ͖ͷ݁ՌͰ͸ݟΒΕͳ͔ͬͨԞͷ෦෼΁ ͷ Attention Mask ͕֬ೝ͞Ε͓ͯΓɼԞͷΞοϓϧ͕ը૾ม׵Ͱ͖͍ͯΔ͜ͱ͕ݟ ͯऔΕΔɽҰํɼද 5.5 ͷΑ͏ʹɼճస֯౓༧ଌλεΫಋೖલͱಋೖޙͰ͸͋·Γ มԽ͕ݟΒΕͳ͔ͬͨ΋ͷ΋֬ೝ͞Εͨɽ ࢒Γͷσʔληοτͷը૾ม׵ʹ͍ͭͯɼmap to photo ͷҰྫΛද 5.6ɼݐ෺֎؍
  39. ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 40 ͔Βϥϕϧ΁ͷม׵ͷҰྫΛද 5.7ɼ౎ࢢܠ؍͔Βηάϝϯςʔγϣϯ΁ͷม׵ͷҰ ྫΛද 5.8 ʹࣔ͢ɽ·ͣ

    map2photo Ͱ͸ઌߦݚڀͱൺ΂ͯɼͲͷճస֯ύϥϝʔλ ΋ Attnetion ͱੜ੒ը૾ʹ͋·ΓมԽ͕ݟΒΕͳ͔ͬͨɽଓ͘ facades ͸ɼಛʹද 5.7 ͷΑ͏ʹɼճస֯ύϥϝʔλ͕ (λD , λG ) = (1.5, 0.3) ͷͱ͖ͷੜ੒ը૾ͷϥϕϧཱ͕ ମతʹͳ͍ͬͯΔ͜ͱ͕֬ೝͰ͖Δɽcityscapes Ͱ͸ɼද 5.8 ͷΑ͏ʹɼं΍ݐ෺ͱ ͍ͬͨ෺ମ͔Βϥϕϧ΁ͷม׵͕ઌߦݚڀͷ Attention-Guided GAN ΑΓൺֱతਖ਼֬ ʹߦΘΕ͓ͯΓɼ֤ϥϕϧ͕ं΍ݐ෺ͷܗʹͳ͍ͬͯΔ͜ͱ͕ݟͯऔΕΔɽͨͩ͠ ճస֯ύϥϝʔλ͕ (λD , λG ) = (1.0, 0.2) ͷͱ͖ɼAttention Mask ʹ̏ͭͷന͍఺͕ ֬ೝ͞Ε͍ͯΔɽ͜ΕΒ͸ Attention Mask ͱରԠ͢Δੜ੒ը૾ͰมԽ͍ͯ͠ͳ͍෦ ෼ͱͳ͍ͬͯΔɽ horse2zebra ͱ apple2orange ʹ͓͚Δੜ੒ը૾ࣗମͷఆੑతධՁʹ͍ͭͯड़΂Δɽ ·ͣճస֯ύϥϝʔλ͝ͱʹ horse2zebra ͷੜ੒ը૾ΛαϯϓϦϯάͨ͠΋ͷΛਤ 5.1, ਤ 5.2, ਤ 5.3, ਤ 5.4, ਤ 5.5 ʹͯࣔ͢ɽ͜ΕΒͷαϯϓϦϯάͨ͠ੜ੒ը૾Λൺ ֱ͢Δͱɼઌߦݚڀͷ Attention-Guided GAN Ͱͷม׵ը૾ΑΓഎܠ΁ͷӨڹ͕վળ ͞Εͨ΋ͷ͕΄ͱΜͲͷੜ੒ը૾Ͱݟड͚ΒΕͨɽͨͩ͠Ͳͷճస֯ύϥϝʔλͰ എܠ΁ͷӨڹ͕վળ͞Ε͔ͨ͸ɼੜ੒ը૾ʹΑͬͯҟͳΔ͜ͱ͕֬ೝ͞Εͨɽͦͷ ҰํͰɼಉ༷ʹઌߦݚڀͷ Attention-Guided GAN Ͱͷม׵ը૾ΑΓഎܠ΁ͷӨڹ͕ ֦େ͞Εͨ෦෼΍ɼࣶ໛༷͕ର৅ͱͳΔ෦෼ʹ͍͍ͭͯͳ͍ͱ͍ͬͨ෦෼΋ݟड͚ ΒΕͨɽ ଓ͍ͯճస֯ύϥϝʔλ͝ͱʹ apple2orange ͷੜ੒ը૾ΛαϯϓϦϯάͨ͠΋ͷ Λਤ 5.6, ਤ 5.7, ਤ 5.8, ਤ 5.9, ਤ 5.10 ʹͯࣔ͢ɽαϯϓϦϯάͨ͠ੜ੒ը૾Λൺ΂ͯɼ தʹ͸શ͘มԽ͠ͳ͍ੜ੒ը૾΋ଘࡏ͕ͨ͠ɼجຊతʹઌߦݚڀͷ Attention-Guided GAN Ͱͷੜ੒ը૾ͱൺ΂ɼճస֯ύϥϝʔλ͕ (λD , λG ) = (0.5, 0.1), (1.5, 0.3) ͷͱ͖ ʹର৅ͱͳΔ෦෼શମ΁ͷม׵͕͞Ε͍ͯΔը૾͕ଟ͘ݟΒΕͨɽͦΕʹՃ͑ͯ੺ ͍Ξοϓϧͱɼผͷ৭ͷΞοϓϧ΍Ξοϓϧͷத਎Λ۠ผͯ͠ม׵͢Δ͜ͱ͕Ͱ͖ ͍ͯΔ͜ͱ΋֬ೝͰ͖Δɽɹ ճస֯ύϥϝʔλ͝ͱʹ map2photo ٴͼ facades ٴͼ cityscapes ͷੜ੒ը૾Λαϯ ϓϦϯάͨ͠΋ͷΛͦΕͧΕࣔ͢ɽmap2photo ͸ਤ 5.11, ਤ 5.12, ਤ 5.13, ਤ 5.14, ਤ
  40. ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 41 5.15 ʹɼfacades ͸ਤ 5.16, ਤ

    5.17, ਤ 5.18, ਤ 5.19, ਤ 5.20 ʹɼcityscapes ͸ਤ 5.21, ਤ 5.22, ਤ 5.23, ਤ 5.24, ਤ 5.25 ʹͯࣔ͢ɽ·ͣ map2photo Ͱ͸ɼճస֯ύϥϝʔλ ʹΑͬͯ͸ը૾ͷ৭͕มΘ͍ͬͯΔՕॴ΋ݟΒΕ͕ͨɼҐஔؔ܎ʹؔͯ͠͸มԽ͸ ݟΒΕͳ͔ͬͨɽ࣍ʹ facades Ͱ͸ɼճస֯ύϥϝʔλ͕ (λD , λG ) = (1.5, 0.3) ͷͱ ͖͸ϥϕϧ͕ม׵ݩͷݐ෺֎؍ͷΑ͏ʹཱମతʹͳ͍ͬͯΔ΋ͷ͕ଟ਺֬ೝ͞Εͨɽ ͳ͓ (λD , λG ) = (1.5, 0.3) Ҏ֎ͷճస֯ύϥϝʔλʹ͓͍ͯ͸ɼઌߦݚڀͷ Attention- Guided GAN Ͱͷ݁Ռͱ͋·ΓมԽ͕ݟΒΕͳ͔ͬͨɽ࠷ޙʹ cityscapes Ͱ͸ɼಛ ʹճస֯ύϥϝʔλ͕ (λD , λG ) = (1.0, 0.2), (2.0, 0.4) ͷΑ͏ʹ੨ͷंͷϥϕϧ΍ɼ྘ ͷݐ෺ͷϥϕϧ͕ΑΓਖ਼֬ʹੜ੒ը૾ʹ൓ө͞Ε͍ͯΔ෦෼͕ݟΒΕͨɽͦͷҰํ Ͱɼઌߦݚڀͷ Attention-Guided GAN ͰݟΒΕͨ੺͍ਓӨͷ෦෼͸ൺֱతݮগͯ͠ ͍ͨɽͦΕʹՃ͑ͯɼճస֯ύϥϝʔλ͕ (λD , λG ) = (0.5, 0.1), (1.5, 0.3) ͷͱ͖͸ɼ ઌߦݚڀͷ Attention-Guided GAN ͷੜ੒ը૾ͱൺ΂ͯϐϯΫͷ෦෼͕૿Ճ͍ͯ͠Δ ͜ͱ͕֬ೝ͞Εͨɽ 5.2 ߟ࡯ 5.2.1 ఆྔతධՁ SSAttention-Guided GAN Ͱੜ੒͞Εͨը૾͸ɼઌߦݚڀͷ Attention-Guided GAN ͷੜ੒ը૾ͱൺֱͯ͠ɼύϥϝʔλʹΑͬͯ͸ FID ͷ਺஋ΛվળͰ͖Δɽಛʹද 5.1 ʹ͓͍ͯɼhorse to zebraɼapple to orange ͱ΋ʹճస֯ύϥϝʔλ (λD , λG ) = (0.5, 0.1) ͷͱ͖͕࠷΋௿͍਺஋ͱͳ͍ͬͯΔɽ͕ͨͬͯ͠ɼճస֯౓༧ଌλεΫ͸ FID ͷվ ળʹߩݙ͓ͯ͠Γɼ(λD , λG ) = (0.5, 0.1) ͷͱ͖͕࠷దͳճస֯ύϥϝʔλͰ͋Δͱ ߟ͑ΒΕΔɽ 5.2.2 ఆੑతධՁ horse2zebraͷఆੑతධՁͰ͸ɼද5.2ɼද5.3ͷΑ͏ʹઌߦݚڀͷAttention-Guided GAN ͰഎܠΛר͖ࠐΜͩੜ੒ը૾͔ΒɼSSAttention-Guided GAN Ͱͷੜ੒ը૾Ͱ
  41. ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 42 ͸എܠ΁ͷӨڹ͕վળͨ͠ࣄྫ͕ݟΒΕͨɽͦΕʹ൐͍ɼAttention ΋ݩͷഅͷ෦෼ ʹ͍ۙܗʹͳ͓ͬͯΓɼճస֯౓༧ଌλεΫʹΑͬͯɼݩͷഅͷزԿతಛ௃΍Ґஔ ؔ܎Λ͖ͪΜͱ೺ѲͰ͖͍ͯΔͱߟ͑ΒΕΔɽ ͨͩ͠ɼճస֯ύϥϝʔλʹΑͬͯ͸֤ม׵ը૾ͷม׵ͷਫ਼౓͕ͦΕͧΕҧͬͯ

    ͍Δ͜ͱ͕֬ೝ͞Ε͓ͯΓɼੜ੒ը૾ʹΑͬͯద੾ͳճస֯ύϥϝʔλ͸ҟͳΔ͜ ͱΛҙຯ͍ͯ͠Δͱߟ͑ΒΕΔɽ࣮ࡍʹഅ͕ॏͳ͍ͬͯͨΓɼਓؒͱ͍ͬͨഅҎ֎ ͷ෺ମ͕Ҡ͍ͬͯΔը૾͸ɼҰ؏ࣶͯ͠໛༷͕͏·͍͍ͭͯ͘ͳ͔ͬͨɽ apple2orangeͷఆੑతධՁʹ͓͍ͯɼ ճస֯ύϥϝʔλ (λD , λG ) = (0.5, 0.1), (1.5, 0.3) ͷͱ͖Ξοϓϧͷ৭ΛΦϨϯδʹ͢Δ texture ม׵͸҆ఆ্ͯ͠ख͘ߦΘΕ͍ͯΔɽ ͦΕʹՃ͑ͯผͷ৭ͷΞοϓϧ΍Ξοϓϧͷஅ໘ɼ༿ͳͲΛ۠ผͯ͠ը૾ม׵Λߦͬ ͍ͯΔ͜ͱ͔Βɼճస֯౓ͷ༧ଌʹΑͬͯΞοϓϧͷಛ௃͕೺ѲͰ͖͍ͯΔͱߟ͑ ΒΕΔɽͳ͓ม׵ݩͷΞοϓϧͷதͰ΋ɼΦϨϯδ৭ʹ͍ۙ΋ͷ͕ଘࡏ͍ͯ͠Δ͕ɼ ͦΕΒ͸͋·ΓมԽ͕ݟΒΕͳ͔ͬͨɽཧ༝ͱͯ͠͸ɼ੺͍ΞοϓϧΛֶश͍ͯ͠ ΔͨΊɼผͷ৭ͷΞοϓϧͱೝࣝͯ͠͠·͍ͬͯΔ͜ͱ͕ߟ͑ΒΕΔɽ ͦͷଞͷσʔληοτʹ͍ͭͯ͸ɼ·ͣ map2photo Ͱ͸ઌߦݚڀͱൺ΂ɼͲͷճ స֯ύϥϝʔλ΋͋·ΓมԽ͕ݟΒΕͳ͔ͬͨɽݪҼͱͯ͠͸ɼม׵ݩͷ஍ਤͷํ ֯ͱม׵ઌͷӴ੕ࣸਅͷը૾಺ͷํ֯Λͦͷ··ʹͯ͠ը૾ม׵͢ΔͨΊɼվΊͯ ճస֯౓༧ଌΛͯ͠ํ֯ͱ͍ͬͨಛ௃Λ೺Ѳ͢Δඞཁ͕ͳ͍ͱ͍͏͜ͱ͕ڍ͛ΒΕ Δɽfacades ͸ಛʹද 5.7 ͷΑ͏ʹɼ(λD , λG ) = (1.5, 0.3) ͷੜ੒ը૾ͷϥϕϧཱ͕ମ తʹͳ͍ͬͯΔ͜ͱ͕ଟ਺֬ೝ͞Ε͍ͯͨɽcityscapes Ͱ΋ઌߦݚڀͱൺ΂ɼं΍ݐ ෺ϥϕϧ͕ΑΓਖ਼֬ʹ͍͍ͭͯΔՕॴ͕ଘࡏ͍ͯ͠Δɽ͜ΕΒͷ݁ՌΑΓɼfacades ΍ cityscapes σʔληοτʹ͍ͭͯ΋ճస֯౓༧ଌλεΫΛಋೖ͢Δ͜ͱͰը૾಺ ͷಛ௃Λ͔ͭΉ͜ͱ͕Ͱ͖͍ͯΔͱߟ͑ΒΕΔɽ 5.2.3 ఆྔతධՁͱఆੑతධՁͷൺֱ horse2zebra ͷఆྔతධՁͱఆੑతධՁΛൺֱ͢Δͱɼੜ੒ը૾ͱ FID ͷ਺஋ʹ૬ ؔੑ͸ͳ͘ɼFID ͷ਺஋ͷେ͖͞ʹ͔͔ΘΒͣɼੜ੒ը૾ͷਫ਼౓͕ͦΕͧΕҟͳͬ
  42. ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 43 ͍ͯΔ͜ͱ͕෼͔Δɽྫ͑͹ horse2zebra ͷճస֯ύϥϝʔλ (λD ,

    λG ) = (2.0, 0.4) ͷ ͱ͖ͷ FID ͸ઌߦݚڀͱൺ΂ͯվળ͞Ε͍͕ͯͨɼද 5.2 ͱরΒ͠߹ΘͤͯݟͯΈ Δͱɼ৽ͨʹม׵͍ͨ͠ର৅Ҏ֎ͷ෦෼Λר͖ࠐΜͰ͠·͍ͬͯΔͱ͍ͬͨࣄྫ΋ ֬ೝ͞Εͨɽ͜ͷݪҼͷҰͭͱͯ͠ɼhorse2zebra σʔληοτͷதʹ͸ਓؒ΍૲ݪ ͕͍ࣸͬͯͨΓɼഅ͕ॏͳ͍ͬͯͨΓͯ͠ɼ࣮ࡍʹഅ͔ΒγϚ΢Ϛ΁ͷม׵্͕ख ͍͔͘ͳ͔ͬͨ΋ͷͷଘࡏ͕ڍ͛ΒΕΔɽhorse2zebra σʔληοτதͷਓؒ΍૲ݪɼ ̎ͭͷॏͳͬͨഅͷֶश͕ FID ʹӨڹΛ༩͓͑ͯΓɼੜ੒ը૾ͷਫ਼౓ʹ͕ࠩग़͍ͯ ΔͷͰ͸ͳ͍͔ͱߟ͑ΒΕΔɽ ରͯ͠ɼapple2orange Ͱ͸ɼFID ͷ਺஋͕ྑ͍ճస֯ύϥϝʔλ΄ͲఆੑతධՁͷ ը૾ม׵্͕ख͘ߦΘΕ͍ͯͨɽ ྫ͑͹ճస֯ύϥϝʔλ (λD , λG ) = (0.5, 0.1), (1.5, 0.3), (2.0, 0.4) ͷͱ͖ͷ FID ͸ઌߦݚڀͱൺ΂ͯྑ͍਺஋Ͱ͋ΓɼఆੑతධՁ΋੺͍Ξοϓ ϧ͔ΒΦϨϯδͷΞοϓϧ΁ͷม׵্͕ख͘Ͱ͖͍ͯͨɽ͕ͨͬͯ͠ɼapple2orange ʹ͓͍ͯ͸ FID ͱੜ੒ը૾ͷ૬ؔੑ͸΄ͱΜͲݟΒΕΔͱ݁࿦෇͚ΒΕΔɽ
  43. ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 44 ද 5.2 horse2zebra σʔληοτͰͷ horse

    to zebra ม׵݁Ռʢ̍ʣ ɹ (λD , λG) horse zebra Attention Mask (0.0, 0.0) (0.5, 0.1) (1.0, 0.2) (1.5, 0.3) (2.0, 0.4) ’
  44. ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 45 ද 5.3 horse2zebra σʔληοτͰͷ horse

    to zebra ม׵݁Ռʢ̎ʣ ɹ (λD , λG) horse zebra Attention Mask (0.0, 0.0) (0.5, 0.1) (1.0, 0.2) (1.5, 0.3) (2.0, 0.4)
  45. ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 46 ද 5.4 apple2orange σʔληοτͰͷ apple

    to orange ม׵݁Ռ ʢ̍ʣ (λD , λG) apple orange Attention Mask (0.0, 0.0) (0.5, 0.1) (1.0, 0.2) (1.5, 0.3) (2.0, 0.4)
  46. ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 47 ද 5.5 apple2orange σʔληοτͰͷ apple

    to orange ม׵݁Ռ ʢ̎ʣɹ (λD , λG) apple orange Attention Mask (0.0, 0.0) (0.5, 0.1) (1.0, 0.2) (1.5, 0.3) (2.0, 0.4)
  47. ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 48 ද 5.6 map2photo σʔληοτͰͷ map

    to photo ม׵݁Ռ (λD , λG) map photo Attention Mask (0.0, 0.0) (0.5, 0.1) (1.0, 0.2) (1.5, 0.3) (2.0, 0.4)
  48. ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 49 ද 5.7 facades σʔληοτΛ༻͍ͨݐ෺֎؍͔Βϥϕϧ΁ͷม ׵݁Ռ

    (λD , λG) ݐ෺֎؍ ϥϕϧ Attention Mask (0.0, 0.0) (0.5, 0.1) (1.0, 0.2) (1.5, 0.3) (2.0, 0.4)
  49. ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 50 ද 5.8 cityscapesσʔληοτΛ༻͍ͨ౎ࢢܠ؍͔Βηάϝϯςʔ γϣϯ΁ͷม׵݁Ռ (λD

    , λG) ౎ࢢܠ؍ ηάϝϯςʔγϣϯ Attention Mask (0.0, 0.0) (0.5, 0.1) (1.0, 0.2) (1.5, 0.3) (2.0, 0.4)
  50. ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 51 ਤ 5.1 ճస֯ύϥϝʔλ (λD ,

    λG ) = (0.0, 0.0) ͷͱ͖ͷ horse2zebra ͷੜ੒ը૾ ਤ 5.2 ճస֯ύϥϝʔλ (λD , λG ) = (0.5, 0.1) ͷͱ͖ͷ horse2zebra ͷੜ੒ը૾
  51. ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 52 ਤ 5.3 ճస֯ύϥϝʔλ (λD ,

    λG ) = (1.0, 0.2) ͷͱ͖ͷ horse2zebra ͷੜ੒ը૾ ਤ 5.4 ճస֯ύϥϝʔλ (λD , λG ) = (1.5, 0.3) ͷͱ͖ͷ horse2zebra ͷੜ੒ը૾
  52. ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 53 ਤ 5.5 ճస֯ύϥϝʔλ (λD ,

    λG ) = (2.0, 0.4) ͷͱ͖ͷ horse2zebra ͷੜ੒ը૾ ਤ 5.6 ճస֯ύϥϝʔλ (λD , λG ) = (0.0, 0.0) ͷͱ͖ͷ ap- ple2orange ͷੜ੒ը૾
  53. ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 54 ਤ 5.7 ճస֯ύϥϝʔλ (λD ,

    λG ) = (0.5, 0.1) ͷͱ͖ͷ ap- ple2orange ͷੜ੒ը૾ ਤ 5.8 ճస֯ύϥϝʔλ (λD , λG ) = (1.0, 0.2) ͷͱ͖ͷ ap- ple2orange ͷੜ੒ը૾
  54. ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 55 ਤ 5.9 ճస֯ύϥϝʔλ (λD ,

    λG ) = (1.5, 0.3) ͷͱ͖ͷ ap- ple2orange ͷੜ੒ը૾ ਤ 5.10 ճస֯ύϥϝʔλ (λD , λG ) = (2.0, 0.4) ͷͱ͖ͷ ap- ple2orange ͷੜ੒ը૾
  55. ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 56 ਤ 5.11 ճస֯ύϥϝʔλ (λD ,

    λG ) = (0.0, 0.0) ͷͱ͖ͷ map2photo ͷੜ੒ը૾ ਤ 5.12 ճస֯ύϥϝʔλ (λD , λG ) = (0.5, 0.1) ͷͱ͖ͷ map2photo ͷੜ੒ը૾
  56. ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 57 ਤ 5.13 ճస֯ύϥϝʔλ (λD ,

    λG ) = (1.0, 0.2) ͷͱ͖ͷ map2photo ͷੜ੒ը૾ ਤ 5.14 ճస֯ύϥϝʔλ (λD , λG ) = (1.5, 0.3) ͷͱ͖ͷ map2photo ͷੜ੒ը૾
  57. ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 58 ਤ 5.15 ճస֯ύϥϝʔλ (λD ,

    λG ) = (2.0, 0.4) ͷͱ͖ͷ map2photo ͷੜ੒ը૾ ਤ 5.16 ճస֯ύϥϝʔλ (λD , λG ) = (0.0, 0.0) ͷͱ͖ͷ facades ͷ ੜ੒ը૾
  58. ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 59 ਤ 5.17 ճస֯ύϥϝʔλ (λD ,

    λG ) = (0.5, 0.1) ͷͱ͖ͷ facades ͷ ੜ੒ը૾ ਤ 5.18 ճస֯ύϥϝʔλ (λD , λG ) = (1.0, 0.2) ͷͱ͖ͷ facades ͷ ੜ੒ը૾
  59. ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 60 ਤ 5.19 ճస֯ύϥϝʔλ (λD ,

    λG ) = (1.5, 0.3) ͷͱ͖ͷ facades ͷ ੜ੒ը૾ ਤ 5.20 ճస֯ύϥϝʔλ (λD , λG ) = (2.0, 0.4) ͷͱ͖ͷ facades ͷ ੜ੒ը૾
  60. ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 61 ਤ 5.21 ճస֯ύϥϝʔλ (λD ,

    λG ) = (0.0, 0.0) ͷͱ͖ͷ cityscapes ͷੜ੒ը૾ ਤ 5.22 ճస֯ύϥϝʔλ (λD , λG ) = (0.5, 0.1) ͷͱ͖ͷ cityscapes ͷੜ੒ը૾
  61. ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 62 ਤ 5.23 ճస֯ύϥϝʔλ (λD ,

    λG ) = (1.0, 0.2) ͷͱ͖ͷ cityscapes ͷੜ੒ը૾ ਤ 5.24 ճస֯ύϥϝʔλ (λD , λG ) = (1.5, 0.3) ͷͱ͖ͷ cityscapes ͷੜ੒ը૾
  62. ୈ 5 ষ ࣮ݧ݁Ռͱߟ࡯ 63 ਤ 5.25 ճస֯ύϥϝʔλ (λD ,

    λG ) = (2.0, 0.4) ͷͱ͖ͷ cityscapes ͷੜ੒ը૾
  63. ୈ 6 ষ ·ͱΊ 64 ୈ6ষ ·ͱΊ ຊݚڀͰ͸ઌߦݚڀͷ Attention-Guided GAN

    ͷ Discriminator ʹճస֯౓༧ଌλ εΫΛ෇Ճͨ͠ SSAttention-Guided GAN ͷϓϩάϥϜΛߏங͠ɼੜ੒ը૾ͷࣝผਫ਼ ౓޲্ΛࢼΈͨɽઌߦݚڀͷ Attention-Guided GAN ͱ SSAttention-Guided GAN ͷ ੜ੒ը૾ͱ Attention Mask Λൺֱ͠ɼม׵ର৅ͷಛ௃͕ Attention ΍ੜ੒ը૾ʹ൓ө ͞Ε͍ͯΔ͔ʹ͍ͭͯఆྔతධՁͱఆੑతධՁΛߦͬͨɽ FID Λ༻͍ͨఆྔతධՁͰ͸ɼઌߦݚڀͷ Attention-Guided GAN ͷͱ͖ΑΓ FID ͷݮগ͕֬ೝ͞ΕͨɽຊݚڀͰ͸ɼhorse2zebra ͱ apple2orange ͷ྆ํͱ΋ճస֯ύ ϥϝʔλ (λD , λG ) = (0.5, 0.1) ͷͱ͖ʹ࠷΋վળ͢Δ܏޲͕͋Δ͜ͱ͕֬ೝ͞Εͨɽ ఆੑతධՁͰ͸ɼઌߦݚڀͷ Attention-Guided GAN ͷͱ͖ʹ؍࡯͞Εͨ࿪ͳ At- tention Mask ͱੜ੒ը૾͕ճస֯౓༧ଌλεΫͷಋೖޙʹվળͨ͠ࣄྫ͕ଟ਺ݟड ͚ΒΕͨɽhorse2zebra Ͱ͸ɼճస֯౓༧ଌλεΫͷಋೖޙʹର৅Ҏ֎ͷഎܠͱ͍ͬ ͨ෦෼΁ͷม׵͕ݮগ͠ɼͦͷͱ͖ͷ Attention Mask ΋അͷܗʹ੔͍ͬͯΔ΋ͷ͕ ֬ೝ͞Εͨɽapple2orange ΋ɼઌߦݚڀͷ Attention-Guided GAN ͷͱ͖ͷੜ੒ը૾ Ͱ͸্ख͘ΦϨϯδʹม׵͠ͳ͔ͬͨ੺͍Ξοϓϧ͕ɼճస֯౓༧ଌλεΫͷಋೖ ޙʹ׬શʹΦϨϯδ৭ʹม׵Ͱ͖͍ͯΔ΋ͷ΋ݟΒΕͨɽ apple2orange Ͱ͸ FID ͷ஋͕খ͍͞΄Ͳɼੜ੒ը૾ͷ੺͍Ξοϓϧ͕ΦϨϯδ৭ ʹ্ख͘ม׵͞ΕΔ܏޲͕ݟΒΕͨɽ͜ͷ݁Ռ͔Βɼapple2orange ʹ͓͚Δ FID ͷ ஋͸ɼੜ੒ը૾ͷਫ਼౓ʹ൓ө͞ΕΔ͜ͱ͕ߟ͑ΒΕΔɽ͜Εʹରͯ͠ɼhorse2zebra ͸ɼFID ͷ஋ʹؔΘΒͣɼੜ੒ը૾ͷਫ਼౓͸ͦΕͧΕҟͳΔ͜ͱ͕֬ೝ͞Εͨɽ͜ ͷཁҼͷҰͭͱͯ͠ɼhorse2zebra σʔληοτͷதʹ͸ਓؒ΍૲ݪɼ2 ͭͷॏͳͬ ͨഅ͕ࠞࡏ͓ͯ͠Γɼ͜ΕΒͷը૾ͷֶश͕ੜ੒ը૾ͷਫ਼౓΍ FID ʹӨڹΛ༩͑ͯ ͍Δͱ͍͏͜ͱ͕ڍ͛ΒΕΔɽຊݚڀͰ΋ɼճస֯౓ͷ༧ଌͷ௥Ճͷ༗ແʹ͔͔Θ
  64. ࢀߟจݙ 67 ࢀߟจݙ ʦ1ʧ ಠཱߦ੓๏ਓ৘ใॲཧਪਐػߏ AI നॻฤूҕһձɿAI നॻ 2019, KADOKAWA

    (2019). ʦ2ʧ Gui, J., Sun, Z., Wen, Y., Tao, D. and Ye, J.: A Review on Generative Adversarial Networks: Algorithms, Theory, and Applications, CoRR, Vol. abs/2001.06937, (2020). ʦ3ʧ Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A. and Bengio, Y.: Generative Adversarial Nets, in Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N. and Weinberger, K. Q. eds., Advances in Neural Information Processing Systems, Vol. 27, Curran Associates, Inc. (2014). ʦ4ʧ Radford, A., Metz, L. and Chintala, S.: Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, in Bengio, Y. and LeCun, Y. eds., 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings (2016). ʦ5ʧ Mirza, M. and Osindero, S.: Conditional Generative Adversarial Nets, CoRR, Vol. abs/1411.1784, (2014). ʦ6ʧ Knyaz, V. A., Kniaz, V. V. and Remondino, F.: Image-to-Voxel Model Transla- tion with Conditional Adversarial Networks, in Leal-Taix´ e, L. and Roth, S. eds., Computer Vision - ECCV 2018 Workshops - Munich, Germany, September 8-14, 2018, Proceedings, Part I, Vol. 11129 of Lecture Notes in Computer Science, pp. 601–618, Springer (2018).
  65. ࢀߟจݙ 68 ʦ7ʧ Zhu, J., Park, T., Isola, P. and

    Efros, A. A.: Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks, in IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017, pp. 2242– 2251, IEEE Computer Society (2017). ʦ8ʧ Tang, H., Xu, D., Sebe, N. and Yan, Y.: Attention-Guided Generative Adversarial Networks for Unsupervised Image-to-Image Translation, in International Joint Conference on Neural Networks, IJCNN 2019 Budapest, Hungary, July 14-19, 2019, pp. 1–8, IEEE (2019). ʦ9ʧ Tang, H., Liu, H., Xu, D., Torr, P. H. S. and Sebe, N.: AttentionGAN: Unpaired Image-to-Image Translation using Attention-Guided Generative Adversarial Net- works, CoRR, Vol. abs/1911.11897, (2019). ʦ10ʧ Chen, T., Zhai, X., Ritter, M., Lucic, M. and Houlsby, N.: Self-Supervised GANs via Auxiliary Rotation Loss, in IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, pp. 12154– 12163, Computer Vision Foundation / IEEE (2019). ʦ11ʧ Mao, X., Li, Q., Xie, H., Lau, R. Y. K., Wang, Z. and Smolley, S. P.: Least Squares Generative Adversarial Networks, in IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017, pp. 2813–2821, IEEE Computer Society (2017). ʦ12ʧ ࡈ౻߁ؽɿθϩ͔Β࡞Δ Deep Learning - Python ͰֶͿσΟʔϓϥʔχϯά ͷཧ࿦ͱ࣮૷, ΦϥΠϦʔɾδϟύϯ (2017). ʦ13ʧ C.M. Ϗγϣ οϓɿύλʔϯೝࣝͱػցֶश ্ ϕΠζཧ࿦ʹΑΔ౷ܭత༧ଌ, ؙળग़൛ (2014). ʦ14ʧ Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L. u. and Polosukhin, I.: Attention is All you Need, in Guyon, I.,
  66. ࢀߟจݙ 69 Luxburg, U. V., Bengio, S., Wallach, H., Fergus,

    R., Vishwanathan, S. and Gar- nett, R. eds., Advances in Neural Information Processing Systems, Vol. 30, Curran Associates, Inc. (2017). ʦ15ʧ Long, J., Shelhamer, E. and Darrell, T.: Fully Convolutional Networks for Se- mantic Segmentation, CoRR, Vol. abs/1411.4038, (2014). ʦ16ʧ Yin, L., Wei, X., Sun, Y., Wang, J. and Rosato, M. J.: A 3D Facial Expression Database For Facial Behavior Research, in Seventh IEEE International Confer- ence on Automatic Face and Gesture Recognition (FGR 2006), 10-12 April 2006, Southampton, UK, pp. 211–216, IEEE Computer Society (2006). ʦ17ʧ Gidaris, S., Singh, P. and Komodakis, N.: Unsupervised Representation Learn- ing by Predicting Image Rotations, in 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings, OpenReview.net (2018). ʦ18ʧ He, K., Zhang, X., Ren, S. and Sun, J.: Deep Residual Learning for Image Recog- nition, CoRR, Vol. abs/1512.03385, (2015). ʦ19ʧ Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B. and Hochreiter, S.: GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilib- rium, in Guyon, I., Luxburg, von U., Bengio, S., Wallach, H. M., Fergus, R., Vishwanathan, S. V. N. and Garnett, R. eds., Advances in Neural Information Pro- cessing Systems 30: Annual Conference on Neural Information Processing Sys- tems 2017, December 4-9, 2017, Long Beach, CA, USA, pp. 6626–6637 (2017).