Upgrade to Pro — share decks privately, control downloads, hide ads and more …

第9回全日本コンピュータビジョン勉強会「StyleNeRF: A Style-based 3D Aware Generator for High-resolution Image Synthesis」発表資料

maguro27
May 15, 2022

第9回全日本コンピュータビジョン勉強会「StyleNeRF: A Style-based 3D Aware Generator for High-resolution Image Synthesis」発表資料

第9回全日本コンピュータビジョン勉強会にて「StyleNeRF: A Style-based 3D Aware Generator for High-resolution Image Synthesisについてわりかし徹底解説を行う資料になっています。

maguro27

May 15, 2022
Tweet

More Decks by maguro27

Other Decks in Research

Transcript

  1. NeRF 10 NeRF͸Volume Renderingͱ͍͏ख๏Λ༻͍ͯը૾ΛϨϯμϦϯά͠·͢ Volume Rendering͸ޡղΛڪΕͣʹݴ͏ͱɺ֤ࢹ఺͔Βݟͨͱ͖ʢޫઢΛඈ͹͢ʣʹɺ ݟ͍͑ͯΔͱ͜ΖΛඳը͢Δख๏Ͱ͢ʢ෺ମ಺෦ʹ͍ͭͯ͸ݟ͑ͳ͍ͷͰඳը͸͠ͳ͍ʣ ࣜΛ௥͍ͬͯ͘͜ͱͰཧղ͕Ͱ͖·͢ʢࣜ͸StyleNeRF [3] ΑΓʣ

    ϨϯμϦϯάޙͷը૾ σ! (・)͸ີ౓ 𝒓(𝑠)͸ޫઢɺ𝒓 𝑠 = 𝑜 + 𝑠𝒅 0 ≤ s < tɺo͸ࢹ఺ͷݪ఺ɺ𝒅͸֯౓දݱ(θ, φ) ఆੑతʹ͸ີ౓͕ߴ͍෦෼Ͱexpͷ஋͕খ͘͞ͳΓɺ ݁Ռతʹີ౓σ" (𝒓 𝑡 )ͷ஋͕খ͘͞ͳΔͷͰɺ ෺ମ಺෦͕ݟ͑ͳ͍͜ͱΛදݱͰ͖Δ
  2. NeRF 11 NeRF͸Volume Renderingͱ͍͏ख๏Λ༻͍ͯը૾ΛϨϯμϦϯά͠·͢ Volume Rendering͸ޡղΛڪΕͣʹݴ͏ͱɺ֤ࢹ఺͔Βݟͨͱ͖ʢޫઢΛඈ͹͢ʣʹɺ ݟ͍͑ͯΔͱ͜ΖΛඳը͢Δख๏Ͱ͢ʢ෺ମ಺෦ʹ͍ͭͯ͸ݟ͑ͳ͍ͷͰඳը͸͠ͳ͍ʣ ࣜΛ௥͍ͬͯ͘͜ͱͰཧղ͕Ͱ͖·͢ʢࣜ͸StyleNeRF [3] ΑΓʣ

    ϨϯμϦϯάޙͷը૾ 𝑝" (𝑡)ʹΑًͬͯ౓𝑐" (𝒓 𝑡 , 𝒅) ͕ॏΈ෇͚͞ΕΔͷͰɺ ϨϯμϦϯά͞Εͨը૾Ͱ ෺ମ಺෦͸ݟ͑ͳ͍ σ! (・)͸ີ౓ 𝒓(𝑠)͸ޫઢɺ𝒓 𝑠 = 𝑜 + 𝑠𝒅 0 ≤ s < tɺo͸ࢹ఺ͷݪ఺ɺ𝒅͸֯౓දݱ(θ, φ) ఆੑతʹ͸ີ౓͕ߴ͍෦෼Ͱexpͷ஋͕খ͘͞ͳΓɺ ݁Ռతʹີ౓σ" (𝒓 𝑡 )ͷ஋͕খ͘͞ͳΔͷͰɺ ෺ମ಺෦͕ݟ͑ͳ͍͜ͱΛදݱͰ͖Δ
  3. ख๏ 16 લ൒͕ͭ͜ͷ࿦จͷϝΠϯͱͳΔ࿩Ͱ͢ ΞʔςΟϑΝΫτͷͰͳ͍ upsampler ܭࢉίετ࡟ݮͷͨΊͷ Volume Renderingͷۙࣅ Revisiting Progressive

    Growing ܭࢉίετΛ ܰ͘͢ΔͨΊͷৠཹ ً౓༧ଌͷωοτϫʔΫ΁ͷ ࢹ఺৚݅ೖྗͷऔΓ΍Ί ϊΠζೖྗΛ2D͔Β3D΁
  4. ख๏ 17 લ൒͕ͭ͜ͷ࿦จͷϝΠϯͱͳΔ࿩Ͱ͢ ΞʔςΟϑΝΫτͷͰͳ͍ upsampler ܭࢉίετ࡟ݮͷͨΊͷ Volume Renderingͷۙࣅ Revisiting Progressive

    Growing ܭࢉίετΛ ܰ͘͢ΔͨΊͷৠཹ ً౓༧ଌͷωοτϫʔΫ΁ͷ ࢹ఺৚݅ೖྗͷऔΓ΍Ί ϊΠζೖྗΛ2D͔Β3D΁
  5. ख๏ 18 ΞʔςΟϑΝΫτͷͰͳ͍ upsampler ܭࢉίετ࡟ݮͷͨΊͷ Volume Renderingͷۙࣅ Revisiting Progressive Growing

    ܭࢉίετΛ ܰ͘͢ΔͨΊͷৠཹ ً౓༧ଌͷωοτϫʔΫ΁ͷ ࢹ఺৚݅ೖྗͷऔΓ΍Ί ϊΠζೖྗΛ2D͔Β3D΁
  6. ܭࢉίετ࡟ݮͷͨΊͷ7PMVNF3FOEFSJOHͷۙࣅ 21 Ұ౓ඞཁͳࣜʹ͍ͭͯ੔ཧ͓͖ͯ͠·͢ 𝑝!( ɾ)ɿً౓ͷͨΊͷີ౓ʹΑΔॏΈ ℎɿ2૚ͷMLP φ! # (ɾ)ɿn૚ͷNN 𝒓

    𝑡 = 𝑜 + 𝒅𝑡ɿࢹ఺ͷݪ఺𝑜ɺ֯౓දݱ𝒅ɺݪ఺͔Βͷڑ཭𝑡Ͱද͞Εͨޫઢ ξ(ɾ)ɿPositional Encoding
  7. ܭࢉίετ࡟ݮͷͨΊͷ7PMVNF3FOEFSJOHͷۙࣅ 22 𝑝!( ɾ)ɿً౓ͷͨΊͷີ౓ʹΑΔॏΈ ℎɿ2૚ͷMLP φ! # (ɾ)ɿn૚ͷNN 𝒓 𝑡

    = 𝑜 + 𝒅𝑡ɿࢹ఺ͷݪ఺𝑜ɺ֯౓දݱ𝒅ɺݪ఺͔Βͷڑ཭𝑡Ͱද͞Εͨޫઢ ξ(ɾ)ɿPositional Encoding ϨϯμϦϯάޙͷը૾ ℎͱφ! #!,#"(ɾ)Λੵ෼ͷ֎ʹग़͢ 𝒜 ・ ͸ີ౓ܭࢉͷ//ΛؚΜͰ ͍ΔͷͰɺఴࣈ͕𝑛%, 𝑛& ͱͳ͍ͬͯΔ ࣍ϖʔδͰهड़
  8. ख๏ 24 ΞʔςΟϑΝΫτͷͰͳ͍ upsampler ܭࢉίετ࡟ݮͷͨΊͷ Volume Renderingͷۙࣅ Revisiting Progressive Growing

    ܭࢉίετΛ ܰ͘͢ΔͨΊͷৠཹ ً౓༧ଌͷωοτϫʔΫ΁ͷ ࢹ఺৚݅ೖྗͷऔΓ΍Ί ϊΠζೖྗΛ2D͔Β3D΁
  9. ΞʔςΟϑΝΫτͷͰͳ͍upsampler 25 StyleNeRFͰ࢖༻͕ݕ౼͞Εͨupsampler͸ҎԼͷ3ͭͰ͢ l Pixel Shuffle [5] l LIEF [6]

    ֶशՄೳͳupsampler νΣεϘʔυͷΑ͏ͳΞʔςΟϑΝΫτ΍ɺςΫενϟ͕ը૾ͷฏ໘ʹషΓ෇͖͕ى͜Γ·͢ chessboard artifact͸ [7] ʹͯɺtexture sticking͸ [8] ʹͯ֬ೝ͢ΔͱΘ͔Γ΍͍͢Ͱ͢ l Bilinear ֶशΛ͠ͳ͍upsampler bilinear upsampler͸ϩʔύεϑΟϧλͷ໾ׂΛՌͨ͢ͷͰɺ׈Β͔ͳը૾Λग़ྗ͢Δ͜ͱͰ ্هͷΞʔςΟϑΝΫτ͸ग़ͳ͍Ͱ͕͢ɺ୅ΘΓʹ๐ͷΑ͏ͳΞʔςΟϑΝΫτ͕ग़·͢
  10. ख๏ 27 ΞʔςΟϑΝΫτͷͰͳ͍ upsampler ܭࢉίετ࡟ݮͷͨΊͷ Volume Renderingͷۙࣅ Revisiting Progressive Growing

    ܭࢉίετΛ ܰ͘͢ΔͨΊͷৠཹ ً౓༧ଌͷωοτϫʔΫ΁ͷ ࢹ఺৚݅ೖྗͷऔΓ΍Ί ϊΠζೖྗΛ2D͔Β3D΁
  11. Revisiting Progressive Growing 28 Progressive Growingͱ͸ஈ֊తʹੜ੒ղ૾౓Λ্͛ͳ͕ΒֶशΛਐΊֶ͍ͯ͘शํ๏ StyleGAN2 [9] Ͱࣃͷ޲͖ͳͲ͕ಛఆͷղ૾౓ͷ૚Ͱੜ੒͞Εͯɺਖ਼͍͠޲͖Ͱੜ੒͞Εͳ͍ ݱ৅Λ๷͙ͨΊʹProgressive

    Growing͸ഇࢭ͞Ε·ͨ͠ ͔͠͠ͳ͕ΒɺProgressive Growing͸ֶशͷ҆ఆੑͱ͍͏ҙຯͰͷҖྗ͸݈ࡏ ͦ͜ͰɺStyleNeRFͰ͸Ұ෦มߋͨ͠Progressive GrowingΛ࠾༻͠·ͨ͠
  12. ख๏ 30 ΞʔςΟϑΝΫτͷͰͳ͍ upsampler ܭࢉίετ࡟ݮͷͨΊͷ Volume Renderingͷۙࣅ Revisiting Progressive Growing

    ܭࢉίετΛ ܰ͘͢ΔͨΊͷৠཹ ً౓༧ଌͷωοτϫʔΫ΁ͷ ࢹ఺৚݅ೖྗͷऔΓ΍Ί ϊΠζೖྗΛ2D͔Β3D΁
  13. ܭࢉίετΛܰ͘͢ΔͨΊͷৠཹ 34 ͨͩ͠ɺݱ࣮తʹ͸௿ղ૾౓͔Βߴղ૾౓ʹ͢ΔNN͸3࣍ݩදݱΛҡ࣋Ͱ͖ͳ͍ ͦ͜Ͱɺߴղ૾౓ͷStyleNeRFͷग़ྗը૾ͱɺStyleNeRFͷग़ྗը૾͔ΒRadiance FieldsΛ ܭࢉΛͯ͠NeRFͷVolume Renderingͨ͠΋ͷͱMSEΛऔΔ ͜ΕʹΑΓɺ3D consistency͕औΕΔʢNeRF-path Regularizationʣ

    ϥϯμϜͳSݸͷ఺Ͱ ฏۉΛऔΔ 𝑅#$ ͸௿ղ૾౓ͷRadiance FieldsͰɺ 𝑅#$ ΛStyleNeRFͷVolume Rendering [i, j]ͷఴࣈ͸ग़ྗը૾ͷ࠲ඪ 𝑅%&' ͸StyleNeRFͷग़ྗը૾Ͱɺ 𝑅%&' ͔ΒRadiance FieldsΛܭࢉͯ͠ɺ NeRFͷVolume Rendering [i, j]ͷఴࣈ͸ग़ྗը૾ͷ࠲ඪ
  14. ख๏ 35 ΞʔςΟϑΝΫτͷͰͳ͍ upsampler ܭࢉίετ࡟ݮͷͨΊͷ Volume Renderingͷۙࣅ Revisiting Progressive Growing

    ܭࢉίετΛ ܰ͘͢ΔͨΊͷৠཹ ً౓༧ଌͷωοτϫʔΫ΁ͷ ࢹ఺৚݅ೖྗͷऔΓ΍Ί ϊΠζೖྗΛ2D͔Β3D΁
  15. ख๏·ͱΊʴͦͷଞઃఆʹ͍ͭͯ 37 ʻఏҊख๏ʼ l Volume Renderingͷۙࣅʢ௿ղ૾౓ͷRadiance Fields͔Βߴղ૾౓ը૾ͷੜ੒ʣ l Pixel Shuffle

    + ϩʔύεϑΟϧλͷupsampler l ࢒ࠩϒϩοΫΛແͨ͘͠Progressive Growing l Volume RenderingͷۙࣅʹΑΔ3D conssitency૕ࣦΛิర͢ΔͨΊͷৠཹʢNeRF path Regu.ʣ l ً౓༧ଌωοτϫʔΫ΁ͷࢹ఺ೖྗͷഇࢭ l 3DʹରԠͨ͠ϊΠζೖྗ ʻͦͷଞઃఆʼ l mapping networkͱDiscriminatorͱ໨తؔ਺͸StyleGAN2ͱಉ͡ l NeRFදݱʹ͸NeRF++ [10] Λ࢖༻ p NeRF++͸എܠͱલܠΛผʑͷωοτϫʔΫͰϞσϧԽ p લܠഎܠผʑϞσϧԽ͸BlockGAN [11] ΍GIRAFFE [12] Ͱ΋ར༻͞Ε͍ͯΔ
  16. HoloGAN 44 NeRFΛऔΓೖΕ͍ͯͳ͍ݩ૆ͱݴͬͯ΋͍͍3D aware GANsͷ1ͭ 1. 3࣍ݩܗঢ়ͷconstantʢStyleGANಉֶ༷शՄೳͳύϥϝʔλ͔ΒελʔτʣΛೖྗ 2. ΧϝϥϙʔζΛೖྗͯ͠3࣍ݩಛ௃ۭؒͰճస౳Λߦ͏ 3.

    2࣍ݩ΁bilinear resamplingͰϨϯμϦϯάʢbilinear resamplingʹ͍ͭͯ͸লུʣ 4. 2Dͷը૾ੜ੒ͱಉ༷ͷϓϩηε 5. ೖྗΧϝϥϙʔζ͸ੜ੒ը૾͔Β༧ଌ͢Δʢࣗݾڭࢣ͋Γֶशʣ
  17. Contents l TL;DR l എܠɺ໨త l ख๏ l ݁Ռ p

    Ablation Study p ϥϯμϜը૾ੜ੒ p Χϝϥϙʔζ੍ޚ p ༷ʑͳԠ༻ u Style Mixing u Style Interpolation u GAN inversion l ల๬ 50
  18. Ablation Study 51 l (a)ɿw/o Progressive Growing ൅ͷ෼͚໨͕ࢹ఺ʹ௥ਵ͠ͳ͍ StyleGAN2ͱ͸૬൓͢Δ݁Ռʹ l

    (b)ɿw/o NeRF-path Regularization 3D consistencyͷ૕ࣦ l (c)ɿw/ view condition ·΍͔͠ͷ૬ؔʹΑΔ3D consistencyͷ૕ࣦ l (table)ɿupsamplerͷൺֱ pixel shuffle + ϩʔύεϑΟϧλͷఏҊख๏͕ ࠷΋඼࣭͕͍͍݁Ռʹ
  19. ༷ʑͳԠ༻ 56 l Style Mixing StyleGAN [18] Ͱ΋ߦΘΕ͍ͯͨStyle Mixingͷ݁Ռ Source

    Aͷਓʹରͯ͠Source Bͷਓͷಛ௃ΛೖΕΔ 3BEJBODF'JFMETܭࢉͷखલͰೖΕΔͱ ਓ෺ಛ௃͕มԽ͢Δ 3BEJBODF'JFMETܭࢉͷޙʹೖΕΔͱ ഽͳͲͷࡉ͔͍ಛ௃͕มԽ͢Δ
  20. ల๬ 61 ʻLimitationʼ l 3D mesh͕௿ղ૾౓ͷ΋ͷ͔͠ͳ͍ͷͰɺͦ͜͸π-GANͳͲʹྼΔ p ͜Ε͸concurrent workͷEG3D [20]

    Ͱղܾ͞Ε͍ͯΔ l CompCarsͷΑ͏ͳෳࡶͳܗঢ়ͩͱ·ͩ·ͩΞʔςΟϑΝΫτ͕໨ཱͭ
  21. Reference 62 [1] Mildenhall et al., “NeRF: Representing Scenes as

    Neural Radiance Fields for View Synthesis”, ECCV, 2020. [2] Mildenhall et al., ”NeRF project page”, https://www.matthewtancik.com/nerf, 2022೥5݄14೔Ӿཡ. [3] Gu et al., “StyleNeRF: A Style-based 3D Aware Generator for High-resolution Image Synthesis”, ICLR, 2022. [4] Tancik et al., “Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains”, NeurIPS, 2020. [5] Shi et al., “Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network “, CVPR, 2016. [6] Chen et al., “Learning Continuous Image Representation with Local Implicit Image Function”, CVPR, 2021. [7] Odena et al., “Deconvolution and Checkerboard Artifacts”, https://distill.pub/2016/deconv-checkerboard/, 2022೥5݄15೔ Ӿཡ [8] Karras et al., “Alias-Free Generative Adversarial Networks (StyleGAN3)”, https://nvlabs.github.io/stylegan3/, 2022೥5݄15೔ Ӿཡ [9] Karras et al., “Analyzing and Improving the Image Quality of StyleGAN”, CVPR, 2020. [10] Zhang et al., “NeRF++: Analyzing and Improving Neural Radiance Fields”, arXiv preprint, 2020. [11] Phuoc et al., “BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images”, NeurIPS, 2020. [12] Niemeyer and Geiger, “GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields”, CVPR, 2021. [13] Phuoc et al., “HoloGAN: Unsupervised Learning of 3D Representations From Natural Images”, ICCV, 2019. [14] Schwarz et al., “GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis”, NeurIPS, 2020. [15] Chan et al., “pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis”, CVPR, 2021. [16] Xue et al., “GIRAFFE HD: A High-Resolution 3D-aware Generative Model”, CVPR, 2022. [17] Jahanian et al., “On the "steerability" of generative adversarial networks”, ICLR, 2020. [18] Karras et al., “A Style-Based Generator Architecture for Generative Adversarial Networks”, CVPR, 2019. [19] Patashnik et al., “StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery”, ICCV, 2021. [20] Chan et al., “EG3D: Efficient Geometry-aware 3D Generative Adversarial Networks”, arXiv preprint, 2021. <NeRFͷ೔ຊޠղઆͷܾఆ൛ʼ [21] ࢁ಺, “ࡾ࣍ݩۭؒͷχϡʔϥϧͳදݱͱNeRF”, https://blog.albert2005.co.jp/2020/05/08/nerf/, 2022೥5݄14Ӿཡ.