Save 37% off PRO during our Black Friday Sale! »

Survey of Image Editing with GANs in SIGGRAPH'21

E8dd0ab5a1ff487f6e6f93cfa0d4a348?s=47 Udon
September 12, 2021

Survey of Image Editing with GANs in SIGGRAPH'21

Yakiniku tabetai YO!

E8dd0ab5a1ff487f6e6f93cfa0d4a348?s=128

Udon

September 12, 2021
Tweet

Transcript

  1. 2021.9.12 Daichi Horita Image Editing with GANs SIGGRAPH 2021ษڧձ —

    https://siggraph.xyz/s2021/ 1
  2. ࣗݾ঺հ • “͏ͲΜ”Ͱ͢ɹhttps://twitter.com/udoooom 2 https://www.shikoku-np.co.jp/udon/shop/890 “͏ͲΜόΧҰ୅” in ߳઒͕͓͢͢Ί

  3. Contents Image Editing with GANs • What is “Image Editing

    with GANs”? • Introduction • StyleGAN Architecture • Projection into StyleGAN • Enjoy SIGGRAPH’21 accepted papers! • Conclusion • Appendix — More recent papersʢମྗ͕ਚ͖ͨͷͰλΠτϧͱҰݴͷΈʣ 3
  4. What is — “Image Editing with GANs”? 4 Virtual Try

    On Cartoonization Appearance / Pose Editing Attribute Editing Input Reference Output Output Input Input Appearance Editing Pose Editing Input + Illumination + Pose - Pose + Expression
  5. StyleGAN Architecture 5 Introduction — Original StyleGAN[Karras+ CVPR19] PGGAN[Karras+ CVPR18]

    StyleGAN[Karras+ CVPR19] W+ W Generated Images by StyleGAN
  6. StyleGAN Architecture 6 Introduction — Style Mixing 4 × 4

    8 × 8 16 × 16 1024 × 1024 512 × 512 ɾɾɾ Space W Space W+ Mapping Net AdaIN Structure (Source A) Style (Source B) const. Output
  7. StyleGAN Architecture 7 StyleGAN[Karras+ CVPR19] Introduction — StyleGAN vs StyleGAN2

    StyleGAN2[Karras+ CVPR20] Feature Modulation by AdaIN[Huang+ ICCV17] Weight Demodulation (Simplify Style Block) W+ W+
  8. Projection into StyleGAN 8 Introduction — Motivation Real-world Face Image

    W+ Space How can we achieve to embed real images in StyleGAN prior? If it was success, we could edit real-world images! (*Success to reconstruct image via embedding ≠ Editability Semantics)
  9. Projection into StyleGAN 9 Introduction — Image2StyleGAN[Abdal+ ICCV19] (Iterative Optimization)

    G W+ Space 1. Init 3. Calc loss 2. Generate 4. Update w* Final Accurate reconstructed result Slow No generalization to space
  10. Projection into StyleGAN 10 Fast inference Can’t reconstruct details Not

    work for OoD sample Introduction — pSp[Alaluf+ CVPR21] (Learning Encoder) Input Output https://twitter.com/notlewistbh/status/1432936600745431041?s=20 Please search “Face Depixelizer” in Twitter Can’t ignore strong prior
  11. Enjoy SIGGRAPH’21 accepted papers! 11 TryOnGAN[Lewis+] StyleCariGAN[Jang+] AgileGAN[Song+] e4e[Tov+] SWAGAN[Gal+]

    StyleFlow[Abdal+]
  12. TryOnGAN[Lewis+] 12

  13. TryOnGAN[Lewis+] 13 • Previous: Paired-Image-to-Image Translation-based Try On • Propose:

    StyleGAN-based Try On • Design a pose conditioned StyleGAN2 with Segmentation / Image generation branches Contribution
  14. TryOnGAN[Lewis+] 14 G Overview Input Ip Reference g G Output

    Ig Generated Image Real Image • Lost high-frequency details • High-quality!
  15. TryOnGAN[Lewis+] 15 Method 1st: Train StyleGAN2 2nd: Optimize σp, σq

    Style mixing per layer!
  16. TryOnGAN[Lewis+] Results 16 Real-Image In. Ref. Previous methods Ours In.

    Ref. Ours Generated-Image Original StyleGAN2 (NOT BAD) Failure real-image (Can’t Harajuku)
  17. StyleCariGAN[Jang+] 17

  18. StyleCariGAN[Jang+] Contribution • Shape Exaggeration Blocks • Modulate course features

    to produce caricature shape exaggerations • Novel Architecture for Caricature Generation 18
  19. StyleCariGAN[Jang+] Method 19 Layer Swap Style MixingResults • CycleGAN-approach •

    Keep contents but transfer caricature style w/ WebCariA 50 Label α 1 α 4
  20. StyleCariGAN[Jang+] Result 20 vs. StyleGAN Inversion vs. I2I Translation Caricature

    to Real FID vs. I2I Translation
  21. AgileGAN[Song+] 21

  22. AgileGAN[Song+] Contribution • Achieve to generate high quality stylistic portraits.

    • Introduce hierarchical VAE, which embed in , to enforce the inverse mapped distribution conforms to follow original prior. Z+ W+ 22
  23. AgileGAN[Song+] Method 23 Novel VAE Architecture (Embed in ) Z+

    Reparameterization Trick t-SNE Visualization Content Difference
  24. AgileGAN[Song+] Results 24 vs. I2I Translation vs. Inversion (Use fine-tuned

    stylization model) Ablation Semantic Editing
  25. e4e[Tov+] 25

  26. e4e[Tov+] Contribution • Study the latent space of StyleGAN •

    Propose to consider distortion and perceptual quality of reconstructed image. • Propose two principles for designing encoders — controls proximity to based on distortion-editability tradeoff and a distortion-perception tradeoff within the StyleGAN latent space. W 26
  27. e4e[Tov+] Method 27 W : prior, Wk : OoD latent

    OoD latent achieves better editability and distortion (Tradeoff) Restrict variance (Blue Arrow) Guide towards prior (Red Arrow) Overview of objectives for latents W Wk End-to-End Architecture
  28. e4e[Tov+] Results 28 (pSp[Alaluf+ CVPR21]) vs. pSp Editing vs. Optimization

    (Optimization is unsuitable for editing.)
  29. SWAGAN[Gal+] 29

  30. SWAGAN[Gal+] Contribution • Previous GAN suffer from degradation in quality

    for high-frequency content… • Propose Style and WAvelet based GAN (SWAGAN) that implements progressive generation in the frequency domain of Haar Wavelet (Not RGB). • Achieve Faster training (x0.25 time) • (Weakness point) Inversion methods using encoders suffer from acute high- frequency shortcomings, since their use of L2 based losses. 30
  31. SWAGAN[Gal+] Method 31 Overall architecture Generator Discirminator To upsample, once

    converted to RGB domain
  32. SWAGAN[Gal+] Results (1/2) 32 Generated samples Comparison of time [s]

    to process 1,000 imgs Bi: Proposed, NWD: Non-Wavelet-Discriminator, NU: Neural Upsample SWAGAN-Bi SWAGAN-NU
  33. SWAGAN[Gal+] Results (2/2) 33 Optimized latent code interpolation

  34. StyleFlow[Abdal+] 34

  35. StyleFlow[Abdal+] Contributions • Propose StyleFlow — controls the generation process

    of attribute conditions and is formulated Conditional Continuous Normalizing Flows. • Refer • Normalizing Flowೖ໳γϦʔζ, https://tatsy.github.io/blog/ • The best overview article of normalizing flow!!!!!!!!!!!!! 35 Unknown Distribution Known Distribution Invertible Normalizing Flow
  36. StyleFlow[Abdal+] Method 36 Optimized by Neural ODE Solver[Chen+ NeurIPS18 Best

    Paper]
  37. StyleFlow[Abdal+] Results 37

  38. Conclusion • Introduce StyleGAN Architecture and Projection • Explain six

    papers of “Image Editing with GANs” session. • Surprisingly, all papers employ StyleGAN!! 38
  39. Appendix GAN Inversion • Image2StyleGAN: How to Embed Images Into

    the StyleGAN Latent Space?[Abdal+ ICCV19] • ࠷ॳʹStyleGANʹຒΊࠐΈΛ΍ͬͨ࿦จ • Image2StyleGAN++: How to Edit the Embedded Images?[Abdal+ CVPR20] • Semantic Editingͷ࿩ • ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement[Alaluf+ ICCV21] • pSp[Alaluf+ CVPR21]ͷ༧ଌ݁Ռʹ࢒ࠩΛ௥Ճ͍ͯ͘͠࿩ 39
  40. Appendix Space of Style GAN • StyleSpace Analysis: Disentangled Controls

    for StyleGAN Image Generation[Wu+ CVPR21] • Convͷಛ௃ྔۭؒΛSۭؒͱఆٛ͠ɼ ΑΓDisentanglement͞Ε͍ͯΔͱ ൃݟ W+ 40