Survey of Image Editing with GANs in SIGGRAPH'21

2021.9.12 Daichi Horita Image Editing with GANs SIGGRAPH 2021ษڧձ —
https://siggraph.xyz/s2021/ 1

ࣗݾ঺հ • “͏ͲΜ”Ͱ͢ɹhttps://twitter.com/udoooom 2 https://www.shikoku-np.co.jp/udon/shop/890 “͏ͲΜόΧҰ୅” in ߳઒͕͓͢͢Ί

Contents Image Editing with GANs • What is “Image Editing
with GANs”? • Introduction • StyleGAN Architecture • Projection into StyleGAN • Enjoy SIGGRAPH’21 accepted papers! • Conclusion • Appendix — More recent papersʢମྗ͕ਚ͖ͨͷͰλΠτϧͱҰݴͷΈʣ 3

What is — “Image Editing with GANs”? 4 Virtual Try
On Cartoonization Appearance / Pose Editing Attribute Editing Input Reference Output Output Input Input Appearance Editing Pose Editing Input + Illumination + Pose - Pose + Expression

StyleGAN Architecture 5 Introduction — Original StyleGAN[Karras+ CVPR19] PGGAN[Karras+ CVPR18]
StyleGAN[Karras+ CVPR19] W+ W Generated Images by StyleGAN

StyleGAN Architecture 6 Introduction — Style Mixing 4 × 4
8 × 8 16 × 16 1024 × 1024 512 × 512 ɾɾɾ Space W Space W+ Mapping Net AdaIN Structure (Source A) Style (Source B) const. Output

StyleGAN Architecture 7 StyleGAN[Karras+ CVPR19] Introduction — StyleGAN vs StyleGAN2
StyleGAN2[Karras+ CVPR20] Feature Modulation by AdaIN[Huang+ ICCV17] Weight Demodulation (Simplify Style Block) W+ W+

Projection into StyleGAN 8 Introduction — Motivation Real-world Face Image
W+ Space How can we achieve to embed real images in StyleGAN prior? If it was success, we could edit real-world images! (*Success to reconstruct image via embedding ≠ Editability Semantics)

Projection into StyleGAN 9 Introduction — Image2StyleGAN[Abdal+ ICCV19] (Iterative Optimization)
G W+ Space 1. Init 3. Calc loss 2. Generate 4. Update w* Final Accurate reconstructed result Slow No generalization to space

Projection into StyleGAN 10 Fast inference Can’t reconstruct details Not
work for OoD sample Introduction — pSp[Alaluf+ CVPR21] (Learning Encoder) Input Output https://twitter.com/notlewistbh/status/1432936600745431041?s=20 Please search “Face Depixelizer” in Twitter Can’t ignore strong prior

Enjoy SIGGRAPH’21 accepted papers! 11 TryOnGAN[Lewis+] StyleCariGAN[Jang+] AgileGAN[Song+] e4e[Tov+] SWAGAN[Gal+]
StyleFlow[Abdal+]

TryOnGAN[Lewis+] 12

TryOnGAN[Lewis+] 13 • Previous: Paired-Image-to-Image Translation-based Try On • Propose:
StyleGAN-based Try On • Design a pose conditioned StyleGAN2 with Segmentation / Image generation branches Contribution

TryOnGAN[Lewis+] 14 G Overview Input Ip Reference g G Output
Ig Generated Image Real Image • Lost high-frequency details • High-quality!

TryOnGAN[Lewis+] 15 Method 1st: Train StyleGAN2 2nd: Optimize σp, σq
Style mixing per layer!

TryOnGAN[Lewis+] Results 16 Real-Image In. Ref. Previous methods Ours In.
Ref. Ours Generated-Image Original StyleGAN2 (NOT BAD) Failure real-image (Can’t Harajuku)

StyleCariGAN[Jang+] 17

StyleCariGAN[Jang+] Contribution • Shape Exaggeration Blocks • Modulate course features
to produce caricature shape exaggerations • Novel Architecture for Caricature Generation 18

StyleCariGAN[Jang+] Method 19 Layer Swap Style MixingResults • CycleGAN-approach •
Keep contents but transfer caricature style w/ WebCariA 50 Label α 1 α 4

StyleCariGAN[Jang+] Result 20 vs. StyleGAN Inversion vs. I2I Translation Caricature
to Real FID vs. I2I Translation

AgileGAN[Song+] 21

AgileGAN[Song+] Contribution • Achieve to generate high quality stylistic portraits.
• Introduce hierarchical VAE, which embed in , to enforce the inverse mapped distribution conforms to follow original prior. Z+ W+ 22

AgileGAN[Song+] Method 23 Novel VAE Architecture (Embed in ) Z+
Reparameterization Trick t-SNE Visualization Content Difference

AgileGAN[Song+] Results 24 vs. I2I Translation vs. Inversion (Use ﬁne-tuned
stylization model) Ablation Semantic Editing

e4e[Tov+] 25

e4e[Tov+] Contribution • Study the latent space of StyleGAN •
Propose to consider distortion and perceptual quality of reconstructed image. • Propose two principles for designing encoders — controls proximity to based on distortion-editability tradeoff and a distortion-perception tradeoff within the StyleGAN latent space. W 26

e4e[Tov+] Method 27 W : prior, Wk : OoD latent
OoD latent achieves better editability and distortion (Tradeoff) Restrict variance (Blue Arrow) Guide towards prior (Red Arrow) Overview of objectives for latents W Wk End-to-End Architecture

e4e[Tov+] Results 28 (pSp[Alaluf+ CVPR21]) vs. pSp Editing vs. Optimization
(Optimization is unsuitable for editing.)

SWAGAN[Gal+] 29

SWAGAN[Gal+] Contribution • Previous GAN suffer from degradation in quality
for high-frequency content… • Propose Style and WAvelet based GAN (SWAGAN) that implements progressive generation in the frequency domain of Haar Wavelet (Not RGB). • Achieve Faster training (x0.25 time) • (Weakness point) Inversion methods using encoders suffer from acute high- frequency shortcomings, since their use of L2 based losses. 30

SWAGAN[Gal+] Method 31 Overall architecture Generator Discirminator To upsample, once
converted to RGB domain

SWAGAN[Gal+] Results (1/2) 32 Generated samples Comparison of time [s]
to process 1,000 imgs Bi: Proposed, NWD: Non-Wavelet-Discriminator, NU: Neural Upsample SWAGAN-Bi SWAGAN-NU

SWAGAN[Gal+] Results (2/2) 33 Optimized latent code interpolation

StyleFlow[Abdal+] 34

StyleFlow[Abdal+] Contributions • Propose StyleFlow — controls the generation process
of attribute conditions and is formulated Conditional Continuous Normalizing Flows. • Refer • Normalizing Flowೖ໳γϦʔζ, https://tatsy.github.io/blog/ • The best overview article of normalizing ﬂow!!!!!!!!!!!!! 35 Unknown Distribution Known Distribution Invertible Normalizing Flow

StyleFlow[Abdal+] Method 36 Optimized by Neural ODE Solver[Chen+ NeurIPS18 Best
Paper]

StyleFlow[Abdal+] Results 37

Conclusion • Introduce StyleGAN Architecture and Projection • Explain six
papers of “Image Editing with GANs” session. • Surprisingly, all papers employ StyleGAN!! 38

Appendix GAN Inversion • Image2StyleGAN: How to Embed Images Into
the StyleGAN Latent Space?[Abdal+ ICCV19] • ࠷ॳʹStyleGANʹຒΊࠐΈΛ΍ͬͨ࿦จ • Image2StyleGAN++: How to Edit the Embedded Images?[Abdal+ CVPR20] • Semantic Editingͷ࿩ • ReStyle: A Residual-Based StyleGAN Encoder via Iterative Reﬁnement[Alaluf+ ICCV21] • pSp[Alaluf+ CVPR21]ͷ༧ଌ݁Ռʹ࢒ࠩΛ௥Ճ͍ͯ͘͠࿩ 39

Appendix Space of Style GAN • StyleSpace Analysis: Disentangled Controls
for StyleGAN Image Generation[Wu+ CVPR21] • Convͷಛ௃ྔۭؒΛSۭؒͱఆٛ͠ɼ ΑΓDisentanglement͞Ε͍ͯΔͱ ൃݟ W+ 40

Survey of Image Editing with GANs in SIGGRAPH'21

Survey of Image Editing with GANs in SIGGRAPH'21

More Decks by Udon

Other Decks in Research

Featured

Transcript