Projection into StyleGAN
8
Introduction — Motivation
Real-world
Face Image
W+
Space
How can we achieve to embed real images in StyleGAN prior?
If it was success, we could edit real-world images!
(*Success to reconstruct image via embedding ≠ Editability Semantics)
Slide 9
Slide 9 text
Projection into StyleGAN
9
Introduction — Image2StyleGAN[Abdal+ ICCV19] (Iterative Optimization)
G
W+
Space
1. Init
3. Calc loss
2. Generate
4. Update w*
Final
Accurate reconstructed result
Slow
No generalization to space
Slide 10
Slide 10 text
Projection into StyleGAN
10
Fast inference
Can’t reconstruct details
Not work for OoD sample
Introduction — pSp[Alaluf+ CVPR21] (Learning Encoder)
Input Output
https://twitter.com/notlewistbh/status/1432936600745431041?s=20
Please search “Face Depixelizer” in Twitter
Can’t ignore strong prior
StyleCariGAN[Jang+]
Result
20
vs. StyleGAN Inversion
vs. I2I Translation
Caricature to Real
FID vs. I2I Translation
Slide 21
Slide 21 text
AgileGAN[Song+]
21
Slide 22
Slide 22 text
AgileGAN[Song+]
Contribution
• Achieve to generate high quality stylistic portraits.
• Introduce hierarchical VAE, which embed in , to enforce the inverse mapped
distribution conforms to follow original prior.
Z+
W+
22
AgileGAN[Song+]
Results
24
vs. I2I Translation
vs. Inversion (Use fine-tuned stylization model)
Ablation
Semantic Editing
Slide 25
Slide 25 text
e4e[Tov+]
25
Slide 26
Slide 26 text
e4e[Tov+]
Contribution
• Study the latent space of StyleGAN
• Propose to consider distortion and perceptual quality of reconstructed image.
• Propose two principles for designing encoders — controls proximity to based on
distortion-editability tradeoff and a distortion-perception tradeoff within the StyleGAN
latent space.
W
26
Slide 27
Slide 27 text
e4e[Tov+]
Method
27
W : prior, Wk : OoD latent
OoD latent achieves better
editability and distortion (Tradeoff)
Restrict variance (Blue Arrow) Guide towards prior (Red Arrow)
Overview of objectives for latents
W
Wk
End-to-End Architecture
Slide 28
Slide 28 text
e4e[Tov+]
Results
28
(pSp[Alaluf+ CVPR21])
vs. pSp
Editing vs. Optimization
(Optimization is unsuitable for editing.)
Slide 29
Slide 29 text
SWAGAN[Gal+]
29
Slide 30
Slide 30 text
SWAGAN[Gal+]
Contribution
• Previous GAN suffer from degradation in quality for high-frequency content…
• Propose Style and WAvelet based GAN (SWAGAN) that implements progressive
generation in the frequency domain of Haar Wavelet (Not RGB).
• Achieve Faster training (x0.25 time)
• (Weakness point) Inversion methods using encoders suffer from acute high-
frequency shortcomings, since their use of L2 based losses.
30
Slide 31
Slide 31 text
SWAGAN[Gal+]
Method
31
Overall architecture
Generator Discirminator
To upsample,
once converted
to RGB domain
Slide 32
Slide 32 text
SWAGAN[Gal+]
Results (1/2)
32
Generated samples
Comparison of time [s] to process 1,000 imgs
Bi: Proposed, NWD: Non-Wavelet-Discriminator, NU: Neural Upsample
SWAGAN-Bi
SWAGAN-NU
StyleFlow[Abdal+]
Contributions
• Propose StyleFlow — controls the generation process of attribute conditions and is
formulated Conditional Continuous Normalizing Flows.
• Refer
• Normalizing FlowೖγϦʔζ, https://tatsy.github.io/blog/
• The best overview article of normalizing flow!!!!!!!!!!!!! 35
Unknown Distribution
Known Distribution
Invertible Normalizing Flow
Slide 36
Slide 36 text
StyleFlow[Abdal+]
Method
36
Optimized by Neural ODE Solver[Chen+ NeurIPS18 Best Paper]
Slide 37
Slide 37 text
StyleFlow[Abdal+]
Results
37
Slide 38
Slide 38 text
Conclusion
• Introduce StyleGAN Architecture and Projection
• Explain six papers of “Image Editing with GANs” session.
• Surprisingly, all papers employ StyleGAN!!
38
Slide 39
Slide 39 text
Appendix
GAN Inversion
• Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?[Abdal+
ICCV19]
• ࠷ॳʹStyleGANʹຒΊࠐΈΛͬͨจ
• Image2StyleGAN++: How to Edit the Embedded Images?[Abdal+ CVPR20]
• Semantic Editingͷ
• ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement[Alaluf+
ICCV21]
• pSp[Alaluf+ CVPR21]ͷ༧ଌ݁ՌʹࠩΛՃ͍ͯ͘͠
39
Slide 40
Slide 40 text
Appendix
Space of Style GAN
• StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation[Wu+
CVPR21]
• ConvͷಛྔۭؒΛSۭؒͱఆٛ͠ɼ ΑΓDisentanglement͞Ε͍ͯΔͱ
ൃݟ
W+
40