Slide 1

Slide 1 text

Optimal Transport-driven CycleGAN for Unsupervised Learning in Inverse Problems Jong Chul Ye, Ph.D. Professor BISPL - BioImaging, Signal Processing, and Learning lab. Dept. of Bio/Brain Engineering Dept. of Mathematical Sciences KAIST, Korea

Slide 2

Slide 2 text

Classical Learning vs Deep Learning Diagnosis Classical machine learning Deep learning (no feature engineering) Feature Engineering Esteva et al, Nature Medicine, (2019)

Slide 3

Slide 3 text

Deep Learning for Scientific Discovery Diagnosis Diagnosis & analysis New frontiers of deep learning: inverse problems

Slide 4

Slide 4 text

Penalized LS for Inverse Problems Likelihood term Data fidelity term Prior term Regularization term • Classical approaches for inverse problems • Tikhonov, TV, Compressed sensing • Top-down model • Transductive à non-inductive • Computational expensive

Slide 5

Slide 5 text

Feed-Forward Neural Network Approaches • CNN as a direct inverse operation • Most simplest and fastest method • Supervised learning with lots of data

Slide 6

Slide 6 text

Model-based, PnP using CNN Prior Likelihood term Data fidelity term CNN based regularization • CNN is used as a denoiser • Can use relative small CNNs à fewer training data • Supervised learning • Still iterative Aggarwal et al, IEEE TMI, 2018; Liu et al, IEEE JSTSP, 2020; Wu et al, IEEE JSTSP, 2020

Slide 7

Slide 7 text

Deep Image Prior (DIP) • CNN architecture as a regularization • Unsupervised learning • Extensive computation >> PLS Ulyanov et al, CVPR, 2018

Slide 8

Slide 8 text

Unsupervised Feed-forward CNN?

Slide 9

Slide 9 text

Yann LeCun’s Cake Analogy Slide courtesy of Yann LeCun’s ICIP 2019 talk

Slide 10

Slide 10 text

Why Unsupervised Learning in Inverse Problems? Low-dose CT Remote sensing Metal artifact removal Blind deconvolution

Slide 11

Slide 11 text

Sim et al, arXiv:1909.12116, 2019, Lee et al, arXiv:2007.03480, 2020 Forward physics (unknown, partially known, known) inverse solution Our Geometric View of Unsupervised Learning

Slide 12

Slide 12 text

Geometry of GAN subject to Lei, Na, arXiv:1710.05488 (2017)

Slide 13

Slide 13 text

Statistical Distances f-Divergence Wasserstein-1 metric GAN, f-GAN W-GAN divergence metric

Slide 14

Slide 14 text

Absolute Continuity

Slide 15

Slide 15 text

Optimal Transport : A Gentle Review

Slide 16

Slide 16 text

Optimal Transport Transportation map

Slide 17

Slide 17 text

Push-Forward of a Measure

Slide 18

Slide 18 text

Optimal Transport: Monge Monge’s Original OT • Difficulty from the push-forward constraint Transportation cost

Slide 19

Slide 19 text

Optimal Transport: Kantorovich Kantorovich’s OT • Allows mass splitting • Probabilistic OT • Linear programming à Nobel prize in economy Transportation cost Joint distribution

Slide 20

Slide 20 text

Kantorovich Dual Formulation c-transform marginal distribution Kantorovich potential

Slide 21

Slide 21 text

Wasserstein-1 Metric and Its Dual Kantorovich Dual 1-Lipschitz. (Lip1)

Slide 22

Slide 22 text

Wasserstein GAN with Gradient Penalty 1-Lipschitz penalty generator discriminator

Slide 23

Slide 23 text

Generative Adversarial Nets (GANs) Generator (transport map) Discriminator (Kantorovich Potential) https://www.cfml.se/blog/generative_adversarial_networks/

Slide 24

Slide 24 text

Geometry of CycleGAN Sim et al, arXiv:1909.12116, 2019 Khan et al, arXiv:2006.14773, 2020 Lee et al, arXiv:2007.03480, 2020

Slide 25

Slide 25 text

Geometry of CycleGAN Sim et al, arXiv:1909.12116, 2019, Lee et al, arXiv:2007.03480, 2020 Forward physics (unknown, partially known, known) inverse solution

Slide 26

Slide 26 text

Two Wasserstein Metrics in Unsupervised Learning

Slide 27

Slide 27 text

Joint Minimizationà CycleGAN Dual formulation Sim et al, arXiv:1909.12116, 2019

Slide 28

Slide 28 text

CycleGAN: Loss Function 1-Lipschitz Cycle-consistency Discriminators Forward operator • Unknown • Partial known • known

Slide 29

Slide 29 text

CycleGAN vs Penalized LS Data fidelity term Regularization term Data fidelity term Regularization term CycleGAN PLS CycleGAN can be considered as stochastic generalization of PLS

Slide 30

Slide 30 text

30 • Multiphase Cardiac CT denoising – Phase 1, 2: low-dose, Phase 3 ~ 10: normal dose – Goal: dynamic changes of heart structure – No reference available Kang et al, Medical Physics, 2018 Case 1: Unsupervised Denoising for Low-Dose CT

Slide 31

Slide 31 text

Unsupervised Denoising by CycleGAN Kang et al, Medical Physics, 2019

Slide 32

Slide 32 text

Lose dose (5%) à high dose Kang et al, unpublished data Input: phase 1 Proposed Target: phase 8 Input- output

Slide 33

Slide 33 text

(a) (b) (c) (d) (e) (f) (g) (h) Input: phase 1 Proposed Without identity loss GAN Ablation Study Kang et al, Medical Physics, 2018

Slide 34

Slide 34 text

Input: phase 1 Proposed Without identity loss GAN Ablation Study (a) (b) (c) (d) (e) (f) (g) (h) Kang et al, Medical Physics, 2018

Slide 35

Slide 35 text

Case 2: Unsupervised Deconvolution Microscopy Lim et al, IEEE TCI, 2020

Slide 36

Slide 36 text

Results on Real Microscopy Data ü Qualitative results • Runtime for 512 x 512 x 50 volume inference: 15 s Transverse view Sagittal view Lim et al, IEEE TCI, 2020

Slide 37

Slide 37 text

Case 3: Unsupervised Learning for Accelerated MRI Sim et al, arXiv:1909.12116, 2019

Slide 38

Slide 38 text

Results on Fast MR Data Set Sim et al, arXiv:1909.12116, 2019

Slide 39

Slide 39 text

Extensions • -CycleGAN • Multi-domain CycleGAN • Wavelet directional CycleGAN

Slide 40

Slide 40 text

Loss function (ELBO: Evidence Lower Bound) Variational Auto-Encoder (VAE) Likelihood Latent space distance Kingma, et al arXiv:1312.6114 (2013)

Slide 41

Slide 41 text

Latent Space Geometry of VAEs KL-divergence l2-distance Sample Space

Slide 42

Slide 42 text

-Variational Auto Encoder (VAE) Feature Space Disentanglement Higgins et al, ICLR, 2017

Slide 43

Slide 43 text

-CycleGAN for Unsupervised Metal Artifact Reduction Lee et al, arXiv:2007.03480, 2020 Metal artifact images Artifact-free images Metal artifact generation physics (beam-hardening, photon starvation, etc.) à Highly complicated to learn Motivation • Less focus on the artifact generation • More emphasis on MAR

Slide 44

Slide 44 text

-CycleGAN Disentangled Representation Dual formulation Lee et al, arXiv:2007.03480, 2020

Slide 45

Slide 45 text

Discriminator: 1/ - Lipschitz -CycleGAN for Unsupervised Metal Artifact Removal

Slide 46

Slide 46 text

Attention-guided -CycleGAN

Slide 47

Slide 47 text

LI NMAR Input LI NMAR Input

Slide 48

Slide 48 text

Preservation of Original Details Non-metallic cases

Slide 49

Slide 49 text

Multiple-Domain CycleGAN :asymmetric transfer Motivation DAS image Despeckle Deconvolution US artifact removal Huh, et al, arXiv:2007.05205 (2020)

Slide 50

Slide 50 text

Domain 3 Domain 1 Domain 2 G G Multiple-Domain CycleGAN :asymmetric transfer Domain 1 Domain 2 Domain 3 G CycleGAN Multi-Domain CycleGAN Huh, et al, arXiv:2007.05205 (2020)

Slide 51

Slide 51 text

Multiple-Domain CycleGAN Domain 1 Domain 2 Domain 3 G StarGAN (symmetric) Proposed (asymmetric) Huh, et al, arXiv:2007.05205 (2020) Domain 1 Domain 2 Domain 3 G Choi et al, CVPR, 2019

Slide 52

Slide 52 text

Geometry of Multi-Domain CycleGAN Domain 1 Domain 2 Huh, et al, arXiv:2007.05205 (2020)

Slide 53

Slide 53 text

Multi-Domain CycleGAN: loss function KL divergence Wasserstein distance Huh, et al, arXiv:2007.05205 (2020)

Slide 54

Slide 54 text

Multi-Domain CycleGAN: Dual Formulation 1-Lipschitz penalty Huh, et al, arXiv:2007.05205 (2020)

Slide 55

Slide 55 text

Multi-Domain CycleGAN for US Artifact Removal Huh, et al, arXiv:2007.05205 (2020)

Slide 56

Slide 56 text

No content

Slide 57

Slide 57 text

Wavelet Directional CycleGAN Song, et al, arXiv:2002.09847 (2020)

Slide 58

Slide 58 text

Unsupervised Learning with CycleGAN Song, et al, arXiv:2002.09847 (2020)

Slide 59

Slide 59 text

Agricultural area

Slide 60

Slide 60 text

Cloud

Slide 61

Slide 61 text

Ocean

Slide 62

Slide 62 text

Summary • Unsupervised learning becomes very important topics in deep image reconstruction • Our theoretical findings • Optimal transport is an important mathematical tool for designing unsupervised networks • CycleGAN can be derived by minimizing two Wasserstein-1 distances in input and target spaces • Variation extensions of CycleGAN • Geometric view can be generalized for other problems

Slide 63

Slide 63 text

Questions?