Slide 1

Slide 1 text

Physical Regularization of Optimal Transport Nicolas Papadakis Transport optimal en apprentissage statistique et traitement du signal 1 / 50

Slide 2

Slide 2 text

Optimal Transport (OT) Basic ingredients • OT defines a family of distances between densities of probability • Transport a mass ρ0 onto ρ1: • Define a cost C(x, y) of mass transport between locations x and y • OT: application with mimimal global cost that transfers ρ0 onto ρ1 • If C(x, y) = ||x − y||p, Lp Wasserstein distance • Concave cost (Economy), Truncated cost (Computer Vision) 1 / 50

Slide 3

Slide 3 text

Applications in IP, CV and ML Robust dissimilarity measure (Optimal transport cost) • Image retrieval (EMD) [Rubner et al. ’00] • 3D shape recognitions [Ruzon and Tomasi, ’01] • SIFT matching [Pele and Werman ’08] • Object segmentation [Ni et al. ’09, Rabin et al. ’11, ’15], • Denoising [Burger et al. ’12, Tartavel et al. ’16] • Loss function [Frogner et al. ’15, Genevay et al. ’17] • Generative models [Arjovsky et al. ’17] 2 / 50

Slide 4

Slide 4 text

Applications in IP, CV and ML Robust dissimilarity measure (Optimal transport cost) • Image retrieval (EMD) [Rubner et al. ’00] • 3D shape recognitions [Ruzon and Tomasi, ’01] • SIFT matching [Pele and Werman ’08] • Object segmentation [Ni et al. ’09, Rabin et al. ’11, ’15], • Denoising [Burger et al. ’12, Tartavel et al. ’16] • Loss function [Frogner et al. ’15, Genevay et al. ’17] • Generative models [Arjovsky et al. ’17] Why is it robust? Discrete bin-to-bin metrics are not informative for disjoint supports 2 / 50

Slide 5

Slide 5 text

Applications in IP, CV and ML Robust dissimilarity measure (Optimal transport cost) • Image retrieval (EMD) [Rubner et al. ’00] • 3D shape recognitions [Ruzon and Tomasi, ’01] • SIFT matching [Pele and Werman ’08] • Object segmentation [Ni et al. ’09, Rabin et al. ’11, ’15], • Denoising [Burger et al. ’12, Tartavel et al. ’16] • Loss function [Frogner et al. ’15, Genevay et al. ’17] • Generative models [Arjovsky et al. ’17] Why is it robust? Discrete bin-to-bin metrics are not informative for disjoint supports Transport map T explains how far are the distributions 2 / 50

Slide 6

Slide 6 text

Optimal Transport Map • The transport map: • Interpolate between densities, compute barycenters or geodesics in the Wasserstein space ρ0 ρ1 3 / 50

Slide 7

Slide 7 text

Applications in IP, CV and ML Tool for matching/interpolation (Optimal transport map) • Image interpolation, registration [Angenent et al. ’04] Medical image registration [Rehman et al. ’09] • Color transfer [Delon, ’04, Pitié et al. ’07, Bonneel et al. ‘11] • Shape matching [Rabin et al. ’10, Schmitzer and Schnörr ’14] • Texture synthesis [Xia et al. ’13, Galerne et al. ’18, Leclaire et al. ’19] • Geodesic PCA [Bigot et al. ‘13, Seguy et al. ‘15, Cazelles et al. ‘18] • Domain adaptation [Courty et al. ’15, Redko et al. ’17] • Generative models [Seguy et al. ’18] 4 / 50

Slide 8

Slide 8 text

Applications in IP, CV and ML Tool for matching/interpolation (Optimal transport map) • Image interpolation, registration [Angenent et al. ’04] Medical image registration [Rehman et al. ’09] • Color transfer [Delon, ’04, Pitié et al. ’07, Bonneel et al. ‘11] • Shape matching [Rabin et al. ’10, Schmitzer and Schnörr ’14] • Texture synthesis [Xia et al. ’13, Galerne et al. ’18, Leclaire et al. ’19] • Geodesic PCA [Bigot et al. ‘13, Seguy et al. ‘15, Cazelles et al. ‘18] • Domain adaptation [Courty et al. ’15, Redko et al. ’17] • Generative models [Seguy et al. ’18] Today: Use of the transport map for Image Processing applications 4 / 50

Slide 9

Slide 9 text

Formulations, Numerical methods, Limitations Continuous [Benamou - Brenier ’00] Semi-discrete [Mérigot ’11] Discrete [Cuturi ’13] 5 / 50

Slide 10

Slide 10 text

Formulations, Numerical methods, Limitations Continuous Semi-discrete Discrete Irregularity of the transport map Interpolation µt between images: ρ0 ρ1 5 / 50

Slide 11

Slide 11 text

Formulations, Numerical methods, Limitations Continuous Semi-discrete Discrete Irregularity of the transport map Interpolation µt between images: ρ0 ρ1 ⇒ Objects contained in the scene are not preserved 5 / 50

Slide 12

Slide 12 text

Formulations, Numerical methods, Limitations Continuous Semi-discrete Discrete Irregularity of the transport map Exact Wasserstein for GAN 5 / 50

Slide 13

Slide 13 text

Formulations, Numerical methods, Limitations Continuous Semi-discrete Discrete Irregularity of the transport map Exact Wasserstein for GAN Generator 5 / 50

Slide 14

Slide 14 text

Formulations, Numerical methods, Limitations Continuous Semi-discrete Discrete Irregularity of the transport map Exact Wasserstein for GAN Wasserstein discriminator Generator 5 / 50

Slide 15

Slide 15 text

Formulations, Numerical methods, Limitations Continuous Semi-discrete Discrete Irregularity of the transport map Exact Wasserstein for GAN ⇒ Over-fitting 5 / 50

Slide 16

Slide 16 text

Formulations, Numerical methods, Limitations Continuous Semi-discrete Discrete Irregularity of the transport map Transfer of colors between images 5 / 50

Slide 17

Slide 17 text

Formulations, Numerical methods, Limitations Continuous Semi-discrete Discrete Irregularity of the transport map Transfer of colors between images ⇒ Artifacts appear with exact prescription of color histograms 5 / 50

Slide 18

Slide 18 text

Formulations, Numerical methods, Limitations Continuous Semi-discrete Discrete Objective: Generalized OT models with regularized transport maps 5 / 50

Slide 19

Slide 19 text

Formulations, Numerical methods, Limitations Continuous Semi-discrete Discrete • Images: densities on support Ω • Mass transport in a fluid mechanics framework on Ω Velocity field T : Ω → Ω 5 / 50

Slide 20

Slide 20 text

Formulations, Numerical methods, Limitations Continuous Semi-discrete Discrete • Images: densities on support Ω • Mass transport in a fluid mechanics framework on Ω Velocity field T : Ω → Ω • Distributions of image features • Transport between normalized histograms of size N and M Coupling matrix P of size M × N 5 / 50

Slide 21

Slide 21 text

Overview Part I - Continuous formulation • Dynamic optimal transport • Generalization of the transport cost • Non-convex model with physical priors ⇒ Application to data interpolation Part II - Discrete formulation • Relaxation and regularization of static transport matrix • Non-convex model to cancel mass spreading ⇒ Application to color transfer 6 / 50

Slide 22

Slide 22 text

Part I Continuous Optimal Transport 7 / 50

Slide 23

Slide 23 text

Context • National project on numerical algorithms for optimal transport • Collaborations with oceanographers: Sea Surface Height: creation of vortexes in Cap Point (output of model NEMO) • Objective: Image interpolation • Problem: How to deal with the coast?p 8 / 50

Slide 24

Slide 24 text

Context • National project on numerical algorithms for optimal transport • Collaborations with oceanographers: Sea Surface Height: creation of vortexes in Cap Point (output of model NEMO) • Objective: Image interpolation • Problem: How to deal with the coast?pOptical flow 8 / 50

Slide 25

Slide 25 text

Context • National project on numerical algorithms for optimal transport • Collaborations with oceanographers: Sea Surface Height: creation of vortexes in Cap Point (output of model NEMO) • Objective: Image interpolation • Problem: How to deal with the coast?p (((((( hhhhhh Optical flow 8 / 50

Slide 26

Slide 26 text

Context • National project on numerical algorithms for optimal transport • Collaborations with oceanographers: Sea Surface Height: creation of vortexes in Cap Point (output of model NEMO) • Objective: Image interpolation • Problem: How to deal with the coast?p (((((( hhhhhh Optical flow Discrete OT 8 / 50

Slide 27

Slide 27 text

Context • National project on numerical algorithms for optimal transport • Collaborations with oceanographers: Sea Surface Height: creation of vortexes in Cap Point (output of model NEMO) • Objective: Image interpolation • Problem: How to deal with the coast?p (((((( hhhhhh Optical flow (((((( hhhhhh Discrete OT 8 / 50

Slide 28

Slide 28 text

Overview • Dynamic optimal transport • Generalization of the transport cost • Optimal transport with physical priors 9 / 50

Slide 29

Slide 29 text

Continuous Optimal Transport • Densities ρ0 and ρ1 defined from x ∈ [0, 1]d to [0, 1] • Mass preserving transport map T: T (ρ0, ρ1) := {T : [0, 1]d → [0, 1]d such that ρ1 = T ρ0} • An optimal transport T solves min T∈T (ρ0,ρ1) C(x, T(x))ρ0(x) dx where C(x, y) 0 is the cost of assigning x ∈ [0, 1]d to y ∈ [0, 1]d 10 / 50

Slide 30

Slide 30 text

Lp optimal transport Properties C(x, y) = ||x − y||p ⇒ p−Wasserstein distance between ρ0 and ρ1 • For p > 1, T is unique • For p = 2,T = ∇ψ, with ψ convex [Brenier ’91] and optimal mass transfer follows straight lines 11 / 50

Slide 31

Slide 31 text

Lp optimal transport Properties C(x, y) = ||x − y||p ⇒ p−Wasserstein distance between ρ0 and ρ1 • For p > 1, T is unique • For p = 2,T = ∇ψ, with ψ convex [Brenier ’91] and optimal mass transfer follows straight lines Explicit computation in 1D with cumulative functions 11 / 50

Slide 32

Slide 32 text

Lp optimal transport Properties C(x, y) = ||x − y||p ⇒ p−Wasserstein distance between ρ0 and ρ1 • For p > 1, T is unique • For p = 2,T = ∇ψ, with ψ convex [Brenier ’91] and optimal mass transfer follows straight lines Explicit computation in 1D with cumulative functions 11 / 50

Slide 33

Slide 33 text

Lp optimal transport Properties C(x, y) = ||x − y||p ⇒ p−Wasserstein distance between ρ0 and ρ1 • For p > 1, T is unique • For p = 2,T = ∇ψ, with ψ convex [Brenier ’91] and optimal mass transfer follows straight lines Explicit computation in 1D with cumulative functions 11 / 50

Slide 34

Slide 34 text

Lp optimal transport Properties C(x, y) = ||x − y||p ⇒ p−Wasserstein distance between ρ0 and ρ1 • For p > 1, T is unique • For p = 2,T = ∇ψ, with ψ convex [Brenier ’91] and optimal mass transfer follows straight lines Explicit computation in 1D with cumulative functions 11 / 50

Slide 35

Slide 35 text

Lp optimal transport Properties C(x, y) = ||x − y||p ⇒ p−Wasserstein distance between ρ0 and ρ1 • For p > 1, T is unique • For p = 2,T = ∇ψ, with ψ convex [Brenier ’91] and optimal mass transfer follows straight lines Explicit computation in 1D with cumulative functions 11 / 50

Slide 36

Slide 36 text

Lp optimal transport Properties C(x, y) = ||x − y||p ⇒ p−Wasserstein distance between ρ0 and ρ1 • For p > 1, T is unique • For p = 2,T = ∇ψ, with ψ convex [Brenier ’91] and optimal mass transfer follows straight lines Explicit computation in 1D with cumulative functions 11 / 50

Slide 37

Slide 37 text

Lp optimal transport Properties C(x, y) = ||x − y||p ⇒ p−Wasserstein distance between ρ0 and ρ1 • For p > 1, T is unique • For p = 2,T = ∇ψ, with ψ convex [Brenier ’91] and optimal mass transfer follows straight lines Explicit computation in 1D with cumulative functions 11 / 50

Slide 38

Slide 38 text

Lp optimal transport Properties C(x, y) = ||x − y||p ⇒ p−Wasserstein distance between ρ0 and ρ1 • For p > 1, T is unique • For p = 2,T = ∇ψ, with ψ convex [Brenier ’91] and optimal mass transfer follows straight lines Explicit computation in 1D with cumulative functions 11 / 50

Slide 39

Slide 39 text

Lp optimal transport Properties C(x, y) = ||x − y||p ⇒ p−Wasserstein distance between ρ0 and ρ1 • For p > 1, T is unique • For p = 2,T = ∇ψ, with ψ convex [Brenier ’91] and optimal mass transfer follows straight lines Explicit computation in 1D with cumulative functions 11 / 50

Slide 40

Slide 40 text

Lp optimal transport Properties C(x, y) = ||x − y||p ⇒ p−Wasserstein distance between ρ0 and ρ1 • For p > 1, T is unique • For p = 2,T = ∇ψ, with ψ convex [Brenier ’91] and optimal mass transfer follows straight lines Explicit computation in 1D with cumulative functions 11 / 50

Slide 41

Slide 41 text

Lp optimal transport Properties C(x, y) = ||x − y||p ⇒ p−Wasserstein distance between ρ0 and ρ1 • For p > 1, T is unique • For p = 2,T = ∇ψ, with ψ convex [Brenier ’91] and optimal mass transfer follows straight lines Explicit computation in 1D with cumulative functions Can not be extended to higher dimensions x ∈ Rd , d > 1 11 / 50

Slide 42

Slide 42 text

Estimation of optimal transport map • A transport map T ∈ T (ρ0, ρ1) satisfies the gradient equation ρ0(x) = ρ1(T(x))| det(∂T(x))| • For p = 2, T = ∇ψ ⇒ Monge-Ampere equation: det(D2ψ) = ρ0(x) ρ1(∇ψ(x)) [Oliker and Prussner ’88, Oberman ’08, Froese ’12, Benamou et al. ’12, ’16] 12 / 50

Slide 43

Slide 43 text

Estimation of optimal transport map • A transport map T ∈ T (ρ0, ρ1) satisfies the gradient equation ρ0(x) = ρ1(T(x))| det(∂T(x))| • For p = 2, T = ∇ψ ⇒ Monge-Ampere equation: det(D2ψ) = ρ0(x) ρ1(∇ψ(x)) [Oliker and Prussner ’88, Oberman ’08, Froese ’12, Benamou et al. ’12, ’16] Fast algorithms (second order methods) 12 / 50

Slide 44

Slide 44 text

Estimation of optimal transport map • A transport map T ∈ T (ρ0, ρ1) satisfies the gradient equation ρ0(x) = ρ1(T(x))| det(∂T(x))| • For p = 2, T = ∇ψ ⇒ Monge-Ampere equation: det(D2ψ) = ρ0(x) ρ1(∇ψ(x)) [Oliker and Prussner ’88, Oberman ’08, Froese ’12, Benamou et al. ’12, ’16] Fast algorithms (second order methods) ρ1 should be lipschitz continuous with convex support ρ0 ρ1 12 / 50

Slide 45

Slide 45 text

Estimation of optimal transport map • A transport map T ∈ T (ρ0, ρ1) satisfies the gradient equation ρ0(x) = ρ1(T(x))| det(∂T(x))| • For p = 2, T = ∇ψ ⇒ Monge-Ampere equation: det(D2ψ) = ρ0(x) ρ1(∇ψ(x)) [Oliker and Prussner ’88, Oberman ’08, Froese ’12, Benamou et al. ’12, ’16] Fast algorithms (second order methods) ρ1 should be lipschitz continuous with convex support ρ0 ρ1 ρ0 ρ1 12 / 50

Slide 46

Slide 46 text

Estimation of optimal transport map • A transport map T ∈ T (ρ0, ρ1) satisfies the gradient equation ρ0(x) = ρ1(T(x))| det(∂T(x))| • For p = 2, T = ∇ψ ⇒ Monge-Ampere equation: det(D2ψ) = ρ0(x) ρ1(∇ψ(x)) [Oliker and Prussner ’88, Oberman ’08, Froese ’12, Benamou et al. ’12, ’16] Fast algorithms (second order methods) ρ1 should be lipschitz continuous with convex support ρ0 ρ1 ρ0 ρ1 • Regularized potential ψ [Paty et al. ’19] 12 / 50

Slide 47

Slide 47 text

Estimation of optimal transport map Use of Knothe rearrangement • The Knothe transport solves: min T∈T (ρ0,ρ1) d i=1 x C(xi , T(x)i ) • Can be computed explicitly 13 / 50

Slide 48

Slide 48 text

Estimation of optimal transport map Use of Knothe rearrangement • The Knothe transport solves: min T∈T (ρ0,ρ1) d i=1 x C(xi , T(x)i ) • Can be computed explicitly PDE initialized with Knothe rearrangement • T = ∇ψ, then curl(T) = ∇ × T = 0 . Penalization of curl(T) [Angenent et al. ’03, Haber et al. ’10] . PDE on T in the transport map space • Lagrangian formulation using straight lines [Iollo and Lombardi, ’11] • PDE on ψ on the torus [Carlier et al. ’10, Bonnotte ’13] 13 / 50

Slide 49

Slide 49 text

Estimation of optimal transport map Use of Knothe rearrangement • The Knothe transport solves: min T∈T (ρ0,ρ1) d i=1 x C(xi , T(x)i ) • Can be computed explicitly PDE initialized with Knothe rearrangement • T = ∇ψ, then curl(T) = ∇ × T = 0 . Penalization of curl(T) [Angenent et al. ’03, Haber et al. ’10] . PDE on T in the transport map space • Lagrangian formulation using straight lines [Iollo and Lombardi, ’11] • PDE on ψ on the torus [Carlier et al. ’10, Bonnotte ’13] All these methods are limited to non vanishing densities 13 / 50

Slide 50

Slide 50 text

Fluid mechanics formulation [Benamou-Brenier ’00] • Parameterization with t ∈ [0, 1] of the geodesic path ρ(x, t): ρ(x, t) = ((1 − t)Id + tT(x)) ρ0 14 / 50

Slide 51

Slide 51 text

Fluid mechanics formulation [Benamou-Brenier ’00] • Parameterization with t ∈ [0, 1] of the geodesic path ρ(x, t): ρ(x, t) = ((1 − t)Id + tT(x)) ρ0 • Non-convex problem over ρ(x, t) ∈ R and velocity field v(x, t) ∈ R2: W2(ρ0, ρ1)2 = min (v,ρ)∈Cv 1 2 [0,1]2 1 0 ρ(x, t)||v(x, t)||2dtdx, under the set of non-linear constraints Cv = (v, ρ) ; ∂t ρ + divx (ρv) = 0, v(0, ·) = v(1, ·) = 0, ρ(·, 0) = ρ0, ρ(·, 1) = ρ1 14 / 50

Slide 52

Slide 52 text

Fluid mechanics formulation [Benamou-Brenier ’00] • Parameterization with t ∈ [0, 1] of the geodesic path ρ(x, t): ρ(x, t) = ((1 − t)Id + tT(x)) ρ0 • Non-convex problem over ρ(x, t) ∈ R and velocity field v(x, t) ∈ R2: W2(ρ0, ρ1)2 = min (v,ρ)∈Cv 1 2 [0,1]2 1 0 ρ(x, t)||v(x, t)||2dtdx, under the set of non-linear constraints Cv = (v, ρ) ; ∂t ρ + divx (ρv) = 0, v(0, ·) = v(1, ·) = 0, ρ(·, 0) = ρ0, ρ(·, 1) = ρ1 Change of variable (v, µ) → (m, µ), with m = µv: Convex cost J and linear constraints C 14 / 50

Slide 53

Slide 53 text

Fluid mechanics formulation [Benamou-Brenier ’00] • Parameterization with t ∈ [0, 1] of the geodesic path ρ(x, t): ρ(x, t) = ((1 − t)Id + tT(x)) ρ0 • Non-convex problem over ρ(x, t) ∈ R and velocity field v(x, t) ∈ R2: W2(ρ0, ρ1)2 = min (v,ρ)∈Cv 1 2 [0,1]2 1 0 ρ(x, t)||v(x, t)||2dtdx, under the set of non-linear constraints Cv = (v, ρ) ; ∂t ρ + divx (ρv) = 0, v(0, ·) = v(1, ·) = 0, ρ(·, 0) = ρ0, ρ(·, 1) = ρ1 Change of variable (v, µ) → (m, µ), with m = µv: Convex cost J and linear constraints C No estimation of the transport map T, only the geodesic ρ(x, t) 14 / 50

Slide 54

Slide 54 text

Minimization The problem is min (m,ρ) J (m, ρ) + ιC(m, ρ) where J et ιCm are non-smooth convex functions 15 / 50

Slide 55

Slide 55 text

Minimization The problem is min (m,ρ) J (m, ρ) + ιC(m, ρ) where J et ιCm are non-smooth convex functions [P., Peyré et Oudet, 2014] • Staggered grid discretization • Optimisation with proximal splitting algorithms: - ADMM/Douglas-Rachford [Lions et Mercier ‘79, Combettes and Pesquet ‘07] - Generalized Forward-Backward [Raguet et al. ‘13] - Primal-Dual [Chambolle and Pock ‘11] • Generalized costs 15 / 50

Slide 56

Slide 56 text

Illustration: linear transport 16 / 50

Slide 57

Slide 57 text

Comparison of optimization algorithms 100 101 102 103 104 102 103 104 105 Iterations k J(mk,fk) ADMM DR PD J (m( ), ρ( )) Mininimum value of ρ( ) Convergence of transport cost is fast.... 17 / 50

Slide 58

Slide 58 text

Convergence speed ||ρ − ρ( )|| ||m − m( )|| ... but convergence of iterates is slow ((ρ∗, m∗) is the reference solution) 18 / 50

Slide 59

Slide 59 text

Example: vanishing densities 19 / 50

Slide 60

Slide 60 text

Example: vanishing densities Limitation: Dealing with non convex domains 19 / 50

Slide 61

Slide 61 text

Example: vanishing densities Limitation: Dealing with non convex domains 19 / 50

Slide 62

Slide 62 text

Example: vanishing densities Limitation: Dealing with non convex domains 19 / 50

Slide 63

Slide 63 text

Overview • Dynamic optimal transport • Generalization of the transport cost • Optimal transport with physical priors 19 / 50

Slide 64

Slide 64 text

Generalization of the transport cost Functional definition • Transport cost function: min (v,ρ)∈Cv 1 2 [0,1]2 1 0 ρ(x, t)||v(x, t)||2dtdx, • Set of constraints Cv = (v, ρ) ; ∂t ρ + divx (ρv) = 0, ρ(·, 0) = ρ0, ρ(·, 1) = ρ1 20 / 50

Slide 65

Slide 65 text

Generalization of the transport cost Functional definition • Generalization of the transport cost function: min (v,ρ)∈Cv 1 2 [0,1]2 1 0 ρβ(x, t)||v(x, t)||2dtdx, • Set of constraints Cv = (v, ρ) ; ∂t ρ + divx (ρv) = 0, ρ(·, 0) = ρ0, ρ(·, 1) = ρ1 • β ∈ [0; 1]: from H−1 to L2 -Wasserstein distances [Dolbeault et al. ’09, Cardaliaguet et al. ’12] 20 / 50

Slide 66

Slide 66 text

Generalization of the transport cost Functional definition • Generalization of the transport cost function: min (v,ρ)∈Cv 1 2 [0,1]2 1 0 w(x, t)ρ(x, t)||v(x, t)||2dtdx, • Set of constraints Cv = (v, ρ) ; ∂t ρ + divx (ρv) = 0, ρ(·, 0) = ρ0, ρ(·, 1) = ρ1 • β ∈ [0; 1]: from H−1 to L2 -Wasserstein distances [Dolbeault et al. ’09, Cardaliaguet et al. ’12] • Riemannian manifold with 0 < w(x, t) = w(x) +∞ (existence and uniqueness [Mc Cann ’01] ): deal with obstacles 20 / 50

Slide 67

Slide 67 text

Generalization of the transport cost Functional definition • Generalization of the transport cost function: min (v,ρ)∈Cv 1 2 [0,1]2 1 0 ρ(x, t)||A(x, t)v(x, t)||2dtdx, • Set of constraints Cv = (v, ρ) ; ∂t ρ + divx (ρv) = 0, ρ(·, 0) = ρ0, ρ(·, 1) = ρ1 • β ∈ [0; 1]: from H−1 to L2 -Wasserstein distances [Dolbeault et al. ’09, Cardaliaguet et al. ’12] • Riemannian manifold with 0 < w(x, t) = w(x) +∞ (existence and uniqueness [Mc Cann ’01] ): deal with obstacles • Anisotropic transport [Hug et al. ’15] 20 / 50

Slide 68

Slide 68 text

Generalization of the transport cost Functional definition • Generalization of the transport cost function: min (v,ρ)∈Cv 1 2 [0,1]2 1 0 ρ(x, t)||v(x, t)||2dtdx+ 1 2 [0,1]2 1 0 ρ(x, t)||f(x, t)||2, • Set of constraints Cv = (v, ρ) ; ∂t ρ + divx (ρv) = f, ρ(·, 0) = ρ0, ρ(·, 1) = ρ1 • β ∈ [0; 1]: from H−1 to L2 -Wasserstein distances [Dolbeault et al. ’09, Cardaliaguet et al. ’12] • Riemannian manifold with 0 < w(x, t) = w(x) +∞ (existence and uniqueness [Mc Cann ’01] ): deal with obstacles • Anisotropic transport [Hug et al. ’15] • Unbalanced Transport [Chizat et al. ’18] 20 / 50

Slide 69

Slide 69 text

Generalization of the transport cost Example: linear transport with ||m||2/ρβ β = 0 β = 0.5 β = 0.25 β = 0.75 21 / 50

Slide 70

Slide 70 text

Anisotropic Optimal Transport Illustration: • Rotations: OT (A=Id) A1 A2 A3 22 / 50

Slide 71

Slide 71 text

Generalization of the transport cost Synthetic oceanography application 2D OT 23 / 50

Slide 72

Slide 72 text

Generalization of the transport cost Synthetic oceanography application 2D OT in a complex domain, with w(x, t) = w(x) ∈ {1; +∞} 24 / 50

Slide 73

Slide 73 text

Generalization of the transport cost Example: labyrinth 1 w(x, t) = w(x) ∈ {1; +∞} 25 / 50

Slide 74

Slide 74 text

Generalization of the transport cost Example: labyrinth 1 Piecewise straight lines 25 / 50

Slide 75

Slide 75 text

Generalization of the transport cost Example: labyrinth 2 Moving walls w(x, t) ∈ {1; +∞} 26 / 50

Slide 76

Slide 76 text

Generalization of the transport cost Example: labyrinth 2 Piecewise constant speed 26 / 50

Slide 77

Slide 77 text

Generalization of the transport cost Example: labyrinth 3 w(x, t) = w(x) ∈ {1; +∞} 27 / 50

Slide 78

Slide 78 text

Generalization of the transport cost Example: labyrinth 3 w(x, t) = w(x) ∈ {1; +∞} Deal with obstacles Preservation of structures 27 / 50

Slide 79

Slide 79 text

Overview • Dynamic optimal transport • Generalization of the transport cost • Optimal transport with physical priors 27 / 50

Slide 80

Slide 80 text

Optimal transport with physical priors Problems • Optimal mass transfer follows straight lines • Can we include other physical priors on the transport ? 28 / 50

Slide 81

Slide 81 text

Optimal transport with physical priors Problems • Optimal mass transfer follows straight lines • Can we include other physical priors on the transport ? Reintroduction of the velocity v [Hug et al. ’15, Maas et al. ’15] • Coupling with a smooth and non-convex penalization: K(m, ρ, v) = 1 2 [0;1]2 1 0 ||m − ρv||2dtdx • Regularity priors R(v): incompressibility (div(v) = 0), rigidity... 28 / 50

Slide 82

Slide 82 text

Optimal transport with physical priors Problems • Optimal mass transfer follows straight lines • Can we include other physical priors on the transport ? Reintroduction of the velocity v [Hug et al. ’15, Maas et al. ’15] • Coupling with a smooth and non-convex penalization: K(m, ρ, v) = 1 2 [0;1]2 1 0 ||m − ρv||2dtdx • Regularity priors R(v): incompressibility (div(v) = 0), rigidity... • Non-convex model F(m, ρ, v) = J (m, ρ) + ιC (m, ρ) + λK(m, ρ, v) + αR(v), ⇒ Block coordinates descent [Tseng ’01, Ochs et al. ’14] 28 / 50

Slide 83

Slide 83 text

Optimal transport with physical priors Illustration OT Incompressible Rigid 29 / 50

Slide 84

Slide 84 text

Optimal transport with physical priors Oceanography application 2D OT in a complex domain with divergence-free penalization 30 / 50

Slide 85

Slide 85 text

Conclusion Proximal splitting methods • Solving dynamical OT problem • Adding constraints and generalizing cost function 31 / 50

Slide 86

Slide 86 text

Conclusion Proximal splitting methods • Solving dynamical OT problem • Adding constraints and generalizing cost function Extension of Dynamic OT • Discrete surfaces [Lavenant et al. ’18] • Sphere [Lang and P. ’19?] 31 / 50

Slide 87

Slide 87 text

Conclusion Proximal splitting methods • Solving dynamical OT problem • Adding constraints and generalizing cost function Extension of Dynamic OT • Discrete surfaces [Lavenant et al. ’18] • Sphere [Lang and P. ’19?] Open problems • Study the existence of solution for non static domains • Modeling of data occlusions (clouds) for data assimilation in oceanography 31 / 50

Slide 88

Slide 88 text

Part II Discrete Optimal Transport 32 / 50

Slide 89

Slide 89 text

Context General color transfer problem • Manipulate some statistical features of an image (color, luminance, texture, etc) • Preserve the other characteristics (geometry, edges, contrast, etc) and avoid artifacts 33 / 50

Slide 90

Slide 90 text

Examples of color transfer applications Color Harmonization (before) u1 u2 34 / 50

Slide 91

Slide 91 text

Examples of color transfer applications Color Harmonization(after) T(u1) u2 34 / 50

Slide 92

Slide 92 text

Examples of color transfer applications Color Harmonization(after) T(u1) u2 3D Reconstruction [P., Provenzi and Caselles ’11] 34 / 50

Slide 93

Slide 93 text

Examples of color transfer applications • Transfer style from movies Amélie Poulain Transformers Result Sources: [Bonneel et al. ’13] • Prior on foreground/background segmentation [Frigo et al. ’15] 35 / 50

Slide 94

Slide 94 text

Examples of color transfer applications Attracting students 36 / 50

Slide 95

Slide 95 text

Examples of color transfer applications Attracting students Place near a random university 36 / 50

Slide 96

Slide 96 text

Examples of color transfer applications Attracting students Place near Color Palette a random university 36 / 50

Slide 97

Slide 97 text

Examples of color transfer applications Attracting students Place near Color Palette Warm place a random university near the sea 36 / 50

Slide 98

Slide 98 text

Examples of color transfer applications Attracting students Place near Color Palette Warm place a random university near the sea Advertisement: Erasmus Mundus Master Program IPCV (Bordeaux, Budapest, Madrid) 36 / 50

Slide 99

Slide 99 text

Unsupervised color transfer u v Tu→v (u) Same color mean 37 / 50

Slide 100

Slide 100 text

Unsupervised color transfer u v Tu→v (u) Same color mean Parametric methoods: transfer of color statistics [Reinhard 2001,Tai et al. ’05] 37 / 50

Slide 101

Slide 101 text

Unsupervised color transfer u v Tu→v (u) Same color histogram Optimal transport: transfer of color palette [Pitié et al. ’07, Rabin et al. ’11] 37 / 50

Slide 102

Slide 102 text

Monge-Kantorovitch problem for d 1 • Histograms: µ = M i=1 µi δXi et ν = N j=1 νj δYj , Xi , Yj ∈ Rd • Wasserstein distance W2(µ, ν)2 = min P∈Pµ,ν { P , C = i,j Pi,j Ci,j } Pµ,ν =        P ∈ RM×N , Pi,j 0, i,j Pi,j = 1, j Pi,j = µi , i Pi,j = νj        • Pi,j is the mass transported from µi to νj • Cost matrix between locations Xi and Yj : Ci,j = d k=1 ||Xk i − Yk j ||2 Do not depend on feature dimension d Limited to low dimensions M and N 38 / 50

Slide 103

Slide 103 text

Monge-Kantorovitch problem for d 1 • Histograms: µ = M i=1 µi δXi et ν = N j=1 νj δYj , Xi , Yj ∈ Rd • Wasserstein distance W2(µ, ν)2 = min P∈Pµ,ν { P , C = i,j Pi,j Ci,j } Pµ,ν =        P ∈ RM×N , Pi,j 0, i,j Pi,j = 1, j Pi,j = µi , i Pi,j = νj        • Pi,j is the mass transported from µi to νj • Cost matrix between locations Xi and Yj : Ci,j = d k=1 ||Xk i − Yk j ||2 Do not depend on feature dimension d Limited to low dimensions M and N 38 / 50

Slide 104

Slide 104 text

Solving the discrete problem • Linear programing: simplex, interior points • Assignment problems: Hungarian, Auction • Acceleration for L1 costs [Ling and Okada ’07 Pele and Werman ’08] • Sliced Wasserstein [Rabin et al. ’11] • Multiscale [Oberman and Ruan ’15]. • Fast approximation: Sinkhorn [Cuturi ’13] • Shortcuts L2 [Schmitzer ’15] 39 / 50

Slide 105

Slide 105 text

Limitations of Optimal Transport • Exact Transfert of color proportions • Irregularity of transport map (spatial / color) • Dimension: limited to low dimensional images Solutions • Relaxation of mass preservation constraint • Estimation of regularised transport maps (color consistency) • Pixel clustering (spatial consistency) 40 / 50

Slide 106

Slide 106 text

Limitations of Optimal Transport • Exact Transfert of color proportions • Irregularity of transport map (spatial / color) • Dimension: limited to low dimensional images Solutions • Relaxation of mass preservation constraint • Estimation of regularised transport maps (color consistency) • Pixel clustering (spatial consistency) 40 / 50

Slide 107

Slide 107 text

Relaxation of mass conservation constraint • Transport of color histogram µ = M i=1 µi δXi to ν µi : proportion of color for bin Xi • Introduction of capacity variables κj • Joint estimation of P and κ {P , κ } ∈ argmin P∈Pκ(µ,ν) κ∈RN ,κ≥0, κ, ν =1 P, C + µ||κ − 1||1 • Relaxed constraints [Ferradans et al. ’14]: Pκ (µ, ν) =        Pi,j 0, i,j Pi,j = 1, j Pi,j = µi , i Pi,j = κj νj        Still a linear program 41 / 50

Slide 108

Slide 108 text

Relaxation of mass conservation constraint • Transport of color histogram µ = M i=1 µi δXi to ν µi : proportion of color for bin Xi • Introduction of capacity variables κj • Joint estimation of P and κ {P , κ } ∈ argmin P∈Pκ(µ,ν) κ∈RN ,κ≥0, κ, ν =1 P, C + µ||κ − 1||1 • Relaxed constraints [Ferradans et al. ’14]: Pκ (µ, ν) =        Pi,j 0, i,j Pi,j = 1, j Pi,j = µi , i Pi,j = κj νj        Still a linear program 41 / 50

Slide 109

Slide 109 text

Relaxation of mass conservation constraint • Transport of color histogram µ = M i=1 µi δXi to ν µi : proportion of color for bin Xi • Introduction of capacity variables κj • Joint estimation of P and κ {P , κ } ∈ argmin P∈Pκ(µ,ν) κ∈RN ,κ≥0, κ, ν =1 P, C + µ||κ − 1||1 • Relaxed constraints [Ferradans et al. ’14]: Pκ (µ, ν) =        Pi,j 0, i,j Pi,j = 1, j Pi,j = µi , i Pi,j = κj νj        Optimal transport Still a linear program 41 / 50

Slide 110

Slide 110 text

Relaxation of mass conservation constraint • Transport of color histogram µ = M i=1 µi δXi to ν µi : proportion of color for bin Xi • Introduction of capacity variables κj • Joint estimation of P and κ {P , κ } ∈ argmin P∈Pκ(µ,ν) κ∈RN ,κ≥0, κ, ν =1 P, C + µ||κ − 1||1 • Relaxed constraints [Ferradans et al. ’14]: Pκ (µ, ν) =        Pi,j 0, i,j Pi,j = 1, j Pi,j = µi , i Pi,j = κj νj        Relaxed optimal transport Still a linear program 41 / 50

Slide 111

Slide 111 text

Illustration of relaxed model Target OT Relaxed OT Source Joint estimation of the proportion of color to transfer No spatial nor colorimetric regularisation 42 / 50

Slide 112

Slide 112 text

Illustration of relaxed model Target OT Relaxed OT Source Joint estimation of the proportion of color to transfer No spatial nor colorimetric regularisation 42 / 50

Slide 113

Slide 113 text

Regularity of transport map • Global regularization: NP-hard problem • Mean transport map [Ferradans et al. ’13, Seguy et al. ’17] TP(Xi ) = Yi = 1 j Pij j Pij Yj = (Dµ PY)i 43 / 50

Slide 114

Slide 114 text

Regularity of transport map • Global regularization: NP-hard problem • Mean transport map [Ferradans et al. ’13, Seguy et al. ’17] TP(Xi ) = Yi = 1 j Pij j Pij Yj = (Dµ PY)i 43 / 50

Slide 115

Slide 115 text

Regularity of transport map • Global regularization: NP-hard problem • Mean transport map [Ferradans et al. ’13, Seguy et al. ’17] TP(Xi ) = Yi = 1 j Pij j Pij Yj = (Dµ PY)i 43 / 50

Slide 116

Slide 116 text

Regularity of transport map for color transfer • Mean transport: Vi = TP(Xi) − Xi • Spatial consistency: graph of similarity ωij between Xi and Xj ⇒ Close pixels with similar colors should be matched together • Graph-laplacian of the mean transport field V (∆V)i := j∈EX (i) ωij d =1 (Vi − Vj ), • Color consistency penalize color shift to avoid artifacts R(P) = i |∆V|i • Still a linear program • Symetric formulation • Barycenter computation Graph 44 / 50

Slide 117

Slide 117 text

Regularity of transport map for color transfer • Mean transport: Vi = TP(Xi) − Xi • Spatial consistency: graph of similarity ωij between Xi and Xj ⇒ Close pixels with similar colors should be matched together • Graph-laplacian of the mean transport field V (∆V)i := j∈EX (i) ωij d =1 (Vi − Vj ), • Color consistency penalize color shift to avoid artifacts R(P) = i |∆V|i • Still a linear program • Symetric formulation • Barycenter computation Graph 44 / 50

Slide 118

Slide 118 text

Regularity of transport map for color transfer • Mean transport: Vi = TP(Xi) − Xi • Spatial consistency: graph of similarity ωij between Xi and Xj ⇒ Close pixels with similar colors should be matched together • Graph-laplacian of the mean transport field V (∆V)i := j∈EX (i) ωij d =1 (Vi − Vj ), • Color consistency penalize color shift to avoid artifacts R(P) = i |∆V|i • Still a linear program • Symetric formulation • Barycenter computation Graph 44 / 50

Slide 119

Slide 119 text

Regularity of transport map for color transfer • Mean transport: Vi = TP(Xi) − Xi • Spatial consistency: graph of similarity ωij between Xi and Xj ⇒ Close pixels with similar colors should be matched together • Graph-laplacian of the mean transport field V (∆V)i := j∈EX (i) ωij d =1 (Vi − Vj ), • Color consistency penalize color shift to avoid artifacts R(P) = i |∆V|i • Still a linear program • Symetric formulation • Barycenter computation Relaxed OT 44 / 50

Slide 120

Slide 120 text

Regularity of transport map for color transfer • Mean transport: Vi = TP(Xi) − Xi • Spatial consistency: graph of similarity ωij between Xi and Xj ⇒ Close pixels with similar colors should be matched together • Graph-laplacian of the mean transport field V (∆V)i := j∈EX (i) ωij d =1 (Vi − Vj ), • Color consistency penalize color shift to avoid artifacts R(P) = i |∆V|i • Still a linear program • Symetric formulation • Barycenter computation Relaxed and regularized OT 44 / 50

Slide 121

Slide 121 text

Color transfer algorithm 1. Superpixel clustering [Achanta et al. ’12, Giraud et al. ’18] Image Superpixels 2. Graph built from superpixel similarities (spatial+color) 3. Estimation of relaxed and regularised transport map 4. Final synthesis at pixel scale 45 / 50

Slide 122

Slide 122 text

Color transfer Images: Nicolas Le Dilhuit 46 / 50

Slide 123

Slide 123 text

Color transfer Images: Nicolas Le Dilhuit 46 / 50

Slide 124

Slide 124 text

Limitation Creation of new drab colors: • Interpolation Xi → ¯ Yi • Amplified with explicit regularisation Implicit regularisation of the transport matrix elements does not help • Entropic regularisation [Cuturi ’13] • Convex/sparse regularisation [Blondel et al. ’17, Dessein et al. ’19] Solution: deal with the color dispersion 47 / 50

Slide 125

Slide 125 text

Limitation Creation of new drab colors: • Interpolation Xi → ¯ Yi • Amplified with explicit regularisation Implicit regularisation of the transport matrix elements does not help • Entropic regularisation [Cuturi ’13] • Convex/sparse regularisation [Blondel et al. ’17, Dessein et al. ’19] Solution: deal with the color dispersion 47 / 50

Slide 126

Slide 126 text

Limitation Creation of new drab colors: • Interpolation Xi → ¯ Yi • Amplified with explicit regularisation Implicit regularisation of the transport matrix elements does not help • Entropic regularisation [Cuturi ’13] • Convex/sparse regularisation [Blondel et al. ’17, Dessein et al. ’19] Solution: deal with the color dispersion 47 / 50

Slide 127

Slide 127 text

Color dispersion • Measure variance of transfered color [Rabin and P. ’15] Var(Y)i := Y − Yi 2 i • Minimisation of a concave term of mass dispersion: α i µi Var(Y)i ⇒ Associate a single color to Xi • Non-smooth and non-convex problem Forward-Backard [Attouch et al. ’13, Ochs et al. ’14], DC programming [Tao ’05] 48 / 50

Slide 128

Slide 128 text

Color dispersion • Measure variance of transfered color [Rabin and P. ’15] Var(Y)i := Y − Yi 2 i • Minimisation of a concave term of mass dispersion: α i µi Var(Y)i ⇒ Associate a single color to Xi • Non-smooth and non-convex problem Forward-Backard [Attouch et al. ’13, Ochs et al. ’14], DC programming [Tao ’05] 48 / 50

Slide 129

Slide 129 text

Influence of color dispersion with parameter α Input α = 0 Example with high regularisation parameter 49 / 50

Slide 130

Slide 130 text

Influence of color dispersion with parameter α Input α = 10 Example with high regularisation parameter 49 / 50

Slide 131

Slide 131 text

Influence of color dispersion with parameter α Input α = 100 Example with high regularisation parameter 49 / 50

Slide 132

Slide 132 text

Conclusion For image processing applications: • Relaxation of mass conservation constraint is necessary • Spatial regularization into transport map deals with artifacts • Non convex models prevent from creating new dull colors To be fair: • Doing color transfer with optimal transport is currently time consuming (1 minute for HD image) • Semi-automatic methods (high level segmentation, semantic analysis, simple optimal transport) [Bonneel et al. ’13, Frigo et al. ’14] give fast and accurate color transfer results even for videos But: • Enhancing OT framework will improve semi-automatic methods • Dealing with artifacts allows defining robust dissimilarity measures 50 / 50