Optimal Transport for Image Assimilation
Nicolas Papadakis
Workshop on AI for Ocean, Atmosphere and Climate Dynamics
January 20th 2020
Optimal Transport for Image Assimilation 1/38

1 Introduction
2 Image Assimilation
3 Wasserstein distance
4 Wasserstein Image Assimilation
5 Conclusion
Optimal Transport for Image Assimilation Introduction 2/38

How to use images for ocean state prediction?
Optimal Transport for Image Assimilation Introduction 3/38

Assimilation: how to predict the state of a dynamical
system?
Meteorology
Glaciology Health sciences
Optimal Transport for Image Assimilation Introduction 4/38

Heat diffusion
Physical variable: u(x, t) room temperature Ω at time t
Initial conditions
Homogeneous temperature u(., 0) = 15◦
Add an Heater at xH
: u(xH, 0) = 20◦
Optimal Transport for Image Assimilation Introduction 5/38

Heat diffusion
Physical variable: u(x, t) room temperature Ω at time t
Initial conditions
Homogeneous temperature u(., 0) = 15◦
Add an Heater at xH
: u(xH, 0) = 20◦
Optimal Transport for Image Assimilation Introduction 5/38

Heat diffusion
Dynamical model: heat equation during 3 hours
∂t u(x, t) = ∆u(x, t) x ∈ Ω\xh
u(xH, t) = 20◦
∂u
∂n
(x) = 0 x ∈ ∂Ω
Optimal Transport for Image Assimilation Introduction 5/38

Heat diffusion
Dynamical model: heat equation during 3 hours
∂t u(x, t) = ∆u(x, t) x ∈ Ω\xh
u(xH, t) = 20◦
∂u
∂n
(x) = 0 x ∈ ∂Ω
Optimal Transport for Image Assimilation Introduction 5/38

Heat diffusion
Dynamical model: heat equation during 3 hours
∂t u(x, t) = ∆u(x, t) x ∈ Ω\xh
u(xH, t) = 20◦
∂u
∂n
(x) = 0 x ∈ ∂Ω
Observations: Thermometer measure at θ: 17◦ = u(xθ, 3)
Optimal Transport for Image Assimilation Introduction 5/38

Heat diffusion
Dynamical model: heat equation during 3 hours
∂t u(x, t) = ∆u(x, t) x ∈ Ω\xh
u(xH, t) = 20◦
∂u
∂n
(x) = g(x) x ∈ ∂Ω
Observations: Thermometer measure at θ: 17◦ = u(xθ, 3)
Uncertainties on boundary conditions g (ventilation...)
Optimal Transport for Image Assimilation Introduction 5/38

Heat diffusion
Dynamical model: heat equation during 3 hours
∂t u(x, t) = ∆u(x, t) x ∈ Ω\xh
u(xH, t) = 20◦
∂u
∂n
(x) = g(x) x ∈ ∂Ω
Observations: Thermometer measure at θ: 17◦ = u(xθ, 3)
Uncertainties on boundary conditions g (ventilation...)
Optimal Transport for Image Assimilation Introduction 5/38

Heat diffusion
Dynamical model: heat equation during 3 hours
∂t u(x, t) = ∆u(x, t) x ∈ Ω\xh
u(xH, t) = 20◦
∂u
∂n
(x) = g(x) x ∈ ∂Ω
Observations: Thermometer measure at θ: 17◦ = u(xθ, 3)
Uncertainties on boundary conditions g (ventilation...)
Correction of boundary conditions g
- Learning of g
- Better knowledge of x
- More reliable predictions
Optimal Transport for Image Assimilation Introduction 5/38

Oceanography
Variables: velocity, surface height, temperature, salinity,
phytoplankton...
Numerical model of ocean dynamic
Non homogeneous spatial interactions
Complex boundary conditions
- Littoral
- Seabed
- Atmosphere
Observations
- in situ: Drifting buoys/ships...
- Bathymetry
- Satellite data
Optimal Transport for Image Assimilation Introduction 6/38

Oceanography
Variables: velocity, surface height, temperature, salinity,
phytoplankton...
Numerical model of ocean dynamic
Non homogeneous spatial interactions
Complex boundary conditions
- Littoral
- Seabed
- Atmosphere
Observations
- in situ: Drifting buoys/ships...
- Bathymetry
- Satellite data
Optimal Transport for Image Assimilation Introduction 6/38

Oceanography
Variables: velocity, surface height, temperature, salinity,
phytoplankton...
Numerical model of ocean dynamic
Non homogeneous spatial interactions
Complex boundary conditions
- Littoral
- Seabed
- Atmosphere
Observations
- in situ: Drifting buoys/ships...
- Bathymetry
- Satellite data
Optimal Transport for Image Assimilation Introduction 6/38

Satellite observations of ocean surface
Information spatially structured (fronts, vortex...)
Altimetry Temperature Chlorophyll
Temporal deformations (occlusions)
SST image Temporal variation
Source: [Béréziat and Herlin ‘11]
Optimal Transport for Image Assimilation Introduction 7/38

Satellite observations of ocean surface
Information spatially structured (fronts, vortex...)
Altimetry Temperature Chlorophyll
Temporal deformations (occlusions)
SST image Temporal variation
Source: [Béréziat and Herlin ‘11]
Optimal Transport for Image Assimilation Introduction 7/38

Satellite observations
Jason
Resolution ≈ 10km
(CNES/NASA, 2001-)
SWOT
High resolution ≈ 1km
(CNES/NASA, 2021-)
Optimal Transport for Image Assimilation Introduction 8/38

Satellite observations
Jason
Resolution ≈ 10km
(CNES/NASA, 2001-)
SWOT
High resolution ≈ 1km
(CNES/NASA, 2021-)
Optimal Transport for Image Assimilation Introduction 8/38

Ocean images
Optimal Transport for Image Assimilation Introduction 9/38

Prediction with data assimilation
Constrained minimization problem
min
∂t u=M(u) T
||y(t) − H(u(t))||2
R
dt + ||u(t0) − f||2
B
Physical variable u: velocity, temperature, pressure...
Dynamical model ∂t u = M(u): temporal evolution
Background information at t0
: u(t0) ≈ f
Observations y: radar, stations, balloons...
Observation operator H
Choice of a norm ||.||
Modeling of error covariance matrices on the background B and
the observations R
Optimal Transport for Image Assimilation Introduction 10/38

Assimilation of ocean surface images
Satellite images y
Assimilation
Loss function:
||y − H(u)||2
R
Challenges
Choice of the norm to compare the
relevant information contained in images
Optimal Transport for Image Assimilation Introduction 11/38

Assimilation of ocean surface images
Satellite images y
Assimilation
Loss function:
||y − H(u)||2
R
Challenges
Observed quantities y not in the state u :
Observation operator H?
Choice of the norm to compare the
relevant information contained in images
Optimal Transport for Image Assimilation Introduction 11/38

Assimilation of ocean surface images
Satellite images y
Assimilation
Loss function:
||y − H(u)||2
R
Challenges
Observed quantities y not in the state u :
Observation operator H?
Error modeling R: KaRIN noise, satellite
roll, atmospheric conditions...
Denoised image y
[Gómez-Navarro, Cosme, Le Sommer et al. ‘18- ]
Choice of the norm to compare the
relevant information contained in images
Optimal Transport for Image Assimilation Introduction 11/38

Assimilation of ocean surface images
Satellite images y
Assimilation
Loss function:
||y − H(u)||2
R
Challenges
Observed quantities y not in the state u :
Observation operator H?
Error modeling R: KaRIN noise, satellite
roll, atmospheric conditions...
Denoised image y
[Gómez-Navarro, Cosme, Le Sommer et al. ‘18- ]
Choice of the norm to compare the
relevant information contained in images
Optimal Transport for Image Assimilation Introduction 11/38

Assimilation of ocean surface images
Satellite images y
Assimilation
Loss function:
||y − H(u)||2
R
Challenges
Observed quantities y not in the state u :
Observation operator H?
Error modeling R: KaRIN noise, satellite
roll, atmospheric conditions...
Denoised image y
[Gómez-Navarro, Cosme, Le Sommer et al. ‘18- ]
Choice of the norm to compare the
relevant information contained in images
Optimal Transport for Image Assimilation Introduction 11/38

1 Introduction
2 Image Assimilation
3 Wasserstein distance
4 Wasserstein Image Assimilation
5 Conclusion
Optimal Transport for Image Assimilation Image Assimilation 12/38

Image assimilation for oceanography
Short review :
Assimilation of structured information contained in images with
standard Euclidean norm
Main contributors:
groups of R. Fablet, I. Herlin, F.-X. Le Dimet, É. Mémin, J. Verron...
Optimal Transport for Image Assimilation Image Assimilation 13/38

Image assimilation
in situ conﬁguration
- Image I: observation of a subset of the state variable u
- Occlusion mask M, observation operator H
- Pixelwise assimilation:
||M(I − Hu)||2
Example [Corpetti et al. ‘09]
- Estimate the velocity ﬁelds of a multilayer shallow-water model
- Assimilation of pressure images
The structured information contained in images is not used
Optimal Transport for Image Assimilation Image Assimilation 14/38

Image assimilation
in situ conﬁguration
- Image I: observation of a subset of the state variable u
- Occlusion mask M, observation operator H
- Pixelwise assimilation:
||M(I − Hu)||2
Example [Corpetti et al. ‘09]
- Estimate the velocity ﬁelds of a multilayer shallow-water model
- Assimilation of pressure images
The structured information contained in images is not used
Optimal Transport for Image Assimilation Image Assimilation 14/38

Image assimilation
in situ conﬁguration
- Image I: observation of a subset of the state variable u
- Occlusion mask M, observation operator H
- Pixelwise assimilation:
||M(I − Hu)||2
Example [Corpetti et al. ‘09]
- Estimate the velocity ﬁelds of a multilayer shallow-water model
- Assimilation of pressure images
The structured information contained in images is not used
Optimal Transport for Image Assimilation Image Assimilation 14/38

How to assimilate the structures contained in images?
Indirect assimilation: use image processing techniques to produce
pseudo-observations
Direct comparison of images and state variables in a dedicated
space
Optimal Transport for Image Assimilation Image Assimilation 15/38

Indirect assimilation
1 Extract information y from image observations I
2 Assimilation of pseudo-observations y:
||H(u) − y||2
Example: fronts [Gaultier et al. ‘13]
(a) SST I1 (b) SSS I2 (c) SPICE u
(d) gradient SST y1 (e) gradient SSS y2 (f) FLSE H(u)
- I: SST or SSS image
- y: thresholded image
gradients
- H(u): Binarized
Finite-Size Lyapunov
Exponents (FLSE)
Optimal Transport for Image Assimilation Image Assimilation 16/38

Indirect assimilation
1 Extract information y from image observations I
2 Assimilation of pseudo-observations y:
||H(u) − y||2
Example: fronts [Gaultier et al. ‘13]
(a) SST I1 (b) SSS I2 (c) SPICE u
(d) gradient SST y1 (e) gradient SSS y2 (f) FLSE H(u)
- I: SST or SSS image
- y: thresholded image
gradients
- H(u): Binarized
Finite-Size Lyapunov
Exponents (FLSE)
Optimal Transport for Image Assimilation Image Assimilation 16/38

Direct assimilation
Deﬁne a coupling between image I and state variable u:
||H(u, I)||2
Optimal Transport for Image Assimilation Image Assimilation 17/38

Direct assimilation
Deﬁne a coupling between image I and state variable u:
||H(u, I)||2
Temporal information [P. and Mémin ‘07, Béréziat and Herlin ‘09]
- Layer velocity u(x, t), image sequence I(x, t)
- Coupling with optical ﬂow equation:
||I(x + u(x, t), t + 1) − I(x, t)||2
I(x, t) I(x, t + 1) u(x, t)
Optimal Transport for Image Assimilation Image Assimilation 17/38

Direct assimilation
Deﬁne a coupling between image I and state variable u:
||H(u, I)||2
Temporal information [P. and Mémin ‘07, Béréziat and Herlin ‘09]
- Layer velocity u(x, t), image sequence I(x, t)
- Coupling with optical ﬂow equation:
||I(x + u(x, t), t + 1) − I(x, t)||2
I(x, t) I(x, t + 1) u(x, t)
Optimal Transport for Image Assimilation Image Assimilation 17/38

Direct assimilation
Deﬁne a coupling between image I and state variable u:
||H(u, I)||2
Temporal information [P. and Mémin ‘07, Béréziat and Herlin ‘09]
- Layer velocity u(x, t), image sequence I(x, t)
- Coupling with optical ﬂow equation:
||I(x + u(x, t), t + 1) − I(x, t)||2
I(x, t) I(x, t + 1) u(x, t)
Not a norm on I or u : hard to model observation errors in a
covariance matrix
Optimal Transport for Image Assimilation Image Assimilation 17/38

Direct assimilation
Deﬁne a coupling between image I and state variable u:
||H(u, I)||2
Multiscale decomposition [Titaud et al. ‘10]
- Additional passive state variable u
- Project u and I in a structured space S (Wavelets, Curvelets...):
||PS
(u) − PS
(I)||2
Threshold coefﬁcients: dimension reduction, robustness to noise
Optimal Transport for Image Assimilation Image Assimilation 17/38

Direct assimilation
Deﬁne a coupling between image I and state variable u:
||H(u, I)||2
Multiscale decomposition [Titaud et al. ‘10]
- Additional passive state variable u
- Project u and I in a structured space S (Wavelets, Curvelets...):
||PS
(u) − PS
(I)||2
Threshold coefﬁcients: dimension reduction, robustness to noise
For observation noise correlated in space: diagonal error covariance
matrix in wavelet space [Chabot et al. ‘15]
Optimal Transport for Image Assimilation Image Assimilation 17/38

Norm for assimilation
Type of compromise between prediction p and observation y:
min
u
||u − p||2
B
+ ||u − y||2
R
Standard assimilation: Euclidean distance weighted by a positive
deﬁnite matrix A:
||x||2
A
= x, Ax
For a conﬁdence of degree α ∈ [0; 1], B = αId and R = (1 − α)Id:
min
u
α||u − p||2 + (1 − α)||u − y||2
Solution: linear interpolation
u∗ = αp + (1 − α)y
- Prediction p = 18◦, observation y = 17◦
- Assimilation is expected to produce something within [17◦, 18◦]
Optimal Transport for Image Assimilation Image Assimilation 18/38

Norm for assimilation
Type of compromise between prediction p and observation y:
min
u
||u − p||2
B
+ ||u − y||2
R
Standard assimilation: Euclidean distance weighted by a positive
deﬁnite matrix A:
||x||2
A
= x, Ax
For a conﬁdence of degree α ∈ [0; 1], B = αId and R = (1 − α)Id:
min
u
α||u − p||2 + (1 − α)||u − y||2
Solution: linear interpolation
u∗ = αp + (1 − α)y
- Prediction p = 18◦, observation y = 17◦
- Assimilation is expected to produce something within [17◦, 18◦]
Optimal Transport for Image Assimilation Image Assimilation 18/38

Norm for assimilation
Type of compromise between prediction p and observation y:
min
u
||u − p||2
B
+ ||u − y||2
R
Standard assimilation: Euclidean distance weighted by a positive
deﬁnite matrix A:
||x||2
A
= x, Ax
For a conﬁdence of degree α ∈ [0; 1], B = αId and R = (1 − α)Id:
min
u
α||u − p||2 + (1 − α)||u − y||2
Solution: linear interpolation
u∗ = αp + (1 − α)y
- Prediction p = 18◦, observation y = 17◦
- Assimilation is expected to produce something within [17◦, 18◦]
Optimal Transport for Image Assimilation Image Assimilation 18/38

Euclidean distance
Data deﬁned on a 1D domain
Prediction Observation Euclidean
Interpolation
Interpolation → computation of geodesic in Euclidean space
Not adapted to errors in the position of structures
Optimal Transport for Image Assimilation Image Assimilation 19/38

Euclidean distance
Data deﬁned on a 1D domain
Prediction Observation Euclidean
Interpolation
Interpolation → computation of geodesic in Euclidean space
Not adapted to errors in the position of structures
Optimal Transport for Image Assimilation Image Assimilation 19/38

Euclidean distance
Data deﬁned on a 1D domain
Prediction Observation Euclidean
Interpolation
Interpolation → computation of geodesic in Euclidean space
Not adapted to errors in the position of structures
Optimal Transport for Image Assimilation Image Assimilation 19/38

Wasserstein distance
Data deﬁned on a 1D domain
Prediction Observation Wasserstein
Interpolation
Interpolation → computation of geodesic in Wasserstein space
Adapted to errors in the position of structures
Optimal Transport for Image Assimilation Image Assimilation 19/38

Wasserstein in 2D
Sea Surface Height
creation of vortexes in Cap Point
(output of NEMO model)
How to deal with the coast?
Optimal Transport for Image Assimilation Image Assimilation 20/38

Wasserstein in 2D
Sea Surface Height
creation of vortexes in Cap Point
(output of NEMO model)
How to deal with the coast?
Optimal Transport for Image Assimilation Image Assimilation 20/38

1 Introduction
2 Image Assimilation
3 Wasserstein distance
4 Wasserstein Image Assimilation
5 Conclusion
Optimal Transport for Image Assimilation Wasserstein distance 21/38

Optimal transport and Wasserstein distance
Transport T: application moving a density ρ0 onto another one ρ1
(same mass), deﬁned for x ∈ Ω as:
ρ1 = T#ρ0
Optimal transport: Application of minimal cost
Ω
c(x, T(x))dx
Wasserstein distance Wp : c(x, y) = ||x − y||p
Concave cost (Economy), Truncated cost (Computer Vision),
W1
(Deep Learning)
Optimal Transport for Image Assimilation Wasserstein distance 22/38

Optimal transport and Wasserstein distance
Transport T: application moving a density ρ0 onto another one ρ1
(same mass), deﬁned for x ∈ Ω as:
ρ1 = T#ρ0
Optimal transport: Application of minimal cost
Ω
c(x, T(x))dx
Wasserstein distance Wp : c(x, y) = ||x − y||p
Concave cost (Economy), Truncated cost (Computer Vision),
W1
(Deep Learning)
Optimal Transport for Image Assimilation Wasserstein distance 22/38

Optimal transport and Wasserstein distance
Transport T: application moving a density ρ0 onto another one ρ1
(same mass), deﬁned for x ∈ Ω as:
ρ1 = T#ρ0
Optimal transport: Application of minimal cost
Ω
c(x, T(x))dx
Wasserstein distance Wp : c(x, y) = ||x − y||p
Concave cost (Economy), Truncated cost (Computer Vision),
W1
(Deep Learning)
Optimal Transport for Image Assimilation Wasserstein distance 22/38

Applications in IP, CV, ML and DL
Robust dissimilarity measure (Optimal transport cost)
• Image retrieval (EMD) [Rubner et al. ’00]
• 3D shape recognitions [Ruzon and Tomasi, ’01]
• SIFT matching [Pele and Werman ’08]
• Object segmentation [Ni et al. ’09, Rabin et al. ’11, ’15],
• Denoising [Burger et al. ’12, Tartavel et al. ’16]
• Generative models [Arjovsky et al. ’17, Genevay et al. ’18]
Optimal Transport for Image Assimilation Wasserstein distance 23/38

Applications in IP, CV, ML and DL
Robust dissimilarity measure (Optimal transport cost)
• Image retrieval (EMD) [Rubner et al. ’00]
• 3D shape recognitions [Ruzon and Tomasi, ’01]
• SIFT matching [Pele and Werman ’08]
• Object segmentation [Ni et al. ’09, Rabin et al. ’11, ’15],
• Denoising [Burger et al. ’12, Tartavel et al. ’16]
• Generative models [Arjovsky et al. ’17, Genevay et al. ’18]
Why is it robust?
Discrete bin-to-bin metrics are not informative for disjoint supports
Optimal Transport for Image Assimilation Wasserstein distance 23/38

Applications in IP, CV, ML and DL
Robust dissimilarity measure (Optimal transport cost)
• Image retrieval (EMD) [Rubner et al. ’00]
• 3D shape recognitions [Ruzon and Tomasi, ’01]
• SIFT matching [Pele and Werman ’08]
• Object segmentation [Ni et al. ’09, Rabin et al. ’11, ’15],
• Denoising [Burger et al. ’12, Tartavel et al. ’16]
• Generative models [Arjovsky et al. ’17, Genevay et al. ’18]
Why is it robust?
Discrete bin-to-bin metrics are not informative for disjoint supports
Transport map T explains
how far are the distributions
Optimal Transport for Image Assimilation Wasserstein distance 23/38

Optimal Transport Map
• The Monge transport map:
• Interpolate between densities, compute barycenters or geodesics
in the Wasserstein space
ρ0 ρ1
Optimal Transport for Image Assimilation Wasserstein distance 24/38

Applications in IP, CV, ML and DL
Tool for matching/interpolation (Optimal transport map)
• Image interpolation, registration [Angenent et al. ’04]
Medical image registration [Rehman et al. ’09]
• Color transfer [Delon, ’04, Pitié et al. ’07, Bonneel et al. ‘11]
• Shape matching [Rabin et al. ’10, Schmitzer and Schnörr ’14]
• Texture synthesis [Ferradans et al. ’13, Leclaire et al. ’19]
• Geodesic PCA [Bigot et al. ‘13, Seguy et al. ‘15, Cazelles et al. ‘18]
• Domain adaptation [Courty et al. ’15, Redko et al. ’17]
• Generative models [Seguy et al. ’18]
Optimal Transport for Image Assimilation Wasserstein distance 25/38

Formulations
Continuous Semi-discrete Discrete
Optimal Transport for Image Assimilation Wasserstein distance 26/38

Formulations
Continuous
[J.-D. Benamou]
Semi-discrete Discrete
• Images: densities on support Ω
• Mass transport in a ﬂuid mechanics framework on Ω
Velocity ﬁeld T : Ω → Ω
Optimal Transport for Image Assimilation Wasserstein distance 26/38

Formulations
Continuous
Semi-discrete
[Q. Mérigot]
Discrete
Wasserstein for GAN
Optimal Transport for Image Assimilation Wasserstein distance 26/38

Formulations
Continuous
Semi-discrete
[Q. Mérigot]
Discrete
Wasserstein for GAN
Optimal Transport for Image Assimilation Wasserstein distance 26/38

Formulations
Continuous
Semi-discrete
[Q. Mérigot]
Discrete
Wasserstein for GAN
Optimal Transport for Image Assimilation Wasserstein distance 26/38

Formulations
Continuous Semi-discrete
Discrete
[M. Cuturi]
• Distributions of image features
• Transport between normalized histograms of size M and N
Coupling matrix P of size M × N
Optimal Transport for Image Assimilation Wasserstein distance 26/38

Formulations
Continuous Semi-discrete Discrete
Optimal Transport for Image Assimilation Wasserstein distance 26/38

Continuous Optimal Transport
Densities ρ0 and ρ1 deﬁned from x ∈ [0, 1]d to [0, 1]
Mass preserving transport map T:
T (ρ0, ρ1) := {T : [0, 1]d → [0, 1]d such that ρ1 = T ρ0}
T ∈ T (ρ0, ρ1) satisﬁes the gradient equation
ρ0(x) = ρ1(T(x))| det(∂T(x))|
An optimal transport T solves
min
T∈T (ρ0,ρ1)
C(x, T(x))ρ0(x) dx
where C(x, y) 0 is the cost of assigning x ∈ [0, 1]d to y ∈ [0, 1]d
For C(x, y) = ||x − y||p: distance Wp, T unique if p > 1
Optimal Transport for Image Assimilation Wasserstein distance 27/38

Estimation of optimal transport map
For p = 2,T = ∇ψ, with ψ convex [Brenier ’91] and optimal mass
transfer follows straight lines → Monge-Ampère equation:
det(D2ψ) =
ρ0(x)
ρ1(∇ψ(x))
[Oliker and Prussner ’88, Oberman ’08, Froese ’12, Benamou et al. ’16]
Optimal Transport for Image Assimilation Wasserstein distance 28/38

Estimation of optimal transport map
For p = 2,T = ∇ψ, with ψ convex [Brenier ’91] and optimal mass
transfer follows straight lines → Monge-Ampère equation:
det(D2ψ) =
ρ0(x)
ρ1(∇ψ(x))
[Oliker and Prussner ’88, Oberman ’08, Froese ’12, Benamou et al. ’16]
Fast algorithms (second order methods)
Optimal Transport for Image Assimilation Wasserstein distance 28/38

Estimation of optimal transport map
For p = 2,T = ∇ψ, with ψ convex [Brenier ’91] and optimal mass
transfer follows straight lines → Monge-Ampère equation:
det(D2ψ) =
ρ0(x)
ρ1(∇ψ(x))
[Oliker and Prussner ’88, Oberman ’08, Froese ’12, Benamou et al. ’16]
Fast algorithms (second order methods)
ρ1 should be lipschitz continuous with convex support
ρ0 ρ1
Optimal Transport for Image Assimilation Wasserstein distance 28/38

Estimation of optimal transport map
For p = 2,T = ∇ψ, with ψ convex [Brenier ’91] and optimal mass
transfer follows straight lines → Monge-Ampère equation:
det(D2ψ) =
ρ0(x)
ρ1(∇ψ(x))
[Oliker and Prussner ’88, Oberman ’08, Froese ’12, Benamou et al. ’16]
Fast algorithms (second order methods)
ρ1 should be lipschitz continuous with convex support
ρ0 ρ1 ρ0 ρ1
Optimal Transport for Image Assimilation Wasserstein distance 28/38

Estimation of optimal transport map
For p = 2,T = ∇ψ, with ψ convex [Brenier ’91] and optimal mass
transfer follows straight lines → Monge-Ampère equation:
det(D2ψ) =
ρ0(x)
ρ1(∇ψ(x))
[Oliker and Prussner ’88, Oberman ’08, Froese ’12, Benamou et al. ’16]
Fast algorithms (second order methods)
ρ1 should be lipschitz continuous with convex support
ρ0 ρ1 ρ0 ρ1
• Regularized potential ψ [Paty et al. ’19]
Optimal Transport for Image Assimilation Wasserstein distance 28/38

Estimation of optimal transport map
For p = 2,T = ∇ψ, with ψ convex [Brenier ’91] and optimal mass
transfer follows straight lines → Monge-Ampère equation:
det(D2ψ) =
ρ0(x)
ρ1(∇ψ(x))
[Oliker and Prussner ’88, Oberman ’08, Froese ’12, Benamou et al. ’16]
Fast algorithms (second order methods)
ρ1 should be lipschitz continuous with convex support
ρ0 ρ1 ρ0 ρ1
• Regularized potential ψ [Paty et al. ’19]
All these methods are limited to non vanishing densities
Optimal Transport for Image Assimilation Wasserstein distance 28/38

Fluid mechanics formulation [Benamou-Brenier ’00]
• Parameterization with t ∈ [0, 1] of the geodesic path ρ(x, t):
ρ(x, t) = ((1 − t)Id + tT(x)) ρ0
Optimal Transport for Image Assimilation Wasserstein distance 29/38

Fluid mechanics formulation [Benamou-Brenier ’00]
• Parameterization with t ∈ [0, 1] of the geodesic path ρ(x, t):
ρ(x, t) = ((1 − t)Id + tT(x)) ρ0
• Non-convex problem over ρ(x, t) ∈ R and velocity ﬁeld v(x, t) ∈ R2:
W2(ρ0, ρ1)2 = min
(v,ρ)∈Cv
1
2 [0,1]2
1
0
ρ(x, t)||v(x, t)||2dtdx,
under the set of non-linear constraints
Cv = (v, ρ) \ ∂t ρ + divx (ρv) = 0, v(0, ·) = v(1, ·) = 0, ρ(·, 0) = ρ0, ρ(·, 1) = ρ1
Optimal Transport for Image Assimilation Wasserstein distance 29/38

Fluid mechanics formulation [Benamou-Brenier ’00]
• Parameterization with t ∈ [0, 1] of the geodesic path ρ(x, t):
ρ(x, t) = ((1 − t)Id + tT(x)) ρ0
• Non-convex problem over ρ(x, t) ∈ R and velocity ﬁeld v(x, t) ∈ R2:
W2(ρ0, ρ1)2 = min
(v,ρ)∈Cv
1
2 [0,1]2
1
0
ρ(x, t)||v(x, t)||2dtdx,
under the set of non-linear constraints
Cv = (v, ρ) \ ∂t ρ + divx (ρv) = 0, v(0, ·) = v(1, ·) = 0, ρ(·, 0) = ρ0, ρ(·, 1) = ρ1
Change of variable (v, ρ) → (m, ρ), with m = ρv:
Convex cost J and linear constraints C
Optimal Transport for Image Assimilation Wasserstein distance 29/38

Fluid mechanics formulation [Benamou-Brenier ’00]
• Parameterization with t ∈ [0, 1] of the geodesic path ρ(x, t):
ρ(x, t) = ((1 − t)Id + tT(x)) ρ0
• Non-convex problem over ρ(x, t) ∈ R and velocity ﬁeld v(x, t) ∈ R2:
W2(ρ0, ρ1)2 = min
(v,ρ)∈Cv
1
2 [0,1]2
1
0
ρ(x, t)||v(x, t)||2dtdx,
under the set of non-linear constraints
Cv = (v, ρ) \ ∂t ρ + divx (ρv) = 0, v(0, ·) = v(1, ·) = 0, ρ(·, 0) = ρ0, ρ(·, 1) = ρ1
Change of variable (v, ρ) → (m, ρ), with m = ρv:
Convex cost J and linear constraints C
No estimation of the transport map T, only the geodesic ρ(x, t)
Optimal Transport for Image Assimilation Wasserstein distance 29/38

Wasserstein distance
Pollutant on a convex 2D domain
Prédiction Observation Interpolation
Optimal Transport for Image Assimilation Wasserstein distance 30/38

Wasserstein distance
Pollutant on a non-convex 2D domain
Prédiction Observation Interpolation
Obstacles with w : x → {1; +∞} [P., Peyré and Oudet, ‘14]
Ω
w(x)c(x, T(x))dx
Optimal Transport for Image Assimilation Wasserstein distance 30/38

Wasserstein distance
Optimal Transport for Image Assimilation Wasserstein distance 31/38

Wasserstein distance
Deal with structured data in complex domain
Optimal Transport for Image Assimilation Wasserstein distance 31/38

Wasserstein distance
Deal with structured data in complex domain
Computational cost
Optimal Transport for Image Assimilation Wasserstein distance 31/38

1 Introduction
2 Image Assimilation
3 Wasserstein distance
4 Wasserstein Image Assimilation
5 Conclusion
Optimal Transport for Image Assimilation Wasserstein Image Assimilation 32/38

Image assimilation with Wasserstein distance
Quantity observed I included in the state [Feyeux et al. ‘18]:
W2(u, I)
Clean formulation: gradient ﬂow in Wasserstein space
Initial time Final time
Optimal Transport for Image Assimilation Wasserstein Image Assimilation 33/38

Image assimilation with Wasserstein distance
Quantity observed I included in the state [Feyeux et al. ‘18]:
W2(u, I)
Clean formulation: gradient ﬂow in Wasserstein space
Initial time Final time
Only deﬁned for densities: same mass and non negative
Optimal Transport for Image Assimilation Wasserstein Image Assimilation 33/38

Contour assimilation with Wasserstein distance
Deal with pollutant position shift [Li et al. ‘19]
Level set representation [Ba et al. ‘10, Gautama et al. ‘16] of a contour
C(u) of interest, transported by the model
Pseudo-observations: extraction of a contour C(I) from the image
Normalization of the mass:
W2
C(u)
|C(u)|
,
C(I)
|C(I)|
Optimal Transport for Image Assimilation Wasserstein Image Assimilation 34/38

Contour assimilation with Wasserstein distance
Deal with pollutant position shift [Li et al. ‘19]
Level set representation [Ba et al. ‘10, Gautama et al. ‘16] of a contour
C(u) of interest, transported by the model
Pseudo-observations: extraction of a contour C(I) from the image
Normalization of the mass:
W2
C(u)
|C(u)|
,
C(I)
|C(I)|
Reduced non-linearities [P. and Rabin ‘17]: W2 C(u), C(I)
|C(I)|
|C(u)|
Optimal Transport for Image Assimilation Wasserstein Image Assimilation 34/38

Contour assimilation with Wasserstein distance
Deal with pollutant position shift [Li et al. ‘19]
Level set representation [Ba et al. ‘10, Gautama et al. ‘16] of a contour
C(u) of interest, transported by the model
Pseudo-observations: extraction of a contour C(I) from the image
Normalization of the mass:
W2
C(u)
|C(u)|
,
C(I)
|C(I)|
Reduced non-linearities [P. and Rabin ‘17]: W2 C(u), C(I)
|C(I)|
|C(u)|
Limited to segmentation masks
Optimal Transport for Image Assimilation Wasserstein Image Assimilation 34/38

Generalized Wasserstein models
Extended W1
for seismograms of varying mass [Métivier et al. ‘16]
Full waveform
inversion
Optimal Transport for Image Assimilation Wasserstein Image Assimilation 35/38

Generalized Wasserstein models
Extended W1
for seismograms of varying mass [Métivier et al. ‘16]
Full waveform
inversion
Unbalanced transport [Chizat et al. ‘18]
- Deal with mass variations (source term in transport equation)
- Still a distance
Euclidean Wasserstein Unbalanced
Optimal Transport for Image Assimilation Wasserstein Image Assimilation 35/38

Generalized Wasserstein models
Extended W1
for seismograms of varying mass [Métivier et al. ‘16]
Full waveform
inversion
Unbalanced transport [Chizat et al. ‘18]
- Deal with mass variations (source term in transport equation)
- Still a distance
Euclidean Wasserstein Unbalanced
Transport-based distance [Thorpe et al. ‘17]
- Can deal with negative values (lifting)
- Adapted to phase shifts
Optimal Transport for Image Assimilation Wasserstein Image Assimilation 35/38

Computational schemes
Dynamical Optimal Transport (geodesic estimation)
Fixed grid [Benamou and Brenier ‘00, P. et al. ‘14,
Osher et al. ‘17, Chizat et al ‘18, Hug et al. ‘20]
Discrete surfaces [Lavenant et al. ’18]
Optimal Transport for Image Assimilation Wasserstein Image Assimilation 36/38

Computational schemes
Dynamical Optimal Transport (geodesic estimation)
Fixed grid [Benamou and Brenier ‘00, P. et al. ‘14,
Osher et al. ‘17, Chizat et al ‘18, Hug et al. ‘20]
Discrete surfaces [Lavenant et al. ’18]
Parametric: sphere [Lang and P. ‘20?]
Optimal Transport for Image Assimilation Wasserstein Image Assimilation 36/38

Computational schemes
Dynamical Optimal Transport (geodesic estimation)
Fixed grid [Benamou and Brenier ‘00, P. et al. ‘14,
Osher et al. ‘17, Chizat et al ‘18, Hug et al. ‘20]
Discrete surfaces [Lavenant et al. ’18]
Parametric: sphere [Lang and P. ‘20?]
- Multi-scale representation, adaptive meshes
- Controlling errors on a reduced number of
quadrature points
Optimal Transport for Image Assimilation Wasserstein Image Assimilation 36/38

Computational schemes
Dynamical Optimal Transport (geodesic estimation)
Fixed grid [Benamou and Brenier ‘00, P. et al. ‘14,
Osher et al. ‘17, Chizat et al ‘18, Hug et al. ‘20]
Discrete surfaces [Lavenant et al. ’18]
Parametric: sphere [Lang and P. ‘20?]
- Multi-scale representation, adaptive meshes
- Controlling errors on a reduced number of
quadrature points
Fast Optimal Transport map estimation
Machine Learning: discrete setting [Cuturi & Peyré ‘19, Flamary,
Courty et al. ‘19, P. ‘19, Weed & Rigolet ‘19...]
Deep Learning: density estimation, mass transport
[Dinh et al. ‘16, Zhang et al. ‘18, Grathwohl et al. ‘19...]
Optimal Transport for Image Assimilation Wasserstein Image Assimilation 36/38

1 Introduction
2 Image Assimilation
3 Wasserstein distance
4 Wasserstein Image Assimilation
5 Conclusion
Optimal Transport for Image Assimilation Conclusion 37/38

Conclusions and current challenges
Wasserstein distance
Measure between structured images
Generalization: mass variation, negative
Efﬁcient computation
Dealing with covariance matrices for assimilation
Ocean images (SWOT mission, 2021)
Observation of small scale dynamic
Modeling observation errors
Optimal Transport for Image Assimilation Conclusion 38/38