Olga Mula (Eindhoven University of Technology, Netherlands) Reduced Models in Wasserstein Spaces for Forward and Inverse Problems
WORKSHOP ON OPTIMAL TRANSPORT
FROM THEORY TO APPLICATIONS
INTERFACING DYNAMICAL SYSTEMS, OPTIMIZATION, AND MACHINE LEARNING
Venue: Humboldt University of Berlin, Dorotheenstraße 24
(TU Eindhoven) Workshop on Optimal Transport Theory and Applications Berlin, 2024/03/11-15 Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 1/ 30
d) be a metric space over a compact domain D ⊂ Rd . Parametrized PDE: Given parameters θ ∈ Θ, solve B θ (u) = 0 ⇝ u(θ) ∈ V Solution set: M := {u(θ) ∈ V : θ ∈ Θ} Observation operator: Given {ℓi }m i=1 linear functionals in V ′, M : V → Rm u → M(u) = (ℓ1(u), . . . , ℓm(u))T = z Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 2/ 30
d) be a metric space over a compact domain D ⊂ Rd . Parametrized PDE: Given parameters θ ∈ Θ, solve B θ (u) = 0 ⇝ u(θ) ∈ V Solution set: M := {u(θ) ∈ V : θ ∈ Θ} Observation operator: Given {ℓi }m i=1 linear functionals in V ′, M : V → Rm u → M(u) = (ℓ1(u), . . . , ℓm(u))T = z Tasks: (i) Forward Problem θ → u(θ) (ii) Forward Model Reduction M ≈ Vn ⊂ V (iii) Inverse state estimation (z, M) → u ≈ M−1(z) ∩ M (iv) Inverse parameter estimation (z, M) → θ† Goal: Numerical solvers for (ii) and (iii) when V = (P2(D), W2) Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 2/ 30
2 Developments in Wasserstein spaces: 1 Forward Reduced Modeling 2 State Estimation Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 4/ 30
[CDMN22] Nonlinear reduced models for state and parameter estimation (SIAM UQ, 2020) Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 5/ 30
260000 262500 265000 267500 6.246 6.248 6.250 6.252 6.254 6.256 1e6 Station locations V = RKHS B θ (u) = div(A∇u + vu) + c θ = (A, v, c) ∈ Θ zi = ℓi (u) = δxi (u) = ⟨ωi , u⟩V When V is a Hilbert space, we can define the observation space W = span{ωi }m i=1 , where the ωi are the Riesz representers of the ℓi , and z = (zi )m i=1 , zi = ℓi (u) = ⟨ωi , u⟩V ⇔ ω= PWm u Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 6/ 30
W ⊥) ∩ M Task: Build A : W → V s.t. A(PW u) ≈ u for all u ∈ M. Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 7/ 30
s.t. A(M(u)) ≈ u for all u ∈ M. Quality of A : Rm → V : E(A, M) = max u∈M ||u − A(M(u))|| Optimal performance among all algorithms: E∗(M) = min A:W →V E(A, M). Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 8/ 30
A∗ is not easily computable since M may have a complicated geometry which is in general not given explicitly. Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 9/ 30
¯ u + Vn be an affine subspace with 1 ≤ n ≤ m. The mapping A : W → V ω → A(ω) := arg min v∈ω+W ⊥ dist(v, V n) is a called an affine reconstruction algorithm. This name is because: Im(A) = ¯ u + Vn ⊕ (W ∩ V ⊥ n ) A(· − PW ¯ u) ∈ L(W , Vn ⊕ (W ∩ V ⊥ n )) Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 10/ 30
problem of size n × m. Characterization: The family of affine algorithms is generated by all affine subspaces V n with 1 ≤ n ≤ m. [CDD+20] Performance: dm+1(M) ≤ E(A, M) ≤ β−1 n,m εn dm+1(M) := min Vn ⊆V 1≤n≤m+1 max u∈M dist(u, Vn) (Kolmogorov m + 1 width) εn:= max u∈M dist(u, V n), βn,m := inf v∈Vn ∥PWm v∥ ∥v∥ = cos(θVn,Wm ) ∈ (0, 1] Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 10/ 30
problem of size n × m. Characterization: The family of affine algorithms is generated by all affine subspaces V n with 1 ≤ n ≤ m. [CDD+20] Performance: dm+1(M) ≤ E(A, M) ≤ β−1 n,m εn dm+1(M) := min Vn ⊆V 1≤n≤m+1 max u∈M dist(u, Vn) (Kolmogorov m + 1 width) εn:= max u∈M dist(u, V n), βn,m := inf v∈Vn ∥PWm v∥ ∥v∥ = cos(θVn,Wm ) ∈ (0, 1] Good strategy for elliptic problems: dn(M) ≲ e−nα [CD16] We can build V n such that εn ≲ e−nα [BCD+11] Need to pay attention to βn,m (sensor placement, [BCMN18]) Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 10/ 30
in K subsets: Θ = Θ1 ∪ · · · ∪ ΘK ⇝ M = M1 ∪ · · · ∪ MK . We omit the details on how to build the partition and select the index ˆ k. Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 11/ 30
Nichols, 2021) For a given target tolerance σ > 0, we can build a partition of M s.t. E∗(M) ≤ E(Aˆ k , M) ≤ E∗(Mσ ) where ˆ k comes from our model selection on the residual. σ ≥ σ0 > 0 in presence of noise and model error. σ → σ0 as K ↗ Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 12/ 30
Nichols, 2021) For a given target tolerance σ > 0, we can build a partition of M s.t. E∗(M) ≤ E(Aˆ k , M) ≤ E∗(Mσ ) where ˆ k comes from our model selection on the residual. σ ≥ σ0 > 0 in presence of noise and model error. σ → σ0 as K ↗ Merits and Limitations: ✓ Very general algorithm. ✓ Elliptic, parabolic pbs with possibly weak coercivity: few partitions expected ✗ Transport-dominated problems: many partitions expected. ✗ Only for V Hilbert space. Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 12/ 30
line. ℓi (u) = “cell averages” Vn = “Parametric curves on each cell” (nonlinear set) Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 13/ 30
line. ℓi (u) = “cell averages” Vn = “Parametric curves on each cell” Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 14/ 30
Model Reduction to build Vn [ELMV20] Nonlinear model reduction on metric spaces. Application to one-dimensional conservative PDEs in Wasserstein spaces (M2AN, 2020) [DFM22] Approximation and Structured Prediction with Sparse Wasserstein Barycenters (preprint) Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 15/ 30
We consider conservative PDEs, or Wasserstein gradient flows, and we view solutions in P2(D). We want to approximate u : Θ → V θ → u(θ) with ˆ un : Θ → Vn θ → ˆ un(θ) and the approximation is efficient if the error en = max θ∈Θ W2(u(θ), ˆ un(θ)) decays fast with n. Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 16/ 30
∈ Rn : n ∑ i=1 λi = 1, λi ≥ 0} and a set of measures Un = {u1, . . . , un} ⊂ P2(D), the barycenter associated to (Λn, Un) is Bar(Λn, Un) = arg min v∈P2(Ω) n ∑ i=1 λi W 2 2 (v, ui ) We want to identify a good Un and work with Vn = Bar(Σn, Un) := {Bar(Λn, Un) : Λn ∈ Σn} Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 17/ 30
set of N ≫ 1 snapshots M = {u(θ1), . . . , u(θN )} = {u(θ) : θ ∈ Θ} We start by finding (u1, u2) ∈ arg max (v1,v2)∈M×M W2(v1, v2) and set U2 = {u1, u2}, V2 = Bar(Σ2, U2) For n > 2, given Un−1 and Vn−1, find un ∈ arg max v∈M W2(v, Vn−1) and set Un = Un−1 ∪ {un}, Vn = Bar(Σn, Un). Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 18/ 30
u ∈ M, we compute the best barycentric weigths min v∈Vn W2(u(θ), v) = min Λn∈Σn W2(u(θ), Bar(Λn, Un)) ⇝ Λ∗ n (θ), ∀θ ∈ Θ. We build an interpolation operator Λn : Θ → Σn such that Λn(θ) = Λ∗ n (θ), ∀θ ∈ Θ. and we define ˆ un(θ) := Bar(Λn(θ), Un). Practical computation: 1d: Leverage close-form expressions for W2 and Bar(·, Un) > 1d: Entropic regularization ([DFM23]) Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 19/ 30
KdV Viscous Burger’s equation: ∂tu + 1 2 ∂x (u2) + θ∂xx u = 0 Camassa-Holm: ∂tm + u∂x m + 2m∂x u = 0, with m = u − α2 ∂xx u. Korteveg-de-Vries: ∂tu + 6u∂x u + ∂3 x u = 0 Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 20/ 30
inf Vn=Bar(Σn,Un) Un⊂P2(D) sup u∈M W2(u, Vn) Decay rate with the Vn from the greedy algorithm en = max θ∈Θ W2(u(θ), ˆ un(θ)) Can we prove that en ≤ C dn(M) for some C > 0? Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 21/ 30
n-term barycentric approximation using M: For a given θ ∈ Θ, we define Λn N (θ) ∈ arg min ΛN ∈ ΣN ∩∆n N W 2 2 u(θ), Bar(ΛN, M) where ∆n N := {v ∈ RN : ∥v∥0 ≤ n} ΣN := {v ∈ RN : vi ≥ 0, ∑N i=1 vi = 1} We obtain a sparse vector Λn N (θ) = (0, . . . , 0, λiθ 1 , 0, . . . , 0, λiθ n , 0, . . . , 0)T ∈ ΣN We next define Λn(θ) = (λiθ 1 , . . . , λiθ n ) ∈ Σn, Un(θ) = {u(θiθ 1 ), . . . , u(θiθ n )}. The best n-term barycentric approximation of u(θ) is Bar(Λn(θ), Un(θ)) Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 22/ 30
A : Rm → Vn [MR24] State Estimation in the Wasserstein space (with P. Rai, in progress) Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 23/ 30
Vn = Bar(Σn, Un), we consider the reconstruction algorithm A : (Rm, ∥ · ∥Z ) → P2(D) z → A(z) = arg min v∈Vn ∥z − M(v)∥Z where M : V → Rm u → M(u) = D ωi (x)u(dx) m i=1 = z for some given ωi ∈ C1(D). Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 24/ 30
for ∥z∥Z = inf{W2(u, v) : M(u − v) = z} For all ¯ u ∈ Vn, CS = sup (u,v)∈V 2 n W2(u, v) ∥M(u) − M(v)∥Z ≥ γ( ¯ u) := max v∈T ¯ uVn ∥v∥L2 ¯ u (D) ∥(dM)(v)∥Z which shows that CS < ∞ if n ≤ m. If ∥z∥Z = zT G−1( ¯ u)z, with G(u) = (⟨∇ωi , ∇ωj ⟩ L2 ¯ u (D) )1≤i,j≤m. then γ(u) = max v∈T ¯ uP2(D) ∥PW v∥L2 ¯ u (D) ∥v∥L2 ¯ u (D) = β−1 n,m , with W = span{∇ω1, . . . , ∇ωi }. This means that M : P2(D) → (Rm, G−1) is a submersion. Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 26/ 30
Assuming that we can choose sensor locations, we can view ωi (·) = ωi (ri , ·), 1 ≤ i ≤ m, Thus M(u, r) = D ωi (ri , x)u(dx) m i=1 and we can optimize locations by computing Copt S = min r=(r1,...,rm) CS (r), with CS (r) = sup (u,v)∈V 2 n W2(u, v) ∥M(u, r) − M(v, r)∥Z . In practice, we use consensus-based optimization. Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 27/ 30
Daan Bon, Gautam Pai, Gijs Bellard, Remco Duits) V = (P2(SE(2)), W2(SE(2))) We lift images to SE(2), and we want to do parameter and state estimation on that space. Figure: Top: Gradient flow in SE(2). Bottom: Gradient flow in R2. Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 29/ 30
Models are crucial to develop viable algorithms. A first step on the Wasserstein space with an algorithm based on barycenters. Plenty of approximation theory results are still missing. Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 30/ 30
G. Petrova, and P. Wojtaszczyk, Convergence rates for greedy algorithms in reduced basis methods, SIAM Journal on Mathematical Analysis 43 (2011), no. 3, 1457–1472. , Data assimilation in reduced modeling, SIAM/ASA Journal on Uncertainty Quantification 5 (2017), no. 1, 1–29. P. Binev, A. Cohen, O. Mula, and J. Nichols, Greedy algorithms for optimal measurements selection in state estimation using reduced models, SIAM/ASA Journal on Uncertainty Quantification 6 (2018), no. 3, 1101–1126. D. Bon, G. Pai, G. Bellard, O. Mula, and R. Duits, Optimal transport in the lie group of roto-translations se(2), submitted (2024). S. Bahmani, B. Raj, and P. T. Boufounos, Greedy sparsity-constrained optimization, Journal of Machine Learning Research 14 (2013), no. Mar, 807–841. Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 31/ 30
holomorphic mappings, IMA Journal of Numerical Analysis 36 (2016), no. 1, 1–12. A. Cohen, W. Dahmen, R. DeVore, J. Fadili, O. Mula, and J. Nichols, Optimal reduced model algorithms for data-based state estimation, SIAM Journal on Numerical Analysis 58 (2020), no. 6, 3355–3381. A. Cohen, W. Dahmen, O. Mula, and J. Nichols, Nonlinear reduced models for state and parameter estimation, SIAM/ASA Journal on Uncertainty Quantification 10 (2022), no. 1, 227–267. A. Cohen, O. Mula, and A. Somacal, High order recovery of geometric interfaces from cell-average data, arXiv:2402.00946 (2024). H. Do, J. Feydy, and O. Mula, Approximation and Structured Prediction with Sparse Wasserstein Barycenters, 2022. Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 32/ 30
and structured prediction with sparse wasserstein barycenters, arXiv preprint arXiv:2302.05356 (2023). V. Ehrlacher, D. Lombardi, O. Mula, and F.-X. Vialard, Nonlinear model reduction on metric spaces. application to one-dimensional conservative pdes in wasserstein spaces, ESAIM M2AN 54 (2020), no. 6, 2159–2197. A. Kyrillidis, S. Becker, V. Cevher, and C. Koch, Sparse projections onto the simplex, International Conference on Machine Learning, PMLR, 2013, pp. 235–243. O. Mula and P. Rai, State estimation in the Wasserstein space. D. Needell and J. A. Tropp, Cosamp: Iterative signal recovery from incomplete and inaccurate samples, Applied and computational harmonic analysis 26 (2009), no. 3, 301–321. Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 33/ 30
Λn N (u) ∈ arg min ΛN ∈ ΣN ∩∆n N W 2 2 u(θ), Bar(ΛN, M) Reminiscent of Compressed Sensing problems posed on V = RN of the type: min z∈RN ∥z∥0 s.t. Az = yobs Difficult optimization problem: Nonconvex Possibly plenty of local minima, non-unique minimizer. ℓ1-regularization not possible because in conflict with ΣN . We have built: a minimization algorithm that delivers satisfactory results, a surrogate version that involves θ instead of u(θ) in the loss function. Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 35/ 30
on the simplex Data: Target u ∈ P2(Ω), Training set M, sparsity degree n Result: Λn N (u) ∈ ΣN ∩ ∆n N . Initialize Λ ∈ ΣN; repeat Gradient step: Λ ← Λ − τ∇ΛW 2 2 (u, Bar(Λ, M)); Projection into ∆n N ∩ ΣN: Λ ← P∆n N ∩ΣN (Λ) until convergence; Algorithm is a generalization of CoSamp and refinements such as GSSP (see [NT09, KBCK13, BRB13]). We can build variants in which we learn adaptively the sparsity degree n. Rigorous convergence analysis missing. Difficult due to nonconvexity. Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 36/ 30
and Ytrain N = {u(θi )}N i=1 contains solutions of a Burgers’ equation. Target function is a barycenter with support 2 from training set. 2 4 6 8 ref 2 4 6 8 2 4 6 8 fit 0.0000 0.0008 0.0016 0.0024 0.0032 0.0040 0.0048 0.0056 0.0064 Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 37/ 30
a barycenter with support 10 from training set. Some functions of the target barycenter are probably redundant. Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 39/ 30
computing Λn N (θ) is based on solving Λn N (θ) ∈ arg min ΛN ∈ ΣN ∩∆n N W 2 2 u(θ), Bar(ΛN, Ytrain N ) The computation requires u(θ) so we can only use it for θ ∈ ΘN. For a general θ ∈ Θ, we cannot assume that u(θ) is given. We need an extra approximation step. Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 40/ 30
(θ). Find local Euclidean embedding: M(θ) ∈ arg min M≥0 ∑ y(α)∈Nn(y(θ)) |(θ − α)T M(θ − α) − W 2 2 (u(θ), u(α))|2 Online: Given θ ∈ Θ: Find the n nearest neighbors of u(θ) by using the Euclidean embedding. For this, evaluate (α − θ)T M(α)(α − θ) ≈ W 2 2 (y(θ), y(α)), ∀α ∈ ΘN and pick the n smallest values to define the neighbors. For the neighbors u(α) ∈ Nn(u(θ)), we have Λz n , Xz n , Yz n . We want to use this information to define an interpolation strategy for the weights. Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 41/ 30
W 2 2 (b, u(α))|2 ≈ ∑ u(α)∈Nn(u(θ)) |(θ − α)T M(α)(θ − α) − W 2 2 (b, u(α))|2 So we look for min Λn∈Σn ∑ u(α)∈Nn(u(θ)) |(θ − α)T M(α)(θ − α) −W 2 2 (Bar(Λn, Nn(u(θ))), u(α))|2 ⇝ Λn N (θ) Olga MULA (TU Eindhoven) Reduced Models in Wasserstein Spaces for State Estimation 42/ 30