Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Laetitia Chapel (Université de Bretagne-Sud & Institut Agro, France) Fast Optimal Transport through Sliced Generalized Wasserstein Geodesics

Jia-Jie Zhu
March 24, 2024
36

Laetitia Chapel (Université de Bretagne-Sud & Institut Agro, France) Fast Optimal Transport through Sliced Generalized Wasserstein Geodesics

WORKSHOP ON OPTIMAL TRANSPORT
FROM THEORY TO APPLICATIONS
INTERFACING DYNAMICAL SYSTEMS, OPTIMIZATION, AND MACHINE LEARNING
Venue: Humboldt University of Berlin, Dorotheenstraße 24

Berlin, Germany. March 11th - 15th, 2024

Jia-Jie Zhu

March 24, 2024
Tweet

More Decks by Jia-Jie Zhu

Transcript

  1. Fast Optimal Transport through Sliced Generalized Wasserstein Geodesics Joint work

    with Guillaume Mahey, Gilles Gasso, Clément Bonet and Nicolas Courty NeurIPS 2023 [5] Laetitia Chapel [email protected] IRISA, Rennes, France Institut Agro Rennes-Angers Workshop on Optimal Transport: from theory to applications, Berlin 2024
  2. Table of Contents Background on Optimal Transport Optimal transport and

    Wasserstein distance Transport map and Wasserstein Geodesics Curvature of the Wasserstein space Wasserstein Generalized Geodesic Computational Optimal Transport Sliced Wasserstein Generalized Geodesic SWGG with a PWD-like formulation SWGG with a Generalized Geodesic formulation Experimental results Computational aspects Gradient flows Pan sharpening / image colorization Point cloud matchings Optimal transport dataset distances Conclusion Bibliography L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 2 / 24
  3. Background Wasserstein distance Background on Optimal Transport Optimal transport and

    Wasserstein distance Optimal transport and Wasserstein distance OT (µ1 , µ2 ) ≜ inf γ∈Γ(µ1,µ2) X×Y c(x, y) dγ(x, y) where Γ(µ1 , µ2 ) def = {γ ∈ M+ (X × Y) s.t. (πx)# γ = µ1 and (πy)# γ = µ2 } with πx : X × Y → X. Linear loss Marginal constraints L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 3 / 24
  4. Background Wasserstein distance Background on Optimal Transport Optimal transport and

    Wasserstein distance Optimal transport and Wasserstein distance OT (µ1 , µ2 ) ≜ inf γ∈Γ(µ1,µ2) X×Y c(x, y) dγ(x, y) where Γ(µ1 , µ2 ) def = {γ ∈ M+ (X × Y) s.t. (πx)# γ = µ1 and (πy)# γ = µ2 } with πx : X × Y → X. Linear loss Marginal constraints µ1 µ2 γi,j > 0 and (πy)# γ = µ2 with (πx)# γ = µ1 The transport plan γ(x, y) specifies for each pair (x, y) how many particles go from x to y Wasserstein distance when c(x, y) = |x − y|p Wp(µ1 , µ2 ) ≜ inf γ∈Γ(µ1,µ2) X×Y c(x, y) dγ(x, y) 1/p L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 3 / 24
  5. Background Wasserstein geodesics Background on Optimal Transport Transport map and

    Wasserstein Geodesics In some cases, the optimal plan γ∗ is a Monge map of the form (Id, T)#µ1 , e.g. for p = 2 Wp p (µ1 , µ2 ) ≜ inf T ∥x − T(x)∥2 2 dµ1 (x) where T is a transport map and T# µ1 = µ2 µ1 µ2 L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 4 / 24
  6. Background Wasserstein geodesics Background on Optimal Transport Transport map and

    Wasserstein Geodesics In some cases, the optimal plan γ∗ is a Monge map of the form (Id, T)#µ1 , e.g. for p = 2 Wp p (µ1 , µ2 ) ≜ inf T ∥x − T(x)∥2 2 dµ1 (x) where T is a transport map and T# µ1 = µ2 x T(x) µ1 T(x) = µ2 Defines for each particle located at x what is its destination T(x) L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 4 / 24
  7. Background Wasserstein geodesics Background on Optimal Transport Transport map and

    Wasserstein Geodesics In some cases, the optimal plan γ∗ is a Monge map of the form (Id, T)#µ1 , e.g. for p = 2 Wp p (µ1 , µ2 ) ≜ inf T ∥x − T(x)∥2 2 dµ1 (x) where T is a transport map and T# µ1 = µ2 Wasserstein geodesics µ1→2(t) ≜ (tT1→2 + (1 − t)Id)#µ1 with T1→2 the optimal map For short, we denote µ1→2 for t = 0.5 L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 4 / 24 µ1 µ2
  8. Background Wasserstein space Background on Optimal Transport Curvature of the

    Wasserstein space The Wasserstein space is of positive curvature W2 2 (µ1→2, ν) ≥ 1 2 W2 2 (µ1 , ν) + 1 2 W2 2 (ν, µ2 ) − 1 4 W2 2 (µ1 , µ2 ) or equivalently W2 2 (µ1 , µ2 ) ≥ 2W2 2 (µ1 , ν) + 2W2 2 (ν, µ2 ) − 4 W2 2 (µ1→2, ν) for ν a pivot measure. ν µ2 µ1 µ1→2 Tν→µ1 Tν→µ2 Tµ1→µ2 Positive curvature of W space y x2 x1 (x1 + x2 )/2 Parallelogram law in Rd L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 5 / 24
  9. Background Wasserstein space Background on Optimal Transport Curvature of the

    Wasserstein space The Wasserstein space is of positive curvature W2 2 (µ1→2, ν) ≥ 1 2 W2 2 (µ1 , ν) + 1 2 W2 2 (ν, µ2 ) − 1 4 W2 2 (µ1 , µ2 ) or equivalently W2 2 (µ1 , µ2 ) ≥ 2W2 2 (µ1 , ν) + 2W2 2 (ν, µ2 ) − 4 W2 2 (µ1→2, ν) for ν a pivot measure. The Wasserstein space is flat when µ1 , µ2 , ν are 1d W2 2 (µ1 , µ2 ) = 2W2 2 (µ1 , ν) + 2W2 2 (ν, µ2 ) − 4W2 2 (µ1→2, ν) L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 5 / 24
  10. Background Wasserstein Generalized Geodesic Background on Optimal Transport Wasserstein Generalized

    Geodesics Has been introduced by Ambrosio et al. [1] Wasserstein Geodesic: µ1→2(t) ≜ (t T1→2 +(1 − t)Id)#µ1 Wasserstein Generalized Geodesic: µ1→2 g (t) ≜ (t Tν→µ2 +(1 − t) Tν→µ1 )#ν for ν a pivot measure. L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 6 / 24
  11. Background Wasserstein Generalized Geodesic Background on Optimal Transport Wasserstein Generalized

    Geodesics Has been introduced by Ambrosio et al. [1] Wasserstein Geodesic: µ1→2(t) ≜ (t T1→2 +(1 − t)Id)#µ1 Wasserstein Generalized Geodesic: µ1→2 g (t) ≜ (t Tν→µ2 +(1 − t) Tν→µ1 )#ν for ν a pivot measure. Negative curvature W2 2 (µ1→2 g , ν) ≤ 1 2 W2 2 (µ1 , ν) + 1 2 W2 2 (ν, µ2 ) − 1 4 W2 2 (µ1 , µ2 ) ν µ2 µ1 µ1→2 g Tν→µ1 Tν→µ2 Tµ1→µ2 ν = Tν→µ2 ◦ Tµ1→ν L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 6 / 24
  12. Background Wasserstein Generalized Geodesic Background on Optimal Transport Wasserstein Generalized

    Geodesics Has been introduced by Ambrosio et al. [1] Wasserstein Geodesic: µ1→2(t) ≜ (t T1→2 +(1 − t)Id)#µ1 Wasserstein Generalized Geodesic: µ1→2 g (t) ≜ (t Tν→µ2 +(1 − t) Tν→µ1 )#ν for ν a pivot measure. Negative curvature W2 2 (µ1→2 g , ν) ≤ 1 2 W2 2 (µ1 , ν) + 1 2 W2 2 (ν, µ2 ) − 1 4 W2 2 (µ1 , µ2 ) ν µ2 µ1 µ1→2 g Tν→µ1 Tν→µ2 Tµ1→µ2 ν = Tν→µ2 ◦ Tµ1→ν ν−Wasserstein distance: W2 ν (µ1 , µ2 ) = 2W2 2 (µ1 , ν) + 2W2 2 (ν, µ2 ) − 4W2 2 (µ1→2 g , ν) with W2 ν (µ1 , µ2 ) ≥ W2 2 (µ1 , µ2 ) L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 6 / 24
  13. Background Computational OT Computational Optimal Transport Discrete formulation of OT

    For µ1 = n i=1 hi δxi and µ2 = m j=1 gj δyj and a quadratic cost, we solve W2 2 (µ1 , µ2 ) ≜ minγ∈Γ(µ1,µ2) i,j c(xi , yj )γi,j → linear solvers with O(n3 log(n)) complexity L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 7 / 24
  14. Background Computational OT Computational Optimal Transport Discrete formulation of OT

    For µ1 = n i=1 hi δxi and µ2 = m j=1 gj δyj and a quadratic cost, we solve W2 2 (µ1 , µ2 ) ≜ minγ∈Γ(µ1,µ2) i,j c(xi , yj )γi,j → linear solvers with O(n3 log(n)) complexity When µ1 and µ2 are 1D distributions and n = m with uniform masses, the solution is given by W2 2 (µ1 , µ2 ) ≜ 1 n n i=1 (xσ(i) − yτ(i) )2 → the optimal transport plan respects the ordering of the elements xσ(i−1) ≤ xσ(i) and yτ(i−1) ≤ yτ(i) , complexity O(n log(n)) and O(n + n log(n)) for computing the distance xσ(1) yτ(1) xσ(2) yτ(2) xσ(3) yτ(3) xσ(4) yτ(4) xσ(5) yτ(5) xσ(6) yτ(6) L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 7 / 24
  15. Background Computational OT Computational Optimal Transport Geodesic in 1D In

    1D, the middle of the geodesic can be easily computed (xσ(i) + yτ(i) )/2 And when we take the pivot measure ν to be the middle of the geodesic µ1→2, we have W2 2 (µ1 , µ2 ) = W2 ν (µ1 , µ2 ) = 2W2 2 (µ1 , ν) + 2W2 2 (ν, µ2 ) xσ(1) +yτ(1) 2 xσ(2) +yτ(2) 2 xσ(3) +yτ(3) 2 xσ(4) +yτ(4) 2 xσ(5) +yτ(5) 2 xσ(6) +yτ(6) 2 L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 8 / 24
  16. Background Computational OT Computational Optimal Transport Sliced Wasserstein on Rd

    1. Slice the distribution along lines θ ∈ Sd−1 2. Project µ1 and µ2 onto θ: Pθ # µ, with Pθ : Rd → R, x → ⟨x, θ⟩ 3. Compute 1d Wasserstein onto the projected samples in 1d 4. Average all the distances SW2 2 (µ1 , µ2 ) ≜ Sd−1 W2 2 (Pθ # µ1 , Pθ # µ2 )dω(θ), with ω uniform distribution on Sd−1. µ1 µ2 Pθ # µ1 = x, θ Pθ # µ2 = y, θ θ ∈ Sd−1 → provides a lower bound of W2 2 (µ1 , µ2 ) with complexity O(Ln + Ln log(n)), L number of lines L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 9 / 24
  17. Background Computational OT Computational Optimal Transport Projected Wasserstein Distance on

    Rd 1. Slice the distribution along lines θ ∈ Sd−1 2. Project µ1 and µ2 onto θ: Pθ # µ, with Pθ : Rd → R, x → ⟨x, θ⟩ 3. Compute Rd Wasserstein onto the permutations obtained by sorting the projections 4. Average all the distances (mettre un theta en indice dans les sigma) PWD2 2 (µ1 , µ2 ) ≜ Sd−1 1 n n i=1 xσθ(i) − yτθ(i) 2 2 dω(θ), with ω uniform distribution on Sd−1. µ1 µ2 σ τ θ ∈ Sd−1 → provides an upper bound of W2 2 (µ1 , µ2 ) with complexity O(Ln d +Ln log(n)), L number of lines L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 10 / 24
  18. SWGG SWGG with a PWD-like formulation Sliced Wasserstein Generalized Geodesic

    SWGG with a PWD-like formulation 1. Slice the distribution along lines θ ∈ Sd−1 2. Project µ1 and µ2 onto θ: Pθ # µ, with Pθ : Rd → R, x → ⟨x, θ⟩ 3. Compute Rd Wasserstein onto the permutations obtained by sorting the projections 4. Take the minimum over all the distances SWGG2 2 (µ1 , µ2 , θ) ≜ 1 n n i=1 xσθ(i) − yτθ(i) 2 2 , min-SWGG2 2 (µ1 , µ2 ) ≜ min θ∈Sd−1 SWGG2 2 (µ1 , µ2 , θ) µ1 µ2 σ τ θ ∈ Sd−1 L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 11 / 24
  19. SWGG SWGG with a PWD-like formulation Sliced Wasserstein Generalized Geodesic

    SWGG with a PWD-like formulation Properties of min-SWGG It comes with a transport map: let θ∗ be the optimal projection direction T(xi ) = y τ−1 θ∗ (σθ∗ (i)) , ∀1 ≤ i ≤ n. It is an upper bound of W and a lower bound of PWD W2 2 ≤ min-SWGG2 2 ≤ PWD2 2 and W2 2 = min-SWGG2 2 when d > 2n [2] Complexity O(Lnd + Ln log(n)) with L number of lines The Monte-Carlo search over the L lines is effective in low dimension only → how to design gradient descent techniques for finding θ∗? → further properties, such as sample complexity? L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 12 / 24
  20. SWGG SWGG with a Generalized Geodesic formulation Sliced Wasserstein Generalized

    Geodesic SWGG with a Generalized Geodesic formulation 1. Slice the distribution along lines θ ∈ Sd−1 2. Project µ1 and µ2 onto θ: Qθ # µ, with Qθ : Rd → Rd, x → θ⟨x, θ⟩ 3. Define the pivot measure ν to be the Wasserstein mean of the measure Qθ # µ1 and Qθ # µ2 ν = µ1→2 θ ≜ arg min µ W2 2 (Qθ # µ1 , µ) + W2 2 (µ, Qθ # µ2 ) 4. Take the minimum over all the following distances SWGG2 2 (µ1 , µ2 , θ) = 2W2 2 (µ1 , µ1→2 θ ) + 2W2 2 (µ1→2 θ , µ2 ) − 4W2 2 (µ1→2 g,θ , µ1→2 θ ) µ1 µ2 µ1→2 g,θ Qθ # µ1 Qθ # µ2 ν = µ1→2 θ θ ∈ Sd−1 → the two formulations are equivalent (for continuous or discrete distributions) L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 13 / 24
  21. SWGG SWGG with a Generalized Geodesic formulation Sliced Wasserstein Generalized

    Geodesic SWGG with a Generalized Geodesic formulation Why this reformulation? Define a gradient descent algorithm for optimizing over θ Rewrite the problem as an OT formulation with a restricted constraint set Define new properties for SWGG Properties of min-SWGG Weak convergence Translation invariance SWGG is equal to W when one of the distributions (µ2 ) is supported on a line of direction θ: W2 2 (µ1 , µ2 ) = W2 2 (µ1 , Qθ # µ1 ) + W2 2 (Qθ # µ1 , µ2 ) that can be computed with a closed form L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 14 / 24
  22. SWGG SWGG with a Generalized Geodesic formulation Sliced Wasserstein Generalized

    Geodesic SWGG with a Generalized Geodesic formulation Gradient descent for optimizing over θ min-SWGG2 2 (µ1 , µ2 ) = minθ∈Sd−1 1 n n i=1 xσθ(i) − yτθ(i) 2 2 is not amenable to optimization µ1 Qθ1 # µ2 µ2 Qθ2 # µ1 Qθ1 # µ1 Qθ2 # µ2 0 π/2 π 3π/2 0 7 SWGG value min-SWGG2 2 (µ1 , µ2 ) = minθ∈Sd−1 2W2 2 (µ1 , µ1→2 θ ) + 2W2 2 (µ1→2 θ , µ2 ) − 4W2 2 (µ1→2 g,θ , µ1→2 θ ) can be computed with a O(dn + n log(n)) complexity, but W2 2 (µ1→2 g,θ , µ1→2 θ ) is still piecewise linear with θ → rely on the blurred Wasserstein distance [3] θ1 θ2 Generalized Wasserstein mean µ1 µ2 µ1→2 g,θ1 µ1→2 g,θ2 Smooth generalized Wasserstein mean µ1→2 g,θk 0 π/2 π 3π/2 5 10 SWGG2 2 SWGG 2 2 L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 15 / 24
  23. SWGG SWGG with a Generalized Geodesic formulation Sliced Wasserstein Generalized

    Geodesic SWGG with a Generalized Geodesic formulation OT with a restricted constraint set Discrete optimal transport, with n = m and uniform masses W2 2 (µ1 , µ2 ) = minγ∈Γ(µ1,µ2) i,j c(xi , yj )γi,j where Γ(µ1 , µ2 ) = {γ ∈ Rn×n s.t. γ1n = 1n /n, γ⊤1n = 1n /n} (Birkhoff polytope). min-SWGG min-SWGG2 2 (µ1 , µ2 ) = minγθ∈Π(µ1,µ2) i,j c(xi , yj )γθi,j where Π(µ1 , µ2 ) = {γθ ∈ Rn×n s.t. it is constructed from the permutahedron of the proj. distributions} 0 50 100 150 200 250 300 Dimension 1.025 1.050 1.075 1.100 1.125 1.150 1.175 1.200 1.225 Ratio min − SWGG W2 2 for n = 50 0 50 100 150 200 250 300 Dimension 0 200 400 600 800 1000 1200 1400 log10 number of permutations Number permutations n = 310 Number of permutation from a line n = 310 L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 16 / 24
  24. SWGG SWGG with a Generalized Geodesic formulation Sliced Wasserstein Generalized

    Geodesic SWGG with a Generalized Geodesic formulation OT with a restricted constraint set Discrete optimal transport, with n = m and uniform masses W2 2 (µ1 , µ2 ) = minγ∈Γ(µ1,µ2) i,j c(xi , yj )γi,j where Γ(µ1 , µ2 ) = {γ ∈ Rn×n s.t. γ1n = 1n /n, γ⊤1n = 1n /n} (Birkhoff polytope). min-SWGG min-SWGG2 2 (µ1 , µ2 ) = minγθ∈Π(µ1,µ2) i,j c(xi , yj )γθi,j where Π(µ1 , µ2 ) = {γθ ∈ Rn×n s.t. it is constructed from the permutahedron of the proj. distributions} Π(µ1 , µ2 ) ⊂ Γ(µ1 , µ2 ) Gives a sample complexity similar to Sinkhorn n−1/2 measures lying on smaller dimensional subspaces has a better sample complexity than between the original measures L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 16 / 24
  25. Experimental results Computational aspects Experimental results Computational aspects Two Gaussian

    distributions µ1 and µ2 10 1000 0 10 20 30 40 Distance W2 2 =32.4 10 1000 Number of projections 0 100 200 300 W2 2 =346.1 maxSW (optimized) SW (Monte Carlo) SWGG Optimized 10 1000 0 1000 2000 3000 4000 W2 2 =3836.0 d = 20 d = 2 d = 200 102 103 104 105 Number of samples in each distribution 10−3 10−2 10−1 100 101 102 103 Seconds SW, L =200 min-SWGG, L =200 min-SWGG optim Factored Coupling Wasserstein Sinkhorn max-SW SRW L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 17 / 24
  26. Experimental results Gradient flows Experimental results Gradient flows Initial µ1

    : uniform distribution, different target distributions 0 250 500 750 1000 1250 1500 1750 2000 Number of iterations −4 −3 −2 −1 0 Log10 (W2 ) SW max-SW min-SWGG (random search) min-SWGG (optim) PWD 0 250 500 750 1000 1250 1500 1750 2000 −4 −3 −2 −1 0 0 250 500 750 1000 1250 1500 1750 2000 Number of iterations −3 −2 −1 0 Log10 (W2 ) 0 250 500 750 1000 1250 1500 1750 2000 0.5 1.0 1.5 2.0 2.5 3.0 Gaussian 500d L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 18 / 24
  27. Experimental results Pan sharpening / image colorization Experimental results Pan

    sharpening / image colorization, using the map One distribution is supported on a line Construct a super-resolution multi-chromatic satellite image from a high-resolution mono-chromatic image (source) and low-resolution multi-chromatic image (target) L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 19 / 24
  28. Experimental results Point cloud matchings Experimental results Point cloud matchings,

    using the map Iterative Closest Point iterative algorithm for aligning point clouds Based on several one-to-one correspondences between points n 500 3000 150 000 NN 3.54 (0.02) 96.9 (0.30) 23.3 (59.37) OT 0.32 (0.18) 48.4 (58.46) · min-SWGG 0.05 (0.04) 37.6 (0.90) 6.7 (105.75) (the lower the better, timings into parenthesis) L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 20 / 24
  29. Experimental results Optimal transport dataset distances Experimental results Optimal transport

    dataset distances For computing distances between datasets Cumbersome to compute in practice since it lays down on solving multiple OT problems 1 1 0.8 1.1 1.2 1.2 1.3 1 1.3 1.2 1.3 0.8 1 1.2 1.2 1 1 1.3 1.2 1 MNIST EMNIST Fashion KMNIST USPS MNIST EMNIST Fashion KMNIST USPS 0.9 0.9 0.7 0.9 1.1 1 1.2 0.9 1.2 1.3 1.2 0.8 0.8 1.1 1.1 0.9 0.9 1.2 1.1 0.9 MNIST EMNIST Fashion KMNIST USPS MNIST EMNIST Fashion KMNIST USPS Figure: OTDD results (×102) distances for min-SWGG (left) and Sinkhorn divergence (right) for various datasets. L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 21 / 24
  30. Conclusion Conclusion Sliced Wasserstein Generalized Geodesic provides an upper bound

    for Wasserstein comes with an associated transport map has a O(Lnd + n log(n)) complexity has good statistical properties Not the only approximation method based on a pivot measure Factored coupling [4], where ν = arg minµ∈P(Rk) W2 2 (µ, µ1) + W2 2 (µ, µ1) Exact OT Source samples Target samples Factored OT Template samples HROT (exact) HROT (thresholded) Partial OT Subspace detours [6], where ν = arg minν∈P(Rd) W2 2 (PE # µ1, ν) + W2 2 (ν, PE # µ2) Some open questions how do the Birkhoff polytope and the considered permutahedron relate? concentration results? extension to incomparable spaces through a pivot measure? L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 22 / 24
  31. Fast Optimal Transport through Sliced Generalized Wasserstein Geodesics Joint work

    with Guillaume Mahey, Gilles Gasso, Clément Bonet and Nicolas Courty NeurIPS 2023 [5] Laetitia Chapel [email protected] IRISA, Rennes, France Institut Agro Rennes-Angers Workshop on Optimal Transport: from theory to applications, Berlin 2024
  32. Bibliography Bibliography I [1] Luigi Ambrosio, Nicola Gigli, and Giuseppe

    Savaré. Gradient flows: in metric spaces and in the space of probability measures. Springer Science & Business Media, 2005. [2] Thomas M Cover. “The number of linearly inducible orderings of points in d-space”. In: SIAM Journal on Applied Mathematics 15.2 (1967), pp. 434–439. [3] Jean Feydy. “Geometric data analysis, beyond convolutions”. PhD thesis. École Normale Supérieure de Cachan, 2020. [4] Aden Forrow et al. “Statistical optimal transport via factored couplings”. In: The 22nd International Conference on Artificial Intelligence and Statistics. PMLR. 2019, pp. 2454–2465. [5] Guillaume Mahey et al. “Fast Optimal Transport through Sliced Generalized Wasserstein Geodesics”. In: Advances in Neural Information Processing Systems 36 (2024). [6] Boris Muzellec and Marco Cuturi. “Subspace detours: Building transport plans that are optimal on subspace projections”. In: Advances in Neural Information Processing Systems 32 (2019). L. Chapel ·Fast OT through SWGG ·Workshop on Optimal Transport: from theory to applications, Berlin 2024 24 / 24