Julie Delon (Université Paris-Cité, France) Optimal Transport with Invariances between Gaussian Mixture Models
WORKSHOP ON OPTIMAL TRANSPORT
FROM THEORY TO APPLICATIONS
INTERFACING DYNAMICAL SYSTEMS, OPTIMIZATION, AND MACHINE LEARNING
Venue: Humboldt University of Berlin, Dorotheenstraße 24
measures. c(x, y) = d(x, y)p with p ≥ 1 and d a distance Wp (μ0 , μ1 ) = ( inf γ∈Π(μ0 ,μ1 ) ∬ c(x, y)dγ(x, y)) 1 p = inf (X,Y)∼(μ0 ,μ1 ) [ 𝔼 [dp(X, Y)]|] 1/p μ0 μ1 c(x, y) x y p=2 or 1 often used in applications Wasserstein distances and barycenters [Solomon et al. 2015] Wasserstein Barycenter of for weights (νi )i∈{1,…,p} ∑ i λi = 1 ν* ∈ argmin ρ p ∑ i=1 λi W2 2 (νi , ρ) [Agueh, Carlier 2011]: existence and unicity of if the vanish on small sets. ν* νi
1 2 ( 1 + 1) <latexit sha1_base64="ZVt/OP7bSysYv1eFvP9HxOreUXA=">AAADC3icjVHLahRBFD1pX3F8jbp0UzgoE9SmajJk3AjBgLiSCE4SSIehuqYmaVL9oLpaCMN8gn/izl3I1h9wI0E/QP/CW2UP6CLobbrr1Ln3nK5bN61MVjvOz1eiS5evXL22er1z4+at23e6d+/t1GVjlR6r0pR2L5W1Nlmhxy5zRu9VVss8NXo3Pd7y+d332tZZWbxzJ5U+yOVhkc0yJR1Rk+6rJG8mnD1+wZJcuiMlzfzNos+firVOknR8UoTkzErFBBuwfjLVxsnJ/JlYPGmxWJt0ezzmGxQjxmMxFGK0QWCdluGIiZiH6KGN7bL7FQmmKKHQIIdGAUfYQKKmZx8CHBVxB5gTZwllIa+xQIe0DVVpqpDEHtP3kHb7LVvQ3nvWQa3oL4ZeS0qGR6Qpqc4S9n9jId8EZ89e5D0Pnv5sJ7SmrVdOrMMRsf/SLSv/V+d7cZjheegho56qwPjuVOvShFvxJ2d/dOXIoSLO4ynlLWEVlMt7ZkFTh9793cqQ/xEqPev3qq1t8NOfkga8nCK7GOwMYrEeD94Oe5sv21Gv4gEeok/zHGETr7GNMXl/xBd8w/foQ/QpOo3OfpdGK63mPv6K6PMvVWenig==</latexit> ft(x) = 1 1 t ✓ g ✓ x + t 1 t ◆ 1x< t + g ✓ x t 1 t ◆ 1x>t ◆ <latexit sha1_base64="WnZvlS5KrcTP/lMaSOpr93XISeA=">AAADUXicjVFNb9NAEB07fJQUaCgXJC5LI6RUVS27jRoOFFVw4Vgk0laqq8jerhOr/tJ6XaWy8g/4dz1U/Qdw5caNt2tHAokCE8U7++a9tzs7YZHEpXLdW8vu3Lv/4OHKo+7q4ydP13rP1o/KvJJcjHme5PIkDEqRxJkYq1gl4qSQIkjDRByHFx90/fhSyDLOs8/qqhBnaTDN4ijmgQI06X2JJmow32T7zI9kwGtvUXvbasH8RERqMG2WLkM09fmWahi+jKcztemngZqFEXSTev4W+Bb7g2j7L6J3+rQGn/T6ruPuIUbMdbyh5432kOxiGY6Y57gm+tTGYd67IZ/OKSdOFaUkKCOFPKGASvxOySOXCmBnVAOTyGJTF7SgLrQVWAKMAOgFvlPsTls0w157lkbNcUqCv4SS0WtocvAkcn0aM/XKOGv0Lu/aeOq7XWENW68UqKIZ0H/plsz/1eleFEX0xvQQo6fCILo73rpU5lX0zdkvXSk4FMB0fo66RM6NcvnOzGhK07t+28DUvxqmRvWet9yKvulbYsDLKbK7k6Mdx9t1dj4N+wfv21Gv0EvaoAHmOaID+kiHNIb3d+uF9crasK/tHx3q2A3VtlrNc/otOqs/ASwfwAk=</latexit> density of µt : <latexit sha1_base64="nsrrxKD6/o8OAsXeQstJtO92muQ=">AAAC4HicjVFNT9tAFBwMLRTaEuiRy4qoEifLhihBnBBcegSJABJBke1s6Ap/yV4joogDN26IK3+Aa/trEP8A/kVnF0eiB9Q+y96382bG+/aFeaxK7XlPU870zIePs3Of5hc+f/m62FhaPiyzqohkN8rirDgOg1LGKpVdrXQsj/NCBkkYy6PwfNfUjy5kUaosPdCjXJ4mwVmqhioKNKF+Y6Wn5aUeD2RaKj0S2VBciV5S9bXYEv1G03O9NqMjPNdv+X6nzWSDS6sjfNez0UQde1njET0MkCFChQQSKTTzGAFKPifw4SEndooxsYKZsnWJK8xTW5ElyQiInvN7xt1JjabcG8/SqiP+JeZbUCnwnZqMvIK5+Zuw9co6G/Q977H1NGcbcQ1rr4Soxk+i/9JNmP+rM71oDLFpe1DsKbeI6S6qXSp7K+bk4k1Xmg45MZMPWC+YR1Y5uWdhNaXt3dxtYOvPlmlQs49qboUXc0oOeDJF8X5yuO76G+76fqu5vVOPeg4rWMUa59nBNn5gD116X+MBv/DbCZ0b59a5e6U6U7XmG/4K5/4PYpSaZg==</latexit> • • • μ0 = 𝒩 (0,1) μ1 = 1 2 𝒩 (−5,0.1) + 1 2 𝒩 (5,0.1) OT plans / barycenters between GMM are usually not GMM themselves
than components! K0 + K1 − 1 Optimal plan for = solution of the OT pb. = optimal map between and MW2 γ*(x, y) = ∑ k,l w* k,l pμk 0 (x) δy=Tk,l (x) w* K0 × K1 Tkl μk 0 μl 1 Barycenters for GMM solution with less than components MW2 K0 + K1 + … + KI−1 − I + 1 MW2 W2
2013), (Solomon et al., 2016), (Vayer et al., 2019). Computational GW : Fit GMMs on data and compute dist. between the two GMMs Computational cost of the method ≈ fi tting the GMMs MGW2 Computational GW: • Entropic GW [Peyré et al., 2016, Solomon et al., 2016] • Sliced GW [Vayer et al., 2019] • Minibatch GW [Fatras et al., 2021] • Low-Rank GW [Scetbon et al., 2022] • Quantized GW [Chowdhury et al., 2022]…