Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Discrete Entropic Wasserstein Flows

Gabriel Peyré
February 19, 2015

Discrete Entropic Wasserstein Flows

Talk given at BANFF Optimal Transport Workshop

Gabriel Peyré

February 19, 2015
Tweet

More Decks by Gabriel Peyré

Other Decks in Research

Transcript

  1. Entropy Regularized Transport (minus) Entropy: E ( ⇡ ) def.

    = X i,j ⇡i,j(log( ⇡i,j) 1) + ◆R+ ( ⇡i,j)
  2. Entropy Regularized Transport (minus) Entropy: Regularized distance: E ( ⇡

    ) def. = X i,j ⇡i,j(log( ⇡i,j) 1) + ◆R+ ( ⇡i,j) W (p, q) def. = min {h⇡, ci + E(⇡) ; ⇡ 2 C(p, q)} ⇡ def. = argmin {h⇡, ci + E(⇡) ; ⇡ 2 C(p, q)} [Schrodinger 1931] Used in economy [Galichon Salani´ e 2008] and machine learning [Cuturi 2013]
  3. Entropy Regularized Transport (minus) Entropy: Regularized distance: ⇡ c E

    ( ⇡ ) def. = X i,j ⇡i,j(log( ⇡i,j) 1) + ◆R+ ( ⇡i,j) W (p, q) def. = min {h⇡, ci + E(⇡) ; ⇡ 2 C(p, q)} ⇡ def. = argmin {h⇡, ci + E(⇡) ; ⇡ 2 C(p, q)} [Schrodinger 1931] Used in economy [Galichon Salani´ e 2008] and machine learning [Cuturi 2013]
  4. The Impact of Regularization Proposition: ⇡ !0 ! argmin ⇡2S

    E(⇡) W (p, q) !0 ! W(p, q) S def. = argmin {h⇡, ci ; ⇡ 2 C(p, q)}
  5. The Impact of Regularization Proposition: ⇡ !+1 ! pqT ⇡

    !0 ! argmin ⇡2S E(⇡) W (p, q) !0 ! W(p, q) 1 W (p, q) !+1 ! E(p) + E(q) S def. = argmin {h⇡, ci ; ⇡ 2 C(p, q)}
  6. The Impact of Regularization Proposition: ⇡ !+1 ! pqT ⇡

    !0 ! argmin ⇡2S E(⇡) W (p, q) !0 ! W(p, q) p q 1 W (p, q) !+1 ! E(p) + E(q) S def. = argmin {h⇡, ci ; ⇡ 2 C(p, q)} ⇡
  7. Kullback-Leibler Projections KL( ⇡|⇠ ) def. = P i,j ⇡i,j

    log ⇣ ⇡i,j ⇠i,j ⌘ + ⇠i,j ⇡i,j KL divergence:
  8. Kullback-Leibler Projections KL( ⇡|⇠ ) def. = P i,j ⇡i,j

    log ⇣ ⇡i,j ⇠i,j ⌘ + ⇠i,j ⇡i,j KL divergence: where ⇠ = e c One has: h⇡, ci + E(⇡) = KL(⇡|⇠) + C
  9. Kullback-Leibler Projections W (p, q) = min {KL(⇡|⇠) ; ⇡

    2 C(p, q)} ⇡ = ProjC(p,q)( ⇠ ) def. = argmin { KL( ⇡|⇠ ) ; ⇡ 2 C ( p, q ) } Proposition: KL( ⇡|⇠ ) def. = P i,j ⇡i,j log ⇣ ⇡i,j ⇠i,j ⌘ + ⇠i,j ⇡i,j KL divergence: where ⇠ = e c One has: h⇡, ci + E(⇡) = KL(⇡|⇠) + C
  10. Kullback-Leibler Projections W (p, q) = min {KL(⇡|⇠) ; ⇡

    2 C(p, q)} Constraint splitting: q p ⇡ C(p, q) = C1 \ C2 ⇢ C1 = ⇡ 2 (R +)N⇥N ; ⇡1 = p , C2 = ⇡ 2 (R +)N⇥N ; ⇡T 1 = q . ⇡ = ProjC(p,q)( ⇠ ) def. = argmin { KL( ⇡|⇠ ) ; ⇡ 2 C ( p, q ) } Proposition: KL( ⇡|⇠ ) def. = P i,j ⇡i,j log ⇣ ⇡i,j ⇠i,j ⌘ + ⇠i,j ⇡i,j KL divergence: where ⇠ = e c One has: h⇡, ci + E(⇡) = KL(⇡|⇠) + C
  11. Sinkhorn / IPFP Algorithm Iterative Bregman projections: ⇡(0) = ⇠

    ⇠ ⇡(1) ⇡(2) ⇡(3) ⇡(4) ⇡(5) ⇡ ⇡(`+1) = ProjC`%K ( ⇡(`) ) [Bregman 1957]
  12. Sinkhorn / IPFP Algorithm Iterative Bregman projections: ⇡(0) = ⇠

    ⇠ ⇡(1) ⇡(2) ⇡(3) ⇡(4) ⇡(5) ⇡ ⇡(`+1) = ProjC`%K ( ⇡(`) ) Theorem: ⇡(`) ! ProjC1 \...\CK ( ⇠ ) [Bregman 1957] If {Ci }i are a ne sets,
  13. Sinkhorn / IPFP Algorithm Iterative Bregman projections: ⇡(0) = ⇠

    ⇠ ⇡(1) ⇡(2) ⇡(3) ⇡(4) ⇡(5) ⇡ ⇡(`+1) = ProjC`%K ( ⇡(`) ) Theorem: ⇡(`) ! ProjC1 \...\CK ( ⇠ ) Fixed marginals: Proposition: ProjC1 ( ⇡ ) = diag ⇣ p ⇡1 ⌘ ⇡ ProjC2 ( ⇡ ) = ⇡ diag ⇣ q ⇡T 1 ⌘ ( C1 def. = {⇡ ; ⇡1 = p} , C2 def. = ⇡ ; ⇡T 1 = q . [Bregman 1957] If {Ci }i are a ne sets,
  14. Diagonal Scaling, Fast Implementation Sinkhorn algorithm: ⇡(0) = ⇠ [Sinkhorn

    1967] [Deming,Stephan 1940] ⇡(2`+1) = diag(p/⇡(2`)1)⇡(2`) ⇡(2`+2) = ⇡(2`+1) diag(q/⇡(2`+1),T 1)
  15. Diagonal Scaling, Fast Implementation Sinkhorn algorithm: ⇡(0) = ⇠ [Sinkhorn

    1967] [Deming,Stephan 1940] Proposition: ⇡ = diag(u )⇠ diag(v ) where ⇠ = e c . ⇡(2`+1) = diag(p/⇡(2`)1)⇡(2`) ⇡(2`+2) = ⇡(2`+1) diag(q/⇡(2`+1),T 1)
  16. Diagonal Scaling, Fast Implementation Sinkhorn algorithm: ⇡(0) = ⇠ [Sinkhorn

    1967] [Deming,Stephan 1940] Proposition: ⇡ = diag(u )⇠ diag(v ) where ⇠ = e c . ⇡(`) = diag(u(`))⇠ diag(v(`)) ⇡(2`+1) = diag(p/⇡(2`)1)⇡(2`) ⇡(2`+2) = ⇡(2`+1) diag(q/⇡(2`+1),T 1)
  17. Diagonal Scaling, Fast Implementation Sinkhorn algorithm: ⇡(0) = ⇠ [Sinkhorn

    1967] [Deming,Stephan 1940] v(0) = 1 Sinkhorn, revisited: u(`) = p ⇠v(`) v(`+1) = q ⇠T u(`) Proposition: ⇡ = diag(u )⇠ diag(v ) where ⇠ = e c . ⇡(`) = diag(u(`))⇠ diag(v(`)) ⇡(2`+1) = diag(p/⇡(2`)1)⇡(2`) ⇡(2`+2) = ⇡(2`+1) diag(q/⇡(2`+1),T 1)
  18. Diagonal Scaling, Fast Implementation Sinkhorn algorithm: ! Only matrix-vector multiplications.

    ⇡(0) = ⇠ [Sinkhorn 1967] [Deming,Stephan 1940] v(0) = 1 Sinkhorn, revisited: u(`) = p ⇠v(`) v(`+1) = q ⇠T u(`) Proposition: ⇡ = diag(u )⇠ diag(v ) where ⇠ = e c . ⇡(`) = diag(u(`))⇠ diag(v(`)) ⇡(2`+1) = diag(p/⇡(2`)1)⇡(2`) ⇡(2`+2) = ⇡(2`+1) diag(q/⇡(2`+1),T 1)
  19. Diagonal Scaling, Fast Implementation Sinkhorn algorithm: ! Only matrix-vector multiplications.

    ! Highly parallelizable. ⇡(0) = ⇠ [Sinkhorn 1967] [Deming,Stephan 1940] v(0) = 1 Sinkhorn, revisited: u(`) = p ⇠v(`) v(`+1) = q ⇠T u(`) Proposition: ⇡ = diag(u )⇠ diag(v ) where ⇠ = e c . ⇡(`) = diag(u(`))⇠ diag(v(`)) ⇡(2`+1) = diag(p/⇡(2`)1)⇡(2`) ⇡(2`+2) = ⇡(2`+1) diag(q/⇡(2`+1),T 1)
  20. Diagonal Scaling, Fast Implementation Sinkhorn algorithm: ! Only matrix-vector multiplications.

    ! Highly parallelizable. ⇡(0) = ⇠ [Sinkhorn 1967] [Deming,Stephan 1940] v(0) = 1 Sinkhorn, revisited: u(`) = p ⇠v(`) v(`+1) = q ⇠T u(`) Proposition: ⇡ = diag(u )⇠ diag(v ) where ⇠ = e c . ⇡(`) = diag(u(`))⇠ diag(v(`)) ⇡(2`+1) = diag(p/⇡(2`)1)⇡(2`) ⇡(2`+2) = ⇡(2`+1) diag(q/⇡(2`+1),T 1) ! Extension to barycenters and more [Benamou et al 2015].
  21. Diagonal Scaling, Fast Implementation Sinkhorn algorithm: ! Only matrix-vector multiplications.

    ! Highly parallelizable. ⇡(0) = ⇠ [Sinkhorn 1967] [Deming,Stephan 1940] v(0) = 1 Sinkhorn, revisited: u(`) = p ⇠v(`) v(`+1) = q ⇠T u(`) Proposition: ⇡ = diag(u )⇠ diag(v ) where ⇠ = e c . ⇡(`) = diag(u(`))⇠ diag(v(`)) ⇡(2`+1) = diag(p/⇡(2`)1)⇡(2`) ⇡(2`+2) = ⇡(2`+1) diag(q/⇡(2`+1),T 1) ! Extension to Riemannian manifolds [Solomon et al 2015] ! Extension to barycenters and more [Benamou et al 2015].
  22. Translation-invariant Ground Metrics Assuming ci,j = 'i j on a

    discrete grid (e.g. periodic b.c.). ⇠v =  ? v where  def. = e '/
  23. Translation-invariant Ground Metrics Assuming ci,j = 'i j on a

    discrete grid (e.g. periodic b.c.). Example: ci,j = || xi xj ||2,  = Gaussian filter. ⇠v =  ? v where  def. = e '/
  24. Translation-invariant Ground Metrics Assuming ci,j = 'i j on a

    discrete grid (e.g. periodic b.c.). Example: ci,j = || xi xj ||2,  = Gaussian filter. v(`+1) = q ⇣  ? ⇣ p  ? v(`) 1 ⌘⌘ 1 Convolutive Sinkhorn: ⇠v =  ? v where  def. = e '/ a b def. = ( aibi)i, ? def. = convolution ! ⇠v computed in O ( N log( N )) operations (FFT, IIR approximation)
  25. Translation-invariant Ground Metrics Assuming ci,j = 'i j on a

    discrete grid (e.g. periodic b.c.). Example: ci,j = || xi xj ||2,  = Gaussian filter. v(`+1) = q ⇣  ? ⇣ p  ? v(`) 1 ⌘⌘ 1 Convolutive Sinkhorn: ⇠v =  ? v where  def. = e '/ a b def. = ( aibi)i, ? def. = convolution p q ` ⇡(`) ! ⇠v computed in O ( N log( N )) operations (FFT, IIR approximation)
  26. JKO Flow - Theory Implicit Euler step: [Jordan, Kinderlehrer, Otto

    1998] pt+1 = argmin p2⌃N W(pt, p) + ⌧f(p)
  27. JKO Flow - Theory Implicit Euler step: Formal limit ⌧

    ! 0: [Jordan, Kinderlehrer, Otto 1998] @tp = div (pr(f0(p))) pt+1 = argmin p2⌃N W(pt, p) + ⌧f(p)
  28. JKO Flow - Theory f(p) = R pw (advection) Implicit

    Euler step: Formal limit ⌧ ! 0: @tp = div(prw) Evolution pt Evolution pt [Jordan, Kinderlehrer, Otto 1998] @tp = div (pr(f0(p))) pt+1 = argmin p2⌃N W(pt, p) + ⌧f(p) Potential cos( w ) Potential cos( w )
  29. JKO Flow - Theory f ( p ) = R

    p log( p ) f(p) = R pw (advection) (heat di↵usion) Implicit Euler step: Formal limit ⌧ ! 0: @tp = div(prw) @tp = p Evolution pt Evolution pt [Jordan, Kinderlehrer, Otto 1998] @tp = div (pr(f0(p))) pt+1 = argmin p2⌃N W(pt, p) + ⌧f(p) Potential cos( w ) Potential cos( w )
  30. JKO Flow - Theory f ( p ) = R

    p log( p ) f(p) = R pw (advection) (heat di↵usion) (non-linear di↵usion) Implicit Euler step: Formal limit ⌧ ! 0: @tp = div(prw) @tp = p @tp = pm Evolution pt Evolution pt [Jordan, Kinderlehrer, Otto 1998] @tp = div (pr(f0(p))) f(p) = 1 m 1 R pm pt+1 = argmin p2⌃N W(pt, p) + ⌧f(p) Potential cos( w ) Potential cos( w )
  31. JKO Flow - Numerics Pros: ! intrinsic discretization (mass conservation).

    ! deals with non-smooth energies. ! (sometimes) exposes displacement convexity. ! no CFL condition (implicit stepping). (?) pt+1 = argminp W(pt, p) + ⌧f(p)
  32. JKO Flow - Numerics Pros: ! intrinsic discretization (mass conservation).

    ! deals with non-smooth energies. ! (sometimes) exposes displacement convexity. ! no CFL condition (implicit stepping). Cons: ! ( ? ) is hard to solve . . . (?) pt+1 = argminp W(pt, p) + ⌧f(p)
  33. JKO Flow - Numerics Pros: ! intrinsic discretization (mass conservation).

    ! deals with non-smooth energies. ! (sometimes) exposes displacement convexity. ! no CFL condition (implicit stepping). Cons: ! ( ? ) is hard to solve . . . (?) [Kinderlehrer, Walkington 1999] [Blanchet, Calvez, Carrillo 2008] [Agueh, Bowles 2013] [Matthes and Osberger 2014] [Carrillo and Moll 2009] [Benamou, Carlier, Merigot, Oudet 2014] [Westdickenberg and Wilkening 2010] [Budd, Cullen and Walsh 2012] [Burger, Carrillo, Wolfram 2010] [Carrillo, Chertock and Huang 2014] Eulerian Lagrangian (moving meshes) (warpings) (particules system) (finite volumes) pt+1 = argminp W(pt, p) + ⌧f(p) 1-D (gradient convex func) (linearization) [Burger, Franeka, Schonlieb 2012] (interior point)
  34. Entropic JKO and KL Optimization min p2⌃N W (q, p)

    + ⌧f(p) ⇠ def. = e c/ 2 RN⇥N +,⇤ min ⇡ KL(⇡|⇠) + '1(⇡) + '2(⇡) () '2(⇡) def. = ⌧ f(⇡1) '1(⇡) = ◆Cq (⇡) Cq def. = ⇡ ; ⇡T 1 = q p = ⇡1
  35. Dykstra’s Algorithm Proximal operator: (?) min ⇡ KL(⇡|⇠) + '1(⇡)

    + '2(⇡) Proxg( ⇡ ) def. = argmin˜ ⇡ KL(˜ ⇡|⇡ ) + g (˜ ⇡ )
  36. Dykstra’s Algorithm z(0) = z( 1) def. = 1 Proximal

    operator: Initialization: Iterations: (?) min ⇡ KL(⇡|⇠) + '1(⇡) + '2(⇡) Proxg( ⇡ ) def. = argmin˜ ⇡ KL(˜ ⇡|⇡ ) + g (˜ ⇡ ) ⇡(0) def. = y ⇡(`) def. = Prox'`%2 ( ⇡(` 1) z(` 2) ) z(`) def. = z(` 2) ⇡(` 1) ⇡(`)
  37. Dykstra’s Algorithm z(0) = z( 1) def. = 1 Proximal

    operator: Initialization: Iterations: (?) min ⇡ KL(⇡|⇠) + '1(⇡) + '2(⇡) Proxg( ⇡ ) def. = argmin˜ ⇡ KL(˜ ⇡|⇡ ) + g (˜ ⇡ ) ⇡(0) def. = y ⇡(`) def. = Prox'`%2 ( ⇡(` 1) z(` 2) ) z(`) def. = z(` 2) ⇡(` 1) ⇡(`) Theorem: ⇡(`) ! ⇡ solution of ( ? ).
  38. Dykstra’s Algorithm z(0) = z( 1) def. = 1 Proximal

    operator: Initialization: Iterations: (?) Proof: Dykstra is block-coordinate minimization on the dual. min u1,u2 E⇤(rE(y) u1 u2) + '⇤ 1 (u1) + '⇤ 2 (u2) min ⇡ KL(⇡|⇠) + '1(⇡) + '2(⇡) Proxg( ⇡ ) def. = argmin˜ ⇡ KL(˜ ⇡|⇡ ) + g (˜ ⇡ ) ⇡(0) def. = y ⇡(`) def. = Prox'`%2 ( ⇡(` 1) z(` 2) ) z(`) def. = z(` 2) ⇡(` 1) ⇡(`) Theorem: ⇡(`) ! ⇡ solution of ( ? ).
  39. Proximal Maps For Entropic Wasserstein Flows Proposition: min ⇡ KL(⇡|⇠)

    + '1(⇡) + '2(⇡) '1(⇡) = ◆Cq (⇡) Cq def. = ⇡ ; ⇡T 1 = q Prox'1 ( ⇡ ) = ⇡ diag ⇣ q ⇡T 1 ⌘
  40. Proposition: Proximal Maps For Entropic Wasserstein Flows Proposition: min ⇡

    KL(⇡|⇠) + '1(⇡) + '2(⇡) '2(⇡) def. = ⌧ f(⇡1) '1(⇡) = ◆Cq (⇡) Cq def. = ⇡ ; ⇡T 1 = q Prox'1 ( ⇡ ) = ⇡ diag ⇣ q ⇡T 1 ⌘ Prox'2 ( ⇡ ) = diag Prox⌧ h( ⇡1 ) ⇡1 ! ⇡
  41. Dykstra For Entropic Wasserstein Flows Dykstra’s iterates: Proposition: ⇡(`) =

    diag(a(`))⇠ diag(b(`)) µ(`) = u(`)v(`),T One has: (⇡(`), z(`))
  42. Dykstra For Entropic Wasserstein Flows Dykstra’s iterates: Proposition: ⇡(`) =

    diag(a(`))⇠ diag(b(`)) µ(`) = u(`)v(`),T One has: (⇡(`), z(`)) u(`) = u(` 2) a(` 1) a(`) v(`) = v(` 2) b(` 1) b(`) a(`) = a(` 1) u(` 2) b(`) = q ⇠T (a(`)) b(`) = b(` 1) v(` 2) a(`) = p(`) ⇠(b(`)) p(`) def. = Prox KL ⌧ f ( a(` 1) u(` 2) ⇠ ( b(`) )) a(0) = b(0) = u(0) = v(0) = 1 Odd `: Even `:
  43. Dykstra For Entropic Wasserstein Flows Dykstra’s iterates: Proposition: ⇡(`) =

    diag(a(`))⇠ diag(b(`)) µ(`) = u(`)v(`),T One has: ! Only matrix/vector multplications ⇠(a), ⇠T (a). (⇡(`), z(`)) u(`) = u(` 2) a(` 1) a(`) v(`) = v(` 2) b(` 1) b(`) a(`) = a(` 1) u(` 2) b(`) = q ⇠T (a(`)) b(`) = b(` 1) v(` 2) a(`) = p(`) ⇠(b(`)) p(`) def. = Prox KL ⌧ f ( a(` 1) u(` 2) ⇠ ( b(`) )) a(0) = b(0) = u(0) = v(0) = 1 Odd `: Even `:
  44. Proposition: Example: Crowd Motion Congestion-inducing function:  = ||pt=0 ||1

     = 2||pt=0 ||1  = 4||pt=0 ||1 [Maury, Roudne↵-Chupin, Santambrogio 2010] Potential cos( w ) f(p) = ◆[0,]N (p) + hw, pi Prox f ( p ) = min( e w/ p,  )
  45. Non-Linear Diffusions 0 0.5 1 1.5 2 -0.8 -0.6 -0.4

    -0.2 0 0.2 0.4 m=1 m=2 m=5 m=10 em( s ) def. = ⇢ s (log( s ) 1) if m = 1 , ssm 1 m m 1 if m > 1 . Generalized entropies: Functions em f(p) def. = P i biemi (pi)
  46. Non-Linear Diffusions 0 0.5 1 1.5 2 -0.8 -0.6 -0.4

    -0.2 0 0.2 0.4 m=1 m=2 m=5 m=10 0 0.5 1 1.5 2 0 0.5 1 1.5 m=1 m=2 m=5 m=10 em( s ) def. = ⇢ s (log( s ) 1) if m = 1 , ssm 1 m m 1 if m > 1 . Generalized entropies: Functions em Proxem f(p) def. = P i biemi (pi)
  47. Non-Linear Diffusions Varying m Varying b 0 0.5 1 1.5

    2 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 m=1 m=2 m=5 m=10 0 0.5 1 1.5 2 0 0.5 1 1.5 m=1 m=2 m=5 m=10 em( s ) def. = ⇢ s (log( s ) 1) if m = 1 , ssm 1 m m 1 if m > 1 . Generalized entropies: Functions em Proxem f(p) def. = P i biemi (pi)
  48. Optimal Transport on Surfaces Ground cost: ci,j = dM(xi, xj)

    2 . Triangulated mesh: M. Geodesic distance: dM. Level sets xi d ( xi, ·)
  49. Optimal Transport on Surfaces Ground cost: ci,j = dM(xi, xj)

    2 . Triangulated mesh: M. Geodesic distance: dM. Level sets xi d ( xi, ·) Computing c (Fast-Marching): N2 log( N ) ! too costly.
  50. Entropic Transport on Surfaces Heat equation on M: @ u

    ( x, ·) = Mu ( x, ·) , u0( x, ·) = x [Solomon et al 2015]
  51. Entropic Transport on Surfaces Heat equation on M: Sinkhorn kernel:

    Theorem: [Varadhan] log( u ) !0 ! d2 M @ u ( x, ·) = Mu ( x, ·) , u0( x, ·) = x ⇠ = e d2 M ⇡ Id L 1 M L Caveat: proved if M di↵eomorphic to a disk . . . [Solomon et al 2015]
  52. Crowd Motion with Obstacles M = sub-domain of R2 .

     ||pt=0 ||1 = 1  ||pt=0 ||1 = 2  ||pt=0 ||1 = 4  ||pt=0 ||1 = 6 Potential cos( w )
  53. Crowd Motion with Obstacles M = sub-domain of R2 .

     ||pt=0 ||1 = 1  ||pt=0 ||1 = 2  ||pt=0 ||1 = 4  ||pt=0 ||1 = 6 Potential cos( w )
  54. Crowd Motion with Obstacles M = sub-domain of R2 .

     ||pt=0 ||1 = 1  ||pt=0 ||1 = 2  ||pt=0 ||1 = 4  ||pt=0 ||1 = 6 Potential cos( w )
  55. Crowd Motion with Obstacles M = sub-domain of R2 .

     ||pt=0 ||1 = 1  ||pt=0 ||1 = 2  ||pt=0 ||1 = 4  ||pt=0 ||1 = 6 Potential cos( w )
  56. Crowd Motion on a Surface  ||pt=0 ||1 = 1

     ||pt=0 ||1 = 6 M = triangulated mesh. Potential cos( w )
  57. Crowd Motion on a Surface  ||pt=0 ||1 = 1

     ||pt=0 ||1 = 6 M = triangulated mesh. Potential cos( w )
  58. Non-convex Functionals h(p) = ◆[0,]N (p) Congestion-inducing function: h(p) =

    ◆{0,}N (p) convex non-convex Proxh Proxh  /e convex non-convex
  59. Non-convex Functionals h(p) = ◆[0,]N (p) Congestion-inducing function: h(p) =

    ◆{0,}N (p) convex non-convex Proxh Proxh  /e convex non-convex
  60. Conclusion JKO discrete flows: Entropic regularization: ! Trade Wasserstein vs.

    KL divergence. ! Advection, di↵usion, non-smooth nonlinearities.
  61. Conclusion JKO discrete flows: Entropic regularization: ! Trade Wasserstein vs.

    KL divergence. ! Advection, di↵usion, non-smooth nonlinearities. Heat kernel approximation: ! Seamless computations on manifolds.
  62. Conclusion JKO discrete flows: Entropic regularization: ! Trade Wasserstein vs.

    KL divergence. ! Advection, di↵usion, non-smooth nonlinearities. Heat kernel approximation: ! Seamless computations on manifolds. Open problem: ! W is not a metric no limitting flow as ⌧ ! 0.
  63. Conclusion JKO discrete flows: Entropic regularization: ! Trade Wasserstein vs.

    KL divergence. ! Advection, di↵usion, non-smooth nonlinearities. Heat kernel approximation: ! Seamless computations on manifolds. Open problem: ! W is not a metric no limitting flow as ⌧ ! 0. ! Requires ⇠ ⌧2 ! 0.