Hamiltonian flows on graphs

Slide 1

Slide 1 text

Hamiltonian ﬂows on graphs Wuchen Li, UCLA 2019 Joint work with Shui-Nee Chow (GT) and Haomin Zhou (GT).

Slide 2

Slide 2 text

Introduction: Hamiltonian ﬂow Consider a second order ODE ¨ x = −∇V (x). Denote the momentum p = ˙ x, then (x, p) satisﬁes a Hamiltonian system ˙ x = p ˙ p = −∇V (x), which conserves the Hamiltonian H(x, p) = p2 2 + V (x). 2

Slide 3

Slide 3 text

Introduction: Law of Hamiltonian flows The finite dimensional Hamiltonian flow connects to a pair of PDEs ∂t ρ(t, x) + ∇x · (ρ(t, x)∇x S(t, x)) = 0 ∂t S(t, x) + 1 2 |∇x S(t, x)|2 + V (x) = 0, which conserves the total Hamiltonian H(ρ, S) = Rd 1 2 |∇S(x)|2ρ(x) + V (x)ρ(x)dx. In this talk, we will build dynamical system viewpoint of Hamiltonian flows via optimal transport: The law of Hamiltonian flow is the Hamiltonian flow in probability space. 3

Slide 4

Slide 4 text

Hamiltonian system+Optimal transport Related to Mean ﬁeld games (Larsy, Lions, Gangbo); Related to weak KAM theory (Evans); Related to 2-Wasserstein metric (Brenier, Villani, Ambrosio); Related to Schr¨ odinger equations (Nelson, Carlen, Laﬀerty); Related to Schr¨ odinger Bridge problem (Carlen, Yause, Leonard). 4

Slide 5

Slide 5 text

History Remark Brownian motion (1905) Schrodinger equation (1926) Schrodinger bridge (1931) Nelson process (1966) Optimal transport+ Hamiltonian system (Recently) 5

Slide 6

Slide 6 text

Optimal transport What is the optimal way to move or transport the mountain with shape X, density ρ0(x) to another shape Y with density ρ1(y)? The problem was ﬁrst introduced by Monge in 1781 and relaxed by Kantorovich by 1940. It introduces a metric function on probability set, named optimal transport distance, Wasserstein metric or Earth Mover’s distance. 6

Slide 7

Slide 7 text

Overview The optimal transport has many diﬀerent formulations under various angles: Mapping/Monge-Amp´ ere equation; Linear programming; Geometry/Fluid dynamics; which are considered by Otto, Kinderlehrer, Villani, McCann, Carlen, Lott, Strum, Gangbo, Jordan, Evans, Brenier, Benamou, Ambrosio, Gigli, Savare and many more. In this talk, we mainly follow its symplectic geometry formulation in a discrete setting, and build the Hamiltonian ﬂows for modeling and numerical computations. 7

Slide 8

Slide 8 text

Density manifold Rewrite x − y 2 = inf γ(t) 1 0 v(t)2dt : ˙ γ(t) = v(t), γ(0) = x, γ(1) = y . The distance has an optimal control formulation (Benamou-Brenier 2000). Let x = X0 (x), y = X1 (x), then inf v 1 0 Ex∼ρ0 v(t, Xt (x))2 dt, where E is the expectation operator and the infimum runs over all vector fields vt , such that ˙ Xt (x) = v(t, Xt (x)), X0 ∼ ρ0, X1 ∼ ρ1. Under this metric, the probability set has a Riemannian geometry structure1. 1John D. Lafferty: the density manifold and configuration space quantization, 1988. 8

Slide 9

Slide 9 text

Density manifold 9

Slide 10

Slide 10 text

Brownian motion and Entropy Production The gradient ﬂow of entropy H(ρ) = Rd ρ(x)log ρ(x)dx, w.r.t. optimal transport metric distance is: ∂ρ ∂t = ∇ · (ρ∇log ρ) = ∆ρ. Entropy dissipation: − d dt H(ρ) = Rd (∇ log ρ)2ρdx = I(ρ). 10

Slide 11

Slide 11 text

Goal: Hamiltonian flows on graphs Question: Can we consider the Hamiltonian flows, e.g. Schr¨ odinger equation or bridge problem, on finite graphs? Answer: Yes, we need to build a discrete optimal transport metric on probability simplex over finite states. Using this metric, we build the associated Hamiltonian flows in probability simplex. Recent Developments on Hamiltonian flows and Optimal transport: Chow, Li, Zhou, Gangbo, Leger, Mou... 11

Slide 12

Slide 12 text

Basic setting Graph with ﬁnite vertices G = (V, E, ω), V = {1, · · · , n}, E is the edge set, ω is the weight set; Probability set P(G) = (ρi )n i=1 | n i=1 ρi = 1, ρi ≥ 0 . Example: Consider a discrete space {1, 2, 3} with the graph structure: The probability simplex set forms 12

Slide 13

Slide 13 text

Definition I We plan to find the discrete analog of density manifold (Maas, Mielke, Chow). First, it is natural to define a vector field on a graph v = (vij )(i,j)∈E , satisfying vij = −vji . Given a potential S = (Si )n i=1 , a gradient vector field refers to ∇G Sij = √ ωij (Si − Sj ), where ωij = ωji is the weight function on an edge. 13

Slide 14

Slide 14 text

Definition II We next define an inner product of two vector fields v1, v2: (v1, v2)ρ := 1 2 (i,j)∈E v1 ij v2 ij θij (ρ); and a divergence of a vector field v at ρ ∈ P(G): divG (ρv) := − j∈N(i) √ ωij vij θij (ρ) n i=1 . Here θ represents the probability weight on the edge θij (ρ) = 1 2 ( ωij i ∈N(i) ωii ρi + ωij j ∈N(j) ωjj ρj ). θ has some other choices. 14

Slide 15

Slide 15 text

Optimal transport distance on a graph The metric for any ρ0, ρ1 ∈ Po (G) is W(ρ0, ρ1)2 := inf v { 1 0 (v, v)ρ dt : dρ dt + div(ρv) = 0, ρ(0) = ρ0, ρ(1) = ρ1}. Here W is the proposed Wasserstein metric on graph. Later on, we will show it introduces a metric tensor structure in density manifold. 15

Slide 16

Slide 16 text

Hodge decomposition Continuous state v(x) = ∇S(x) + u(x), where v(x) is a given vector field, ∇S is the gradient vector field and u is the divergence free with respect to a density ρ, i.e. div(ρu) = 0. Graph v = ∇G S + u where v = (vij ) is a discrete vector field, ∇G S = (Si − Sj )ωij is the discrete gradient vector field, and the divergence free on a graph means divG (ρu) = 0. 16

Slide 17

Slide 17 text

Variational formulation Lemma The discrete Wasserstein metric is equivalent to W(ρ0, ρ1)2 = inf ∇GS 1 0 (∇G S, ∇G S)ρ dt, where the inﬁmum is taken among all discrete potential vector ﬁelds ∇G S, such that dρ dt + divG (ρ∇G S) = 0, ρ(0) = ρ0, ρ(1) = ρ1. This metric gives Po (G) Riemannian geometry structure. 17

Slide 18

Slide 18 text

Discrete probability manifold Denote −divG (ρ∇G S) = L(ρ)S. Then the metric in Po (G) is equivalent to W(ρ0, ρ1)2 = inf{ 1 0 ˙ ρT L(ρ)−1 ˙ ρ dt : ρ(0) = ρ0, ρ(1) = ρ1}. 18

Slide 19

Slide 19 text

Wasserstein metric tensor Here L(ρ) ∈ Rn×n is the linear weighted Laplacian matrix function L(ρ) = −divG (ρ∇G ) = −DT Θ(ρ)D, where D ∈ R|E|×|V | is a discrete gradient matrix with Dij,k =      1 i = k, j ∈ N(i) −1 j = k, j ∈ N(i) 0 otherwise, DT ∈ R|V |×|E| is a discrete divergence matrix, and Θ ∈ R|E|×|E| is a diagonal weight matrix Θ(i,j)∈E,(k,l)∈E = ρi+ρj 2 if (i, j) = (k, l) ∈ E; 0 otherwise. 19

Slide 20

Slide 20 text

Probability manifold: Geodesics Consider the Lagrangian L(ρ, ˙ ρ) = 1 2 ˙ ρT L(ρ)−1 ˙ ρ. The geodesic satisﬁes the Euler-Lagrangian equation d dt ∇ ˙ ρ L(ρ, ˙ ρ) = ∇ρ L(ρ, ˙ ρ), which can be written into the following second order ODE2 ¨ ρ + ΓW ρ ( ˙ ρ, ˙ ρ) = 0, Here ΓW ρ is the Christoﬀel symbol in probability manifold. In other words, we write the Euler-Lagrangian equation into a second order ODE. 2Wuchen Li, Geometry of probability simplex via optimal transport, 2018. 20

Slide 21

Slide 21 text

Christoffel symbol The Christoffel symbol is the coefficient of the quadratic form ΓW ρ ( ˙ ρ, ˙ ρ) = −L( ˙ ρ)L(ρ)−1 ˙ ρ + 1 2 L(ρ) ∇G L(ρ)−1 ˙ ρ, ∇G L(ρ)−1 ˙ ρ . It is also new in the continuous domain: ΓW ρ ( ˙ ρ, ˙ ρ) = −∆ ˙ ρ ∆−1 ρ ˙ ρ − 1 2 ∆ρ ∇∆−1 ρ ˙ ρ, ∇∆−1 ρ ˙ ρ . Thus we propose to study the following second order ODE: ¨ ρ − ∆ ˙ ρ ∆−1 ρ ˙ ρ − 1 2 ∆ρ ∇∆−1 ρ ˙ ρ, ∇∆−1 ρ ˙ ρ = 0, where ∆ρ = ∇ · (ρ∇). Interestingly, it can be written into the first order ODE system later on. 21

Slide 22

Slide 22 text

Hamiltonian formulation By the Legendre transform, i.e. H(ρ, S) = sup ˙ ρ ˙ ρT S − L(ρ, ˙ ρ), then the geodesics satisﬁes the Hamiltonian system d dt ρ S = 0 I −I 0 ∂ ∂ρ H ∂ ∂S H , where H(ρ, S) = 1 2 ST L(ρ)S. 22

Slide 23

Slide 23 text

Hamiltonian system on probability manifold We write the above Hamiltonian system explicitly. Denote H(ρ, S) = 1 2 (i,j)∈E ωij (Si − Sj )2θij (ρ), then the geodesics satisﬁes            ˙ ρ + j∈N(i) (Si − Sj )ωij θij (ρ) = 0 ˙ S + 1 2 j∈N(i) (Si − Sj )2ωij ∂θij (ρ) ∂ρi = 0. They are discrete analog of H(ρ, S) = 1 2 (∇S(x), ∇S(x))ρ(x)dx, with continuity equation and Hamilton-Jacobi equation    ∂t ρ(t, x) + ∇ · (ρ(t, x)∇S(t, x)) = 0 ∂t S(t, x) + 1 2 (∇S(t, x))2 = 0. 23

Slide 24

Slide 24 text

General Hamiltonian ﬂow We mainly consider the second order ODE in probability manifold: ¨ ρ + ΓW ρ ( ˙ ρ, ˙ ρ) = −gradW F(ρ). In Hamiltonian formalism, it represents          ˙ ρ + j∈N(i) (Si − Sj )ωij θij (ρ) = 0 ˙ S + 1 2 j∈N(i) (Si − Sj )2ωij ∂θij ∂ρi = ∇ρ F(ρ). Two important examples of Hamiltonian ﬂows include: Schr¨ odinger equation; Schr¨ odinger bridge problem. 24

Slide 25

Slide 25 text

Entropy production on graphs An important energy in Hamiltonian flow also evolves in the gradient flow, known as the entropy production. The gradient flow of the Shannon entropy S(ρ) = n i=1 ρi log ρi in (P(G), W) is the diffusion process on a graph: dρ dt = −gradW S(ρ) = divG (ρ∇G log ρ). The dissipation of entropy defines the Fisher information on a graph: I(ρ) = (gradW S(ρ), gradW S(ρ))ρ = 1 2 (i,j)∈E (log ρi − log ρj )2θij (ρ). Many interesting topics have been extracted from this observation. E.g. entropy dissipation, Log-Sobolev inequalities, Ricci curvature, Yano formula (Annals of Mathematics, 1952, 7 pages). 25

Slide 26

Slide 26 text

Example 1: Schr¨ odinger equations Consider ¨ ρ + ΓW ρ ( ˙ ρ, ˙ ρ) = −gradW F(ρ) with F(ρ) = −I(ρ). Then its Hamiltonian formulation (ρ, S) recovers the following quantum hydrodynamics on graphs            ˙ ρ + j∈N(i) (Si − Sj )ωij θij (ρ) = 0 ˙ S + 1 2 j∈N(i) (Si − Sj )2ωij ∂θij (ρ) ∂ρi = −∂ρi I(ρ) 26

Slide 27

Slide 27 text

Hamiltonian structure Laplacian operator Interestingly, the Madelung transform3 on graphs Ψ = √ ρe √ −1S, introduces the following Hamilton-structure-keeping Laplacian operator: √ −1 dΨi dt = 1 2 Ψi { j∈N(i) (log Ψi − log Ψj ) θij |Ψi |2 + j∈N(i) | log Ψi − log Ψj |2 ∂θij ∂|Ψi |2 }. It is a consistent scheme for Schr¨ odinger equation: √ −1∂t Ψt = 1 2 ∆Ψt . 3Nelson, Quantum diﬀusion, 1985. 27

Slide 28

Slide 28 text

Two points Schr¨ odinger equation 28

Slide 29

Slide 29 text

Example: Ground state Compute the ground state via min ρ∈P(G) h2 8 I(ρ) + V(ρ) + W(ρ). −1 −0.5 0 0.5 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 x ρ h=1 h=0.1 h=0.01 Figure: The plot of ground state’s density function. The blue, black, red curves represents h = 1, 0.1, 0.01, respectively. 29

Slide 30

Slide 30 text

Example 2: Schr¨ odinger bridge problem Consider ¨ ρ + ΓW ρ ( ˙ ρ, ˙ ρ) = −gradW F(ρ) with F(ρ) = I(ρ). Then its Hamiltonian formulation (ρ, S) recovers the Schr¨ odinger bridge on graphs. It has many other formulations connecting with Mean ﬁeld games and diﬀusion process. Its Lagrangian formulation can be viewed as the regularized version of optimal transport. 30

Slide 31

Slide 31 text

Discussion We establish the Hamiltonian flows on probability simplex over finite graphs. The Fisher information is added as the diffusion perturbation into the proposed Hamiltonian flows. 31

Slide 32

Slide 32 text

Main references Edward Nelson Derivation of the Schr¨ odinger Equation from Newtonian Mechanics, 1966. Shui-Nee Chow, Wuchen Li and Haomin Zhou Entropy dissipation of Fokker-Planck equations on finite graphs, 2017. Shui-Nee Chow, Wuchen Li and Haomin Zhou A Schr¨ odinger equation on finite graphs via optimal transport, 2017. Shui-Nee Chow, Wuchen Li, Chenchen Mou and Haomin Zhou A Schr¨ odinger bridge problem on finite graphs via optimal transport, 2018. Wuchen Li, Geometry of probability simplex via optimal transport, 2018. 32