Population games via optimal transport

Slide 1

Slide 1 text

Population games via discrete optimal transport Wuchen Li UCLA April 28, 2017

Slide 2

Slide 2 text

Games Game contains: Players; Strategies; Payoﬀs. Example: Rock-Paper-Scissors game Players: 2; Strategies: S1 = S2 = {Rock, Paper, Scissors}; Payoﬀs: F1 , F2 : S1 × S2 → {+1, 0, −1}. 2

Slide 3

Slide 3 text

Example: Prisoner’s Dilemma 3

Slide 4

Slide 4 text

Finite players’ game Finite players’ games model the strategic interactions in N players. Player v receive a payoff depending all others Fv : S1 × · · · × SN → R ; Each player faces his/her own payoff problem: max xv∈Sv Fv (x1 , · · · , xv , · · · , xN ) , v ∈ {1, · · · , N} . People study a particular status in games, named Nash equilibrium (NE), meaning that no player has incentives to change his/her current strategy unilaterally. A strategy profile (x∗ 1 , · · · , x∗ N ) is a NE if Fv (· · · x∗ v−1 , x∗ v , x∗ v+1 , · · · ) ≥ Fv (· · · , x∗ v−1 , xv , x∗ v+1 , · · · ) , for any player v with xv ∈ Sv . 4

Slide 5

Slide 5 text

Stag hunt (Population game) Players: Inﬁnity; Strategy set: S = {C, D}; Players form (ρC , ρD ) with ρC + ρD = 1; Payoﬀs: F(ρ) = (FC (ρ), FD (ρ))T = Aρ, where A = 3 0 2 2 , meaning a Deer worthing 6, a rabbit worthing 2. 5

Slide 6

Slide 6 text

Population game Population games model the strategic interactions in large populations of small, anonymous agents. It is a limiting procedure of ﬁnite players’ games. Strategy set S = {1, · · · , n} ; Players (Simplex) P(S) = {(ρi )n i=1 ∈ Rn : n i=1 ρi = 1 , ρi ≥ 0} ; Payoﬀ function to strategy i: Fi : P(S) → R. E.g. F(ρ) = (Fi (ρ))n i=1 = Aρ , where A ∈ Rn×n . Applications Social Network, Biology species, Virus, Trading, Cancer, Congestion and many more. We plan to design new dynamics to model for the evolution of a game, and study their asymptotic properties. 6

Slide 7

Slide 7 text

Nash equilibrium and Potential games Nash Equilibrium (NE): Players have no unilateral incentive to deviate from their current strategies. ρ∗ = (ρ∗ i )n i=1 is a Nash equilibrium (NE) if ρ∗ i > 0 implies that Fi (ρ∗) ≥ Fj (ρ∗) for all j ∈ S. A particular type of game, named Potential games, are widely considered: There exists a potential F : P(S) → R, such that ∂ ∂ρi F(ρ) = Fi (ρ) . E.g. if F(ρ) = Aρ, consider F(ρ) = 1 2 ρT Aρ, where A is a symmetric matrix. In potential games, from KKT condition, NE is the critical points of max ρ F(ρ) : ρ ∈ P(S) . 7

Slide 8

Slide 8 text

Evolutionary dynamics In literature, people have designed many dynamics, named mean or evolutionary dynamics, to model games. Typical examples are BNN (Brown-von Neumann-Nash 50), Best response dynamics (Gilboa-Matsui 91), Logit (Fudenberg-Levine 98), Smith dynamics (Smith 1983) and more. One of the most widely used dynamics is Replicator dynamics (Taylor and Jonker 1978) dρi dt = ρi (Fi (ρ) − ¯ F(ρ)), where ¯ F(ρ) = j∈S ρj Fj (ρ). In potential games, the Replicator dynamics is a gradient ﬂow in probability set P(S) w.r.t to a modiﬁed Euclidean metric (Akin (1980)). 8

Slide 9

Slide 9 text

Our Goal and Methodology Design a new dynamics for the evolutionary games to have the following properties: (i) Evolution only using local information; (ii) Locally the best choice (a gradient ﬂow in potential games); (iii) Ability to include noise perturbations. Mathematics: Optimization; Dynamical system; Optimal transport; Riemannian Geometry; Optimal control; Partial diﬀerential equations; Graph theory; Entropy; Fisher information, etc. 9

Slide 10

Slide 10 text

Main results: New model via optimal transport We introduce a new dynamics model for population games dρi dt = j∈N(i) ρj [Fi (ρ) − Fj (ρ) + β log ρj ρi ]+ − j∈N(i) ρi [Fj (ρ) − Fi (ρ) + β log ρi ρj ]+ , where β is a nonnegative parameter modeling the risk-taking behavior of players. The proposed model connects deeply with Brownian motion: Mean ﬁeld stochastic process (Ito, Einstein); Optimal transport and Entropy (Villani, Gangbo, Carlen, Otto, Brenier, Benamou etc); Fisher information (See Frieden’s book). 10

Slide 11

Slide 11 text

Motivation: Mean field games Consider S = Rd as analog of {1, · · · , n}. Our model: dXt = ∇Xt F(Xt , ρ)dt + 2βdWt , Xt ∈ Rd. where Wt is the standard Brownian motion (or noise level) and Pr(Xt = x) = ρ(t, x). Then ρ(t, x) satisfies the mean field equation ∂ρ ∂t + ∇x · ρ∇x F(x, ρ) = β∆x ρ . Here Modeling: Individual players change their pure strategies according to the direction that maximizes their own payoff functions most rapidly. And the Laplacian represents uncertainties. Optimal transport: The PDE is the gradient flow equation (gradient descent) in potential games. I.e. there exists a potential F : P(Rd) → R: F(x, ρ) = δ δρ(x) F(ρ) . 11

Slide 12

Slide 12 text

Our derivation In this talk, we will derive a similar mean ﬁeld equation on a discrete strategy set. It is a gradient ﬂow. In order to attach such a goal, we shall introduce the theory of optimal transport. 12

Slide 13

Slide 13 text

Optimal transport The problem is originally introduced by Monge in 1781, relaxed by Kantorovich by 1940 (the ﬁrst example in linear programming). It introduces a particular metric on probability set, named optimal transport distance, Wasserstein metric or Earth Mover’s distance etc. 13

Slide 14

Slide 14 text

Probability Manifold In this talk, we use an important reformulation, orignated by Benamou-Breiner 2000: W(ρ0, ρ1)2 := inf v 1 0 Ev(t, Xt )2 dt , where E is the expectation operator and the inﬁmum runs over all vector ﬁeld v(t, x) with ˙ Xt = v(t, Xt ) , X0 ∼ ρ0 , X1 ∼ ρ1 . Under this metric, the probability set enjoys Riemannian geometry structures. 14

Slide 15

Slide 15 text

Gradient flow via Optimal transport The gradient flow of ¯ F(ρ) = − 1 2 Rd×Rd A(x, y)ρ(x)ρ(y)dxdy + β Rd ρ(x) log ρ(x)dx Interaction Potential energy Boltzmann-Shannon entropy w.r.t. optimal transport metric distance is: ∂ρ ∂t + ∇ · ρ∇x δ δρ(x) ¯ F(ρ) = 0 , which is exactly the mean field equation (Ambrosio, Gigli and Savare). We quote a sentence from Villani’s book (2008): The density of gradient flow is the gradient flow in density space. 15

Slide 16

Slide 16 text

Discrete optimal transport and gradient flow? Question: Can we derive a similar gradient flow in potential games on discrete strategies? Answer: Yes, we can. The gradient flow depends on the metric of probability set. We need to build a discrete optimal transport metric. 16

Slide 17

Slide 17 text

Basic setting Graph with ﬁnite vertices G = (S, E), S = {1, · · · , n}, E is the edge set. Noise potential: ¯ F(ρ) = 1 2 n i=1 n j=1 Aij ρi ρj − β n i=1 ρi log ρi , Interaction Potential energy Boltzmann-Shannon entropy where A is a given symmetric matrix and β > 0 is a given constant. 17

Slide 18

Slide 18 text

Deﬁnition: Optimal transport distance on a graph The metric for any ρ0, ρ1 ∈ Po (S) is W(ρ0, ρ1)2 := inf v { 1 0 (v, v)ρ dt : dρ dt + divG (ρv) = 0, ρ(0) = ρ0, ρ(1) = ρ1} , where (v, v)ρ = 1 2 (i,j)∈E v2 ij gij (ρ) , divG (ρv) = −( j∈N(i) vij gij (ρ))n i=1 , and gij is given by a upwind scheme: gij (ρ) =      ρi if ∂ ∂ρi F(ρ) > ∂ ∂ρj F(ρ), j ∈ N(i); ρj if ∂ ∂ρi F(ρ) < ∂ ∂ρj F(ρ), j ∈ N(i); ρi+ρj 2 if ∂ ∂ρi F(ρ) = ∂ ∂ρj F(ρ), j ∈ N(i). 18

Slide 19

Slide 19 text

Gradient flow in Riemannian manifold The gradient flow in abstract form dρ dt = gradPo(S) ¯ F(ρ) , where the gradient is defined by: Tangency: gradPo(S) ¯ F(ρ) ∈ Tρ Po (S) . Duality: (gradPo(S) F(ρ), σ)ρ = diffF(ρ) · σ, for any σ ∈ Tρ Po (S) . where diffF(ρ) = ( ∂ ∂ρi F(ρ))n i=1 . 19

Slide 20

Slide 20 text

Main result: Gradient flow derivation Theorem Given a potential game with a strategy graph G = (S, E), a payoff matrix A. Then dρi dt = j∈N(i) ρj [Fi (ρ) − Fj (ρ) + β log ρj ρi ]+ − j∈N(i) ρi [Fj (ρ) − Fi (ρ) + β log ρi ρj ]+ , (1) is the gradient flow of the free energy F(ρ) = − 1 2 ρT Aρ + β n i=1 ρi log ρi on Po (S) with respect to the discrete optimal transport distance W. 20

Slide 21

Slide 21 text

Main result: Gradient ﬂow Asymptotical behavior Theorem For any initial condition ρ0 ∈ Po (S), (1) has a unique solution ρ(t) : [0, ∞) → Po (S). (i) The free energy F(ρ) is a Lyapunov function of (1): (ii) If limt→∞ ρ(t) exists, call it ρ∞, then ρ∞ is one of the possible Gibbs measures, i.e. ρ∞ i = 1 K eFi(ρ∞) β , K = n i=1 eFi(ρ∞) β for all i ∈ S. 21

Slide 22

Slide 22 text

Main result: Converge of Gradient ﬂow Theorem (Entropy dissipation) If the Gibbs measure ρ∞ is a strict maximizer of ¯ F(ρ), then there exists a constant C > 0, such that ¯ F(ρ∞) − ¯ F(ρ(t)) ≤ e−Ct( ¯ F(ρ∞) − ¯ F(ρ0)) . The exponential convergence is naturally expected because (1) is the gradient ﬂow on a Riemannian manifold (Po (S), W). Its proof is based on the relation between entropy, Fisher information, and optimal transport metric on a graph. 22

Slide 23

Slide 23 text

Modeling: Nonlinear Markov process Let ρi (t) = Pr(Xβ (t) = i). Then ρ(t) governs the following discrete state Markov process Xβ (t): P(Xβ (t + h) = j | Xβ (t) = i) =      ( ¯ Fj (ρ) − ¯ Fi (ρ))+ h, if j ∈ N(i); 1 − j∈N(i) ( ¯ Fj (ρ) − ¯ Fi (ρ))+ h + o(h), if j ∈ N(i); 0, otherwise, where limh→0 o(h) h = 0 and ¯ Fi (ρ) = Fi (ρ) − β log ρi . 23

Slide 24

Slide 24 text

Example: Stag Hunt Strategy set {C, D}; Players ρ = (ρC , ρD )T ; Payoﬀ F(ρ) = Aρ with A = 3 0 2 2 . 24

Slide 25

Slide 25 text

Example: Stag Hunt We draw the vector field of the Fokker-Planck equation. Different noise levels lead to different NEs. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 (c) β = 5 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 (d) β = 0.5 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 (e) β = 0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 (f) β = 0 25

Slide 26

Slide 26 text

Example: Potential game I Strategy set {1, 2, 3}; Players ρ = (ρ1 , ρ2 , ρ3 )T ; Payoﬀ F(ρ) = Aρ with a symmetric matrix A =   1 0 0 0 1 1 0 1 1   . 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 (g) β = 0 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 0 0.1 0.2 0.3 0.4 0.5 1 0.9 0.8 0.7 0.6 0 (h) β = 0.1 26

Slide 27

Slide 27 text

Example: Potential game II Strategy set {1, 2, 3}; Players ρ = (ρ1 , ρ2 , ρ3 )T ; Payoﬀ F(ρ) = Aρ with a symmetric matrix A =   1 2 0 0 0 1 1 0 1 1   . 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 (i) β = 0 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 0.6 0.7 0.8 0.9 1 0.5 0.4 0.3 0.2 0.1 0 0 (j) β = 0.1 27

Slide 28

Slide 28 text

Example: Rock-Scissors-Paper Strategy set {r, s, p}; Players ρ = (ρr , ρs , ρp )T ; Payoﬀ F(ρ) = Aρ with payoﬀ matrix A =   0 −1 1 1 0 −1 −1 1 0   . 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0 (k) β = 0 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 (l) β = 0.1 28

Slide 29

Slide 29 text

Example: Bad Rock-Scissors-Paper Payoﬀ F(ρ) = Aρ with payoﬀ matrix A =   0 −2 1 1 0 −2 −2 1 0   . We demonstrate a Hopf Bifurcation. If β is large, there is a unique equilibrium around (1 3 , 1 3 , 1 3 ). If β is small, a limit cycle exists. 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 0 0.1 0.2 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0 (m) β = 0.5 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0 (n) β = 0.1 0 0.2 0.4 0.6 0.8 1 1 0.8 0.6 0.4 0.2 0.1 0.2 0.5 0 0.3 0.4 1 0.9 0.8 0.7 0.6 0 (o) β = 0 29

Slide 30

Slide 30 text

Conclusion The new model has some desirable properties: It is the gradient ﬂow of the noisy potential in the probability space endowed with the optimal transport metric. It is the probability evolution equation of a Markov process, which model players myopicity, greediness and irrationality. For potential games, its asymptotic properties are obtained by the relation of optimal transport metric, entropy and Fisher information. 30

Slide 31

Slide 31 text

B. Frieden Science from Fisher Information: A Unification, 2004. Wuchen Li, Penghang Yin and Stanley Osher Computations of optimal transport distance with Fisher information regularization, 2017. Shui-Nee Chow, Wuchen Li, Jun Lu and Haomin Zhou Population games and discrete optimal transport, 2017. Shui-Nee Chow, Wuchen Li and Haomin Zhou Optimal transport on finite graphs: Entropy dissipation, 2016. Wuchen Li A study of stochastic differential equation and Fokker-Planck equation with applications, Phd thesis, 2016. D. Monderer and L. Shapely, Potential games, 1996. C´ edric Villani Optimal transport: Old and new, 2008. 31

Slide 32

Slide 32 text

Thanks! 31