$30 off During Our Annual Pro Sale. View Details »

Mean-Field Games for Scalable Computation and Diverse Applications

Wuchen Li
September 08, 2022

Mean-Field Games for Scalable Computation and Diverse Applications

Mean field games (MFGs) study strategic decision-making in large populations where individual players interact via specific mean-field quantities. They have recently gained enormous popularity as powerful research tools with vast applications. For example, the Nash equilibrium of MFGs forms a pair of PDEs, which connects and extends variational optimal transport problems. This talk will present recent progress in this direction, focusing on computational MFG and engineering applications in robotics path planning, pandemics control, and Bayesian/AI sampling algorithms. This is based on joint work with the MURI team led by Stanley Osher (UCLA).

Wuchen Li

September 08, 2022
Tweet

More Decks by Wuchen Li

Other Decks in Education

Transcript

  1. Mean-Field Games (MFGs) for Scalable Computation and Diverse Applications Wuchen

    Li University of South Carolina 1 Applied math seminar, Duke university, Sep 6 Joint work with Stanley Osher (UCLA)’s AFOSR MURI team
  2. 2 Alex Lin Levon Nurbekyan Samy Wu Fung Stanley Osher

    Siting Liu Wonjun Lee
  3. 3

  4. AI UAV COVID 19 pandemic control 4 Applications

  5. 5 Mean fi eld Control/Game

  6. Variational formulation 6

  7. 7

  8. Mean fi eld game system A typical time-dependent MFG system

    (derived from (9)) has the form 8 > > > > > < > > > > > : t 2 + H (x, Dx (x, t), ⇢(x, t), t) = 0 ⇢t 2 ⇢ divx (⇢DpH (x, Dx (x, t), ⇢(x, t), t)) = 0 ⇢ 0, R Td ⇢(x, t)dx = 1, t 2 [0, T] ⇢(x, T) = ⇢T (x), (x, 0) = 0(x, ⇢(x, 0)), x 2 Td . (10) Given: H : Td ⇥ Rd ⇥ X ⇥ R ! R is a periodic in space and convex in momentum Hamiltonian, where X = R+ or X = L 1(Td ; R+) or X = R+ ⇥ L 1(Td ; R+). H = L ⇤ is a di↵usion parameter 0 : Td ⇥ X ! R, ⇢T : Td ! R given initial-terminal conditions Unknowns: , ⇢ : Td ⇥ [0, T] ! R has the form 8
  9. 9 Literatures Liu, Guo, Lu, etc. Liu, Guo, Lu.

  10. 10

  11. Minimal flux problem Denote m(s, x) = ⇢(s, x)u(s, x).

    The variational problem forms inf ⇢,u Z t 0 { Z Td L(x, m(s, x) ⇢(s, x) )⇢(s, x)dx F(⇢(s, ·)}ds + G(⇢(0, ·)) , where the infimum is taken among all flux function m(s, x) and density ⇢(s, x): @⇢ @s + r · m = 0 , 0  s  t , ⇢(t, ·) = ⇢(·) . 12 11
  12. Finite volume discretization To mimic the minimal flux problem, we

    consider the discrete flux function div(m)|i = 1 x d X v=1 (mi+ 1 2 ev mi 1 2 ev ) , and the cost functional L(m, ⇢) = 8 > > < > > : P i+ ev 2 2E L ✓ m i+ 1 2 ev g i+ 1 2 ev ◆ gi+ 1 2 ev if gi+ ev 2 > 0 ; 0 if gi+ ev 2 = 0 and mi+ ev 2 = 0 ; +1 Otherwise . where gi+ 1 2 ev := 1 2 (⇢i + ⇢i+ev ) is the discrete probability on the edge i + ev 2 2 E. The time interval [0, 1] is divided into N interval, t = 1 N . 13 12
  13. Computational Mean field games Consider the discrete optimal control system:

    ˜ U(t, ⇢) := inf m,⇢ N X n=1 L(m n , ⇢ n ) N X n=1 F(⇢ n ) + G(⇢ 0 ) where the minimizer is taken among {⇢}n i , {m}n i+ ev 2 , such that ( ⇢ n+1 i ⇢n i + t · div(m)|i = 0 , ⇢N i = ⇢i . 14 13
  14. Primal-Dual structure Let H be the Legendre transform of L.

    sup inf m,⇢ ⇢ X n L(m n , ⇢ n ) t X n tF({⇢}n i ) + G({⇢}0 i ) + X n X i n i ⇢ n+1 i ⇢ n i + t · div(m)|i = sup inf ⇢ ⇢ inf m X n 0 @L(m n , ⇢ n ) + X i+ ev 2 2E 1 x ( n i n i+ev )mi+ 1 2 ev 1 A t X n tF({⇢}n i ) + G({⇢}0 i ) + X n X i n i ⇢ n+1 i ⇢ n i = sup inf ⇢ ⇢ X n X i+ ev 2 2E H ✓ 1 x ( n i n i+ev ) ◆ gi+ 1 2 ev t X n tF({⇢}n i ) + G({⇢}0 i ) + X n X i n i ⇢ n+1 i ⇢ n i 15 14
  15. 15 and Mean fi eld Control Generalized Primal-Dual algorithms

  16. Computational Mean-field Games on Manifolds Jiajia Yu (RPI) 1 Rongjie

    Lai (RPI) 1 Wuchen Li (U of SC) Stanley Osher (UCLA) June, 5th, 2022 1J. Yu and R. Lai’s work are supported in part by an NSF Career Award DMS-1752934 and NSF DMS-2134168 16
  17. MFG on Manifolds I Conventional mean-field games deal with Nash

    Equilibrium on flat Euclidean domains. I But Euclidean domain is not enough for real-life applications, for example modeling evolution on the Earth or on data-sets or on the brain surface. I We generalize mean-field games to manifolds, and show that with operators on manifolds, the Nash Equilibrium satisfies 8 > < > : @t (x, t) + H(x, rM (x, t)) = F(x, ⇢(·, t)), @t⇢(x, t) rM · (⇢(x, t)@qH(x, rM (x, t)) = 0, (x, 1) = FT (x, ⇢(·, 1)), ⇢(·, 0) = ⇢0. Figure source: NASA(top), MNIST, Wikipedia(center), [Lai et al’13](bottom). 2 17
  18. Variational Formulations I The PDE system is very complicated in

    its coordinate form 8 > > > > < > > > > : @t X (⇠ 1 , ⇠ 2 , t) + HX (X, g 1 ⇠ r⇠ X ) = F(X, ⇢(·, t)), @t⇢X (⇠ 1 , ⇠ 2 , t) 1 p det(g⇠ ) 2 X i=1 @ @⇠i 2 X j=1 p det(g⇠ )⇢X (g 1 ⇠ )ij@qj HX ! = 0, X (⇠ 1 , ⇠ 2 , 1) = FT (X(⇠ 1 , ⇠ 2), ⇢(·, 1)), ⇢X (⇠ 1 , ⇠ 2 , 0) = ⇢0(X(⇠ 1 , ⇠ 2)). I Comparing to the PDE formulation, when F(⇢) ⇢ (x) = F(x, ⇢), FT (⇢) ⇢ (x) = FT (x, ⇢), the equivalent variational formulation is easier to handle min ⇢,m Z 1 0 Z M ⇢(x, t)L ✓ x, m(x, t) ⇢(x, t) ◆ dMxdt + Z 1 0 F(⇢(·, t))dt + FT (⇢(·, 1)), s.t. @t⇢(x, t) + rM · m(x, t) = 0, ⇢(·, 0) = ⇢0. A local coordinate representation of a manifold. 3 18
  19. Space Discretization I With the triangular mesh approximation of manifold

    and linear functions on the manifold, we discretize the problem to min P,M e Y(P, M) := Z 1 0 s X j=1 ATj P(Tj, t) kM(Tj, t)k2 2 dt + Z 1 0 e F(P(·, t))dt + e FT (P(·, 1)) s.t. A(P, M) := ✓ d dt P(Vi, t) + (r f M · M)(Vi, t) ◆ Vi2V,t2[0,1] = 0, P(·, 0) = P0. A kitten triangular mesh. A linear function on a mesh. The gradient on a triangle. 4 19
  20. Algorithm I We applied projected gradient descent to solve the

    variational problem. 8 > > > < > > > : (P, M)(k+1/2) := ( b P, c M)(k) ⌘ (k)rP,M e Y( b P, c M)(k) , (P, M)(k+1) := proj{A(P,M)=0} (P, M)(k+1/2) , ( b P, c M) := ⇣ 1 + ! (k) ⌘ (P, M)(k+1) ! (k)(P, M)(k) . I The projection operator is exactly (P, M)(k+1) = (Id A⇤(AA⇤) 1A)(P, M)k+1/2 . I We pre-compute (AA⇤) 1 to save total computational cost. 5 20
  21. Numerical Results: Avoiding Obstacles I Recall: the objective function is

    Z 1 0 Z M km(x, t)k2 g(x) 2⇢(x, t) dMxdt + Z 1 0 F(⇢(·, t))dt + FT (⇢(·, 1)) I The interaction cost F(⇢(·, t)) = R M ⇢(x, t)b(x, t)dMx encourages the density to avoid obstacles. I Our model and algorithm can handle spherical or genus-2 manifolds very well. US-map-based mesh. “8”-shape 6 21
  22. Numerical Results: Concentrated Density I Recall: the objective function is

    Z 1 0 Z M km(x, t)k2 g(x) 2⇢(x, t) dMxdt + Z 1 0 F(⇢(·, t))dt + FT (⇢(·, 1)) I The interaction cost F(⇢(·, t)) = R M p ⇢(x, t) + ✏dMx encourages the density to concentrate. F(⇢(·, t)) = 0. F(⇢(·, t)) = R M p ⇢(x, t) + ✏dMx. 7 22
  23. Numerical Results: Smooth Density I Recall: the objective function is

    Z 1 0 Z M km(x, t)k2 g(x) 2⇢(x, t) dMxdt + Z 1 0 F(⇢(·, t))dt + FT (⇢(·, 1)) I The interaction cost F(⇢(·, t)) = 1 2 R M krM⇢(x, t)k2 g(x) dMx encourages the density to be smooth. F(⇢(·, t)) = 0. F(⇢(·, t)) = 1 2 R M krM⇢(x, t)k2 g(x) dMx. 8 23
  24. Controlling propagation of epidemics: Mean-field SIR games Stanley Osher Joint

    work with Wonjun Lee, Siting Liu, Hamidou Tembine and Wuchen Li 24
  25. Classic Epidemic Model The classical Epidemic model is the SIR

    model (Kermack and McKendrick, 1927) 8 > > > > > < > > > > > : dS dt = SI dI dt = SI I dR dt = I where S, I,R : [0, T] ! [0, 1] represent the density of the susceptible population, infected population, and recovered population, respectively, given time t. The nonnegative constants and represent the rates of susceptible becoming infected and infected becoming recovered. 5 25
  26. Spatial SIR To model the spatial e↵ect of virus spreading

    ,the spatial SIR model is considered: 8 > > > > > > > < > > > > > > > : @ t ⇢ S (t, x) + ⇢ S (t, x) Z ⌦ K(x, y)⇢ I (t, y)dy ⌘2 S 2 ⇢ S (t, x) = 0 @ t ⇢ I (t, x) ⇢ I (x) Z ⌦ K(x, y)⇢ S (t, y)dy + ⇢ I (t, x) ⌘2 I 2 ⇢ I (t, x) = 0 @ t ⇢ R (t, x) ⇢ I (t, x) ⌘2 R 2 ⇢ R (t, x) = 0 Here ⌦ is a given spatial domain and K(x, y) is a symmetric positive definite kernel modeling the physical distancing. E.g. R Kd⇢ I is the exposure to infectious agents. 6 26
  27. vI 27

  28. Mean-field game SIR systems 8 > > > > >

    > > > > > > > > > > > > > > > > > > > > < > > > > > > > > > > > > > > > > > > > > > > > > > : @ t S ↵ S 2 |r S| 2 + ⌘2 S 2 S + c(⇢ S + ⇢ I + ⇢ R ) + (K ⇤ ( I ⇢ I ) S K ⇤ ⇢ I ) = 0 @ t I ↵ I 2 |r I| 2 + ⌘2 I 2 I + c(⇢ S + ⇢ I + ⇢ R ) + ( I K ⇤ ⇢ S K ⇤ ( S ⇢ S )) + ⇢( R I ) = 0 @ t R ↵ R 2 |r R| 2 + ⌘2 R 2 R + c(⇢ S + ⇢ I + ⇢ R ) = 0 @ t ⇢ S 1 ↵ S r · (⇢ Sr S ) + ⇢ S K ⇤ ⇢ I ⌘2 S 2 ⇢ S = 0 @ t ⇢ I 1 ↵ I r · (⇢r I ) ⇢ I K ⇤ ⇢ S + ⇢ I ⌘2 I 2 ⇢ I = 0 @ t ⇢ R 1 ↵ R r · (⇢ Rr R ) ⇢ I ⌘2 R 2 ⇢ R = 0. 12 I
  29. Variational formulation Denote m i = ⇢ i v i

    . Define the Lagrangian functional for Mean field game SIR problem by L((⇢ i , m i , i ) i=S,I,R ) =P(⇢ i , m i ) i=S,I,R Z T 0 Z ⌦ X i=S,I,R i ✓ @ t ⇢ i + r · m i ⌘2 i 2 ⇢ i ◆ dxdt + Z T 0 Z ⌦ I ⇢ I K ⇤ ⇢ S S ⇢ S K ⇤ ⇢ I + ⇢ I ( R I )dxdt. Using this Lagrangian functional, we convert the minimization problem into a saddle problem. inf (⇢i,mi)i=S,I,R sup ( i)i=S,I,R L((⇢ i , m i , i ) i=S,I,R ). 16 29
  30. Algorithm Algorithm: PDHG for mean field game SIR system Input:

    ⇢ i (0, ·) (i = S, I, R) Output: ⇢ i , m i , i (i = S, I, R) for x 2 ⌦, t 2 [0, T] While relative error > tolerance ⇢(k+1) i = argmin ⇢ L(⇢, m(k) i , (k) i ) + 1 2⌧i k⇢ ⇢(k) i k2 L2 m(k+1) i = argmin m L(⇢(k+1), m, (k) i ) + 1 2⌧i km m(k) i k2 L2 (k+ 1 2 ) i = argmax L(⇢(k+1), m(k+1) i , ) 1 2 i k (k) i k2 H2 (k+1) i = 2 (k+ 1 2 ) i (k) i end 17 30
  31. Examples I 19 Small recovery rate 31

  32. Example II 20 Large recovery rate 32

  33. Example III 21 Small recovery rate 33

  34. Discussions Importance of spatial SIR variational problems. I Consider more

    status of populations, going beyond S, I, R. I Construct discrete spatial domain model, including airport, train station, hospital, school etc. I Propose inverse mean field SIR problems. Learning parameters in the model by daily life data. I Combine mean field game SIR models with AI and machine learning algorithms, including APAC, Neural variational ODE, Neural Fokker-Planck equations, etc. 22 34
  35. APAC-Net: Alternating the Population and Agent Control Neural Networks Alex

    Tong Lin, Samy Wu Fung, Wuchen Li, Levon Nurbekyan, Stanely Osher 2020. 35 Stanley Osher
  36. Mean Field Games I A mean field game seeks to

    model the behavior of a very large number of small interacting agents that each seek to optimize their own value function. 2 36
  37. Variational Mean-Field Games I Namely for some mean-field games, the

    equilibrium solution can be found by minimizing an overall “energy” (e.g. multiply the value function for a single agent by ⇢): A(⇢, v) = min ⇢,v Z T 0 Z ⌦ ⇢(x, t)L(v) + F(x, ⇢) dx dt + Z ⌦ (x)⇢(x, T) dx where x = x(t), v = v(t) above, and where the optimization has the constraint, @t⇢ ⌫ ⇢ + div(⇢v) = 0. 3 37
  38. It’s Just Sampling We can raise the constraints into the

    objective, perform integration by parts, and push the minimization of v inside to obtain, = max ' min ⇢ Z T 0 Z ⌦ F(x, ⇢(x, t)) dx dt + Z ⌦ ⇣ (x) '(x, T) ⌘ ⇢(x, T) dx + Z ⌦ '(x, 0)⇢(x, 0) dx + Z T 0 Z ⌦ ✓ @t'(x, t) + '(x, t) L⇤( r'(x, t)) ◆ ⇢(x, t) dx dt which means we end up with a sampling problem. Then the idea is to turn ⇢ and ' into neural networks and train as in GANs (Generative Adversarial Networks). 4 38 With Primal-Dual formulations of Mean fi eld control
  39. GAN Training I Example GAN loss function: min G max

    D E x⇠real [D(x)] E z⇠N [D(G(z))] s.t.krDk  1 = min G max D E x⇠real [D(x)] E y⇠G(z),z⇠N [D(y)] s.t.krDk  1 I Training has a discriminator and a generator. The generator produces samples (analogous to our ⇢), and the discriminator evaluates the quality of those samples (analogous to our '). 5 39
  40. Math and GANs Recall the solution to the mean-field game

    turned into the min-max problem: max ' min ⇢ Z ⌦ ⇣ (x) '(x, T) ⌘ ⇢(x, T) dx + Z ⌦ '(x, 0)⇢(x, 0) dx + Z T 0 Z ⌦ F(x, ⇢(x, t)) dx dt + Z T 0 Z ⌦ ✓ @t'(x, t) + '(x, t) L⇤( r'(x, t)) ◆ ⇢(x, t) dx dt And supposing a certain form for F, then this can be expressed in expectation form, max ' min ⇢ E y⇠⇢(x,T ) [ (y) '(y, T)] + E y⇠⇢(x,0) ['(y, 0)] + Z T 0 E y⇠⇢(x,t) [F(y)] dt + Z T 0 E y⇠⇢(x,t) [@t'(y, t) + '(y, t) L ⇤( r'(y, t))] dt 6 40
  41. Variational Mean-Field Games I Namely for some mean-field games, the

    equilibrium solution can be found by minimizing an overall “energy” (e.g. multiply the value function for a single agent by ⇢): A(⇢, v) = min ⇢,v Z T 0 Z ⌦ ⇢(x, t)L(v) + F(x, ⇢) dx dt + Z ⌦ g(x)⇢(x, T) dx where x = x(t), v = v(t) above, and where the optimization has the constraint, @t⇢ ⌫ ⇢ + div(⇢v) = 0. 2 41
  42. Variational Mean-Field Games / APAC-Net Preview After some elevating the

    constraints into the objective, integrating by parts, and calculating the Legendre transform, we get = max ' min ⇢ Z T 0 Z ⌦ F(x, ⇢(x, t)) dx dt + Z ⌦ ⇣ g(x) '(x, T) ⌘ ⇢(x, T) dx + Z ⌦ '(x, 0)⇢(x, 0) dx + Z T 0 Z ⌦ ✓ @t'(x, t) + '(x, t) L⇤( r'(x, t)) ◆ ⇢(x, t) dx dt which means we end up with a sampling problem – this is a preview of APAC-Net. This is in the spirit of Feynman-Kac. Then the idea is to turn ⇢ and ' into neural networks and train as in GANs (Generative Adversarial Networks). 3 42
  43. Numerical Results - Obstacles & Congestion I H(p) = kpk2

    I A 20 dimensional obstacle problem where we have an obstacle in (x1, x2) and in (x3, x4) and in (x5, x6), etc. Congestion penalty is active in the bottlenecks. Figure: A screencapture of a video that will be played 6 43
  44. Numerical Results: A realistic example, the Quadcopter The dynamics of

    a quadcopter are: 8 > > > > > > < > > > > > > : ¨ x = u m (sin( ) sin( ) + cos( ) cos( ) sin(✓)) ¨ y = u m ( cos( ) sin( ) + cos( ) sin(✓) sin( )) ¨ z = u m cos(✓) cos( ) g ¨ = ˜ ⌧ ¨ ✓ = ˜ ⌧✓ ¨ = ˜ ⌧ where u is the thrust, g is the gravitational acceleration (9.81m/s2), and x, y, z are the spatial coordinates, , ✓, are the angular coordinates, and ˜ ⌧ , ˜ ⌧✓ , ˜ ⌧ . Turns 12-dimensional when you transfer to first-order system. 7 44
  45. Numerical Results: A realistic example, the Quadcopter 8 45

  46. Numerical Results: A realistic example, the Quadcopter Movie: Figure: A

    screencapture of a video that will be played 9 46
  47. Numerical Results: Drone fleet avoiding obstacle 7 47

  48. Numerical Results: Drone fleet avoiding obstacle Movie: Figure: Screen capture

    of drones avoiding obstacles that will be played. 8 48
  49. Numerical Results: Drone fleet chasing a target Movie: Figure: Screen

    capture of drones chasing a target that will be played. 10 49
  50. Numerical Results: Drone fleet chasing a target 9 50

  51. 51 Publications

  52. 52

  53. Thank You! 53