Karma Games - Speaker Deck

Slide 1

Slide 1 text

1 A self-contained Karma Economy   for the dynamic allocation of shared resources  Ezzat Elokda, and Andrea Censi Saverio Bolognani Florian Dör fl er Emilio Frazzoli with Carlo Cenedese, Kenan Zhang and John Lygeros

Slide 2

Slide 2 text

Menu • Motivation • A "simple" resource sharing problem • Dynamic population game model • Application to tra ffi c congestion management • Conclusions 2

Slide 3

Slide 3 text

Shared Resources

Slide 4

Slide 4 text

Engineers play an active role in managing shared resources ⇒ must be mindful of social implications of our designs Energy Transportation Internet 4

Slide 5

Slide 5 text

Fairness, efficiency, scalability Three con fl icting objectives 5 Scalability E ffi ciency* Fairness* *exact de fi nition to follow

Slide 6

Slide 6 text

Fairness, efficiency, scalability Three con fl icting objectives 6 Scalability E ffi ciency Fairness* *maximum fairness possible if encoded in cost function

Slide 7

Slide 7 text

Fairness, efficiency, scalability Three con fl icting objectives 7 Scalability E ffi ciency Fairness

Slide 8

Slide 8 text

Fairness, efficiency, scalability Three con fl icting objectives 8 Scalability E ffi ciency Fairness

Slide 9

Slide 9 text

Fairness, efficiency, scalability Three con fl icting objectives 9 Scalability E ffi ciency* Fairness *willingness to pay ≠ need

Slide 10

Slide 10 text

Fairness, efficiency, scalability Three con fl icting objectives 10 Scalability E ffi ciency Fairness

Slide 11

Slide 11 text

Menu • Motivation • A "simple" resource sharing problem • Dynamic population game model • Application to tra ffi c congestion management • Conclusions 11

Slide 12

Slide 12 text

• Each agent has a private urgency u • The daily urgency follows   an exogenous Markov chain φ Two AVs meet at an unsignalled intersection. Who goes first? 12

Slide 13

Slide 13 text

Reputation works in a small community • Just tell each other the urgency • In a small community,  agents are inclined to be truthful  because freeloaders will be punished. • How can reputation work,  if you never see the same car again? 13

Slide 14

Slide 14 text

Tossing a coin is inefficient • E ffi ciency := - 𝔼 [sum of costs] Cost of going fi rst = 0 Cost of going last = urgency • Max e ff i ciency:   highest urgency goes fi rst • Coin toss is agnostic to private needs 14

Slide 15

Slide 15 text

Tossing a coin is not always fair • Fairness def. 1: "equal opportunity/access” Prob[ R goes fi rst ] = Prob[ B goes fi rst ] • Fairness def. 2: "equal outcome" 𝔼 [ cost(R) ] = 𝔼 [ cost(B) ] • Fairness def. 3: "reparation" 𝔼 [ future cost | past cost ] ∝ -past cost • Coin toss satis fi es equal opportunity, equal outcome (if players are homogeneous), but not reparation 15

Slide 16

Slide 16 text

Monetary solutions are unfair [1] • Simple reason: not everyone has the same money • Would be fair if: bank account = social credit [1] Carlino et al., "Auction-based autonomous intersection management", ITSC (2013) 16

Slide 17

Slide 17 text

Monetary solutions are inefficient • … if e ffi ciency is de fi ned based on the true private urgency • Willingness to pay ≠ urgency Depends on balance in bank [1] 17 [1] Börjesson et al., "On the income elasticity of the value of travel time", Transportation Research Part A (2011)

Slide 18

Slide 18 text

Karma mechanism (pairwise interactions) Today Tomorrow 1. Each agent bids karma to access the shared resource 2. Who bids more gets the resource, and pays the bid to the other Karma balances re fl ect the access of shared resources There are many variations: pay to peer, pay to society; pay full, di ff erence Very similar results 18

Slide 19

Slide 19 text

Placing karma bids is not that simple Unlike money, karma has no value a-priori Money world bid 'too much' bid 'too little' bid 'right amount' • Players face a dynamic optimization a ff ected by: Urgency process φ Future awareness α ∈ [0,1): discounts future rewards Bids of others 19 Go fi rst Karma world Go fi rst now Go fi rst later Go fi rst now Go fi rst later Go fi rst now Go fi rst later

Slide 20

Slide 20 text

Menu • Motivation • A "simple" resource sharing problem • Dynamic population game model • Application to tra ffi c congestion management • Conclusions 20

Slide 21

Slide 21 text

• Individual time-varying state: x = [urgency u, karma k] • Individual action: a = [karma bid b ≤ k] • Social state: (state distribution d, policy π) d[u, k] - distribution of individual states π[b | u, k] - map from individual state to probabilities of individual actions • Individual Markov Decision Processes, but coupled through (d, π) Immediate reward: ζ[u, b](d, π) Karma transition probabilities: κ[k+ | k, b](d, π) The karma game is a dynamic population game Notation [·] for discrete quantities (·) for continuous quantities [a | b] probability of a given b 21 Example policy

Slide 22

Slide 22 text

• Expected immediate reward: R[x](d, π) = Σa π[a | x] ζ[x, a](d, π) • State stochastic matrix: P[x+ | x](d, π) = Σa π[a | x] ρ[x+ | x, a](d, π) • In fi nite-horizon value function: V[x](d, π) = R[x](d, π) + α Σx+ P[x+ | x] (d, π) V[x+](d, π) • Single-stage deviation rewards: Q[x, a](d, π) = ζ[x, a](d, π) + α Σx+ ρ[x+, a | x] V[x+](d, π) • Best response per state: B[x](d, π) := set of randomizations of a that maximize Q[x, a](d, π) Best response in dynamic population games De fi ned per state w.r.t. single-stage deviation rewards Notation [·] for discrete quantities (·) for continuous quantities x - individual state a - individual action d - state distribution π - policy ζ - immediate reward ρ - state transition probabilities α - future discount factor 22

Slide 23

Slide 23 text

• Stationary Nash Equilibrium (SNE): social state (d*, π*) where state distribution is stationary d* = P(d*, π*) d* policy is a best response at all states π*[. | x] ∈ B[x](d*, π*) Assumption 1: Continuity ζ(d, π) and ρ(d, π) are continuous. Assumption 2: Karma preservation Karma is preserved in expectation, i.e., 𝔼 [k+] = 𝔼 [k]. Theorem 1 (Existence of SNE): Let Assumption 1 and 2 hold. Then, a SNE is guaranteed to exist. Nash equilibria in dynamic population games The game played every day is di ff erent. A best response fi xed point is not enough 23 Notation [·] for discrete quantities (·) for continuous quantities d - state distribution π - policy P - state stochastic matrix B - best response per state ζ - immediate reward ρ - state transition probabilities Required in general DPGs Speci fi c to karma DPGs

Slide 24

Slide 24 text

Dynamic vs. static population game Theorem 2 (Reduction to population game): For every dynamic population game (DPG), there exists a static population game with augmented population p' whose NE coincide with SNE of the DPG. SNE in dynamic population game d* = P(d*, π*) d* π*[. | x] ∈ B[x](d*, π*) x - individual state NE in static population game π*[. | p] ∈ B[p]( π*) p - individual (static) population Trick: de fi ne an augmented population p' • Mixed strategy: π[. | p'] := d • Payo ff : F[p', .](d, π) := P(d, π) d - d Intuition: population p' "plays the dynamics" 24

Slide 25

Slide 25 text

Corollary: evolutionary dynamics for SNE computation 25 Sandholm, "Population games and evolutionary dynamics", MIT Press (2010) Remark: Projection dynamic for p' = continuous-time state dynamics ḋ = P(d, π) d - d

Slide 26

Slide 26 text

Here is a karma Stationary Nash Equilibrium It consists of an equilibrium policy π* and a stationary state distribution d* π* d* 26 • Urgency process φ[u+ | u] • Future awareness α = 0.98

Slide 27

Slide 27 text

And here are the efficiency and fairness of karma Performance shown for α-sweep and karma schemes PBP & PBS 27 • Results from agent-based simulations with 200 agents and 1000 days • E ffi ciency = -AVG[costs] • Fairness = -STD[# of times went fi rst] Captures equal opportunity and reparation *Optimal e ffi ciency of MONEY under assumption that money is accurate measure of urgency Legend PBP Karma: Pay Bid to Peer PBS Karma: Pay Bid to Society COIN Baseline coin toss TURN Simple turn-taking MONEY Truthful monetary mechanism*

Slide 28

Slide 28 text

And here are the efficiency and fairness of karma Near-optimal e ff i ciency and fairness for high α in both karma schemes 28 • Results from agent-based simulations with 200 agents and 1000 days • E ffi ciency = -AVG[costs] • Fairness = -STD[# of times went fi rst] Captures equal opportunity and reparation *Optimal e ffi ciency of MONEY under assumption that money is accurate measure of urgency Legend PBP Karma: Pay Bid to Peer PBS Karma: Pay Bid to Society COIN Baseline coin toss TURN Simple turn-taking MONEY Truthful monetary mechanism*

Slide 29

Slide 29 text

Heterogeneous future awareness "Ants" are more patient than "Grasshoppers" 29 29

Slide 30

Slide 30 text

Heterogeneous future awareness Closing the disparity with a small progressive karma tax 30

Slide 31

Slide 31 text

Heterogeneous urgency processes Four user types with same average urgency but di ff erent processes φ 31

Slide 32

Slide 32 text

Menu • Motivation • A "simple" resource sharing problem • Dynamic population game model • Application to tra ff i c congestion management • Conclusions 32

Slide 33

Slide 33 text

Let's turn to a persistent societal problem For decades, experts have been seeking non-monetary solutions to tra ff i c congestion. Nothing has been quite satisfactory. • HOV lanes [1, 2] Limited controllability • License-plate rationing [3, 4] Ine ff i cient: can't travel on Wednesday instead of Monday • Mobility credits [5, 6] Essentially monetary: credits are tradable for money [5] or used to pay for tolls [6] [1] Dahlgren, "High occupancy vehicle lanes: Not always more e ff ective than general purpose lanes" (1998) [2] Wang et al., "Optimal capacity allocation for high occupancy vehicle (HOV) lane in morning commute" (2019) [3] Wang et al., "Tra ffi c rationing and short-term and long-term equilibrium" (2010) [4] Han et al., "E ff i ciency of the plate-number-based tra ff i c rationing in general networks" (2010) [5] Verhoef et al., "Tradeable permits: their potential in the regulation of road transport externalities" (1997) [6] Kalmanje et al., "Credit-based congestion pricing: travel, land value, and welfare impacts" (2004) 33

Slide 34

Slide 34 text

Karma for traffic management CARMA: fair and e ff i cient bottleneck congestion management with karma • Commuters departing at same discrete time window bid karma • Regulated fast lane fi lled until free- fl ow capacity by highest bidders • All other tra ffi c goes to unregulated slow lane that can get congested • PBS scheme: fast lane commuters pay bid to society (to be uniformly redistributed at end of day) Ezzat Elokda, Carlo Cenedese, Kenan Zhang, Andrea Censi, John Lygeros, Emilio Frazzoli, Florian Dör fl er TRB Annual Meeting (2023)

Slide 35

Slide 35 text

The bottleneck model in the classical vs. karma world Notation [·] for discrete quantities (·) for continuous quantities u - urgency / Value of Time t - departure time t* - desired arrival time tq - queuing delay β, γ - delay sensitivities 35 Classical model: Value of Time (VOT) queuing delay schedule early delay schedule late delay monetary toll

Slide 36

Slide 36 text

The bottleneck model in the classical vs. karma world There is no monetary term in the karma cost function! Classical model: Karma model: 36 Value of Time (VOT) queuing delay schedule early delay schedule late delay monetary toll Notation [·] for discrete quantities (·) for continuous quantities u - urgency / Value of Time t - departure time t* - desired arrival time tq - queuing delay β, γ - delay sensitivities d - state distribution π - policy (d, π) - social state

Slide 37

Slide 37 text

Homogeneous commuters CARMA is as e ff i cient as TOLL: high VOT commuters enter fast lane 37 Legend CARMA Karma solution TOLL Optimal monetary toll NOM No policy intervention ul Low urgency/VOT uh High urgency/VOT

Slide 38

Slide 38 text

Low-income vs high-income groups - TOLL: low income commuters never enter fast lane - CARMA: equal opportunity of entering fast lane 38 Legend CARMA Karma solution TOLL Optimal monetary toll NOM No policy intervention τ1 Low income group τ2 High income group

Slide 39

Slide 39 text

Time-dependent karma redistribution CARMA more e ff i cient than TOLL: less congestion in slow lane 39 Legend CARMA Karma solution TOLL Optimal monetary toll NOM No policy intervention ul Low urgency/VOT uh High urgency/VOT

Slide 40

Slide 40 text

Menu • Motivation • A "simple" resource sharing problem • Dynamic population game model • Application to tra ffi c congestion management • Conclusions 40

Slide 41

Slide 41 text

Karma lies between reputation and money Embedding reputation into large-scale resource sharing 41 Reputation Karma Money

Slide 42

Slide 42 text

Karma systems for socially responsible resource sharing 42 • Stop using money as a design tool when allocating shared resources! • Karma mechanisms provide an alternative. • Many open questions for karma: How to learn the Stationary Nash Equilibrium? Do di ff erent systems need di ff erent karma accounts? Can we compose karma system and preserve fairness? • We also realized that fairness is high dimensional (equal access/opportunity, reparation, …) What is the right term to add to our cost functions?

Slide 43

Slide 43 text

Conclusions • Karma creates economies of favors: updating  the homo sapiens reciprocipity advantage   to large scale, automation-mediated interactions • Karma is fair because it is closed and regulated You cannot buy Karma with other wealth Karma is exchanged based on strict rules • Karma achieves high e ff i ciency and fairness for a population of self-interested players …playing an equilibrium of a dynamic population game (no coordination) … yet players act as if they were altruistic   and consider the reputation of other agents. Legend PBP Karma: Pay Bid to Peer PBS Karma: Pay Bid to Society COIN Baseline coin toss TURN Simple turn-taking MONEY Monetary market

Slide 44

Slide 44 text

44 Andrea Censi Saverio Bolognani Florian Dör fl er Emilio Frazzoli with Carlo Cenedese, Kenan Zhang and John Lygeros