Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Karma Games

Florian Dörfler
October 16, 2024
300

Karma Games

Florian Dörfler

October 16, 2024
Tweet

Transcript

  1.  1 A self-contained Karma Economy 
 for the dynamic

    allocation of shared resources
 Ezzat Elokda, and Andrea Censi Saverio Bolognani Florian Dör fl er Emilio Frazzoli with Carlo Cenedese, Kenan Zhang and John Lygeros
  2. Menu • Motivation • A "simple" resource sharing problem •

    Dynamic population game model • Application to tra ffi c congestion management • Conclusions  2
  3. Engineers play an active role in managing shared resources ⇒

    must be mindful of social implications of our designs Energy Transportation Internet  4
  4. Fairness, efficiency, scalability Three con fl icting objectives  5

    Scalability E ffi ciency* Fairness* *exact de fi nition to follow
  5. Fairness, efficiency, scalability Three con fl icting objectives  6

    Scalability E ffi ciency Fairness* *maximum fairness possible if encoded in cost function
  6. Fairness, efficiency, scalability Three con fl icting objectives  9

    Scalability E ffi ciency* Fairness *willingness to pay ≠ need
  7. Menu • Motivation • A "simple" resource sharing problem •

    Dynamic population game model • Application to tra ffi c congestion management • Conclusions  11
  8. • Each agent has a private urgency u • The

    daily urgency follows 
 an exogenous Markov chain φ Two AVs meet at an unsignalled intersection. Who goes first?  12
  9. Reputation works in a small community • Just tell each

    other the urgency • In a small community,
 agents are inclined to be truthful
 because freeloaders will be punished. • How can reputation work,
 if you never see the same car again?  13
  10. Tossing a coin is inefficient • E ffi ciency :=

    - 𝔼 [sum of costs] Cost of going fi rst = 0 Cost of going last = urgency • Max e ff i ciency: 
 highest urgency goes fi rst • Coin toss is agnostic to private needs  14
  11. Tossing a coin is not always fair • Fairness def.

    1: "equal opportunity/access” Prob[ R goes fi rst ] = Prob[ B goes fi rst ] • Fairness def. 2: "equal outcome" 𝔼 [ cost(R) ] = 𝔼 [ cost(B) ] • Fairness def. 3: "reparation" 𝔼 [ future cost | past cost ] ∝ -past cost • Coin toss satis fi es equal opportunity, equal outcome (if players are homogeneous), but not reparation  15
  12. Monetary solutions are unfair [1] • Simple reason: not everyone

    has the same money • Would be fair if: bank account = social credit [1] Carlino et al., "Auction-based autonomous intersection management", ITSC (2013)  16
  13. Monetary solutions are inefficient • … if e ffi ciency

    is de fi ned based on the true private urgency • Willingness to pay ≠ urgency Depends on balance in bank [1]  17 [1] Börjesson et al., "On the income elasticity of the value of travel time", Transportation Research Part A (2011)
  14. Karma mechanism (pairwise interactions) Today Tomorrow 1. Each agent bids

    karma to access the shared resource 2. Who bids more gets the resource, and pays the bid to the other Karma balances re fl ect the access of shared resources There are many variations: pay to peer, pay to society; pay full, di ff erence Very similar results  18
  15. Placing karma bids is not that simple Unlike money, karma

    has no value a-priori Money world bid 'too much' bid 'too little' bid 'right amount' • Players face a dynamic optimization a ff ected by: Urgency process φ Future awareness α ∈ [0,1): discounts future rewards Bids of others  19 Go fi rst Karma world Go fi rst now Go fi rst later Go fi rst now Go fi rst later Go fi rst now Go fi rst later
  16. Menu • Motivation • A "simple" resource sharing problem •

    Dynamic population game model • Application to tra ffi c congestion management • Conclusions  20
  17. • Individual time-varying state: x = [urgency u, karma k]

    • Individual action: a = [karma bid b ≤ k] • Social state: (state distribution d, policy π) d[u, k] - distribution of individual states π[b | u, k] - map from individual state to probabilities of individual actions • Individual Markov Decision Processes, but coupled through (d, π) Immediate reward: ζ[u, b](d, π) Karma transition probabilities: κ[k+ | k, b](d, π) The karma game is a dynamic population game Notation [·] for discrete quantities (·) for continuous quantities [a | b] probability of a given b  21 Example policy
  18. • Expected immediate reward: R[x](d, π) = Σa π[a |

    x] ζ[x, a](d, π) • State stochastic matrix: P[x+ | x](d, π) = Σa π[a | x] ρ[x+ | x, a](d, π) • In fi nite-horizon value function: V[x](d, π) = R[x](d, π) + α Σx+ P[x+ | x] (d, π) V[x+](d, π) • Single-stage deviation rewards: Q[x, a](d, π) = ζ[x, a](d, π) + α Σx+ ρ[x+, a | x] V[x+](d, π) • Best response per state: B[x](d, π) := set of randomizations of a that maximize Q[x, a](d, π) Best response in dynamic population games De fi ned per state w.r.t. single-stage deviation rewards Notation [·] for discrete quantities (·) for continuous quantities x - individual state a - individual action d - state distribution π - policy ζ - immediate reward ρ - state transition probabilities α - future discount factor  22
  19. • Stationary Nash Equilibrium (SNE): social state (d*, π*) where

    state distribution is stationary d* = P(d*, π*) d* policy is a best response at all states π*[. | x] ∈ B[x](d*, π*) Assumption 1: Continuity ζ(d, π) and ρ(d, π) are continuous. Assumption 2: Karma preservation Karma is preserved in expectation, i.e., 𝔼 [k+] = 𝔼 [k]. Theorem 1 (Existence of SNE): Let Assumption 1 and 2 hold. Then, a SNE is guaranteed to exist. Nash equilibria in dynamic population games The game played every day is di ff erent. A best response fi xed point is not enough  23 Notation [·] for discrete quantities (·) for continuous quantities d - state distribution π - policy P - state stochastic matrix B - best response per state ζ - immediate reward ρ - state transition probabilities Required in general DPGs Speci fi c to karma DPGs
  20. Dynamic vs. static population game Theorem 2 (Reduction to population

    game): For every dynamic population game (DPG), there exists a static population game with augmented population p' whose NE coincide with SNE of the DPG. SNE in dynamic population game d* = P(d*, π*) d* π*[. | x] ∈ B[x](d*, π*) x - individual state NE in static population game π*[. | p] ∈ B[p]( π*) p - individual (static) population Trick: de fi ne an augmented population p' • Mixed strategy: π[. | p'] := d • Payo ff : F[p', .](d, π) := P(d, π) d - d Intuition: population p' "plays the dynamics"  24
  21. Corollary: evolutionary dynamics for SNE computation  25 Sandholm, "Population

    games and evolutionary dynamics", MIT Press (2010) Remark: Projection dynamic for p' = continuous-time state dynamics ḋ = P(d, π) d - d
  22. Here is a karma Stationary Nash Equilibrium It consists of

    an equilibrium policy π* and a stationary state distribution d* π* d*  26 • Urgency process φ[u+ | u] • Future awareness α = 0.98
  23. And here are the efficiency and fairness of karma Performance

    shown for α-sweep and karma schemes PBP & PBS  27 • Results from agent-based simulations with 200 agents and 1000 days • E ffi ciency = -AVG[costs] • Fairness = -STD[# of times went fi rst] Captures equal opportunity and reparation *Optimal e ffi ciency of MONEY under assumption that money is accurate measure of urgency Legend PBP Karma: Pay Bid to Peer PBS Karma: Pay Bid to Society COIN Baseline coin toss TURN Simple turn-taking MONEY Truthful monetary mechanism*
  24. And here are the efficiency and fairness of karma Near-optimal

    e ff i ciency and fairness for high α in both karma schemes  28 • Results from agent-based simulations with 200 agents and 1000 days • E ffi ciency = -AVG[costs] • Fairness = -STD[# of times went fi rst] Captures equal opportunity and reparation *Optimal e ffi ciency of MONEY under assumption that money is accurate measure of urgency Legend PBP Karma: Pay Bid to Peer PBS Karma: Pay Bid to Society COIN Baseline coin toss TURN Simple turn-taking MONEY Truthful monetary mechanism*
  25. Menu • Motivation • A "simple" resource sharing problem •

    Dynamic population game model • Application to tra ff i c congestion management • Conclusions  32
  26. Let's turn to a persistent societal problem For decades, experts

    have been seeking non-monetary solutions to tra ff i c congestion. Nothing has been quite satisfactory. • HOV lanes [1, 2] Limited controllability • License-plate rationing [3, 4] Ine ff i cient: can't travel on Wednesday instead of Monday • Mobility credits [5, 6] Essentially monetary: credits are tradable for money [5] or used to pay for tolls [6] [1] Dahlgren, "High occupancy vehicle lanes: Not always more e ff ective than general purpose lanes" (1998) [2] Wang et al., "Optimal capacity allocation for high occupancy vehicle (HOV) lane in morning commute" (2019) [3] Wang et al., "Tra ffi c rationing and short-term and long-term equilibrium" (2010) [4] Han et al., "E ff i ciency of the plate-number-based tra ff i c rationing in general networks" (2010) [5] Verhoef et al., "Tradeable permits: their potential in the regulation of road transport externalities" (1997) [6] Kalmanje et al., "Credit-based congestion pricing: travel, land value, and welfare impacts" (2004)  33
  27. Karma for traffic management CARMA: fair and e ff i

    cient bottleneck congestion management with karma • Commuters departing at same discrete time window bid karma • Regulated fast lane fi lled until free- fl ow capacity by highest bidders • All other tra ffi c goes to unregulated slow lane that can get congested • PBS scheme: fast lane commuters pay bid to society (to be uniformly redistributed at end of day) Ezzat Elokda, Carlo Cenedese, Kenan Zhang, Andrea Censi, John Lygeros, Emilio Frazzoli, Florian Dör fl er TRB Annual Meeting (2023)
  28. The bottleneck model in the classical vs. karma world Notation

    [·] for discrete quantities (·) for continuous quantities u - urgency / Value of Time t - departure time t* - desired arrival time tq - queuing delay β, γ - delay sensitivities  35 Classical model: Value of Time (VOT) queuing delay schedule early delay schedule late delay monetary toll
  29. The bottleneck model in the classical vs. karma world There

    is no monetary term in the karma cost function! Classical model: Karma model:  36 Value of Time (VOT) queuing delay schedule early delay schedule late delay monetary toll Notation [·] for discrete quantities (·) for continuous quantities u - urgency / Value of Time t - departure time t* - desired arrival time tq - queuing delay β, γ - delay sensitivities d - state distribution π - policy (d, π) - social state
  30. Homogeneous commuters CARMA is as e ff i cient as

    TOLL: high VOT commuters enter fast lane  37 Legend CARMA Karma solution TOLL Optimal monetary toll NOM No policy intervention ul Low urgency/VOT uh High urgency/VOT
  31. Low-income vs high-income groups - TOLL: low income commuters never

    enter fast lane - CARMA: equal opportunity of entering fast lane  38 Legend CARMA Karma solution TOLL Optimal monetary toll NOM No policy intervention τ1 Low income group τ2 High income group
  32. Time-dependent karma redistribution CARMA more e ff i cient than

    TOLL: less congestion in slow lane  39 Legend CARMA Karma solution TOLL Optimal monetary toll NOM No policy intervention ul Low urgency/VOT uh High urgency/VOT
  33. Menu • Motivation • A "simple" resource sharing problem •

    Dynamic population game model • Application to tra ffi c congestion management • Conclusions  40
  34. Karma systems for socially responsible resource sharing  42 •

    Stop using money as a design tool when allocating shared resources! • Karma mechanisms provide an alternative. • Many open questions for karma: How to learn the Stationary Nash Equilibrium? Do di ff erent systems need di ff erent karma accounts? Can we compose karma system and preserve fairness? • We also realized that fairness is high dimensional (equal access/opportunity, reparation, …) What is the right term to add to our cost functions?
  35. Conclusions • Karma creates economies of favors: updating
 the homo

    sapiens reciprocipity advantage 
 to large scale, automation-mediated interactions • Karma is fair because it is closed and regulated You cannot buy Karma with other wealth Karma is exchanged based on strict rules • Karma achieves high e ff i ciency and fairness for a population of self-interested players …playing an equilibrium of a dynamic population game (no coordination) … yet players act as if they were altruistic 
 and consider the reputation of other agents. Legend PBP Karma: Pay Bid to Peer PBS Karma: Pay Bid to Society COIN Baseline coin toss TURN Simple turn-taking MONEY Monetary market
  36.  44 Andrea Censi Saverio Bolognani Florian Dör fl er

    Emilio Frazzoli with Carlo Cenedese, Kenan Zhang and John Lygeros