Decisions: Week 4

D S D M William Lowe Data Science Lab, Hertie
School Week : . .

S e role of utilities in making decisions How to
think about your future How to think about other people’s future How people actually think about the future Making decisions in groups Case study Trouble with group decision making

P Representation eorem (Von Neumann & Morgenstern, ) If you
have preferences over outcomes that are → complete → re exive → transitive then they can be represented numerically with a utility function: U(.)

P Representation eorem (Von Neumann & Morgenstern, ) If you
have preferences over outcomes that are → complete → re exive → transitive then they can be represented numerically with a utility function: U(.) Outcomes of actions are uncertain. Represent them as P(Oi A = a) → a ‘lottery’ in the decision theory terminology Pick the value of A with the highest expected value EU(a) = i U(Oi)P(Oi A = a)

I : C It’s not clear that P(Oi A =
a) is quite what we want here. Strictly it might be better to work with P(OA=a ) a.k.a. P(Oi do(a)) the distribution of O given that we do or set A to a, not just see A = a (Pearl )

I : C It’s not clear that P(Oi A =
a) is quite what we want here. Strictly it might be better to work with P(OA=a ) a.k.a. P(Oi do(a)) the distribution of O given that we do or set A to a, not just see A = a (Pearl ) An expected utility formulated this way gives us causal decision theory → Probably that was the one you had in mind to start with! (I hope) For an example of where we might get di erent answers depending on this choice, see any discussion of “Newcomb’s Problem”

F Many (all) policy decision problems have future consequences →
How to weigh them relative to the present consequences?

F Many (all) policy decision problems have future consequences →
How to weigh them relative to the present consequences? Simple theory: → A reward r τ time steps into the future is worth U(r )D(τ) → where D is the discount function → in simple cases D(τ) = exp(−λτ) → where λ is a discount rate

F is works rather like a (negative) interest rate, δτ
with ≤ δ ≤

T Case: climate change → How much are future (not
yet existing) lives (and their quality) worth relative to current ones?

T Case: climate change → How much are future (not
yet existing) lives (and their quality) worth relative to current ones? Two (separable?) considerations: → e e ect of an action on the existing population (or its quality of life) t years in the future → e e ect on populations that would exist (or the quality of their lives) t years in the future, some of whom would not exist (or whose life quality would be di erent) depending on the action How to think about this problem? (Note: however you feel about discount rates, policy behaviour embodies an implicit rate)

P As far as we are aware no animals discount
exponentially → Why?

exponentially → Why? Irrationality?

exponentially → Why? Irrationality? Maybe. Maybe not. Let’s see what they do instead

H Human discounting is usually not exponential but hyperbolic and
time sensitive Something more like ( + kτ) with unintuitive and apparently irrational consequences → preference reversal! For an interpretation in terms of ‘weakness of the will’ (Ainslie, )

E One way to understand what may be happening is
to return to the last lecture on representing uncertainty

to return to the last lecture on representing uncertainty (Sozou ) argues → Exponential discounting implies a belief in a known and constant hazard rate λ → hazard rate: the probability that the future good fails to appear as expected at t + τ despite being there at t → So consider priors over λ

to return to the last lecture on representing uncertainty

to return to the last lecture on representing uncertainty And now posteriors over λ e exponential prior generates discounting behaviour that is hyperbolic in exactly the functional form we saw before What does the exponential prior mean? And what does this have to do with marshmallows?

M Americans eat these things. Nobody really knows why

S ? Mischel and Ebbesen ( ) gave children the
following situation: → Favourite snack on a chair → Told they could eat it → But if they waited minutes then they could have a two

S ? Mischel and Ebbesen ( ) gave children the
following situation: → Favourite snack on a chair → Told they could eat it → But if they waited minutes then they could have a two Results → A small minority ate the snack immediately → / delayed long enough to get the second → Age predicted ability to delay Follow up in : “preschool children who delayed grati cation longer [...] were described [...] as adolescents who were signi cantly more competent” Follow up in : SAT was a predictor

E Self control with snack predicts SAT and other positive
outcomes! or Reliable parental environment makes it rational to expect rewards to come and make self-control a better idea Please don’t make graphs like this (it was , but still)

T We need to weight the future somehow → ere
are mathematical problems with putting positive weight on the in nite future → Rational choices for the discount rate seem...under-constrained → For personal utilities the problem can seem straightforward → For utilities that range over others (real or potential) things get...di cult → Unfortunately these are the important policy cases → Temporal discounting involves questions of policy, psychology, and to fairness

T We need to weight the future somehow → ere
are mathematical problems with putting positive weight on the in nite future → Rational choices for the discount rate seem...under-constrained → For personal utilities the problem can seem straightforward → For utilities that range over others (real or potential) things get...di cult → Unfortunately these are the important policy cases → Temporal discounting involves questions of policy, psychology, and to fairness Now, about all those people we were making decision about. Maybe they should be involved in the process?

G What is the ‘will’ of the people? (government, company,
organization)

G What is the ‘will’ of the people? (government, company,
organization) Operationally: How to aggregate individual preferences?

C Turns out this is not so straightforward. Consider the
problem of choosing a pope (Maltzman et al., )

problem of choosing a pope (Maltzman et al., ) From until the pope was chosen by → God → a / vote majority among ≤ cardinals → All of the above

problem of choosing a pope (Maltzman et al., ) From until the pope was chosen by → God → a / vote majority among ≤ cardinals → All of the above In John Paul II changed the rules (Apostolic Constitution, ‘Universi Dominici Gregis’) → a simple majority and the possibility of a runo vote

C R . . . B XVI

P Pope selection had been historically di cult and controversial
→ e conclave in produced two outcomes (the pope and the anti-pope) → a cardinal grabbed the papal coat and tried to run o with it, a erwards reigning as Victor VI → Deadlock in Viterbo prompted the locals to remove the roof and put the cardinals on bread and water is hasn’t happened recently, but it could...

D wins for against margin M vs B M +
M vs R R + B vs M M + B vs R B + R vs B B + R vs M R + Here there is no Condorcet winner

C Complete and transitive preferences over ≥ options can lead
to cycles in aggregate preferences, i.e. majority preferences are cyclical Speci c example: → ere need be no overall winner using a majority voting rule

C Generalized by Ken Arrow into his Impossibility eorem (Arrow,
) Any aggregation method for preferences that has all the following properties → Non-dictatorial: e wishes of multiple cardinals should be taken into consideration → Unrestricted Domain: Voting must account for all individual preferences / cover all the options → Pareto Optimal: Unanimous individual preferences must be respected: If every cardinal prefers cardinal A over cardinal B, cardinal A should be the pope → Independent of Irrelevant Alternatives: If a candidate is removed or drops out, then the others’ order should not change can have cycles. (You will see lots of slightly di erent versions of these, depending on the proof)

T John-Paul II adjusted the rules in to → Remove
’unrestricted domain’ → Allow a bare majority a er rounds of voting

’unrestricted domain’ → Allow a bare majority a er rounds of voting In principle, changing the number of deliberation cycles ( ) before a supermajority was no longer does not matter

’unrestricted domain’ → Allow a bare majority a er rounds of voting In principle, changing the number of deliberation cycles ( ) before a supermajority was no longer does not matter Anecdotally, with cardinals, some abstaining → Round : Ratzinger gets votes → Round : Ratzinger gets votes (but no supermajority) → Round : and a clear winner

A Interpretation: → Cycles can be real → Cycles can
represent the lack of a unique behavioural prediction Note: → We are not guaranteed cycles. at depends on a lot of things, including the individual’s preferences who are being aggregated

A Interpretation: → Cycles can be real → Cycles can
represent the lack of a unique behavioural prediction Note: → We are not guaranteed cycles. at depends on a lot of things, including the individual’s preferences who are being aggregated Most theorists get around this result by assuming more structure on preferences → i.e. structured utility functions

I e most natural form of structured utility function is
spatial How and how fast does utility decrease from the ideal point? → Linear: x − x∗ → Quadratic: (x − x∗ ) → Gaussian: exp (x−x∗) σ Interpretations: → Gaussian implies an ‘alienation from indi erence’ and is prpbably empirically more accurate (Carroll et al., )

I Black ( ) showed that if preferences were →
single dimensional → single peaked then no choices are well behaved. Special case: → Majority vote converges on the median voter

I We can have preferences on several dimensions at once
In this (radially symmetrical) case, majority rule also converges on the median voter → yay! But not so fast. It doesn’t work without the radial structure (Plott, ) McKelvey ( ) showed that in general multi-dimensional settings there need be ‘no stable equilibrium’ at all. i.e. cycling

S e McKelvey’s ‘chaos theorem’ shows that by careful choice
of comparisons, a smart agenda controller can o er a series of votes that move naive spatial voters to support anything she wants ese seem to be rather dumb voters who have not taken a social choice course. Can we design a system where it’s best to be as honest as these folk?

I : S Not usually

I : S Not usually Gibbard-Satterthwaite theorem (informally): For group
of ≥ individuals and ≥ alternatives (with no constraints on preferences and a non-dictatorial aggregation function) there is always some set of preferences it’s worth lying about (Gibbard, ; Satterthwaite, )

of ≥ individuals and ≥ alternatives (with no constraints on preferences and a non-dictatorial aggregation function) there is always some set of preferences it’s worth lying about (Gibbard, ; Satterthwaite, ) ere’s just no avoiding strategy...

of ≥ individuals and ≥ alternatives (with no constraints on preferences and a non-dictatorial aggregation function) there is always some set of preferences it’s worth lying about (Gibbard, ; Satterthwaite, ) ere’s just no avoiding strategy... Riker ( ) noted that it was o en possible to add a dimension in order to manipulate the outcome. More generally ‘the heresthetic’ is a type of political strategy, emphasizing → agenda control → strategic voting → manipulations of dimensions

G Being a rational group and making rational group decisions
may be harder than you think Lots of possibilities for making it easier → Restrict the choice domain: dictators, committees, departments, runo s → Induce structured utilities, e.g. elite leadership, psychological heuristic

G Being a rational group and making rational group decisions
may be harder than you think Lots of possibilities for making it easier → Restrict the choice domain: dictators, committees, departments, runo s → Induce structured utilities, e.g. elite leadership, psychological heuristic Shot: e concept of rational group preferences may just be a mistake (or an impossibility)? Chaser: You are o en, in a psychological sense, a group decision maker and may su er the same problems making decisions (Utility elicitation may be as hard as probability elicitation)

R Ainslie, G. ( ). ‘Breakdown of will’. Cambridge University
Press. Arrow, K. J. ( ). ‘Social choice and individual values’. John Wiley & Sons. Black, D. ( ). ‘On the rationale of group decision-making’. Journal of Political Economy, ( ), – . Carroll, R., Lewis, J. B., Lo, J., Poole, K. T. & Rosenthal, H. ( ). ‘ e structure of utility in spatial models of voting’. American Journal of Political Science. Gibbard, A. ( ). ‘Manipulation of voting schemes: A general result’. Econometrica, ( ), . Maltzman, F., Schwartzberg, M. & Sigelman, L. ( ). ‘Vox populi, vox dei, vox sagittae’. PS - Political Science & Politics, ( ), – . McKelvey, R. D. ( ). ‘Intransitivities in multidimensional voting models and some implications for agenda control’. Journal of Economic eory, ( ), – .

R Mischel, W. & Ebbesen, E. B. ( ). ‘Attention
in delay of grati cation’. Journal of Personality and Social Psychology, ( ), – . Plott, C. R. ( ). ‘A notion of equilibrium and its possibility under majority rule’. e American Economic Review, ( ), – . Riker, W. H. ( ). ‘ e art of political manipulation’. Yale University Press. Satterthwaite, M. A. ( ). ‘Strategy-proofness and Arrow’s conditions: Existence and correspondence theorems for voting procedures and social welfare functions’. Journal of Economic eory, ( ), – . Von Neumann, J. & Morgenstern, O. ( ). ‘ eory of games and economic behavior’. Princeton University Press.

Decisions: Week 4

Decisions: Week 4

More Decks by Will Lowe

Other Decks in Education

Featured

Transcript