Will Lowe
April 13, 2022
16

# Data Science and Decision Making 2022: Week 7

April 13, 2022

## Transcript

1. ### DATA SCIENCE AND DECISION MAKING What is this thing called

bias? William Lowe Hertie School Data Science Lab 2022-04-13
2. ### PLAN 1 D → ree kinds of fairness → What

can be fair? ...and with respect to what? D → Traditional classi er performance measures M → e intuitive but ine ective strategy → Recidivism prediction, and a problem → Counterfactual fairness C → Case studies: administrative data science Spot the problem
3. ### TERMINOLOGY: BIAS 2 S An estimator ˆ δ for δ

is biased if for xed N EN [ˆ δ] − δ ≠
4. ### TERMINOLOGY: BIAS 2 S An estimator ˆ δ for δ

is biased if for xed N EN [ˆ δ] − δ ≠ even though it might be consistent, i.e. EN [ˆ δ] → δ as N → ∞ Bias in asymptotically unbiased estimators like Maximum Likelihood reduces with larger N Importantly, performance, e.g. mean squared error MSE = EN [(ˆ δ − EN [ˆ δ]) ] can o en be improved by adding bias (it decreases estimator variance)
5. ### TERMINOLOGY: BIAS 2 S An estimator ˆ δ for δ

is biased if for xed N EN [ˆ δ] − δ ≠ even though it might be consistent, i.e. EN [ˆ δ] → δ as N → ∞ Bias in asymptotically unbiased estimators like Maximum Likelihood reduces with larger N Importantly, performance, e.g. mean squared error MSE = EN [(ˆ δ − EN [ˆ δ]) ] can o en be improved by adding bias (it decreases estimator variance) C An empirical estimand does not identify a causal estimand when E[Y X] − E[YX] ≠ is is a population / mechanism property, so increasing N will not help
6. ### TERMINOLOGY: BIAS 2 S An estimator ˆ δ for δ

is biased if for xed N EN [ˆ δ] − δ ≠ even though it might be consistent, i.e. EN [ˆ δ] → δ as N → ∞ Bias in asymptotically unbiased estimators like Maximum Likelihood reduces with larger N Importantly, performance, e.g. mean squared error MSE = EN [(ˆ δ − EN [ˆ δ]) ] can o en be improved by adding bias (it decreases estimator variance) C An empirical estimand does not identify a causal estimand when E[Y X] − E[YX] ≠ is is a population / mechanism property, so increasing N will not help ‘B ’ . . . e kinds of bias we run into in policy contexts may involve any combination of → statistical bias, e.g. stereotypes → causal bias, e.g. institutional prejudice → accurate but ‘undesirable’ inferences

start
8. ### THREE KINDS OF FAIRNESS 3 Fairness motivated mathematics from the

start See al-Khwarizm¯ ı’s hit CE textbook → e compendious book on calculation by completion and balancing Chapter devoted to establishing ‘fair division’ in inheritance problems al Jabr (algebra) is all about maintaining equalities A A mechanism, e.g. an allocation, is fair with respect to characteristic A if Y(a) = Y(a′) Muhammad ibn M¯ usa al-Khwarizm¯ ı + al-Jabr

10. ### THREE KINDS OF FAIRNESS 4 e world is cruel, and

the only morality in a cruel world is chance It’s not about what I want, it’s about what’s fair! Harvey Dent
11. ### THREE KINDS OF FAIRNESS 4 e world is cruel, and

the only morality in a cruel world is chance It’s not about what I want, it’s about what’s fair! Harvey Dent S Fairness ≈ equal probabilities A mechanism is fair with respect to characteristic A if P(Y A = a) = P(Y A = a′)
12. ### THREE KINDS OF FAIRNESS 5 Fairness ≈ causal ine cacy

P(Y(A=a)) = P(Y(A=a′))
13. ### DIFFERING IN THE COUNTERFACTUALS 6 E e e ectiveness of

multilateral UN operations in civil wars (Doyle & Sambanis, ). Reexamined by King and Zeng ( ) with response (Sambanis & Doyle, )

15. ### WHAT CAN BE FAIR? 8 P → Mechanisms, rules, procedures,

decision procedures → Allocations, enforcement, outcomes, decisions
16. ### WHAT CAN BE FAIR? 8 P → Mechanisms, rules, procedures,

decision procedures → Allocations, enforcement, outcomes, decisions People and organisations have rules and make decisions → Decisions are made according to, mostly according to, or despite the rules → Rules may be internally inconsistent and require balancing or weighting (looking at you, lawyers)
17. ### WHAT CAN BE FAIR? 8 P → Mechanisms, rules, procedures,

decision procedures → Allocations, enforcement, outcomes, decisions People and organisations have rules and make decisions → Decisions are made according to, mostly according to, or despite the rules → Rules may be internally inconsistent and require balancing or weighting (looking at you, lawyers) We won’t have much to say about implementation issues here... [gestures in the direction of all Hertie]
18. ### WHAT CAN BE FAIR 9 M L It is o

en argued that these issues are made worse by the presence of ‘algorithmic’ or ML decision-making tools
19. ### WHAT CAN BE FAIR 9 M L It is o

en argued that these issues are made worse by the presence of ‘algorithmic’ or ML decision-making tools → All explicit decision-making processes are algorithms → Removing the human element helps theorizing (even if if hinders other things) → Some of the best work on fairness currently happens in Computer Science departments → In the eld of algorithmic fairness (Barocas et al., ) Automated decision making systems German style (fax machine not shown)
20. ### FAIRNESS WITH RESPECT TO WHAT? 10 P → Variables, e.g.

gender, race, etc. → Measurable on the individual level → O en aggregated to groups
21. ### FAIRNESS WITH RESPECT TO WHAT? 10 P → Variables, e.g.

gender, race, etc. → Measurable on the individual level → O en aggregated to groups U Broadly we can de ne fairness for → individuals → groups
22. ### FAIRNESS WITH RESPECT TO WHAT? 10 P → Variables, e.g.

gender, race, etc. → Measurable on the individual level → O en aggregated to groups U Broadly we can de ne fairness for → individuals → groups C → Outcomes, e.g. ˆ Yi → Implementation, e.g. conditioning variables, internal rules
23. ### FAIRNESS WITH RESPECT TO WHAT? 10 P → Variables, e.g.

gender, race, etc. → Measurable on the individual level → O en aggregated to groups U Broadly we can de ne fairness for → individuals → groups C → Outcomes, e.g. ˆ Yi → Implementation, e.g. conditioning variables, internal rules V → Y the outcome e.g. loan-worthiness, recidivism → ˆ Y a prediction of Y, e.g. probability (or amount) of eventual loan repayment, whether caught committing another crime → X a non-protected characteristic, e.g. criminal record → A a protected characteristic → U non-protected but unobserved characteristics that might also predict Y ˆ Y is a function of X, A, or both. O en thresholded at τ to make a decision.
24. ### TRADITIONAL PERFORMANCE MEASURES 11 C Classi ers make probabilistic predictions

ˆ P → E[Y = X, . . .] = P(Y = X, . . .)
25. ### TRADITIONAL PERFORMANCE MEASURES 11 C Classi ers make probabilistic predictions

ˆ P → E[Y = X, . . .] = P(Y = X, . . .) which are converted to decisions by comparing to a threshold τ ˆ Y = ⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩ if ˆ P > τ otherwise
26. ### TRADITIONAL PERFORMANCE MEASURES 11 C Classi ers make probabilistic predictions

ˆ P → E[Y = X, . . .] = P(Y = X, . . .) which are converted to decisions by comparing to a threshold τ ˆ Y = ⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩ if ˆ P > τ otherwise P P(Y = ˆ Y = ) (Precision) P( ˆ Y = Y = ) (Recall) Precision and recall are related by Bayes theorem, obvs. P( ˆ Y = Y = ) = P(Y = ˆ Y = )P( ˆ Y = ) P(Y = ) sometimes this can be surprisingly useful (King & Lowe, )
27. ### TRADITIONAL PERFORMANCE MEASURES 11 C Classi ers make probabilistic predictions

ˆ P → E[Y = X, . . .] = P(Y = X, . . .) which are converted to decisions by comparing to a threshold τ ˆ Y = ⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩ if ˆ P > τ otherwise P P(Y = ˆ Y = ) (Precision) P( ˆ Y = Y = ) (Recall) Precision and recall are related by Bayes theorem, obvs. P( ˆ Y = Y = ) = P(Y = ˆ Y = )P( ˆ Y = ) P(Y = ) sometimes this can be surprisingly useful (King & Lowe, ) C A probability estimator is calibrated when P(Y = ˆ P = p) = p so Y= in p% of cases where ˆ P = p (for all p) Calibrated classi ers don’t have to be any good, they have to know how good they are

29. ### STATISTICAL VS TASTE-BASED DISCRIMINATION 13 D → ‘taste-based’: prefer A

= a to A = a′ → ‘statistical’: prefer U = u to U = u′ using A as a predictor A Agan and Starr ( ) → , ctitious job applications → employers in NJ and NYC → varying race and felony convictions → measured the callback rates Callback rates before legislation (with and without the box) Callback rates for employers that removed the box to comply with legislation
30. ### DESIDERATA 14 (R ) P(Y = ˆ P = p,

A = ) = P(Y ˆ P = p, A = ) No requirement that P(Y = ˆ P = p) = p → Equal (mis)calibration across levels of A Motivation: → ˆ P should mean the same thing across groups
31. ### DESIDERATA 14 (R ) P(Y = ˆ P = p,

A = ) = P(Y ˆ P = p, A = ) No requirement that P(Y = ˆ P = p) = p → Equal (mis)calibration across levels of A Motivation: → ˆ P should mean the same thing across groups P Relatedly, we could also ask for equal precision P(Y = ˆ Y = , A = ) = P(Y = ˆ Y = , A = )
32. ### DESIDERATA 14 (R ) P(Y = ˆ P = p,

A = ) = P(Y ˆ P = p, A = ) No requirement that P(Y = ˆ P = p) = p → Equal (mis)calibration across levels of A Motivation: → ˆ P should mean the same thing across groups P Relatedly, we could also ask for equal precision P(Y = ˆ Y = , A = ) = P(Y = ˆ Y = , A = ) E Equal false positive rate P( ˆ Y = Y = , A = ) = P( ˆ Y = Y = , A = ) Equal false negative rate P( ˆ Y = Y = , A = ) = P( ˆ Y = Y = , A = ) M Classi cation errors should not vary across groups
33. ### DESIDERATA 14 (R ) P(Y = ˆ P = p,

A = ) = P(Y ˆ P = p, A = ) No requirement that P(Y = ˆ P = p) = p → Equal (mis)calibration across levels of A Motivation: → ˆ P should mean the same thing across groups P Relatedly, we could also ask for equal precision P(Y = ˆ Y = , A = ) = P(Y = ˆ Y = , A = ) E Equal false positive rate P( ˆ Y = Y = , A = ) = P( ˆ Y = Y = , A = ) Equal false negative rate P( ˆ Y = Y = , A = ) = P( ˆ Y = Y = , A = ) M Classi cation errors should not vary across groups N → Equal calibration is (basically) about precision → Equal error rates are (basically) about recall
34. ### CASE STUDY 15 ProPublica (Larson et al., ) noted that

a commercial recidivism prediction tool COMPAS had quite di erent error rates by race
35. ### CONTROVERSY 16 e company involved responded: sure, but the classi

er is well-calibrated
36. ### A FUNDAMENTAL PROBLEM 17 P When recidivism prevalence di ers,

i.e. P(Y = A = a) ≠ P(Y = A = a′) then provably we cannot have equal calibration and both error rates equal (Chouldechova, ; Kleinberg et al., ) if an instrument satis es predictive par- ity – that is, if the PPV [i.e. precision] is the same across groups – but the preva- lence di ers between groups, the instru- ment cannot achieve equal false positive and false negative rates [i.e. recall] across those groups. (Chouldechova, )
37. ### A FUNDAMENTAL PROBLEM 17 P When recidivism prevalence di ers,

i.e. P(Y = A = a) ≠ P(Y = A = a′) then provably we cannot have equal calibration and both error rates equal (Chouldechova, ; Kleinberg et al., ) if an instrument satis es predictive par- ity – that is, if the PPV [i.e. precision] is the same across groups – but the preva- lence di ers between groups, the instru- ment cannot achieve equal false positive and false negative rates [i.e. recall] across those groups. (Chouldechova, ) is should not surprise us → Recall and precision are related through Bayes theorem → Re-weighting depends on prevalence P(Y = )! Statistical reconstructions of intuitive notions of fairness seem incomplete and/or inconsistent
38. ### A FUNDAMENTAL PROBLEM 17 P When recidivism prevalence di ers,

i.e. P(Y = A = a) ≠ P(Y = A = a′) then provably we cannot have equal calibration and both error rates equal (Chouldechova, ; Kleinberg et al., ) if an instrument satis es predictive par- ity – that is, if the PPV [i.e. precision] is the same across groups – but the preva- lence di ers between groups, the instru- ment cannot achieve equal false positive and false negative rates [i.e. recall] across those groups. (Chouldechova, ) is should not surprise us → Recall and precision are related through Bayes theorem → Re-weighting depends on prevalence P(Y = )! Statistical reconstructions of intuitive notions of fairness seem incomplete and/or inconsistent R → Nihilism: fairness is incoherent, so choose your poison → Optimism: there is a coherent notion of fairness but we haven’t got it yet → Causal inference! (Kusner, Lo us, et al., )
39. ### COUNTERFACTUAL FAIRNESS 18 C P( ˆ Y(A=a) i Ai =

a) = P( ˆ Y(A=a′) Ai = a) ˆ Yi is fair if it would not have been di erent had Ai taken a di erent value
40. ### COUNTERFACTUAL FAIRNESS 18 C P( ˆ Y(A=a) i Ai =

a) = P( ˆ Y(A=a′) Ai = a) ˆ Yi is fair if it would not have been di erent had Ai taken a di erent value is has a number of interesting properties: → individual-oriented: Ai = a (not group-oriented: A = a) → exactly half outcome-oriented: Y(Ai =a) → Sometimes it is necessary for a prediction to condition on A to be fair with respect to it
41. ### COUNTERFACTUAL FAIRNESS 18 C P( ˆ Y(A=a) i Ai =

a) = P( ˆ Y(A=a′) Ai = a) ˆ Yi is fair if it would not have been di erent had Ai taken a di erent value is has a number of interesting properties: → individual-oriented: Ai = a (not group-oriented: A = a) → exactly half outcome-oriented: Y(Ai =a) → Sometimes it is necessary for a prediction to condition on A to be fair with respect to it With apologies to Simon Munzert who loves this meme more than any of us
42. ### COUNTERFACTUAL FAIRNESS 19 A X Y U From Kusner, Lo

us, et al. ( ) → Race (A) → Car choice (X) → Speeding tendency (U) → Accidents (Y) No causal e ect of A on Y, but some association
43. ### COUNTERFACTUAL FAIRNESS 19 A X Y U From Kusner, Lo

us, et al. ( ) → Race (A) → Car choice (X) → Speeding tendency (U) → Accidents (Y) No causal e ect of A on Y, but some association Consider predicting accidents using one of ˆ Y = β + XβX (Model ) ˆ Y = β + XβX + AβA (Model ) M Using Model is counterfactually unfair → Holding U constant, but changing A, changes X which changes ˆ Y M Using Model is counterfactually fair → Holding U constant, but changing A, still changes X but this doesn’t change ˆ Y because X is controlled in it
44. ### COUNTERFACTUAL FAIRNESS 20 We are now (imho) at the state

of the art: → Group-based fairness → Individual-based fairness → Counterfactual individual-based fairness Maybe this approach is general. I hope so, but I’m biased...
45. ### COUNTERFACTUAL FAIRNESS 20 We are now (imho) at the state

of the art: → Group-based fairness → Individual-based fairness → Counterfactual individual-based fairness Maybe this approach is general. I hope so, but I’m biased... I According to Kusner et al. a su cient (but not necessary) condition for fairness is → “Conditioning on non-children of A will always be fair” Does not hold for some other de nitions...
46. ### COUNTERFACTUAL FAIRNESS 20 We are now (imho) at the state

of the art: → Group-based fairness → Individual-based fairness → Counterfactual individual-based fairness Maybe this approach is general. I hope so, but I’m biased... I According to Kusner et al. a su cient (but not necessary) condition for fairness is → “Conditioning on non-children of A will always be fair” Does not hold for some other de nitions... A : → Counterfactually de ned fairness is not entirely new → We met it with mediation analysis D In the previous lecture we thought about proving discrimination by establishing a Natural Direct E ect (NDE)
47. ### OPEN QUESTIONS 21 e central question in any employment- discrimination

case is whether the employer would have taken the same action had the employee been of a di erent race (age, sex, religion, national origin, etc.) and every- thing else had remained the same. (Carson v. Bethlehem Steel Corp., ) Everything else? Lots of things mediate the e ects of Gender! → ‘downstream’ e ects, e.g. job type You’ve probably met this kind of problem before, e.g. Bickel et al. ( ) T M Y → Gender (T) → Job type (M) → Outcome (Y) Kusner et al. aren’t clear on the solution...
48. ### MODERN MEDIATION ANALYSIS 22 T M Y P Y(T,M) a.k.a

Y(T, M(T)) is is Y as a function of → M, which is a function of T → T, via all routes that do not a ect M A E[Y( , M( )) − Y( , M( ))] (ATE)
49. ### MODERN MEDIATION ANALYSIS 22 T M Y P Y(T,M) a.k.a

Y(T, M(T)) is is Y as a function of → M, which is a function of T → T, via all routes that do not a ect M A E[Y( , M( )) − Y( , M( ))] (ATE) C E[Y( , m) − Y( , m)] (CDE(M))
50. ### MODERN MEDIATION ANALYSIS 22 T M Y P Y(T,M) a.k.a

Y(T, M(T)) is is Y as a function of → M, which is a function of T → T, via all routes that do not a ect M A E[Y( , M( )) − Y( , M( ))] (ATE) C E[Y( , m) − Y( , m)] (CDE(M)) N E[Y( , M( ) − Y( , M( ))] (NDE)
51. ### MODERN MEDIATION ANALYSIS 22 T M Y P Y(T,M) a.k.a

Y(T, M(T)) is is Y as a function of → M, which is a function of T → T, via all routes that do not a ect M A E[Y( , M( )) − Y( , M( ))] (ATE) C E[Y( , m) − Y( , m)] (CDE(M)) N E[Y( , M( ) − Y( , M( ))] (NDE) N E[Y( , M( ) − Y( , M( ))] (NIE)
52. ### ACT NATURAL 23 What’s the di erence between controlled and

natural? CDE(M) → Most policy targets → Experimentally identi able (randomize T and M)
53. ### ACT NATURAL 23 What’s the di erence between controlled and

natural? CDE(M) → Most policy targets → Experimentally identi able (randomize T and M) NDE ( NIE) → Strictly counterfactual, e.g. Yi ( , Mi ( )) requires we think of ‘splitting’ subject i → Not experimentally identi able (Robins & Greenland, ) Whereas the controlled direct e ect is of interest when policy options exert control over values of variables (e.g., raising the level of a substance in patients’ blood to a prespeci ed concentration), ... the natural direct e ect is of interest when policy options enhance or weaken mech- anisms or processes (e.g., freezing a sub- stance at its current level of concentration [for each patient], but preventing it from responding to a given stimulus). (Pearl, )
54. ### POLICY IMPLICATIONS 24 L To prove discrimination plainti s usually

need to show a positive direct e ect of, e.g. gender (T) on outcome (Y) e central question in any employment- discrimination case is whether the em- ployer would have taken the same ac- tion had the employee been of a di er- ent race (age, sex, religion, national origin, etc.) and everything else had remained the same. (Carson v. Bethlehem Steel Corp., ) But which one?
55. ### POLICY IMPLICATIONS 24 L To prove discrimination plainti s usually

need to show a positive direct e ect of, e.g. gender (T) on outcome (Y) e central question in any employment- discrimination case is whether the em- ployer would have taken the same ac- tion had the employee been of a di er- ent race (age, sex, religion, national origin, etc.) and everything else had remained the same. (Carson v. Bethlehem Steel Corp., ) But which one? T M Y In a linear system: NDE = CDE(M) = β But if an employer prefers → T=men for the M=high paying jobs → T=women for M=low paying jobs ‘the’ direct e ect depends on M!
56. ### NATURAL EFFECTS IDENTIFIED 25 In the absence of confounders, identi

cation for N An average CDE(m), weighted by the probability of each m value in the untreated population NDE = m E[Y( , m) − Y( , m)] CDE(m) P(M = m T = )
57. ### NATURAL EFFECTS IDENTIFIED 25 In the absence of confounders, identi

cation for N An average CDE(m), weighted by the probability of each m value in the untreated population NDE = m E[Y( , m) − Y( , m)] CDE(m) P(M = m T = ) N An average of Y’s responses to the M, weighted by M’s responsiveness to treatment NIE = m E[Y( , m)][P(M = m T = ) − P(M = m T = )] Treatment’s effect on M
58. ### NATURAL EFFECTS IDENTIFIED 25 In the absence of confounders, identi

cation for N An average CDE(m), weighted by the probability of each m value in the untreated population NDE = m E[Y( , m) − Y( , m)] CDE(m) P(M = m T = ) N An average of Y’s responses to the M, weighted by M’s responsiveness to treatment NIE = m E[Y( , m)][P(M = m T = ) − P(M = m T = )] Treatment’s effect on M All estimable using (several) regression models (Pearl, ) T M Y

60. ### CAN DATA BE BIASED? 26 W Semantic spaces constructed by

word associations show predictable ‘biases’ that re ect real-world regularities, e.g. → Searches for CEO returning pictures of men → Machine translation regendering: “O bir doktor” → “He is a doctor” Some CS researchers work on techniques to remove them (e.g. Bolukbasi et al., ) Other elds use them for research (e.g. Kozlowski et al., ) (Caliskan et al., )

62. ### SELECTION PROBLEMS 27 R e role of suspect race in

stop-question-frisk (Knox et al., ). See also Bronner [link] → Race (R) → Stopped by police (S) → Use of force (F) R S F Q → Questioning (Q) which generates a report a.k.a. our data set
63. ### POST TREATMENT BIAS: POLICING 28 M → Behaviour (B) causes

stops (S) and the use of force (F), conditional on S → Unmeasured animosity (A) causes stops and the use of force F, conditional on S R S Q F A B P → An important estimand that’s harder than expected to de ne and estimate → Direct experimentation would be unethical → Multiple paths to an undesirable outcome (mediation) → Not everything can be measured (IV) → Data available only on part of the process (conditioning) → Sample selection built into the data generation process
64. ### POST TREATMENT BIAS: POLICING 28 M → Behaviour (B) causes

stops (S) and the use of force (F), conditional on S → Unmeasured animosity (A) causes stops and the use of force F, conditional on S R S Q F A B
65. ### POST TREATMENT BIAS: POLICING 28 M → Behaviour (B) causes

stops (S) and the use of force (F), conditional on S → Unmeasured animosity (A) causes stops and the use of force F, conditional on S R S Q F A B P → An important estimand that’s harder than expected to de ne and estimate → Direct experimentation would be unethical → Multiple paths to an undesirable outcome (mediation) → Not everything can be measured (IV) → Data available only on part of the process (conditioning) → Sample selection built into the data generation process
66. ### POST TREATMENT BIAS: HEALTH STATISTICS 29 M → Infection (R)

causes hospitalization (S), but so does smoking (B) and diet (A) → All of these cause intensive care (F) R S Q F A B (Gri th et al., ) P → An important estimand that’s harder than expected to de ne and estimate → Direct experimentation would be unethical → Multiple paths to an undesirable outcome (mediation) → Not everything can be measured (IV) → Data available only on part of the process (conditioning) → Sample selection built into the data generation process
67. ### PLAN 30 D → ree kinds of fairness → What

can be fair? ...and with respect to what? D → Traditional classi er performance measures M → e intuitive but ine ective strategy → Recidivism prediction, and a problem → Counterfactual fairness C → Case studies: administrative data science Spot the problem
68. ### REFERENCES 31 Agan, A., & Starr, S. ( ). Ban

the box, criminal records, and racial discrimination: A eld experiment. e Quarterly Journal of Economics, ( ), – . Barocas, S., Hardt, M., & Narayanan, A. ( ). Fairness and machine learning. fairmlbook.org. Bickel, P. J., Hammel, E. A., & O’Connell, J. W. ( ). Sex bias in graduate admissions: Data from Berkeley. Science, ( ), – . Bolukbasi, T., Chang, K.-W., Zou, J. Y., Saligrama, V., & Kalai, A. T. ( ). Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. . Caliskan, A., Bryson, J. J., & Narayanan, A. ( ). Semantics derived automatically from language corpora contain human-like biases. Science, ( ), – . Carson v. Bethlehem Steel Corp. Chouldechova, A. ( , February ). Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Doyle, M. W., & Sambanis, N. ( ). International peacebuilding: A theoretical and quantitative analysis. American Political Science Review, ( ), – .
69. ### REFERENCES 32 Gri th, G., Morris, T. T., Tudball, M.,

Herbert, A., Mancano, G., Pike, L., Sharp, G. C., Palmer, T. M., Davey Smith, G., Tilling, K., Zuccolo, L., Davies, N. M., & Hemani, G. ( , May ). Collider bias undermines our understanding of COVID- disease risk and severity (preprint). King, G., & Lowe, W. ( ). An automated information extraction tool for international con ict data with performance as good as human coders: A rare events evaluation design. International Organization, ( ), – . King, G., & Zeng, L. ( ). When can history be our guide? e pitfalls of counterfactual inference. International Studies Quarterly, ( ), – . Kleinberg, J., Mullainathan, S., & Raghavan, M. ( , November ). Inherent trade-o s in the fair determination of risk scores. Knox, D., Lowe, W., & Mummolo, J. ( ). Administrative records mask racially biased policing. American Political Science Review, ( ), – . Kozlowski, A. C., Taddy, M., & Evans, J. A. ( ). e geometry of culture: Analyzing meaning through word embeddings. American Sociological Review, ( ), – .
70. ### REFERENCES 33 Kusner, M. J., Lo us, J., Russell, C.,

& Silva, R. ( ). Counterfactual fairness. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Eds.), Advances in Neural Information Processing Systems (pp. – ). Curran Associates, Inc. Kusner, M. J., Lo us, J. R., Russell, C., & Silva, R. ( , March ). Counterfactual fairness. Larson, J., Mattu, S., Kirchner, L., & Angwin, J. ( , May ). How we analyzed the COMPAS recidivism algorithm. ProPublica. Pearl, J. ( ). e causal mediation formula—a guide to the assessment of pathways and mechanisms. Prevention Science, ( ), – . Pearl, J. ( ). e deductive approach to causal inference. Journal of Causal Inference, ( ), – . Robins, J. M., & Greenland, S. ( ). Identi ability and exchangeability for direct and indirect e ects. Epidemiology, ( ), – . Sambanis, N., & Doyle, M. W. ( ). No easy choices: Estimating the e ects of united nations peacekeeping (Response to King and Zeng). International Studies Quarterly, ( ), – .