Slide 1

Slide 1 text

Supervised Learning Without Discrimination NIPS 2016 Moritz Hardt (Google), Eric Price (UT Austin) and Nati Srebro (TTIC)

Slide 2

Slide 2 text

Data-Driven Automation • Loan approval • Healthcare premiums • Car and home insurance premiums • Hiring • School admission • Policing strategies • Criminal sentencing • Differential service offerings • Marketing Yet for moral, legal and societal reasons, must protect from racial, gender, and other discrimination

Slide 3

Slide 3 text

Non-Discrimination in Supervised Learning • Formal setup: • Available features (e.g. credit history, payment history, rent and house purchase history, number of dependents, driving record, employment record, education, etc) • Protected attribute (e.g. race) • Prediction target (e.g. load defaulting, non-appearance, recidivism) • Learn predictor ෠ () or ෠ (, ) for • Learn based on training set , , =1.. …can mostly assume population distribution (, , ) is known • What does it mean for ෡ to be non-discriminatory?

Slide 4

Slide 4 text

Blindness • Blindness: ෠ () is not a function of (only other features) • Problem: can predict from and depend on this prediction, intentionally or inadvertently • Also: Accuracy disparity for different groups (different values of A)

Slide 5

Slide 5 text

Demographic Parity ෠ = = (෠ | = ′) ෠ ⊥ Too strict: • What if true correlates with ? • Doesn’t allow perfect prediction ෠ = • e.g. give loans exactly to those that won’t default Not weak: • Doesn’t protect from accuracy disparity • e.g. give loans to qualified = 0 people and random = 1 people

Slide 6

Slide 6 text

Suggested Notion: Equalized Odds ෠ ⊥ | Prediction does not provide any additional information about beyond what the truth already tells us on ෠ = , = = ෠ = , = ′ • The perfect predictor, ෠ = , always satisfies equalized odds • Protects against accuracy disparity ෠

Slide 7

Slide 7 text

“Equality of Opportunity in Supervised Learning” arXiv:1610.02413 NIPS 2016 Moritz Hardt, Eric Price and Nati Srebro • Efficiently and optimally correct discriminatory predictors to satisfy equalizes odds • Interpretation in terms of ROC curves • Incentive structure: • Shifts “cost of uncertainty” from protected group to decision maker • Incentivizes collecting features directly related to target (not via ) • Incentivizes data collection also on minority groups • Inherent limitations of oblivious tests (treating predictor as black box) • Non-identifiability of different scenarios • Why equalized odds and not calibration parity (Northpoint’s “target population errors”)