Hardt's Method

[Hardt+ 16]

Given unfair predicted class, , and a sensitive feature, , a fair class, ,

is predicted maximizing accuracy under an equalized odds condition

✽ True class, , cannot be used by this predictor

̂

Y S Y∘

Y

true positive ratio (TPR) Pr[Y∘

=1 ∣ S=s, Y=1]

false positive ratio (FPR) Pr[Y∘

=1 ∣ S=s, Y=0]

perfectly accurate point

FPR & PPR

can be matched

satisfying

equalized odds

the most

accurate

point satisfying

an equalized

odds condition

feasible region

for S=0

feasible region

for S=1

{

Pr[Y∘=1 ∣ ̂

Y=1,S=1] = 1.0

Pr[Y∘=1 ∣ ̂

Y=0,S=1] = 0.0 {

Pr[Y∘=1 ∣ ̂

Y=1,S=1] = 1.0

Pr[Y∘=1 ∣ ̂

Y=0,S=1] = 1.0

{

Pr[Y∘=1 ∣ ̂

Y=1,S=1] = 0.0

Pr[Y∘=1 ∣ ̂

Y=0,S=1] = 1.0

{

Pr[Y∘=1 ∣ ̂

Y=1,S=1] = 0.0

Pr[Y∘=1 ∣ ̂

Y=0,S=1] = 0.0