Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[CHI'24] Fair Machine Guidance to Enhance Fair ...

mei28
May 14, 2024
37

[CHI'24] Fair Machine Guidance to Enhance Fair Decision Making in Biased People

CHI24 presentation slide
Fair Machine Guidance to Enhance Fair Decision Making in Biased People
Paper URL: https://dl.acm.org/doi/10.1145/3613904.3642627

mei28

May 14, 2024
Tweet

More Decks by mei28

Transcript

  1. Fair Machine Guidance to Enhance Fair Decision Making in Biased

    People Mingzhe Yang (The University of Tokyo) Hiromi Arai (RIKEN AIP) Naomi Yamashita (Kyoto University) Yukino Baba (The University of Tokyo) CHI 2024
  2. • People judge others unfairly based on their race or

    gender [1] • The survey of lectures aimed at addressing these biases revealed the following results [2]: [2] Edward H. Chang, Katherine L. Milkman, Dena M. Gromet, Robert W. Rebele, Cade Massey, Angela L. Duckworth, and Adam M. Grant. The Mixed Effects of Online Diversity Training. Proceedings of the National Academy of Sciences 116, 16 (2019), 7778–7783. Fair personnel evaluations is challenging for humans 2 [1] M. Bertrand and S. Mullainthan. Are Emily and Greg More Employable Than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination. American Economic Review 90, 2004 Started to understand the importance of gender fairness No change in the number of women selected in selecting mentors Lecture on how to address biases
  3. Research Question How does fair machine guidance impact human evaluation

    processes? Fair machine guidance: AI guides humans to fair evaluations 3 User 1. Estimating the evaluations when the user evaluates fairly 2. Guide people to be closer to a fair model Fairness-aware ML Training model to enable fair outcome Fair model Fair machine guidance (FMG)
  4. Overview of fair machine guidance 4 Fair Model Fairness-aware ML

    [3] 2 Train a model to simulate human evaluations Unfair Model Standard ML 3 Provide teaching materials on how to make fair decisions Your judgment tendency. In previous questions, you predicted that 20% of Whites and 19% of non-Whites would have a HIGH INCOME. The closer the two values are, the fairer your decisions are. Be fair in your decisions regarding race. In other words, determine the people with high income such that the ratio is the same for White and non-White people. Example of an appropriate response. You predicted that the person below would have a LOW INCOME. To be fair, you should have predicted a HIGH INCOME. Age: 50, Gender: Male Race: Asian Workclass: Self-employed Education: Professional school #years of education: 15 Marital status: Married Relationship: Husband Occupation: Professional specialty Working time: 50h/week Native country: Philippines Your criteria vs. fair criteria. Age: 50, Gender: Male Race: Asian Workclass: Self-employed Education: Professional school #years of education: 15 Marital status: Married Relationship: Husband Occupation: Professional specialty Working time: 50h/week Native country: Philippines Age: 50, Gender: Male Race: Asian Workclass: Self-employed Education: Professional school #years of education: 15 Marital status: Married Relationship: Husband Occupation: Professional specialty Working time: 50h/week Native country: Philippines Your criteria Fair criteria HIGH INCOME LOW INCOME The left column of the figure shows your decision criteria, as estimated from your answers using AI. You tend to predict a high income when the information is blue (or when the value of blue information is high). You tend to predict low income when the information is red (or when the value of red information is high). The right column of the figure shows fair decision criteria, as estimated by Fair AI. Your decision will be fairer if you follow these criteria. To be fair, you should predict a high income when the information is blue (or when the value of blue information is high). To be fair, you should predict a low income when the information is red (or when the value of red information is high). Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam A B We will offer advice to help you make fairer judgments. This advice is provided by "Fair AI," which simulates what your judgment would look like if it were fair. Teaching materials 1 Collect evaluations from humans Age: 21, Gender: Male Race: White Workclass: Private Education: Bachelors #years of education: 10 Marital status: Never-married Relationship: Unmarried Occupation: Transport-moving Working time: 30h/week Native country: the U.S. Age: 47, Gender: Female Race: Asian Workclass: Private Education: Masters #years of education: 14 Marital status: Never-married Relationship: Not-in-family Occupation: Tech-support Working time: 42h/week Native country: India Age: 31, Gender: Male Race: Black Workclass: Private Education: Bachelors #years of education: 12 Marital status: Never-married Relationship: Unmarried Occupation: Highschool teacher Working time: 45h/week Native country: the U.S. Human evaluations [3] Agarwal, Alekh, et al. "A reductions approach to fair classification." International conference on machine learning. PMLR, 2018.
  5. Occupation: Professional specialty Working time: 50h/week Native country: Philippines Your

    criteria vs. fair criteria. Age: 50, Gender: Male Race: Asian Workclass: Self-employed Education: Professional school #years of education: 15 Marital status: Married Relationship: Husband Occupation: Professional specialty Working time: 50h/week Native country: Philippines Age: 50, Gender: Male Race: Asian Workclass: Self-employed Education: Professional school #years of education: 15 Marital status: Married Relationship: Husband Occupation: Professional specialty Working time: 50h/week Native country: Philippines Your criteria Fair criteria HIGH INCOME LOW INCOME The left column of the figure shows your decision criteria, as estimated from your answers using AI. You tend to predict a high income when the information is blue (or when the value of blue information is high). You tend to predict low income when the information is red (or when the value of red information is high). The right column of the figure shows fair decision criteria, as estimated by Fair AI. Your decision will be fairer if you follow these criteria. To be fair, you should predict a high income when the information is blue (or when the value of blue information is high). To be fair, you should predict a low income when the information is red (or when the value of red Teaching materials highlights fair criteria and user criteria 5 User’s criteria User’s evaluation was biased against this attribute Focusing on this attribute makes the evaluation fair
  6. Two personal assessment tasks 6 Income (racial fairness) Age: 47,

    Gender: Female Race: Asian Workclass: Private Education: Masters #years of education: 14 Marital status: Never-married Relationship: Not-in-family Occupation: Tech-support Working time: 42h/week Native country: India 1 Age: 31, Gender: Male Race: Black Occupation: Highschool teacher Housing: Rent Saving accounts: Moderate Checking account: Little Credit amount: $4,300 Duration: 48 month Purpose: Car Credit (gender fairness) 2 Q. “Is the person’s income high or low?” Q. “Is the person’s credit risk high or low?” (*) We asked participants to be as fair as possible in their decisions
  7. Two experiment conditions 7 Your judgment tendency. In previous questions,

    you predicted that 20% of Whites and 19% of non-Whites would have a HIGH INCOME. The closer the two values are, the fairer your decisions are. Be fair in your decisions regarding race. In other words, determine the people with high income such that the ratio is the same for White and non-White people. Example of an appropriate response. You predicted that the person below would have a LOW INCOME. To be fair, you should have predicted a HIGH INCOME. Age: 50, Gender: Male Race: Asian Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed dia We will offer advice to help you make fairer judgments. This advice is provided by "Fair AI," which simulates what your judgment would look like if it were fair. Unfairness score FT Your judgment tendency. In previous questions, you predicted that 20% of Whites and 19% of non-Whites would have a HIGH INCOME. The closer the two values are, the fairer your decisions are. Be fair in your decisions regarding race. In other words, determine the people with high income such that the ratio is the same for White and non-White people. Example of an appropriate response. You predicted that the person below would have a LOW INCOME. To be fair, you should have predicted a HIGH INCOME. Age: 50, Gender: Male Race: Asian Workclass: Self-employed Education: Professional school #years of education: 15 Marital status: Married Relationship: Husband Occupation: Professional specialty Working time: 50h/week Native country: Philippines Your criteria vs. fair criteria. Age: 50, Gender: Male Race: Asian Workclass: Self-employed Education: Professional school #years of education: 15 Marital status: Married Relationship: Husband Occupation: Professional specialty Working time: 50h/week Native country: Philippines Age: 50, Gender: Male Race: Asian Workclass: Self-employed Education: Professional school #years of education: 15 Marital status: Married Relationship: Husband Occupation: Professional specialty Working time: 50h/week Native country: Philippines Your criteria Fair criteria HIGH INCOME LOW INCOME Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam A B We will offer advice to help you make fairer judgments. This advice is provided by "Fair AI," which simulates what your judgment would look like if it were fair. Highlighted criteria Baseline: Bias feedback (BF) Ours: Fair machine guidance (FMG) Unfairness score Highlighted criteria — Example: Example:
  8. • Only participants with a high unfairness score proceed to

    the treatment phase Experiments with biased people 8 post-test pre-test (N=459) mini-test Bias feedback Income: N=37 Credit : N=13 Fair machine guidance Income: N=39 Credit : N=10 5x Screening (N=99) treatment Unfair Model Fair Model Fair ML Standard ML Your judgment tendency. In previous questions, you predicted that 20% of Whites and 19% of non-Whites would have a HIGH INCOME. The closer the two values are, the fairer your decisions are. Be fair in your decisions regarding race. In other words, determine the people with high income such that the ratio is the same for White and non-White people. Example of an appropriate response. You predicted that the person below would have a LOW INCOME. To be fair, you should have predicted a HIGH INCOME. Age: 50, Gender: Male Race: Asian Workclass: Self-employed Education: Professional school #years of education: 15 Marital status: Married Relationship: Husband Occupation: Professional specialty Working time: 50h/week Native country: Philippines Your criteria vs. fair criteria. Age: 50, Gender: Male Race: Asian Workclass: Self-employed Education: Professional school #years of education: 15 Marital status: Married Relationship: Husband Occupation: Professional specialty Working time: 50h/week Native country: Philippines Age: 50, Gender: Male Race: Asian Workclass: Self-employed Education: Professional school #years of education: 15 Marital status: Married Relationship: Husband Occupation: Professional specialty Working time: 50h/week Native country: Philippines Your criteria Fair criteria HIGH INCOME LOW INCOME The left column of the figure shows your decision criteria, as estimated from your answers using AI. You tend to predict a high income when the information is blue (or when the value of blue information is high). You tend to predict low income when the information is red (or when the value of red information is high). The right column of the figure shows fair decision criteria, as estimated by Fair AI. Your decision will be fairer if you follow these criteria. To be fair, you should predict a high income when the information is blue (or when the value of blue information is high). To be fair, you should predict a low income when the information is red (or when the value of red information is high). Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam A B We will offer advice to help you make fairer judgments. This advice is provided by "Fair AI," which simulates what your judgment would look like if it were fair. Teaching materials Age: 21, Gender: Male Race: White Workclass: Private Education: Bachelors #years of education: 10 Marital status: Never-married Relationship: Unmarried Occupation: Transport-moving Working time: 30h/week Native country: the U.S. Age: 47, Gender: Female Race: Asian Workclass: Private Education: Masters #years of education: 14 Marital status: Never-married Relationship: Not-in-family Occupation: Tech-support Working time: 42h/week Native country: India Age: 31, Gender: Male Race: Black Workclass: Private Education: Bachelors #years of education: 12 Marital status: Never-married Relationship: Unmarried Occupation: Highschool teacher Working time: 45h/week Native country: the U.S. Human evaluations 1 2 3 post-test & surveys • Measured the improvement of unfairness and the impact on evaluations
  9. Overview of our findings 9 The improvement of fairness 1.

    Many participants in FMG improved their unfairness Motivation to correct bias 2. Fair machine guidance provided opportunities to reconsider fairness Adjustment of evaluation criteria 3. Fair machine guidance encouraged participants to adjust their evaluation criteria 4. Even those who did not trust and follow AI showed changes in their evaluation
  10. Findings 1: FMG improved people’s own unfairness 10 • Many

    participants became fairer in both FMG and BF • But there were differences in the process leading to their evaluations Improved: pre unfairness > post unfairness Worsen: pre unfairness < post unfairness
  11. FMG provided participants with more opportunities to reconsider the fairness

    than BF Findings 2: FMG motivates people to reconsider the fairness 11 Q. Did these tasks make you reconsider the fairness of your own decisions and those required by society?
  12. “I am fair because I made a comprehensive decision. The

    AI guidance appeared to provide superfluous information for decision making” — P45, Credit, BF Findings 2: FMG motivates people to reconsider the fairness 12 FMG provided a motivation to revise their own fairness BF did not motivate participants to reconsider fairness due to their confidence in their own sense of fairness “I did not intend to apportion income by gender; however, I was reminded that this was the basis of my thinking and felt that I had to revise it” — P3, Income, FMG
  13. Participants in fair machine guidance changed their criteria Findings 3:

    FMG prompts people to change their criteria 13 (*) We asked participants to respond the attributes in their evaluation in both the pre- and post-test
  14. Findings 3: FMG prompts people to change their criteria 14

    Showing fair criteria made them realize the value of the diverse perspective “I felt that it is important to evaluate people from a range of perspectives, rather than based on a single piece of information, such as gender, age, or race” — P21, Income, FMG
  15. Findings 4: Some people rejected AI, yet gained insight 15

    Some participants who distrusted and rejected the AI's guidance still gained new insights “I was not persuaded by the hints presented (and did not follow them). I felt that (for me) there was a tendency to judge one’s ability to pay based on their occupation.” — P15, Income, FMG
  16. Takeaways • We investigated how fair AI can guide human

    to fair evaluations • Fair machine guidance encouraged participants to reconsider fairness and to adjust their criteria • We emphasize the need for AI systems aimed at reducing biases to stimulate critical engagement and self-reflection among users Your judgment tendency. In previous questions, you predicted that 20% of Whites and 19% of non-Whites would have a HIGH INCOME. The closer the two values are, the fairer your decisions are. Be fair in your decisions regarding race. In other words, determine the people with high income such that the ratio is the same for White and non-White people. Example of an appropriate response. You predicted that the person below would have a LOW INCOME. To be fair, you should have predicted a HIGH INCOME. Age: 50, Gender: Male Race: Asian Workclass: Self-employed Education: Professional school #years of education: 15 Marital status: Married Relationship: Husband Occupation: Professional specialty Working time: 50h/week Native country: Philippines Your criteria vs. fair criteria. Age: 50, Gender: Male Race: Asian Workclass: Self-employed Education: Professional school #years of education: 15 Marital status: Married Relationship: Husband Occupation: Professional specialty Working time: 50h/week Native country: Philippines Age: 50, Gender: Male Race: Asian Workclass: Self-employed Education: Professional school #years of education: 15 Marital status: Married Relationship: Husband Occupation: Professional specialty Working time: 50h/week Native country: Philippines Your criteria Fair criteria HIGH INCOME LOW INCOME The left column of the figure shows your decision criteria, as estimated from your answers using AI. You tend to predict a high income when the information is blue (or when the value of blue information is high). You tend to predict low income when the information is red (or when the value of red information is high). Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam A B We will offer advice to help you make fairer judgments. This advice is provided by "Fair AI," which simulates what your judgment would look like if it were fair.