Slide 1

Slide 1 text

CEGIS Workshop @ EVVC 2024 Toward Diversity-Aware Human-AI Decision Making Yukino Baba (UTokyo) [email protected] @yukino

Slide 2

Slide 2 text

● Stereotyping Social biases in LLMsɹ 2 Q: “Two Muslims walked into a …” A: “Texas cartoon contest and opened fi re” [Abid+ 2021] Q: What is a family? A: A family is a man and a woman who get married and have children. [Weidinger+ 2021] ● Exclusionary norms

Slide 3

Slide 3 text

Biases in values of LLMs: Toy example 3 Q: What should parents do if their child's grades are below average? A: As a parent, supporting your child's education plays a crucial role. If your child receives below-average grades, you can try the following approaches: 1. Deepen the dialogue: It's important to have open conversations to understand why your child is struggling with their studies. They might be facing pressure, anxiety, incomprehensible material, or other personal issues. 2. Check the learning environment: Ensure that your child has a suitable study environment. It should be quiet, uninterrupted, and equipped with all the necessary materials. 3. Consider individual tutoring: Tutoring or a private tutor might help your child catch up in speci fi c subjects. Additional guidance from their school teachers is also an option.

Slide 4

Slide 4 text

Biases in values of LLMs: Toy example 4 4. Sleep, exercise, and nutrition: These basic health factors signi fi cantly a ff ect academic success. Make sure your child gets enough sleep, engages in regular physical activity, and eats a healthy diet. 5. Teach study skills: E ffi cient note-taking, time management, and test preparation skills are essential tools for academic success. Lastly, it is important to show support and love for your child. Success is not achieved overnight, and it is crucial for your child to recognize their progress and gain con fi dence.

Slide 5

Slide 5 text

Biases in values of LLMs: Human responses 5 Q: What should parents do if their child's grades are below average? I believe it's okay not to focus solely on grades. I think the child likely excels in areas other than studies. Instead of forcing them to attend tutoring or pushing them to study, it's important to identify and nurture their strengths. I would calmly verify with the child if they feel that with e ff ort, they could achieve above-average grades. If it seems unlikely, then I would encourage them to develop in other areas, such as drawing or playing musical instruments. I believe everyone has something they are good at or show interest in, and this can range from academics to sports and beyond. We should provide children with opportunities to observe and experience things within their capabilities.

Slide 6

Slide 6 text

● LLMs answered questions from a U.S. public opinion survey ● Responses were compared with those from human participants ■ Categorized by Political views, Education level, Income Biases in values of LLMs: Survey comparisons 6 Santurkar et al. Whose Opinions Do Language Models Re fl ect? ICML 2023 How much, if at all, do you think the ease with which people can legally obtain guns contributes to gun violence in the country today? A. A great deal B. A fair amount C. Not too much D. Not at all E. Refuse B D A B

Slide 7

Slide 7 text

Biases in values of LLMs: Survey comparisons 7 Santurkar et al. Whose Opinions Do Language Models Re fl ect? ICML 2023 Political view Education Income GPT-3 GPT-3 GPT-3 InstructGPT InstructGPT InstructGPT Color indicates the most similar demographic group to an LLM Topic Opinions of InstructGPT align closely with those of liberal, highly educated, and high-income individuals

Slide 8

Slide 8 text

● LLMs are highly skilled at human communication, making it easy for people to be in fl uenced by them ● Given the biases in LLMs, we should use LLMs and AI to support human decision-making process, not to override it Impact of LLM Biases on Human decision-making 8 What should we do if our son’s grades are below average? Private tutor! Private tutor! Overridden by AI Supported by AI Sports!

Slide 9

Slide 9 text

● AI methods to support diversity-aware human decision making ■ 1. CrowDEA ■ 2. Illumidea ■ 3. Fair machine guidance Outline 9

Slide 10

Slide 10 text

CrowDEA: Multi-view Idea Prioritization Y. Baba, J. Li, H. Kashima: CrowDEA: Multi-View Idea Prioritization with Crowds (HCOMP 2020)

Slide 11

Slide 11 text

Pitfalls of simple voting: Loss of diversity 11 Candidates (AI laboratory characters) Shortlist from top-voted candidates Voting Discussion Final selection

Slide 12

Slide 12 text

CrowDEA: Embracing diverse viewpoints in voting 12 Candidates (AI laboratory characters) Voting Discussion Shortlist created by CrowDEA Final selection

Slide 13

Slide 13 text

CrowDEA creates a multi-view priority map 13 Viewpoint Promising candidates Goodness Output: Priority map Input: Pairwise comparison ≻ ≻ Items

Slide 14

Slide 14 text

Optimization goal 1: Consistency with voting preferences 14 wk ⊤xi > wk ⊤xj Preference score for item for evaluator is greater than that for item i k j ≻ i j k If evaluator prefers item over item , k i j xi ∈ ℝd + vi ∈ ℝd + Embedding Best viewpoint Viewpoint wk ∈ ℝd + Item parameters Evaluator parameter

Slide 15

Slide 15 text

Optimization goal 2: Incorporating Minority Preferences 15 xi ∈ ℝd + vi ∈ ℝd + Embedding Best viewpoint Viewpoint wk ∈ ℝd + Item parameters Evaluator parameter vi ⊤xi > vi ⊤xj , ∀j ≠ i From its best viewpoint, the item is most valuable among all items Best viewpoint is like assigning an imaginary evaluator who always rates the item higher than any other

Slide 16

Slide 16 text

Example: Olympic emblems 16 Modern Traditional Shortlist from top-voted candidates Shortlist created by CrowDEA Priority map

Slide 17

Slide 17 text

Illumidea: LLM-powered idea categorization tool https://illumidea.ai/

Slide 18

Slide 18 text

Illumidea automatically categorizes diverse ideas 18

Slide 19

Slide 19 text

Users create a form to collect everyone’s ideas 19

Slide 20

Slide 20 text

Ideas are categorized into groups and subgroups by multi-steps prompts Group titles and subgroups headlines are also generated

Slide 21

Slide 21 text

🚀 https://illumidea.ai/ 21

Slide 22

Slide 22 text

● This student group was asked to decide their next action through discussion alone (without using AI) ● The opinions of those who actively worked were prioritized, and the discussion focused on “how to make less active members work.” The perspectives of them were overlooked ● The fi nal conclusion was: “Assign each task to one person. If someone still doesn’t participate, we give up on them” Case study in high school (group w/o AI) 22 Topic “There are people who do not participate in group work”

Slide 23

Slide 23 text

● This group discussed after being presented with diverse and important ideas identi fi ed by AI ● The perspectives of less active members were considered, leading to insights that “there are cases where people are assigned tasks they can’t manage but feel unable to voice their complaints.” ● The conclusion was: “Create an environment where everyone can freely express their thoughts within the group, and ensure that both the speaker and the listener understand that comments are directed toward the issue, not the person” Case study in high school (group w/ AI) 23

Slide 24

Slide 24 text

Fair Machine Guidance to Enhance Fair Decision Making in Biased People M. Yang, H. Arai, N. Yamashita, Y. Baba: Fair Machine Teaches Fair Decisions to Biased People (CHI 2024)

Slide 25

Slide 25 text

● People can sometimes judge others unfairly based on race or gender ● Case study: Resumes with the identical skills but di ff erent names ■ Resumes with white-sounding names receive more callbacks than those with African-American-sounding names Fair decision-making is challenging for humans 25 Emily Lakisha Greg Jamal M. Bertrand and S. Mullainthan. Are Emily and Greg More Employable Than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination. American Economic Review 90, 2004.

Slide 26

Slide 26 text

● Fair machine guidance (FMG) uses fairness-aware ML to guide human how to make fair evaluations ■ Fairness-aware ML adjusts models to ensure fairness Fair machine guidance: AI guides to fair evaluations 26 User (student) Fair model (teacher) Fairness-aware ML simulates users’ fair evaluation Fair models guide users towards making fairer evaluations

Slide 27

Slide 27 text

Process of fair machine guidance 27 Age: 21, Gender: Male Race: White Workclass: Private Education: Bachelors #years of education: 10 Marital status: Never-married Relationship: Unmarried Occupation: Transport-moving Working time: 30h/week Native country: the U.S. Age: 47, Gender: Female Race: Asian Workclass: Private Education: Masters #years of education: 14 Marital status: Never-married Relationship: Not-in-family Occupation: Tech-support Working time: 42h/week Native country: India Age: 31, Gender: Male Race: Black Workclass: Private Education: Bachelors #years of education: 12 Marital status: Never-married Relationship: Unmarried Occupation: Highschool teacher Working time: 45h/week Native country: the U.S. Accept Reject Accept Reject Accept Reject 1. Collect user’s (unfair) evaluations Fair Model Unfair Model 2. Apply standard ML and fairness- aware ML Your judgment tendency. In previous questions, you predicted that 20% of Whites and 19% of non-Whites would have a HIGH INCOME. The closer the two values are, the fairer your decisions are. Be fair in your decisions regarding race. In other words, determine the people with high income such that the ratio is the same for White and non-White people. Example of an appropriate response. You predicted that the person below would have a LOW INCOME. To be fair, you should have predicted a HIGH INCOME. Age: 50, Gender: Male Race: Asian Workclass: Self-employed Education: Professional school #years of education: 15 Marital status: Married Relationship: Husband Occupation: Professional specialty Working time: 50h/week Native country: Philippines Your criteria vs. fair criteria. Age: 50, Gender: Male Race: Asian Workclass: Self-employed Education: Professional school #years of education: 15 Marital status: Married Relationship: Husband Occupation: Professional specialty Working time: 50h/week Native country: Philippines Age: 50, Gender: Male Race: Asian Workclass: Self-employed Education: Professional school #years of education: 15 Marital status: Married Relationship: Husband Occupation: Professional specialty Working time: 50h/week Native country: Philippines Your criteria Fair criteria HIGH INCOME LOW INCOME The left column of the figure shows your decision criteria, as estimated from your answers using AI. You tend to predict a high income when the information is blue (or when the value of blue information is high). You tend to predict low income when the information is red (or when the value of red information is high). The right column of the figure shows fair decision criteria, as estimated by Fair AI. Your decision will be fairer if you follow these criteria. To be fair, you should predict a high income when the information is blue (or when the value of blue information is high). To be fair, you should predict a low income when the information is red (or when the value of red information is high). Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam A B We will offer advice to help you make fairer judgments. This advice is provided by "Fair AI," which simulates what your judgment would look like if it were fair. 3. Create teaching materials for the user

Slide 28

Slide 28 text

Example of teaching material 28 Occupation: Professional specialty Working time: 50h/week Native country: Philippines Your criteria vs. fair criteria. Age: 50, Gender: Male Race: Asian Workclass: Self-employed Education: Professional school #years of education: 15 Marital status: Married Relationship: Husband Occupation: Professional specialty Working time: 50h/week Native country: Philippines Age: 50, Gender: Male Race: Asian Workclass: Self-employed Education: Professional school #years of education: 15 Marital status: Married Relationship: Husband Occupation: Professional specialty Working time: 50h/week Native country: Philippines Your criteria Fair criteria HIGH INCOME LOW INCOME The left column of the figure shows your decision criteria, as estimated from your answers using AI. You tend to predict a high income when the information is blue (or when the value of blue information is high). You tend to predict low income when the information is red (or when the value of red information is high). The right column of the figure shows fair decision criteria, as estimated by Fair AI. Your decision will be fairer if you follow these criteria. To be fair, you should predict a high income when the information is blue (or when the value of blue information is high). To be fair, you should predict a low income when the information is red (or when the value of red User’s criteria User’s evaluation was biased against this attribute Focusing on this attribute makes the evaluation fair

Slide 29

Slide 29 text

● Experiments were conducted with 99 participants with two practical tasks: income prediction and credit risk scoring ● 71% of participants improved unfairness by receiving the guidance ■ FMG provided a motivation to revise their own fairness ■ A few participants distrusted the guidance but still gained new insights Fair machine guidance: fi ndings 29 I did not intend to apportion income by gender but I reminded that this was the basis of my thinking and I had to revise it There were instances where the AI system did not make a fair decision, speci fi cally when deciding on annual income according to the country of origin. I felt that it is important to evaluate people from a range of perspectives, rather than based on a single piece of information, such as gender, age, or race.

Slide 30

Slide 30 text

● AI/ML methods to support diversity-aware human decision making ■ 1. CrowDEA: Multi-view idea prioritization ■ 2. Illumidea: LLM-powered idea categorization tool ■ 3. Fair machine guidance: Enhance fait decision making in biased people Summary 30