Open Data Science Conference 2023 -- Invited Talk

Towards Socially Unbiased Generative A rt i fi cial Intelligence
Professor Danushka Bollegala University of Liverpool

Is AI Socially Biased? 2 Images created by DALL-E (OpenAI)
h tt ps://www.vice.com/en/a rt icle/wxdawn/the-ai-that-draws-what-you-type-is-very-racist-shocking-no-one

Is AI Socially Biased? 3 Images generated by DALL-E (OpenAI)

Is AI Socially Biased? 4 Images generated by Stable Di
ff usion (Stability.AI) “Janitor” “Asse rt ive Fire fi ghter” h tt ps://techpolicy.press/researchers- fi nd-stable-di ff usion-ampli fi es-stereotypes/

Is AI Socially Biased? 5 Tweets by Steven Piantadosi (sho
rt url.at/morB7) Dec 4, 2022

Machine Learning 101 6 Model Raw Data Algorithm Human Labels
Training Data Inference

Types of biases 7 Bayeza-Yates 2018

The Wisdom of Crowd a Few • 50% of all
a rt icles in Wikipedia (at the sta rt ) was wri tt en by 0.04% (ca. 2000) of its editors. • Only 4% of all active users at Amazon write product reviews • 50% of all websites are in English, whereas there are only 5% of native English speakers (becomes 13% if we add non-natives) in the World. • Only 7% of Facebook users produce 50% of the posts • 0.05% of most popular people a tt racts (followed by) 50% of the Twi tt er users • Zip’s Least E ff o rt Principle — many people do only a li tt le while few people do a lot 8

Gender Bias 9 Accumulated Fraction of Women’s biographies in Wikipedia
Bayeza-Yates 2018

But it is not OK for AI to be biased!
• Legal argument • Title VII of the Civil Rights Act of 1964 in the UK • Prohibits employment discrimination due to race, religion, gender and ethnicity • EU Cha rt er of Fundamental Rights, Title III (Equality), A rt icle 21 Non-discrimination • 1. Any discrimination based on any ground such as sex, race, colour, ethnic or social origin, genetic features, language, religion or belief, political or any other opinion, membership of a national minority, prope rt y, bi rt h, disability, age or sexual orientation shall be prohibited. • 2. Within the scope of application of the Treaties and without prejudice to any of their speci fi c provisions, any discrimination on grounds of nationality shall be prohibited. • Commercial argument • Your customers will loose trust in your AI-based service • Moral argument • Come on! Why should we let humans to be discriminated by AI. 10

Can we measure gender bias in LLMs? 11 She is
a nurse He is a nurse 0.9 0.4 Likelihood If an LLM assigns higher likelihood score to one sentence than the other, it is considered to be preferring one gender over the other, hence gender biased. She/He is a nurse Likelihood She: (0.9+0.8+0.6+0.7) / 4 He: (0.8+0.9+0.5+0.6) / 4 Diff 0.2 All Unmasked Likelihood (AUL) score = Percentage of sentence-pairs where the male version has a higher likelihood than the female version Kaneko et al. [AAAI 2022]

Multi-lingual Bias Evaluation • Social biases are not limited to
English. However, compared to bias evaluation datasets annotated for English, datasets available for other languages are limited. • Annotating datasets eliciting social biases for each language is costly, time consuming, and it might even be di ff i cult to recruit annotators for this purpose. • We proposed a multi-lingual bias evaluation measure using existing parallel translation data. 12 Kaneko et al. [NAACL 2022]

Which bias evaluation measure? • Di ff erent intrinsic bias
evaluation measures are proposed, which use di ff erent evaluation datasets and criteria • CrowS-Pairs [Nangia et al.], StereoSets [Nadeem et al.], AUL/AULA [Kaneko et al.], Template-based Scores (TBS) [Kurita et al.]   • How can we know which intrinsic bias evaluation measure is most appropriate? 13 Kaneko et al. [EACL 2023]

Bias controlling in LLMs 14 Figure 2: Average output probabilities
for “[MASK] is a/an [Occupation]” produced by the bias-controlled BERT and ALBERT PLMs ﬁne-tuned with different r on the news dataset. (a) r = 1.0 (b) r = 0.7 (c) r = 0.5 [MASK] doesn’t have time for the family due to work obligations.

Gender-biases in LLMs 15 Measure BERT ALBERT news book HA
news book HA TBS 0.14 0.09 - 0.25 0.14 - SSS 0.22 0.22 0.45 0.31 0.22 0.53 CPS 0.30 0.27 0.57 0.37 0.22 0.48 AUL 0.37 0.32 0.68 0.55 0.36 0.56 AULA 0.42 0.34 0.71 0.60 0.42 0.57 Table 1: Peason correlation between biased PLM order and each bias scores. News and book represent the corpus used for biasing, respectively. HA is AUC value of of the proposed m several PLMs and the proposed met tion results of the CPS, AUL, and A ALBERT on new HA is the AUC va (2022)’s method TBS uses templa HA. Peason correlation between biased PLM order and each bias scores. News and book represent the corpus used for biasing, respectively. HA is AUC value of method using human annotation. • For BERT, the proposed method induces the same order among measures   (i.e. AULA > AUL > CPS > SSS) as done by HA in both news and book. • For ALBERT, only the rankings of SSS and CPS di ff er between the proposed method and HA. • These results show that the proposed method and the existing method that use human annotations rank the intrinsic gender bias evaluation measures in almost the same order.

Debiasing isn’t enough! • Intrinsic bias scores: • Evaluates social
biases in LLMs on their own right, independently of any downstream tasks • Extrinsic bias scores: • Evaluates social biases in LLMs when they are applied to solve a speci fi c downstream task such as Natural Language Inference (NLI), Semantic Textual Similarity (STS) or predicting occupations from biographies (BiasBios). • The correlation between intrinsic vs. extrinsic bias scores is weak 16 Kaneko et al. [COLING 2022]

Intrinsic vs. Extrinsic Bias Scores 17 -10 -5 0 5
bert-bu bert-lu bert-bc bert-lc roberta-b roberta-l albert-b AT CDA DO (a) SSS -10 -5 0 5 bert-bu bert-lu bert-bc bert-lc roberta-b roberta-l albert-b AT CDA DO (b) CPS -10 -5 0 5 bert-bu bert-lu bert-bc bert-lc roberta-b roberta-l albert-b AT CDA DO (c) AULA -0.5 0.0 0.5 1.0 1.5 bert-bu bert-lu bert-bc bert-lc roberta-b roberta-l albert-b AT CDA DO (d) BiasBios -0.2 -0.1 0.0 0.1 0.2 bert-bu bert-lu bert-bc bert-lc roberta-b roberta-l albert-b AT CDA DO (e) STS-bias -20 -10 0 10 20 bert-bu bert-lu bert-bc bert-lc roberta-b roberta-l albert-b AT CDA DO (f) NLI-bias Figure 1: Differences between the bias scores of original vs. debiased MLMs. Negative values indicate that the debiased MLM has a lower bias than its original (non-debiased) version. Di ff erences between the bias scores of original vs. debiased MLMs. Negative values indicate that the debiased MLM has a lower bias than its original (non-debiased) version.

We are not done… (yet) • Debiased LLMs when fi
ne-tuned for downstream tasks, can sometimes relearn the social biases! [Kaneko et al. COLING 2022] • Sometimes when we combine (meta-embed) debiased embeddings, they again become biased! [Kaneko et al. EMNLP 2022] • When we debias LLMs, we loose pe rf ormance on downstream tasks • Evaluating social biases across languages and cultures is hard (and no annotated datasets are available) • Methods for automatically evaluating multilingual biases [Kaneko et al. EACL 2023] 18

Al learns from its mistakes (unlike humans) 19

20 Questions Danushka Bollegala h tt ps://danushka.net [email protected] @Bollegala Th
ank Y o

Open Data Science Conference 2023 -- Invited Talk

Open Data Science Conference 2023 -- Invited Talk

Danushka Bollegala

More Decks by Danushka Bollegala

Other Decks in Research

Featured

Transcript

Towards Socially Unbiased Generative A rt i fi cial Intelligence

Is AI Socially Biased? 2 Images created by DALL-E (OpenAI)

Is AI Socially Biased? 3 Images generated by DALL-E (OpenAI)

Is AI Socially Biased? 4 Images generated by Stable Di

Is AI Socially Biased? 5 Tweets by Steven Piantadosi (sho

Machine Learning 101 6 Model Raw Data Algorithm Human Labels

Types of biases 7 Bayeza-Yates 2018

The Wisdom of Crowd a Few • 50% of all

Gender Bias 9 Accumulated Fraction of Women’s biographies in Wikipedia

But it is not OK for AI to be biased!

Can we measure gender bias in LLMs? 11 She is

Multi-lingual Bias Evaluation • Social biases are not limited to

Which bias evaluation measure? • Di ff erent intrinsic bias

Bias controlling in LLMs 14 Figure 2: Average output probabilities

Gender-biases in LLMs 15 Measure BERT ALBERT news book HA

Debiasing isn’t enough! • Intrinsic bias scores: • Evaluates social

Intrinsic vs. Extrinsic Bias Scores 17 -10 -5 0 5

We are not done… (yet) • Debiased LLMs when fi

Al learns from its mistakes (unlike humans) 19

20 Questions Danushka Bollegala h tt ps://danushka.net [email protected] @Bollegala Th