What does Fairness in Information Access Mean and Can We Achieve It?

What does Fairness in Information Access Mean and Can We
Achieve It? Chirag Shah @chirag_shah

We live in a biased world • We are biased.
• Any dataset can be biased. • Any model can be biased. • No data is a perfect representation of the world; tradeoffs are made during data collection, storage, and analysis.

Fairness = Lack of Bias? • Bias is not always
bad • Three definitions of fairness: • Statistical parity • Disparate impact • Disparate treatment

Addressing fairness through diversity • Took a sliver of search
data (queries, top results). • Clustered the results and quantified the amount of topical bias. • Designed new algorithms to re-rank those results to have a fairer ranking. • Two forms of fairness: ◦ Statistical parity ◦ Disparate impact Ruoyuan Gao Amazon Gao, R. & Shah, C. (2020). Toward Creating a Fairer Ranking in Search Engine Results. Journal of Information Processing and Management (IP&M), 57(1). Gao, R. & Shah, C. (2019). How Fair Can We Go: Detecting the Boundaries of Fairness Optimization in Information Retrieval. In Proceedings of ACM International Conference on Theory of Information Retrieval (ICTIR). pp. 229-236. October 2-5, 2019. Santa Clara, CA, USA. Gao, R., Ge, Y., & Shah, C. (2022). FAIR: Fairness-Aware Information Retrieval Evaluation. Journal of the Association for Information Science and Technology (JASIST).

Datasets • Google ◦ From Google Trends (June 23-June 29,
2019) ◦ 100 queries ◦ Top 100 results per query • New York Times ◦ 1.8M articles published by NYT ◦ 50 queries ◦ Top 100 results per query • Clustering with two subtopics madden shooting hurricane lane update jacksonville shooting video shanann watts holy fire update fortnite galaxy skin new deadly spider stolen plane …

Creating a Fair Top-10 List from Top-100

= Statistical parity

70% 30% Problem: we are disregarding relevance. Disparate impact

Top-Top with Statistical Parity =

Top-Top with Disparate Impact Problem: subsequent documents may not have
as much novelty. 70% 30%

Page-wise with Statistical Parity =

Page-wise with Disparate Impact Problem: we are not getting enough
diversity by sampling from the tops. 70% 30%

ε-greedy • Explore the results with ε probability, exploit with
1-ε • ε=0.0 → No exploration • ε=1.0 → Full exploration (randomness) • Non-fair (naïve) ε-greedy: with probability ε, randomly select from entire rank-list (100) with probability 1-ε, pick from the top • Fair ε-greedy with probability ε, randomly select a cluster, then pick top from the cluster with probability 1-ε, pick the “fair” cluster, then pick top from the cluster Statistical parity | Disparate impact

Implications: text search

Measuring impacts on users http://fate.infoseeking.org/googleornot.php

Translating systems to experiences • Most people can’t tell the
difference between original Google results and those with ε=0.3. • But they can if ε>0.5. • Lesson: We can provide diversity in search results in a careful way that helps reduce bias while keeping user satisfaction.

Implications: image search

Query “CEO United States”

Query “CEO UK”

Feng, Y. & Shah, C. (2022). Has CEO Gender Bias
Really Been Fixed? Adversarial Attacking and Improving Gender Fairness in Image Search. AAAI Conference on Artificial Intelligence. February 22-March 1, 2022. Vancouver, Canada.

Reducing bias using “fair-greedy” approach Feng, Y. & Shah, C.
(2022). Has CEO Gender Bias Really Been Fixed? Adversarial Attacking and Improving Gender Fairness in Image Search. AAAI Conference on Artificial Intelligence. February 22- March 1, 2022. Vancouver, Canada.

But… • [Technical] Multi-objective optimization (fairness in marketplace) is hard
and not always well-defined. • [Business] Re-ranking brings additional costs. • [Social] Our notions of what’s biased, what’s fair, and what’s good keep changing.

Summary • Large scale information access systems suffer from problems
of bias, unfairness, and opaqueness – some due to technical issues, some due to business objectives, and some are social issues. • We could audit these systems and create education, awareness, and advocacy around them. • Ideally, we need a multifaceted approach similar to curbing smoking.

Thank you!

What does Fairness in Information Access Mean a...

What does Fairness in Information Access Mean and Can We Achieve It?

wing.nus

More Decks by wing.nus

Featured

Transcript