Algorithmic Bias in Machine Learning

Algorithmic Bias in Machine Learning Jill Cates Data Scientist at
BioSymetrics November 16, 2019 PyCon Canada, Toronto Image source: Wired

Machine Learning Trends

PyCon U.S. PyCon Canada Machine Learning Trends

PyCon Canada 2019 Machine Learning Trends PyCon U.S.

Machine Learning Trends Decision-Making Facial Recognition Recommendations Financial Investing Real
Estate Political Campaigns

Algorithmic Bias “systematic and repeatable errors in a computer system
that create unfair outcomes, such as privileging one arbitrary group of users over others” “Our own values and desires influence our choices, from the data we choose to collect to the questions we ask. Models are opinions embedded in mathematics.” - Cathy O’Neil (Weapons of Math Destruction) An unfortunate by-product of machine learning

Algorithmic Bias An unfortunate by-product of machine learning Online Advertising
Legal System HR Recruitment Facial Recognition Healthcare Credit Scores

Algorithmic Bias An unfortunate by-product of machine learning Future Implications:
MIT Media Lab's Moral Machine: http://moralmachine.mit.edu/ Source: MIT Technology Review

Building a Shoe Classifier

Building a Shoe Classifier Data Collection

??? Machine Learning Algorithm Building a Shoe Classifier Model Training
Biased data = Biased model

??? Machine Learning Algorithm Building a President Classifier Model Training
(Zuzana Čaputová, president of Slovakia) Biased data = Biased model

Clinical Decision-Making • Recent paper published in Science (Oct 25,
2019) • Assessed a U.S. healthcare decision-making algorithm • Looked at 50,000 records from an academic hospital • Found that white patients were given higher risk scores and received more care than black patients • Clinical diagnostic tools and early detection of disease • When to admit or discharge a patient from the hospital • Automated triaging of patients • Assessing patient risk

• MIMIC-III is a widely used healthcare dataset in machine
learning research • Developed and de-identified by the MIT Laboratory for Computational Physiology • Contains electronic medical record data of 50,000 hospital admissions for 40,000 critical care patients • Collected at Beth Israel Deaconess Medical Centre between 2001 and 2012 Clinical Decision-Making Some papers that use MIMIC III:

70% of patients are white 47% of patients are insured
by Medicare 35% of patients are Catholic 41% of patients are married MIMIC-III Dataset Demographics (mean age of adult patients is 62.5 years old)

Language Translation Translating gender neutral languages • Gender neutral languages:
Malay, Farsi (Persian), Hungarian, Armenian, Tagalog, etc. • Google Translate determines which gender should be assigned to which role • Trained on examples of translations from the web

Language Translation • Words are represented as vectors • Similar
words have similar representations Word Embeddings Word2Vec • Generates word embeddings (gensim) • Reveals semantic relations (associations) between words

Language Translation • gender bias is captured by a direction
in the word embedding • gender-neutral words are distinct from gender-definition words in the word embedding number of generated analogies number of stereotypic analogies

Mitigating the Risk of Bias • Contains metrics to test
for “unwarranted associations between an algorithm's outputs and certain user subpopulations identified by protected features” • Identifies subpopulations with disproportionately high error rates, assesses offensive labeling, and detects uneven rates of algorithmic error FairTest by Columbia University Some included metrics: Normalized Mutual Information, Normalized Conditional Mutual Information, Binary Ratio, Binary Difference

Mitigating the Risk of Bias • Fairness metrics to test
for biases (e.g., Generalized Entropy Index evaluates inequality in a dataset) • Bias mitigation algorithms (e.g., Adversarial Debiasing, Learning Fair Representations, Disparate Impact Remover, etc.) AIF360 by IBM AI 360

Mitigating the Risk of Bias • Deep learning models are
difficult to interpret (can’t extract feature importances) • Lime provides explanations of a black-box model’s predictions • Can be used to interpret image and text classifiers Explaining Models with Lime

Achieving Algorithmic Fairness Conference on Fairness, Accountability, and Transparency (FAT)
Algorithmic Accountability

Jill Cates Data Scientist at BioSymetrics github: @topspinj / twitter:
@jillacates [email protected] Thank you!

Algorithmic Bias in Machine Learning

Algorithmic Bias in Machine Learning

Jill Cates

More Decks by Jill Cates

Other Decks in Technology

Featured

Transcript