Slide 1

Slide 1 text

Algorithmic Bias in Machine Learning Jill Cates Data Scientist at BioSymetrics November 16, 2019 PyCon Canada, Toronto Image source: Wired

Slide 2

Slide 2 text

Machine Learning Trends

Slide 3

Slide 3 text

PyCon U.S. PyCon Canada Machine Learning Trends

Slide 4

Slide 4 text

PyCon Canada 2019 Machine Learning Trends PyCon U.S.

Slide 5

Slide 5 text

Machine Learning Trends Decision-Making Facial Recognition Recommendations Financial Investing Real Estate Political Campaigns

Slide 6

Slide 6 text

Algorithmic Bias “systematic and repeatable errors in a computer system that create unfair outcomes, such as privileging one arbitrary group of users over others” “Our own values and desires influence our choices, from the data we choose to collect to the questions we ask. Models are opinions embedded in mathematics.” - Cathy O’Neil (Weapons of Math Destruction) An unfortunate by-product of machine learning

Slide 7

Slide 7 text

Algorithmic Bias An unfortunate by-product of machine learning Online Advertising Legal System HR Recruitment Facial Recognition Healthcare Credit Scores

Slide 8

Slide 8 text

Algorithmic Bias An unfortunate by-product of machine learning Future Implications: MIT Media Lab's Moral Machine: http://moralmachine.mit.edu/ Source: MIT Technology Review

Slide 9

Slide 9 text

Building a Shoe Classifier

Slide 10

Slide 10 text

Building a Shoe Classifier Data Collection

Slide 11

Slide 11 text

??? Machine Learning Algorithm Building a Shoe Classifier Model Training Biased data = Biased model

Slide 12

Slide 12 text

??? Machine Learning Algorithm Building a President Classifier Model Training (Zuzana Čaputová, president of Slovakia) Biased data = Biased model

Slide 13

Slide 13 text

Clinical Decision-Making • Recent paper published in Science (Oct 25, 2019) • Assessed a U.S. healthcare decision-making algorithm • Looked at 50,000 records from an academic hospital • Found that white patients were given higher risk scores and received more care than black patients • Clinical diagnostic tools and early detection of disease • When to admit or discharge a patient from the hospital • Automated triaging of patients • Assessing patient risk

Slide 14

Slide 14 text

• MIMIC-III is a widely used healthcare dataset in machine learning research • Developed and de-identified by the MIT Laboratory for Computational Physiology • Contains electronic medical record data of 50,000 hospital admissions for 40,000 critical care patients • Collected at Beth Israel Deaconess Medical Centre between 2001 and 2012 Clinical Decision-Making Some papers that use MIMIC III:

Slide 15

Slide 15 text

70% of patients are white 47% of patients are insured by Medicare 35% of patients are Catholic 41% of patients are married MIMIC-III Dataset Demographics (mean age of adult patients is 62.5 years old)

Slide 16

Slide 16 text

Language Translation Translating gender neutral languages • Gender neutral languages: Malay, Farsi (Persian), Hungarian, Armenian, Tagalog, etc. • Google Translate determines which gender should be assigned to which role • Trained on examples of translations from the web

Slide 17

Slide 17 text

Language Translation • Words are represented as vectors • Similar words have similar representations Word Embeddings Word2Vec • Generates word embeddings (gensim) • Reveals semantic relations (associations) between words

Slide 18

Slide 18 text

Language Translation • gender bias is captured by a direction in the word embedding • gender-neutral words are distinct from gender-definition words in the word embedding number of generated analogies number of stereotypic analogies

Slide 19

Slide 19 text

Mitigating the Risk of Bias • Contains metrics to test for “unwarranted associations between an algorithm's outputs and certain user subpopulations identified by protected features” • Identifies subpopulations with disproportionately high error rates, assesses offensive labeling, and detects uneven rates of algorithmic error FairTest by Columbia University Some included metrics: Normalized Mutual Information, Normalized Conditional Mutual Information, Binary Ratio, Binary Difference

Slide 20

Slide 20 text

Mitigating the Risk of Bias • Fairness metrics to test for biases (e.g., Generalized Entropy Index evaluates inequality in a dataset) • Bias mitigation algorithms (e.g., Adversarial Debiasing, Learning Fair Representations, Disparate Impact Remover, etc.) AIF360 by IBM AI 360

Slide 21

Slide 21 text

Mitigating the Risk of Bias • Deep learning models are difficult to interpret (can’t extract feature importances) • Lime provides explanations of a black-box model’s predictions • Can be used to interpret image and text classifiers Explaining Models with Lime

Slide 22

Slide 22 text

Achieving Algorithmic Fairness Conference on Fairness, Accountability, and Transparency (FAT) Algorithmic Accountability

Slide 23

Slide 23 text

Jill Cates Data Scientist at BioSymetrics github: @topspinj / twitter: @jillacates [email protected] Thank you!