Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Fairness in AI - Understanding and Addressing AI bias

Fairness in AI - Understanding and Addressing AI bias

Machine learning and artificial intelligence continue to become ever more central to every aspect of our lives and the pace of adoption is only continuing to accelerate. AI should be a force for good and has already delivered innumerable benefits. However, as AI starts to decide everything from whether we get a home loan to whether our resume is considered by a company, it is critical to ensure that these decisions are fair, equitable and explainable. Unfortunately, it is becoming increasingly clear that, much like humans, AI can be biased, and there have been many very public incidents where projects had to be abandoned due to catastrophic biases.
In this presentation, we start by considering the ramifications of bias, discuss how fairness is defined, and consider regulated domains and protected classes. We continue by highlighting how bias can be introduced into AI solutions, with significant focus on NLP, where models trained on large public data corpora can assume many of the explicit and implicit biases that are unfortunately present in humankind’s communications. We subsequently discuss how this bias can be measured, tracked and even minimized. We present best practices for ensuring that bias doesn’t creep into models over time, discuss open-source toolkits and highlight how explainability can be used to perform real-time checks on predictions.


Lawrence Spracklen

October 27, 2020


  1. AI DEV WORLD 2020 Fairness in AI - Understanding and

    Addressing AI Bias Sonya Balzer Dr. Lawrence Spracklen RSquared.AI
  2. Today’s Speakers Dr. Lawrence Spracklen CTO, RSquared VP of Engineering

    & Data Science, SupportLogic VP of Engineering, Alpine Data VP of Engineering, Ayasdi lawrence@rsquared.ai Sonya Balzer Director of Marketing, RSquared Marketing Director, SupportLogic sonya@rsquared.ai
  3. Where We’re From • RSquared is a data-driven actionable insights

    platform used by organizations to improve workplace culture, inclusion and productivity • Using AI / NLP to securely analyze employee interactions and attitudes through work emails, chats, and other digital communications
  4. The Current Environment + +

  5. Equality Requires Fairness • Why is this true? • Fairness

    is being free from bias or injustice; evenhandedness • But all human beings hold unconscious beliefs about different groups • Why protected classes and regulations are needed
  6. Bias Exists in Technology Too • Discrimination exists in algorithms

    • Most AI systems assume gender is binary • Software is written by humans and we’re inherently biased
  7. Why We All Should Care • Detecting and correcting bias

    is becoming more critical to society • The more we learn about bias in AI, the more we learn about bias in humans • Explainable AI is one of the top trends in the field of Machine Learning today • US laws proposed to require large companies to audit ML systems
  8. Fair & Equitable AI • Call for explainable and unbiased

    AI is happening now • Implicit and explicit biases must be addressed • This requires a combination of social science + data science
  9. “With great power comes great responsibility” - Peter Parker Principle

  10. Shouldn’t Computers be Fair? • AI algorithms aren’t necessarily biased

    • Algorithms are trained on example data • Models learn explicit or implicit biases in the training data • Without appropriate checks & safeguards AI can be ruthless • Leverages statistical differences to make decisions • No fairness through unawareness!
  11. Attacking the Problem Bias detection • Does my data set

    contain bias? • Is my model biased? Bias explainability • Where is the bias? • Which are the problem features? • What features is my model using when making a prediction? Bias mitigation • Can I reduce the impact of these biases? • Should I be rectifying the data, the model or predictions?
  12. Understand your data • Examine data with respect to sensitive/protected

    features • Do proportion of positive outcomes vary across protected groups? • Are all populations adequately represented? • Are features correlated with sensitive/protected features? • Can sensitive/protected features be predicted from remaining features? • Variety of different metrics to measure unfairness ! Disparate impact = !" #$% &$'()*+,+-./.0) !" #$% &$)*+,+-./.0) Consistency = 1 − 1 . 0 +$% ( | . − 0 2$% 3 2 | [Group fairness] [Individual fairness]
  13. Checking Model Bias • Go beyond looking at overall metrics

    • Aggregate stats can hide significant problems • Model fidelity can vary significant across protected groups • Even when overall stats are good, sub-populations may be modeled poorly • Breakout model stats with respect each protected group • PPV and NPV grouped by the sensitive attribute. • TPR, FPR, TNR and FNR grouped by the sensitive attribute. • ROC per sensitive attribute value
  14. Understand your models • Complex models are black boxes •

    Explainability provides insights into features driving predictions • Possible at a global level or an individual level • Global : what are the most important features overall • Individual : which features are most important for an individual prediction • Individual explanations can be expensive • Sample around observation and observe impact • Train localized interpretable model approximation https://github.com/marcotcr/lime
  15. Local Explainability Titanic dataset : Sex & wealth of passengers

    had a big impact on chance of survival
  16. NLP Explainability Explaining BERT sentiment model 'he is an extremely

    unpleasant british man' 'she is an extremely unpleasant british woman' Why did the model decide the statement was negative? Is there any obvious bias?
  17. Tackling Bias Four basic approaches to tackling bias 1. Collect

    ‘better’ data 2. Adjust data 3. Adjust models 4. Adjust outcome N.B. No silver bullet • Debiasing is not always viable • Debiasing introduces its own bias
  18. Data Set Manipulation Variety of different approaches to handling data

    set bias • Feature manipulation • Modify feature values to improve CDF alignment across protected groups • Sample weighting • Modify sample weights to emphasize unprivileged group positive outcomes • Label manipulation • Modify labels for examples close to classifier decision boundary to benefit unprivileged group • Dataset transformation • Transform features and labels with group fairness, fidelity & individual distortion constraints Unpriv Priv Sample weighting Feature manipulation
  19. Debiasing Outputs • Multiple Thresholds • Separate thresholds for each

    group value • Maximize model performance subject to specified fairness constraint • Outcome Modification 1. Change outcomes ‘close’ to the decision boundary 2. Probabilistically modify outcomes to achieve specified fairness objective • Not always possible to achieve the desired fairness constraints • Or achieve reasonable model outcomes while satisfying constraints • Upstream intervention may be required
  20. Bias-aware Algorithms • Bias-aware algorithms explicitly attempt to minimize bias

    during training • Algorithms leverage supplied fairness metric as explicit cost consideration • Potentially excessively limiting in choice of algorithms • Adversarial debiasing leverages adversarial learning to train debiased models • Adversary attempts to predict protected group from model predictions • Model weights are updated to better thwart adversary • Process repeats until convergence • Applicable to a wide range of model types
  21. Bias in NLP Additional opportunities for the introduction of bias

    1. Embedding information 2. Pretrained models
  22. Word Embeddings • Map words to high dimension vectors •

    Variety of different algorithms (Word2Vec, GloVe) • ‘Similar’ words cluster together • Arithmetic operations on word vectors • woman - man ≈ queen - king • Highlight stereotypical associations • man : woman :: shopkeeper : housewife • man : woman :: pharmaceuticals : cosmetics • Exist for names, religions, races, genders Man Woman King Queen ≈ − +
  23. Debiased Embeddings • Word embeddings can be debiased with respect

    to specified biases • Debiased embeddings are now available • E.g. ConceptNet • Wise to ensure that chosen embedding has been corrected for attributes of interest https://github.com/commonsense/conceptnet-numberbatch
  24. BERT et al. • BERT & GPT2 are common pretrained

    language models • Easily fine-tuned to perform a variety of custom tasks • Powerful techniques • Rapidly increasing in popularity • Models inherit biases observed in data used for pretraining • Techniques emerging for effective debiasing • Without impacting accuracy! https://stereoset.mit.edu/ BERT Next Sentence Prediction
  25. Overrepresentation in Training • Toxic example datasets without sufficient representation

    of words in neutral contexts can help to significant false positives • E.g. Gay or Black or Christian • E.g. “I am a proud gay man” or “I am a woman who is deaf” • See : “Jigsaw Unintended Bias in Toxicity Classification” • May only be apparent when the model deployed • Test data set will not highlight the problem • Operationalized explainability can help flag problems • Improve example datasets!
  26. Resources • Explainability • LIME • SHAP • Bias Detection

    and mitigation • TF Fairness • AI Fairness 360 • Fairlearn • Responsibility AI • Debiased Embeddings • ConceptNet • Data sets • Stereoset • Documents • FairML Book
  27. Conclusions • AI can be biased due to biased training

    data • Responsible AI is a critical consideration for data science projects • Develop comprehensive debiasing strategy • Removing protected attributes is not sufficient • Understand your data! • Broad array of OSS solutions to help detect, explain and reduce bias • Perform risk assessments • Understand the implications of your AI and the impact of potential bias • Create structure, process and governance • No ‘wild-west’ – carefully review data, models and implications • Diverse oversight

    EMAIL INFO@RSQUARED.COM lawence@rsquared.ai sonya@rsquared.ai
  29. Additional Slides

  30. How Does Bias Manifest? • Many ways bias can be

    introduced • Historical bias, representation bias, measurement bias, population bias • Many human biases [Sadly] • Over 180 human biases have been found • Racial, gender, religious, sexual orientation, age… • Remember : No fairness through unawareness • Removing protected classes will not fix the problem • Many attributes may be correlated with the protected one(s) • Effects of bias can’t be completely eliminated • But we can enable AI to do better in a biased world
  31. Global Explainability • Which features are most important in explaining

    target variable • Variety of different methods • Model specific methods • Feature permutation • Drop column • Overall behavior does not explain individual predictions Titanic dataset : Sex & wealth of passengers had a big impact on chance of survival
  32. Fairness Criteria (Classification) Different definitions of fairness 1. Sensitive variables

    (A) are independent to the prediction (R) • Independence (R, A) = = ) ≥ = = ) − 2. Sensitive variables are independent to error rates • Separation (R, A, Y) = = , = ) ≥ = = , = ) − • Sufficiency (R, A, Y) = = , = ) ≥ = = , = ) − https://en.wikipedia.org/wiki/Fairness_(machine_learning)