Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Florian Pfisterer- Fairness in automated decision making

Florian Pfisterer- Fairness in automated decision making

Decisions derived from automated systems, e.g. machine learning models increasingly affect our lives. Ensuring that those systems behave fairly, and e.g. do not discriminate against majorities is an important endeavour. In the talk, I would like to give a brief intro to the field of algorithmic fairness. This includes harms that might arise from the use of biased ML models and some intuition regarding how "un-" fairness could be measured along with approaches towards how we might be able to mitigate biases in such systems.

MunichDataGeeks

April 26, 2023
Tweet

More Decks by MunichDataGeeks

Other Decks in Science

Transcript

  1. Fairness in
    Automated Decision
    Making
    Florian Pfisterer @ Datageeks Munich
    30.03.2022

    View full-size slide

  2. About Me
    ● PhD Statistics at LMU Munich
    ○ Automated Machine
    Learning
    ○ Fairness and Ethics in AI
    ● Currently:
    Figuring out what's next
    ● First Datageeks Meetup:
    April 2015

    View full-size slide

  3. Outline
    ● What is automated decision making (ADM) ?
    ● What makes a decision making system unfair?
    ○ Which types of harms occur?
    ○ What are sources of bias?
    ○ How can we detect unfair systems?
    ● How can we prevent unfair systems?

    View full-size slide

  4. Automated decision making (ADM)
    Model
    Data Decision
    A model can be any system that produces a decision based on data.
    Example: Set of business rules, Logistic Regression, Deep Neural Networks
    Name Age Income Job
    Joe 33 50000 Mgr.


    ADMs automate decisions in many domains, e.g. credit checks, fraud detection,
    setting insurance premiums, hiring decisions, ....

    View full-size slide

  5. Fairness in ADM -
    Why should we
    care?

    View full-size slide

  6. Hiring
    www.pnas.org/content/117/23/12592
    Healthcare
    https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-recruiting-t
    ool-that-showed-bias-against-women-idUSKCN1MK08G
    http://web.br.de/interaktiv/ki-bewerbung/en/
    Work
    https://algorithmwatch.org/en/austrias-employment-agency-ams-rolls-out-discriminatory-algorithm/

    View full-size slide

  7. Harms - Individuals
    Allocation
    Extending or withholding
    opportunities, resources or
    information
    Stereotyping
    The system reinforces
    stereotypes
    Quality-of-service
    System does not work
    equally well for all groups
    Representation
    System over- or under-
    represents certain groups
    Denigration
    The system is actively
    offensive or derogatory
    Procedural
    The system makes
    decisions that violate social
    norms
    Weerts H., An Introduction to Algorithmic Fairness, 2022
    Only affected individuals
    experience this!

    View full-size slide

  8. Types of Biases -
    Where do harms come
    from?

    View full-size slide

  9. Types of Bias - Historical Bias
    Data reflects how things were in the past -
    We might not want to perpetuate all of it!
    ● Texts often reflect historical inequalities
    ● Poor areas have higher police presence
    → more arrests / reoffenders
    Models can pick up biases and perpetuate
    them into the future!

    View full-size slide

  10. Types of Bias - Representation Bias
    Data is often not representative of the whole
    population we care about
    ● Collecting data from underrepresented groups
    is often neglected as it is expensive
    ● Data often does not exist:
    Women were often not included in studies
    → Gender medicine
    http://gendershades.org/

    View full-size slide

  11. Types of Bias - Other
    Measurement Bias
    Difference in how a given variable is measured across sub-populations
    ● Better data quality between different hospitals
    Model Bias
    Biases introduced during modeling, e.g. due to under-specified models
    ● e.g. Models only learn the prediction mechanism of the larger class
    Feedback Loops
    Model decisions shape data collected in the future
    ● Can, e.g. lead to representation bias if sub-populations are excluded
    systematically

    View full-size slide

  12. Example - Loan Application Process
    Loan
    Applicant
    Credit Risk
    Scoring
    Denied
    Repayment
    Process
    Historical
    Data




    What kinds of biases might occur here?

    View full-size slide

  13. Example - Loan Application Process
    Loan
    Applicant
    Credit Risk
    Scoring
    Denied
    Repayment
    Process
    Historical
    Data




    Historical Bias:
    Some groups might not have
    been able to pay back loans in
    the past but this has changed
    Feedback loops:
    We do not learn anything
    about rejected applicants!
    This extends indefinitely into
    the future!
    Model Bias:
    The model might not pick up
    important differences between,
    e.g. genders
    Representation Bias:
    Insufficient data about some
    groups leading to higher
    uncertainty.

    View full-size slide

  14. Diagnosing Bias

    View full-size slide

  15. Approaches to measuring fairness
    Individual Fairness
    "Similar people should
    be treated similarly"
    Group Fairness
    "On average, different
    groups of people
    should be treated
    equally. "
    Two Perspectives

    View full-size slide

  16. Approaches to measuring fairness
    Individual Fairness
    "Similar people should
    be treated similarly"
    Group Fairness
    "On average, different
    groups of people
    should be treated
    equally. "
    Two Perspectives
    What are similar people?
    What is similar treatment ?
    Are the groups comparable?

    View full-size slide

  17. Approaches to measuring fairness
    We introduce two concepts:
    Goal: Fairness for legally protected groups (age, sex, disability, ethnic origin,...)
    ● Sensitive attribute A:
    A describes which group an individual belongs to, either 0 or 1.
    ● Decision D:
    An ADM produces a decision D, and it can be 👎 or 👍.
    Decisions should fit the true outcome Y, (also 👎 or 👍).
    Example - Credit risk assessment
    Individual with characteristics X=X, A=1 (female).
    She receives the decision D = 👎 but would have paid back a loan (Y = 👍).
    Our system made an error!

    View full-size slide

  18. Auditing models for potential harms
    Statistical Parity (Treatment Equality)
    "Decisions should be independent of sensitive
    attribute"
    P(D=👍|A=1) = P(D=👍|A=0)
    Equality of Opportunity
    "Chance to deservedly obtain a favourable outcome is
    independent of sensitive attribute"
    P(D=👍|A=1,Y=👍) = P(D=👍|A=0,Y=👍)
    Angus Maguire Interaction Institute for Social Change

    View full-size slide

  19. Auditing models for potential harms
    1. Bias preserving fairness metrics
    Define fairness based on errors (or lack thereof)
    between a true outcome Y and the decision D.
    Examples:
    Equality of opportunity: True positive rates across
    groups should be equal!
    Accuracy equality: Accuracy across groups should be
    equal!
    2. Bias transforming fairness metrics
    This defines fairness based only on decision D.
    Examples:
    Statistical parity: Positive rates should be equal across
    groups
    Choosing a fairness definition requires an ethical judgement!
    Conditional statistical parity: Positive rates should be equal given some
    condition
    "Acceptance at fire departments should be equal given a minimum height
    requirement"

    View full-size slide

  20. Auditing models for potential harms
    Example:
    A D Y
    poor 👎 👍
    rich 👍 👍
    rich 👍 👎
    poor 👍 👍
    ● True positive rate parity (D = 👍, Y = 👍)
    rich: 1/2, poor: 1/1
    ● Statistical parity (D = 👍)
    rich: 2/2, poor: 1/2
    Metric: Absolute difference between groups
    |ɸA=0 - ɸA=1|
    ● Fairness needs to be evaluated on a representative dataset

    View full-size slide

  21. Auditing models for potential harms
    Fairness metrics reduce many
    important considerations into a
    single number
    They can not guarantee that a
    system is fair

    View full-size slide

  22. Dealing with
    Biases

    View full-size slide

  23. No fairness through unawareness!
    Naive Idea: Remove the protected attribute!
    The model directly uses race as a
    feature
    The model picks up information about
    race through the proxy-variable ZIP code

    View full-size slide

  24. Remedies - Algorithmic Solutions
    Several 'technical' fixes have been proposed
    ● Preprocessing Changes the data so resulting models are fairer
    Example: Balance distributions across different groups
    ● Fair models Learn models that take a fairness criterion into account
    Example: Linear model with fairness constraints
    ● Postprocessing Adapt decisions to satisfy fairness metrics
    Example: Accept more people from the disadvantaged group

    View full-size slide

  25. Remedies - Algorithmic Solutions
    Technical solutions can only hope to 'fix' the symptoms but do not address root causes!

    View full-size slide

  26. Remedies - Recourse
    ● Problem formulation Deciding which problems to prioritize using ADM systems is
    important:
    Example: Detecting welfare fraud vs. Identifying underserved cases
    ● Accountability & Recourse
    ○ Automated systems will make errors - developers need to ensure that humans
    responsible for addressing errors exist and that they can resolve such errors.
    ○ Access to an explanation on how the decision was made and what steps can be
    taken to address unfavourable decisions.
    ● Documentation Errors often result from using data and models beyond their intended
    purpose - Data Sheets and Model Cards help to document intended use and
    important caveats.

    View full-size slide

  27. Summary
    Ensuring fair decisions can be difficult:
    ● Organizational Support To succeed, fairness needs to be embedded at the
    product development and engineering level. This requires creating awareness in
    engineering teams that develop and maintain ADM systems.
    ● Diverse perspectives Harms can occur in many different forms. Considering
    diverse perspectives and involving stakeholders during system development is
    essential.
    ● Fairness metrics can be a useful tool to diagnose bias, but to understand what
    they mean, they need to be grounded in real world quantities.

    View full-size slide

  28. Thank you
    Get in contact:
    Illustrations:
    @drawdespiteitall
    www.linkedin.com/in/pfistfl/
    [email protected]

    View full-size slide

  29. Resources
    Books & Articles
    ● Fairness and Machine Learning - Limitations and Opportunities (Barocas et al., 2019)
    ● An Introduction to Algorithmic Fairness (Weerts, 2021)
    with online notes: https://hildeweerts.github.io/responsiblemachinelearning/
    ● Algorithmic Fairness: Choices, Assumptions, and Definitions (Mitchell et al., 2021)
    ● Why Fairness Cannot Be Automated: Bridging the Gap Between EU Non-Discrimination
    Law and AI (Wachter, 2021)
    Software
    ● Fairlearn (Python) https://fairlearn.org/
    ● aif360 (Python, R) https://aif360.mybluemix.net/
    ● fairmodels (R) https://fairmodels.drwhy.ai/
    ● mlr3fairness (R) https://github.com/mlr-org/mlr3fairness

    View full-size slide

  30. Harms - ADM provider
    Legal
    Our systems should not discriminate, e.g.
    Article 21 of the EU Charter of Fundamental Rights
    Public Image
    Increased scrutiny on ADM-based products by media and
    consumer advocacy groups
    Ethical
    We want our decision making to reflect our ethical values
    Regulatory
    Demonstrate non-discrimination to regulatory bodies

    View full-size slide

  31. Remedies - Documentation
    Data Sheets and Model Cards
    Harm often stems from using datasets or models beyond their intended use,
    this can be prevented by better documentation!
    ● Dataset docs Include information on datasets, how it was collected etc.
    Example: Datasheets for datasets (Gebru et al., 2018)
    ● Model docs Information on used data, intended use and target demographic.
    Example: Model Cards for Model Reporting (Mitchell et al., 2019)

    View full-size slide

  32. Discussion
    Can simple business rules be unfair?
    Does fairness matter for less consequential decisions?
    Reconciling fairness and the financial bottom line?

    View full-size slide

  33. Fairness in European Law
    Disclaimer:
    I know only few things about law and fairness!
    This might be wrong!
    1. The current legal situation in the EU is unclear, it is likely that this will be shaped by
    case law in the different member states
    2. Error-based metrics are widely used to assess fairness.
    If they are sufficient depends on whether we can assume that data is 'fair'
    3. Wachter et al. (2021) argue that EU law might require a form of conditional
    statistical parity introduced today.

    View full-size slide

  34. Algorithmic Fairness
    ADM's should behave and treat people fairly:
    without unjust treatment on the grounds of sensitive
    characteristics
    sensitive characteristics: e.g., legally protected groups (age, sex, disability,
    ethnic origin, race, ...)

    View full-size slide

  35. Auditing models for potential harms
    1. Students who might succeed are admitted
    equally often between students from poor and
    rich households.
    Students are selected according to their ability
    (measured via grades)
    Argument:
    Resources should be made available to those
    with the highest chance of success.
    2. Students from poor and rich households are
    admitted equally often.
    Does not take ability into account
    Argument:
    Poor students have less access to tutoring -
    It is not fair to base admission on grades only.
    The construct grades does not adequately
    measure ability.
    Choosing a fairness definition requires an ethical judgement!

    View full-size slide

  36. Auditing models for potential harms
    1. Students with good grades are admitted equally
    between students from poor and rich households.
    This defines fairness based on errors (or lack thereof)
    between a true outcome Y (= will succeed) and the
    decision D.
    Examples:
    Equality of opportunity: True positive rates across
    groups should be equal!
    Accuracy equality: Accuracy across groups should be
    equal!
    2. Students from poor and rich households are admitted
    equally often.
    This defines fairness based only on decision D.
    Examples:
    Statistical parity: Positive rates should be equal across
    groups
    Conditional statistical parity: Positive rates should be
    equal given some condition
    "Acceptance at fire departments should be equal given a
    minimum height requirement"
    Choosing a fairness definition requires an ethical judgement!

    View full-size slide

  37. A practical
    example

    View full-size slide