Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Open Data Science Conference 2023 -- Invited Talk

Open Data Science Conference 2023 -- Invited Talk

Generative AI (GAI) systems such as ChatGPT have revolutionised the way we interact with AI systems. These models can provide precise and detailed answers to our information needs, expressed in the form of brief text-based prompts. However, some of the responses generated by the GAI systems can contain harmful social biases such as gender or racial biases. Detecting and mitigating such biased responses is an important step towards establishing user trust in GAI. In this talk, I will describe the latest developments in methodologies that can be used to detect social biases in texts generated by GAI systems. In particular, I will describe methods that can be used to detect social biases expressed not only in English but other languages as well, with minimal human intervention. This is particularly important when scaling social bias evaluation for many languages. Second, I will describe methods that can be used to mitigate the identified social biases in large-scale language models. Experiments show that although some of the social biases can be identified and mitigated with high accuracy, the existing techniques are not perfect and indirect associations remain in the generative NLP models. Finally, I will describe on-going work in the NLP community to address these shortcomings and develop not only accurate but also trustworthy AI systems for the future.

Danushka Bollegala

June 14, 2023
Tweet

More Decks by Danushka Bollegala

Other Decks in Research

Transcript

  1. Towards Socially Unbiased


    Generative A
    rt
    i
    fi
    cial Intelligence
    Professor Danushka Bollegala


    University of Liverpool

    View Slide

  2. Is AI Socially Biased?
    2
    Images created by DALL-E (OpenAI)
    h
    tt
    ps://www.vice.com/en/a
    rt
    icle/wxdawn/the-ai-that-draws-what-you-type-is-very-racist-shocking-no-one

    View Slide

  3. Is AI Socially Biased?
    3
    Images generated by DALL-E (OpenAI)

    View Slide

  4. Is AI Socially Biased?
    4
    Images generated by Stable Di
    ff
    usion (Stability.AI)
    “Janitor”
    “Asse
    rt
    ive Fire
    fi
    ghter”
    h
    tt
    ps://techpolicy.press/researchers-
    fi
    nd-stable-di
    ff
    usion-ampli
    fi
    es-stereotypes/

    View Slide

  5. Is AI Socially Biased?
    5
    Tweets by Steven Piantadosi


    (sho
    rt
    url.at/morB7)


    Dec 4, 2022

    View Slide

  6. Machine Learning 101
    6
    Model
    Raw Data
    Algorithm
    Human Labels
    Training Data Inference

    View Slide

  7. Types of biases
    7
    Bayeza-Yates 2018

    View Slide

  8. The Wisdom of Crowd a Few
    • 50% of all a
    rt
    icles in Wikipedia (at the sta
    rt
    ) was wri
    tt
    en by 0.04% (ca.
    2000) of its editors.


    • Only 4% of all active users at Amazon write product reviews


    • 50% of all websites are in English, whereas there are only 5% of native
    English speakers (becomes 13% if we add non-natives) in the World.


    • Only 7% of Facebook users produce 50% of the posts


    • 0.05% of most popular people a
    tt
    racts (followed by) 50% of the
    Twi
    tt
    er users


    • Zip’s Least E
    ff
    o
    rt
    Principle — many people do only a li
    tt
    le while few
    people do a lot
    8

    View Slide

  9. Gender Bias
    9
    Accumulated Fraction of Women’s biographies in Wikipedia
    Bayeza-Yates 2018

    View Slide

  10. But it is not OK for AI to be biased!
    • Legal argument


    • Title VII of the Civil Rights Act of 1964 in the UK


    • Prohibits employment discrimination due to race, religion, gender and ethnicity


    • EU Cha
    rt
    er of Fundamental Rights, Title III (Equality), A
    rt
    icle 21 Non-discrimination


    • 1. Any discrimination based on any ground such as sex, race, colour, ethnic or social origin,
    genetic features, language, religion or belief, political or any other opinion, membership of a
    national minority, prope
    rt
    y, bi
    rt
    h, disability, age or sexual orientation shall be prohibited.


    • 2. Within the scope of application of the Treaties and without prejudice to any of their speci
    fi
    c
    provisions, any discrimination on grounds of nationality shall be prohibited.


    • Commercial argument


    • Your customers will loose trust in your AI-based service


    • Moral argument


    • Come on! Why should we let humans to be discriminated by AI.
    10

    View Slide

  11. Can we measure gender bias in LLMs?
    11
    She is a nurse
    He is a nurse
    0.9
    0.4
    Likelihood
    If an LLM assigns higher likelihood score to one sentence than the other, it is
    considered to be preferring one gender over the other, hence gender biased.


    She/He is a nurse
    Likelihood
    She: (0.9+0.8+0.6+0.7) / 4
    He: (0.8+0.9+0.5+0.6) / 4
    Diff
    0.2
    All Unmasked Likelihood (AUL) score =
    Percentage of sentence-pairs where the male


    version has a higher likelihood than the female version
    Kaneko et al. [AAAI 2022]

    View Slide

  12. Multi-lingual Bias Evaluation
    • Social biases are not limited to English. However, compared to bias evaluation
    datasets annotated for English, datasets available for other languages are
    limited.


    • Annotating datasets eliciting social biases for each language is costly, time
    consuming, and it might even be di
    ff
    i
    cult to recruit annotators for this purpose.


    • We proposed a multi-lingual bias evaluation measure using existing parallel
    translation data.
    12
    Kaneko et al. [NAACL 2022]

    View Slide

  13. Which bias evaluation measure?
    • Di
    ff
    erent intrinsic bias evaluation measures are proposed, which use di
    ff
    erent
    evaluation datasets and criteria


    • CrowS-Pairs [Nangia et al.], StereoSets [Nadeem et al.], AUL/AULA [Kaneko
    et al.], Template-based Scores (TBS) [Kurita et al.]

    • How can we know which intrinsic bias evaluation measure is most appropriate?
    13
    Kaneko et al. [EACL 2023]

    View Slide

  14. Bias controlling in LLMs
    14
    Figure 2: Average output probabilities for “[MASK]
    is a/an [Occupation]” produced by the bias-controlled
    BERT and ALBERT PLMs fine-tuned with different r
    on the news dataset.
    (a) r = 1.0
    (b) r = 0.7
    (c) r = 0.5
    [MASK] doesn’t have time for the family due to work obligations.

    View Slide

  15. Gender-biases in LLMs
    15
    Measure
    BERT ALBERT
    news book HA news book HA
    TBS 0.14 0.09 - 0.25 0.14 -
    SSS 0.22 0.22 0.45 0.31 0.22 0.53
    CPS 0.30 0.27 0.57 0.37 0.22 0.48
    AUL 0.37 0.32 0.68 0.55 0.36 0.56
    AULA 0.42 0.34 0.71 0.60 0.42 0.57
    Table 1: Peason correlation between biased PLM order
    and each bias scores. News and book represent the cor-
    pus used for biasing, respectively. HA is AUC value of
    of the proposed m
    several PLMs and
    the proposed met
    tion results of the
    CPS, AUL, and A
    ALBERT on new
    HA is the AUC va
    (2022)’s method
    TBS uses templa
    HA.
    Peason correlation between biased PLM order and each bias scores.
    News and book represent the corpus used for biasing, respectively.
    HA is AUC value of method using human annotation.
    • For BERT, the proposed method induces the same order among measures

    (i.e. AULA > AUL > CPS > SSS) as done by HA in both news and book.


    • For ALBERT, only the rankings of SSS and CPS di
    ff
    er between the proposed method and HA.


    • These results show that the proposed method and the existing method that use human
    annotations rank the intrinsic gender bias evaluation measures in almost the same order.

    View Slide

  16. Debiasing isn’t enough!
    • Intrinsic bias scores:


    • Evaluates social biases in LLMs on their own right, independently
    of any downstream tasks


    • Extrinsic bias scores:


    • Evaluates social biases in LLMs when they are applied to solve a
    speci
    fi
    c downstream task such as Natural Language Inference
    (NLI), Semantic Textual Similarity (STS) or predicting occupations
    from biographies (BiasBios).


    • The correlation between intrinsic vs. extrinsic bias scores is weak
    16
    Kaneko et al. [COLING 2022]

    View Slide

  17. Intrinsic vs. Extrinsic Bias Scores
    17
    -10
    -5
    0
    5
    bert-bu
    bert-lu
    bert-bc
    bert-lc
    roberta-b
    roberta-l
    albert-b
    AT CDA DO
    (a) SSS
    -10
    -5
    0
    5
    bert-bu
    bert-lu
    bert-bc
    bert-lc
    roberta-b
    roberta-l
    albert-b
    AT CDA DO
    (b) CPS
    -10
    -5
    0
    5
    bert-bu
    bert-lu
    bert-bc
    bert-lc
    roberta-b
    roberta-l
    albert-b
    AT CDA DO
    (c) AULA
    -0.5
    0.0
    0.5
    1.0
    1.5
    bert-bu
    bert-lu
    bert-bc
    bert-lc
    roberta-b
    roberta-l
    albert-b
    AT CDA DO
    (d) BiasBios
    -0.2
    -0.1
    0.0
    0.1
    0.2
    bert-bu
    bert-lu
    bert-bc
    bert-lc
    roberta-b
    roberta-l
    albert-b
    AT CDA DO
    (e) STS-bias
    -20
    -10
    0
    10
    20
    bert-bu
    bert-lu
    bert-bc
    bert-lc
    roberta-b
    roberta-l
    albert-b
    AT CDA DO
    (f) NLI-bias
    Figure 1: Differences between the bias scores of original vs. debiased MLMs. Negative values indicate that the
    debiased MLM has a lower bias than its original (non-debiased) version.
    Di
    ff
    erences between the bias scores of original vs. debiased MLMs. Negative values
    indicate that the debiased MLM has a lower bias than its original (non-debiased) version.

    View Slide

  18. We are not done… (yet)
    • Debiased LLMs when
    fi
    ne-tuned for downstream tasks, can
    sometimes relearn the social biases! [Kaneko et al. COLING 2022]


    • Sometimes when we combine (meta-embed) debiased
    embeddings, they again become biased! [Kaneko et al. EMNLP
    2022]


    • When we debias LLMs, we loose pe
    rf
    ormance on downstream
    tasks


    • Evaluating social biases across languages and cultures is hard (and
    no annotated datasets are available)


    • Methods for automatically evaluating multilingual biases
    [Kaneko et al. EACL 2023]
    18

    View Slide

  19. Al learns from its mistakes (unlike humans)
    19

    View Slide

  20. 20
    Questions
    Danushka Bollegala


    h
    tt
    ps://danushka.net


    [email protected]


    @Bollegala
    Th
    ank Y
    o

    View Slide