Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Skills Matter - Digital Discrimination: Cognitive Bias in Machine Learning

Skills Matter - Digital Discrimination: Cognitive Bias in Machine Learning

Maureen McElaney

June 04, 2020
Tweet

More Decks by Maureen McElaney

Other Decks in Technology

Transcript

  1. Skills Matter
    Digital Discrimination:
    Cognitive Bias in Machine
    Learning
    Maureen McElaney, Developer Advocate
    Center for Open Source Data and AI Technologies (CODAIT)
    June 4, 2020

    View Slide

  2. codait.org
    Center for Open Source Data and AI Technologies
    CODAIT
    Open Source @ IBM

    View Slide

  3. Digital Discrimination:
    Cognitive Bias in Machine
    Learning
    Tweet at us! @ibmcodait
    codait.org

    View Slide

  4. 4
    Agenda
    ● Examples of Bias in Machine
    Learning.
    ● Solutions to combat unwanted bias.
    ● Tools to combat unwanted bias.
    ● Resources and how to get involved.

    View Slide

  5. A cognitive bias is a systematic pattern of
    deviation from norm or rationality in
    judgment.
    People make decisions given their limited
    resources.
    Wilke A. and Mata R. (2012) “Cognitive Bias”, Clarkson University
    @ibmcodait

    View Slide

  6. Examples of bias in machine
    learning.
    @ibmcodait

    View Slide

  7. @ibmcodait
    NorthPointe’s
    COMPAS
    Algorithm
    Image Credit: #WOCinTech

    View Slide

  8. Source: https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
    May 2016 - Northpointe’s COMPAS Algorithm
    http://www.equivant.com/solutions/inmate-
    classification

    View Slide

  9. May 2016 - Northpointe’s COMPAS Algorithm
    http://www.equivant.com/solutions/inmate-
    classification
    Source: https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing

    View Slide

  10. May 2016 - Northpointe’s COMPAS Algorithm
    http://www.equivant.com/solutions/inmate-
    classification
    Source: https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing

    View Slide

  11. May 2016 - Northpointe’s COMPAS Algorithm
    http://www.equivant.com/solutions/inmate-
    classification
    Source: https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing

    View Slide

  12. Source: https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
    Black Defendant’s Risk Scores

    View Slide

  13. Source: https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
    White Defendant’s Risk Scores

    View Slide

  14. @ibmcodait
    BLACK VS. WHITE
    DEFENDANTS
    ○ Falsely labeled black defendants as likely
    of future crime at twice the rate as white
    defendants.
    ○ White defendants mislabeled as low risk
    more than black defendants
    ○ Pegged Black defendants 77% more likely
    to be at risk of committing future violent
    crime

    View Slide

  15. View Slide

  16. @ibmcodait
    Gender
    Shades Project
    February 2018
    Image Credit: #WOCinTech

    View Slide

  17. http://gendershades.org/

    View Slide

  18. “If we fail to make
    ethical and inclusive
    artificial intelligence
    we risk losing gains
    made in civil rights
    and gender equity
    under the guise of
    machine neutrality.”
    - Joy Boulamwini
    @jovialjoy

    View Slide

  19. http://www.aies-conference.com/wp-content/uploads/2019/01/AIES-19_paper_223.pdf

    View Slide

  20. @ibmcodait
    https://www.youtube.com/watch?v=Af2VmR-iGkY

    View Slide

  21. 21
    Agenda
    ● Examples of Bias in Machine
    Learning.
    ● Solutions to combat unwanted
    bias.
    ● Tools to combat unwanted bias.
    ● Resources and how to get involved.

    View Slide

  22. Solutions?
    What can we do
    to combat bias
    in AI?
    @ibmcodait

    View Slide

  23. @ibmcodait
    EDUCATION IS
    KEY
    Image Credit: #WOCinTech

    View Slide

  24. https://www.nytimes.com/2018/02/12/business/computer-science-
    ethics-courses.html

    View Slide

  25. Questions
    posed to
    students
    in these
    courses...
    Is the
    technology
    fair?
    How do you
    make sure
    that the
    data is not
    biased?
    Should
    machines
    be judging
    humans?
    @ibmcodait

    View Slide

  26. https://twitter.com/Neurosarda/status/1084198368526680064

    View Slide

  27. FIX THE
    PIPELINE?
    @ibmcodait Image Credit: #WOCinTech

    View Slide

  28. “Cognitive bias in
    machine learning is
    human bias on
    steroids.”
    28
    - Rediet Abebe
    @red_abebe
    @ibmcodait

    View Slide

  29. https://twitter.com/MatthewBParksSr/status/1133435312921874432

    View Slide

  30. January 2019 - New Search Feature on...
    https://www.pinterest.com
    Source:
    https://www.engadget.com/2019/01/24/pinterest-skin-tone-search-diversity/

    View Slide

  31. “By combining the
    latest in machine
    learning and inclusive
    product development,
    we're able to directly
    respond to Pinner
    feedback and build a
    more useful product.”
    31
    - Candice Morgan
    @Candice_MMorgan
    @ibmcodait

    View Slide

  32. @ibmcodait
    National and
    Industry
    Standards
    Image Credit: #WOCinTech

    View Slide

  33. EU Ethics Guidelines for Trustworthy
    Artificial Intelligence
    According to the Guidelines, trustworthy AI should be:
    (1) lawful - respecting all applicable laws and
    regulations
    (2) ethical - respecting ethical principles and values
    (3) robust - both from a technical perspective while
    taking into account its social environment
    Source:
    https://ec.europa.eu/digital-single-market/en/news/ethics-guidelines-trustworthy-ai

    View Slide

  34. #1 -
    Human
    agency and
    oversight.
    #2 -
    Technical
    robustness
    and safety.
    #3 -
    Privacy and
    data
    governance.
    #4 -
    Transparency
    .
    @Mo_Mack
    #5 -
    Diversity,
    non-
    discrimination
    and fairness.
    #6 -
    Societal and
    environmental
    well-being.
    #7 -
    Accountability

    View Slide

  35. https://wiki.lfai.foundation/display/DL/Trusted+AI+Committee

    View Slide

  36. 36
    Agenda
    ● Examples of Bias in Machine
    Learning.
    ● Solutions to combat unwanted bias.
    ● Tools to combat unwanted bias.
    ● Resources and how to get involved.

    View Slide

  37. @ibmcodait
    TOOLS TO
    COMBAT BIAS
    Image Credit: #WOCinTech

    View Slide

  38. In the works!
    Adversarial
    Robustness 360
    ↳ (ART360)
    AI Fairness 360
    ↳ (AIF360)
    AI
    Explainability
    360
    ↳ (AIX360)
    github.com/IBM/AIF360
    aif360.mybluemix.net
    FAIRNESS EXPLAINABILITY
    ROBUSTNESS LINEAGE
    Trusted AI Lifecycle through Open Source
    Pillars of trust, woven into the lifecycle of an AI
    application
    github.com/IBM/adversari
    al-robustness-toolbox
    art-demo.mybluemix.net
    github.com/IBM/AIX360
    aix360.mybluemix.net
    Is it fair? Is it easy to
    understand?
    Is it
    accountable?
    Did anyone
    tamper with it?

    View Slide

  39. Tool #1:
    AI Fairness
    360 Toolkit
    Open Source Library
    @ibmcodait

    View Slide

  40. http://aif360.mybluemix.net/
    @ibmcodait

    View Slide

  41. http://aif360.mybluemix.net/
    @ibmcodait

    View Slide

  42. @ibmcodait
    Machine Learning
    Pipeline
    In-
    Processing
    Pre-
    Processing
    Post-
    Processing
    Modifying the
    training data.
    Modifying the
    learning
    algorithm.
    Modifying the
    predictions (or
    outcomes.)

    View Slide

  43. http://aif360.mybluemix.net/
    Demos
    @ibmcodait

    View Slide

  44. https://github.com/IBM/AIF360
    AI Fairness 360 Toolkit Public Repo
    @ibmcodait

    View Slide

  45. Tool #2:
    AI Explainability
    360 Toolkit
    Open Source Library
    @ibmcodait

    View Slide

  46. AIX360 toolkit is an open-source library to help explain AI and machine learning models and
    their predictions. This includes three classes of algorithms: local post-hoc, global post-hoc,
    and directly interpretable explainers for models that use image, text, and structured/tabular
    data. The AI Explainability360 Python package includes a comprehensive set of explainers,
    both at global and local level.
    Toolbox
    Local post-hoc
    Global post-hoc
    Directly interpretable
    AI Explainability
    360
    ↳ (AIX360)
    https://github.com/IBM/AIX360
    http://aix360.mybluemix.net
    THINK 2020 / © 2020 IBM Corporation

    View Slide

  47. Tackling different ways to explain
    Selected 2018 explainability innovations from IBM Research
    GLOBAL, POST-HOC
    Improving Simple Models with
    Confidence Profiles
    NEURIPS 2018
    LOCAL, POST-HOC
    Explanations Based on the
    Missing: Towards Contrastive
    Explanations with Pertinent
    Negatives
    NEURIPS 2018
    GLOBAL, DIRECTLY INTERPRETABLE
    Boolean Decision Rules via
    Column Generation
    NIPS 2018
    Variational Inference of
    Disentangled Latent Concepts
    from Unlabeled Observations
    ICLR 2018
    INTERACTIVE MODEL VISUALIZATION
    Seq2Seq-Vis: A Visual
    Debugging Tool for
    Sequence-to-Sequence Models
    IEEE VAST 2018
    LOCAL, DIRECTLY
    INTERPRETABLE
    TED: Teaching AI to
    Explain its Decisions
    AIES 2019
    THINK 2020 / © 2020 IBM Corporation

    View Slide

  48. Three dimensions of explainability
    One explanation does not fit all: There are many ways to explain things
    directly interpretable
    The oldest AI formats, such as decision
    rule sets, decision trees, and decision
    tables are simple enough for people to
    understand. Supervised learning of
    these models is directly interpretable.
    vs. post hoc interpretation
    Start with a black box model and probe
    into it with a companion model to
    create interpretations. The black box
    model continues to provide the actual
    prediction while interpretation improve
    human interactions.
    global (model-level)
    Show the entire predictive model to the
    user to help them understand it (e.g. a
    small decision tree, whether obtained
    directly or in a post hoc manner).
    vs. local (instance-level)
    Only show the explanations associated
    with individual predictions (i.e. what
    was it about the features of this
    particular person that made her loan
    denied).
    static
    The interpretation is simply presented
    to the user.
    vs. interactive (visual analytics)
    The user can interact with
    interpretation.

    View Slide

  49. Data explanation
    • ProtoDash (Gurumoorthy et al.,
    2019)
    • Disentangled Inferred Prior VAE
    (Kumar et al., 2018)
    Supported explainability algorithms
    Local post-hoc explanation
    • ProtoDash (Gurumoorthy et al.,
    2019)
    • Contrastive Explanations Method
    (Dhurandhar et al., 2018)
    • Contrastive Explanations Method
    with Monotonic Attribute Functions
    (Luss et al., 2019)
    • LIME (Ribeiro et al. 2016, Github)
    • SHAP (Lundberg, et al. 2017, Github)
    Local direct explanation
    • Teaching AI to Explain its Decisions
    (Hind et al., 2019)
    Global direct explanation
    • Boolean Decision Rules via Column
    Generation (Light Edition) (Dash et
    al., 2018)
    • Generalized Linear Rule Models (Wei
    et al., 2019)
    Global post-hoc explanation
    • ProfWeight (Dhurandhar et al., 2018)
    Supported explainability
    metrics
    • Faithfulness (Alvarez-Melis and
    Jaakkola, 2018)
    • Monotonicity (Luss et al., 2019)

    View Slide

  50. Explain how AI arrived at a prediction
    • Uses contrastive techniques to explain model behavior in the vicinity of the target data
    point.
    • Identifies feature weighting of most and least important features
    • Displays factors that influence a prediction in simple terms.
    • Explanation in terms of the top-K features which played a key role in the prediction. E.g.,
    The loan was rejected because: (1) Credit score=average, (2) Loan Amount>$2M and (3)
    Area=Downtown.
    Explainability
    THINK 2020 / © 2020 IBM Corporation

    View Slide

  51. Prediction: Partially Granted
    1 2 3 4 5 6 7 8 9
    Input Data Point
    PP PN
    Partially Approved Approved
    Most Frequent Least Frequent
    Number of Married Years
    Contrastive Explanation:
    • PP: If Number of married years was 7 and salary was in the range $190-210K, then
    outcome would have changed to Loan=Approved
    • PN: Even if Number of married years = 3 and salary was in the range $110-130K,
    outcome would have been Loan=Partially Granted
    [90,
    110]
    [70, 90]
    [110,13
    0]
    Most Frequent Least Frequent
    Input Data Point
    PP
    Salary
    PN
    Partially Approved Approved
    [>210]
    [130,15
    0]
    [150,17
    0]
    [170,19
    0]
    [190,21
    0]
    THINK 2020 / © 2020 IBM Corporation

    View Slide

  52. https://github.com/IBM/AIX360
    AI Explainability 360 Toolkit
    Public Repo
    @ibmcodait

    View Slide

  53. Tool #3:
    Model Asset
    eXchange
    Open Source Pre-Trained
    Deep Learning Models
    @ibmcodait

    View Slide

  54. @ibmcodait
    Step 1: Find a model
    ...that does what you need
    ...that is free to use
    ...that is performant enough

    View Slide

  55. @ibmcodait
    Step 2: Get the code
    Is there a good implementation available?
    ...that does what you need
    ...that is free to use
    ...that is performant enough

    View Slide

  56. @ibmcodait
    Step 3: Verify the
    model
    ○ Does it do what you need?
    ○ Is it free to use (license)?
    ○ Is it performant enough?
    ○ Accuracy?

    View Slide

  57. @ibmcodait
    Step 4: Train the model

    View Slide

  58. @ibmcodait
    Step 4: Train the model

    View Slide

  59. @ibmcodait
    Step 5: Deploy your
    model
    ○ Adjust inference code (or write from
    scratch)
    ○ Package inference code, model code, and
    pre-trained weights together
    ○ Deploy your package

    View Slide

  60. @ibmcodait
    Step 6: Consume your
    model

    View Slide

  61. @ibmcodait
    Model Asset
    Exchange
    The Model Asset Exchange (MAX) is a one
    stop shop for developers/data scientists to
    find and use free and open source deep
    learning models
    ibm.biz/model-exchange

    View Slide

  62. @ibmcodait
    ibm.biz/model-exchange

    View Slide

  63. http://ibm.biz/model-exchange
    Model Asset eXchange (MAX)
    @ibmcodait

    View Slide

  64. Tool #4:
    Data Asset
    eXchange
    Open Source Data Sets
    @ibmcodait

    View Slide

  65. http://ibm.biz/data-exchange
    Data Asset eXchange (DAX)
    @ibmcodait

    View Slide

  66. http://ibm.biz/codait-trusted-ai
    IBM CODAIT Trusted AI Work
    @ibmcodait

    View Slide

  67. 67
    Agenda
    ● Examples of Bias in Machine
    Learning.
    ● Solutions to combat unwanted bias.
    ● Tools to combat unwanted bias.
    ● Resources and how to get involved.

    View Slide

  68. https://www.ajlunited.org/fight

    View Slide

  69. https://www.patreon.com/poetofcode

    View Slide

  70. Photo by rawpixel on Unsplash
    No matter what it is our
    responsibility to build
    systems that are fair.

    View Slide

  71. Thank you!
    Resources from this talk:
    My team’s work:
    Codait.org
    twitter.com/ibmcodait
    http://ibm.biz/codait-trusted-ai
    Criminal Recidivism Scoring
    http://www.equivant.com/solutions/inmate-classific
    ation
    https://www.propublica.org/article/machine-bias-ris
    k-assessments-in-criminal-sentencing
    Gender Shades/Algorithmic Justice League
    http://gendershades.org/
    http://www.aies-conference.com/wp-content/uploa
    ds/2019/01/AIES-19_paper_223.pdf
    https://www.youtube.com/watch?v=Af2VmR-iGkY
    https://www.ajlunited.org/fight
    Anil Dash on the Biases of Tech on the Ezra Klein
    Podcast
    https://www.vox.com/ezra-klein-show-podcast
    https://www.youtube.com/watch?v=-lupS5SkSk0
    Ethics in Computer Science
    https://www.nytimes.com/2018/02/12/business/co
    mputer-science-ethics-courses.html
    https://twitter.com/Neurosarda/status/1084198368
    526680064
    Pinterest Skin Tone Search
    https://www.engadget.com/2019/01/24/pinterest-s
    kin-tone-search-diversity/
    Industry/Government Definitions of Trustworthy
    AI
    https://ec.europa.eu/digital-single-market/en/news/
    ethics-guidelines-trustworthy-ai
    https://wiki.lfai.foundation/display/DL/Trusted+AI+C
    ommittee
    Any questions for us?
    @ibmcodait
    Learn more: codait.org

    View Slide