An AI with an Agenda: How Our Biases Leak Into Machine Learning (NDC Minnesota 2019)

An AI with an Agenda: How Our Biases Leak Into Machine Learning (NDC Minnesota 2019)

In the glorious AI-assisted future, all decisions are objective and perfect, and there’s no such thing as cognitive biases. That’s why we created AI and machine learning, right? Because humans can make mistakes, and computers are perfect. Well, there’s some bad news: humans make those AIs and machine learning models, and as a result humanity’s biases and missteps can subtly work their way into our AI and models.

All hope isn’t lost, though! In this talk you’ll learn how science and statistics have already solved some of these problems and how a robust awareness of cognitive biases can help with many of the rest. Come learn what else we can do to protect ourselves from these old mistakes, because we owe it to the people who’ll rely on our algorithms to deliver the best possible intelligence!

6f6662ecab8176c54c3ad89ec158842c?s=128

Arthur Doler

May 08, 2019
Tweet

Transcript

  1. Arthur Doler @arthurdoler arthurdoler@gmail.com Slides: Handout: AN AI WITH AN

    AGENDA How Our Biases Leak Into Machine Learning bit.ly/art-ai-with-agenda None
  2. LET’S ALL PLAY A GAME

  3. “THE NURSE SAID”

  4. “THE SOFTWARE ENGINEER SAID”

  5. None
  6. None
  7. None
  8. None
  9. None
  10. None
  11. None
  12. None
  13. REAL CONSEQUENCES

  14. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing

  15. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing

  16. http://blog.conceptnet.io/posts/2017/how-to-make-a-racist-ai-without-really-trying/ Aylin Caliskan-Islam1 , Joanna J. Bryson1,2, and Arvind Narayanan1,

    2016
  17. http://blog.conceptnet.io/posts/2017/how-to-make-a-racist-ai-without-really-trying/

  18. None
  19. SIX CLASSES OF PROBLEM WITH AI/ML

  20. None
  21. None
  22. None
  23. Class I – Phantoms of False Correlation Class II –

    Specter of Biased Sample Data Class III – Shade of Overly-Simplistic Maximization Class V – The Simulation Surprise Class VI – Apparition of Fairness Class VII – The Feedback Devil
  24. None
  25. None
  26. None
  27. None
  28. None
  29. None
  30. http://www.tylervigen.com/spurious-correlations - Data sources: Centers for Disease Control & Prevention

    and Internet Movie Database
  31. http://www.tylervigen.com/spurious-correlations - Data sources: National Vital Statistics Reports and U.S.

    Department of Agriculture
  32. http://www.tylervigen.com/spurious-correlations - Data sources: National Spelling Bee and Centers for

    Disease Control & Prevention
  33. None
  34. KNOW WHAT QUESTION YOU’RE ASKING UP FRONT

  35. USE CONDITIONAL PROBABILITY OVER CORRELATION

  36. https://versionone.vc/correlation-probability/

  37. None
  38. None
  39. MORTGAGE LENDING ANALYSIS

  40. None
  41. None
  42. None
  43. None
  44. https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G

  45. Twitter - @quantoidasaurus (Used with permission)

  46. YOUR SAMPLE MIGHT NOT BE REPRESENTATIVE

  47. YOUR DATA MIGHT NOT BE REPRESENTATIVE

  48. None
  49. MODELS REPRESENT WHAT WAS THEY DON’T TELL YOU WHAT SHOULD

    BE
  50. FIND A BETTER DATA SET! CONCEPTNET.IO

  51. BUILD A BETTER DATA SET!

  52. None
  53. BEWARE SHADOW COLUMNS

  54. MAKE SURE YOUR SAMPLE SET IS REPRESENTATIVE

  55. None
  56. None
  57. IBM’S AI FAIRNESS TOOLKIT

  58. https://aif360.mybluemix.net AI FAIRNESS TOOLKIT

  59. https://aif360.mybluemix.net

  60. None
  61. None
  62. None
  63. https://aif360.mybluemix.net

  64. https://aif360.mybluemix.net

  65. https://pair-code.github.io/what-if-tool

  66. HAVE A GOOD PROCESS

  67. KEEP IN MIND YOU NEED TO KNOW WHO CAN BE

    AFFECTED IN ORDER TO UN-BIAS
  68. None
  69. PRICING ALGORITHMS

  70. Calvano, Calzolari, Denicolò and Pastorello (2018)

  71. None
  72. None
  73. Calvano, Calzolari, Denicolò and Pastorello (2018)

  74. WHAT IF AMAZON BUILT A SALARY TOOL INSTEAD?

  75. THE BRATWURST PROBLEM

  76. HUMANS ARE RARELY SINGLE-MINDED

  77. None
  78. None
  79. https://www.alexirpan.com/2018/02/14/rl-hard.html; Gu, Lillicrap, Sutskever, & Levine, 2016

  80. None
  81. MODELS REPRESENT WHAT WAS THEY DON’T TELL YOU WHAT SHOULD

    BE
  82. DON’T TRUST ALGORITHMS TO MAKE SUBTLE OR LARGE MULTI-VARIABLE JUDGEMENTS

  83. None
  84. MORE COMPLEX ALGORITHMS THAT INCLUDE OUTSIDE INFLUENCE

  85. None
  86. Lehman, Clune, & Misevic, 2018

  87. Cheney, MacCurdy, Clune, Lipson, 2013

  88. None
  89. BE READY

  90. DON’T CONFUSE THE MAP WITH THE TERRITORY

  91. VERIFY AND CHECK SOLUTIONS DERIVED FROM SIMULATION

  92. None
  93. None
  94. BUT WHAT HAPPENS WITH DIALECTAL LANGUAGE? Blodgett, Green, and O’Connor,

    2016
  95. MANY AI/ML TOOLS ARE TRAINED TO MINIMIZE AVERAGE LOSS

  96. REPRESENTATION DISPARITY Hashimoto, Srivastava, Namkoong, and Liang, 2018

  97. None
  98. CONSIDER PREDICTIVE ACCURACY AS A RESOURCE TO BE ALLOCATED Hashimoto,

    Srivastava, Namkoong, and Liang, 2018
  99. DISTRIBUTIONALLY ROBUST OPTIMIZATION Hashimoto, Srivastava, Namkoong, and Liang, 2018

  100. None
  101. LET’S BUILD A PRODUCT WITH OUR TWITTER NLP

  102. WHAT HAPPENS TO PEOPLE WHO USE DIALECT?

  103. PREDICTIVE POLICING

  104. Image via Reddit, Author user u/jakeroot

  105. Ensign, Friedler, Neville, Scheidegger, & Venkatasubramanian, 2017

  106. Ensign, Friedler, Neville, Scheidegger, & Venkatasubramanian, 2017

  107. Ensign, Friedler, Neville, Scheidegger, & Venkatasubramanian, 2017

  108. Ensign, Friedler, Neville, Scheidegger, & Venkatasubramanian, 2017

  109. Ensign, Friedler, Neville, Scheidegger, & Venkatasubramanian, 2017

  110. Ensign, Friedler, Neville, Scheidegger, & Venkatasubramanian, 2017

  111. None
  112. IGNORE OR ADJUST FOR ALGORITHM-SUGGESTED RESULTS

  113. LOOK TO CONTROL ENGINEERING

  114. By Arturo Urquizo - http://commons.wikimedia.org/wiki/File:PID.svg, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=17633925

  115. None
  116. CLASS I - PHANTOMS OF FALSE CORRELATION Know what question

    you’re asking Trust conditional probability over straight correlation
  117. CLASS II - SPECTER OF BIASED SAMPLE DATA Recognize data

    is biased even at rest Make sure your sample set is crafted properly Excise problematic predictors, but beware their shadow columns Build a learning system that can incorporate false positives and false negatives as you find them Try using adversarial techniques to detect bias
  118. CLASS III - SHADE OF OVERLY-SIMPLISTIC MAXIMIZATION Remember models tell

    you what was, not what should be Try combining dependent columns and predicting that Try complex algorithms that allow more flexible reinforcement
  119. CLASS V – THE SIMULATION SURPRISE Don’t confuse the map

    with the territory Always reality-check solutions from simulations
  120. CLASS VI - APPARITION OF FAIRNESS Consider predictive accuracy as

    a resource to be allocated Possibly seek external auditing of results, or at least another team
  121. CLASS VII - THE FEEDBACK DEVIL Ignore or adjust for

    algorithm-suggested results Look to control engineering for potential answers
  122. None
  123. None
  124. MODELS REPRESENT WHAT WAS THEY DON’T TELL YOU WHAT SHOULD

    BE
  125. None
  126. OR GET TRAINING

  127. Bootcamps Coursera Udemy Actual Universities

  128. None
  129. AI Now Institute Georgetown Law Center on Privacy and Technology

    Knight Foundation’s AI ethics initiative fast.ai
  130. ABIDE BY ETHICS GUIDELINES

  131. Privacy / Consent Transparency of Use Transparency of Algorithms Ownership

  132. https://www.accenture.com/_acnmedia/PDF-24/Accenture-Universal-Principles-Data-Ethics.pdf

  133. Slides: Arthur Doler @arthurdoler arthurdoler@gmail.com Handout: bit.ly/art-ai-with-agenda None