Pro Yearly is on sale from $80 to $50! »

An AI with an Agenda: How Our Biases Leak Into Machine Learning (Codestock 2019)

An AI with an Agenda: How Our Biases Leak Into Machine Learning (Codestock 2019)

In the glorious AI-assisted future, all decisions are objective and perfect, and there’s no such thing as cognitive biases. That’s why we created AI and machine learning, right? Because humans can make mistakes, and computers are perfect. Well, there’s some bad news: humans make those AIs and machine learning models, and as a result humanity’s biases and missteps can subtly work their way into our AI and models.

All hope isn’t lost, though! In this talk you’ll learn how science and statistics have already solved some of these problems and how a robust awareness of cognitive biases can help with many of the rest. Come learn what else we can do to protect ourselves from these old mistakes, because we owe it to the people who’ll rely on our algorithms to deliver the best possible intelligence!

6f6662ecab8176c54c3ad89ec158842c?s=128

Arthur Doler

April 13, 2019
Tweet

Transcript

  1. Arthur Doler @arthurdoler arthurdoler@gmail.com Slides: Handout: AN AI WITH AN

    AGENDA How Our Biases Leak Into Machine Learning None
  2. None
  3. LET’S ALL PLAY A GAME

  4. “THE NURSE SAID”

  5. “THE SOFTWARE ENGINEER SAID”

  6. None
  7. None
  8. None
  9. None
  10. None
  11. None
  12. None
  13. None
  14. REAL CONSEQUENCES

  15. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing

  16. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing

  17. http://blog.conceptnet.io/posts/2017/how-to-make-a-racist-ai-without-really-trying/ Aylin Caliskan-Islam1 , Joanna J. Bryson1,2, and Arvind Narayanan1,

    2016
  18. http://blog.conceptnet.io/posts/2017/how-to-make-a-racist-ai-without-really-trying/

  19. None
  20. SIX CLASSES OF PROBLEM WITH AI/ML

  21. None
  22. None
  23. None
  24. Class I – Phantoms of False Correlation Class II –

    Specter of Biased Sample Data Class III – Shade of Overly-Simplistic Maximization Class V – The Simulation Surprise Class VI – Apparition of Fairness Class VII – The Feedback Devil
  25. None
  26. None
  27. None
  28. None
  29. None
  30. http://www.tylervigen.com/spurious-correlations - Data sources: Centers for Disease Control & Prevention

    and Internet Movie Database
  31. http://www.tylervigen.com/spurious-correlations - Data sources: National Vital Statistics Reports and U.S.

    Department of Agriculture
  32. http://www.tylervigen.com/spurious-correlations - Data sources: National Spelling Bee and Centers for

    Disease Control & Prevention
  33. None
  34. KNOW WHAT QUESTION YOU’RE ASKING UP FRONT

  35. USE CONDITIONAL PROBABILITY OVER CORRELATION

  36. https://versionone.vc/correlation-probability/

  37. None
  38. None
  39. MORTGAGE LENDING ANALYSIS

  40. None
  41. None
  42. None
  43. None
  44. https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G

  45. Twitter - @quantoidasaurus (Used with permission)

  46. YOUR SAMPLE MIGHT NOT BE REPRESENTATIVE

  47. YOUR DATA MIGHT NOT BE REPRESENTATIVE

  48. None
  49. MODELS REPRESENT WHAT WAS THEY DON’T TELL YOU WHAT SHOULD

    BE
  50. FIND A BETTER DATA SET! CONCEPTNET.IO

  51. BUILD A BETTER DATA SET!

  52. None
  53. BEWARE SHADOW COLUMNS

  54. RESTRUCTURE YOUR DATA SET PROBABILISTICALLY

  55. DATA AUGMENTATION

  56. None
  57. None
  58. MAKE SURE YOUR SAMPLE SET IS REPRESENTATIVE

  59. None
  60. None
  61. HAVE A GOOD PROCESS

  62. USE ADVERSARIAL ALGORITHMS TO DETECT BIAS IN YOUR MODELS

  63. KEEP IN MIND YOU NEED TO KNOW WHO CAN BE

    AFFECTED IN ORDER TO UN-BIAS
  64. None
  65. PRICING ALGORITHMS

  66. Calvano, Calzolari, Denicolò and Pastorello (2018)

  67. None
  68. None
  69. Calvano, Calzolari, Denicolò and Pastorello (2018)

  70. WHAT IF AMAZON BUILT A SALARY TOOL INSTEAD?

  71. THE BRATWURST PROBLEM

  72. HUMANS ARE RARELY SINGLE-MINDED

  73. None
  74. None
  75. https://www.alexirpan.com/2018/02/14/rl-hard.html; Gu, Lillicrap, Sutskever, & Levine, 2016

  76. None
  77. MODELS REPRESENT WHAT WAS THEY DON’T TELL YOU WHAT SHOULD

    BE
  78. DON’T TRUST ALGORITHMS TO MAKE SUBTLE OR LARGE MULTI-VARIABLE JUDGEMENTS

  79. None
  80. MORE COMPLEX ALGORITHMS THAT INCLUDE OUTSIDE INFLUENCE

  81. None
  82. Lehman, Clune, & Misevic, 2018

  83. Cheney, MacCurdy, Clune, Lipson, 2013

  84. None
  85. BE READY

  86. DON’T CONFUSE THE MAP WITH THE TERRITORY

  87. VERIFY AND CHECK SOLUTIONS DERIVED FROM SIMULATION

  88. None
  89. None
  90. BUT WHAT HAPPENS WITH DIALECTAL LANGUAGE? Blodgett, Green, and O’Connor,

    2016
  91. MANY AI/ML TOOLS ARE TRAINED TO MINIMIZE AVERAGE LOSS

  92. REPRESENTATION DISPARITY Hashimoto, Srivastava, Namkoong, and Liang, 2018

  93. None
  94. CONSIDER PREDICTIVE ACCURACY AS A RESOURCE TO BE ALLOCATED Hashimoto,

    Srivastava, Namkoong, and Liang, 2018
  95. DISTRIBUTIONALLY ROBUST OPTIMIZATION Hashimoto, Srivastava, Namkoong, and Liang, 2018

  96. None
  97. LET’S BUILD A PRODUCT WITH OUR TWITTER NLP

  98. WHAT HAPPENS TO PEOPLE WHO USE DIALECT?

  99. PREDICTIVE POLICING

  100. Image via Reddit, Author user u/jakeroot

  101. Ensign, Friedler, Neville, Scheidegger, & Venkatasubramanian, 2017

  102. Ensign, Friedler, Neville, Scheidegger, & Venkatasubramanian, 2017

  103. Ensign, Friedler, Neville, Scheidegger, & Venkatasubramanian, 2017

  104. Ensign, Friedler, Neville, Scheidegger, & Venkatasubramanian, 2017

  105. Ensign, Friedler, Neville, Scheidegger, & Venkatasubramanian, 2017

  106. Ensign, Friedler, Neville, Scheidegger, & Venkatasubramanian, 2017

  107. None
  108. IGNORE OR ADJUST FOR ALGORITHM-SUGGESTED RESULTS

  109. LOOK TO CONTROL ENGINEERING

  110. By Arturo Urquizo - http://commons.wikimedia.org/wiki/File:PID.svg, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=17633925

  111. None
  112. CLASS I - PHANTOMS OF FALSE CORRELATION Know what question

    you’re asking Trust conditional probability over straight correlation
  113. CLASS II - SPECTER OF BIASED SAMPLE DATA Recognize data

    is biased even at rest Make sure your sample set is crafted properly Excise problematic predictors, but beware their shadow columns Build a learning system that can incorporate false positives and false negatives as you find them Try using adversarial techniques to detect bias
  114. CLASS III - SHADE OF OVERLY-SIMPLISTIC MAXIMIZATION Remember models tell

    you what was, not what should be Try combining dependent columns and predicting that Try complex algorithms that allow more flexible reinforcement
  115. CLASS V – THE SIMULATION SURPRISE Don’t confuse the map

    with the territory Always reality-check solutions from simulations
  116. CLASS VI - APPARITION OF FAIRNESS Consider predictive accuracy as

    a resource to be allocated Possibly seek external auditing of results, or at least another team
  117. CLASS VII - THE FEEDBACK DEVIL Ignore or adjust for

    algorithm-suggested results Look to control engineering for potential answers
  118. None
  119. None
  120. None
  121. MODELS REPRESENT WHAT WAS THEY DON’T TELL YOU WHAT SHOULD

    BE
  122. None
  123. OR GET TRAINING

  124. Bootcamps Coursera Udemy Actual Universities

  125. None
  126. AI Now Institute Georgetown Law Center on Privacy and Technology

    Knight Foundation’s AI ethics initiative fast.ai
  127. ABIDE BY ETHICS GUIDELINES

  128. Privacy / Consent Transparency of Use Transparency of Algorithms Ownership

  129. https://www.accenture.com/_acnmedia/PDF-24/Accenture-Universal-Principles-Data-Ethics.pdf

  130. Slides: Arthur Doler @arthurdoler arthurdoler@gmail.com Handout: None