Gerrit Gruben: Limits of Data Science and other ethical considerations

Gerrit Gruben: Limits of Data Science and other ethical considerations

Faking statistics or doing bogus research on data has always been a classic and interesting topic. In the big data age, we observe otherwise rare phenomena such as the Simpson's paradox more often. There are also limits to our methods, both theoretical - think "black swan" - and human - think biases. I want to touch several topics to increase your consciousness and sharpen your critical thinking as an ethical data scientist. As everyone in Machine Learning has created a faulty experimental design at least once, this presentation is also of a high practical value. I will show-case you concrete examples of where the model evaluation has been screwed up for the disadvantage of human beings.

3c3f3f18c25ea5283640ebd23553e7c6?s=128

MunichDataGeeks

January 31, 2018
Tweet

Transcript

  1. Ethics for Data Scientists The Limits of ML Munich DataGeeks

    Gerrit Gruben 31. January 2018
  2. about.me 2 •Freelance DS, before worked as DS/SWE. •Training people

    in a 3-month boot camp to be DS → •Org. of Kaggle Berlin meetup •ML PhD Dropout @ Potsdam •Degrees in Math. & CS, going for Laws (sic!) datascienceretreat.com
  3. Goals 3

  4. 4

  5. Main points •No data positivism in ML • inductive bias

    always there • IID assumption is idealistic. •Can't predict everything •ML systems prone to manipulation (fragility) 5
  6. Limits & Biases 6

  7. 7 Benevolent or evil?

  8. ”Absence of Evidence is not Evidence of Absence" --- Data

    Scientist’s Proverbs
  9. 9

  10. 10 Source: http://www.gpmfirst.com/books/exploiting-future-uncertainty/risk-concepts

  11. 11

  12. ”I beseech you, in the bowels of Christ, think it

    possible that you may be mistaken" --- Oliver Cromwell Dennis Lindley: avoid prior probabilities of 0 and 1.
  13. Problem of Induction •More general as the black swan problem.

    •ML models have an inductive bias. 13 ” The process of inferring a general law or principle from the observation of particular instances." --- Oxford's Dictionary (direct opposite of deduction)
  14. ” When you have two competing theories that make exactly

    the same predictions, the simpler one is the better." --- Ockham’s Razor
  15. Technical Things What goes wrong often…

  16. Multiple Testing Retrying the tests so often, until "hitting" the

    significance level by chance. Solution: Bayesian or correction (e.g. Bonferroni correction) or different experimental design. Data Snooping: http://bit.ly/2iWoFrV
  17. Statistical Power

  18. Simpson's Paradox Let's try at: https://vudlab.com/simpsons/

  19. Frequentist vs Bayesian

  20. "P-hacking"

  21. "P-hacking" II "When a measure becomes a target, it ceases

    to be a good measure" --- Goodhart's law
  22. selection ≠ evaluation

  23. 23 Paper: http://bit.ly/2gBIR1M Prefer to call it “over-selection” In “Learning

    with Kernels” from Smola & Schölkopf they name ex. 5.10. “overfitting on the test set”.
  24. Empirical Risk Minimization • 24

  25. Empirical Loss • 25

  26. Empirical Risk Minimization II • 26

  27. Bias / Variance 27

  28. • 28

  29. 29 Source: http://bit.ly/2vDfoLp

  30. 30 Source: University of Potsdam

  31. 31 Source: University of Potsdam

  32. Nested CV 32 From Quora: http://bit.ly/2wvz2aZ

  33. Messing up your experiments •Data split strategy is part of

    experiment. •Mainly care for: • Class distribution • Problem domain relevant issues such as time 33 ”Validation and Test sets should model nature and nature is not accommodating." --- Data Scientist’s Proverbs
  34. 34 “Model evaluation, model selection…“ by Sebastian Raschka: http://bit.ly/2p6PGY0 “Approximate

    Statistical Tests For Comparing Supervised Class. Learning Algorithms” (Dietterich 98): http://bit.ly/2wyItF6
  35. Gallery of Fails

  36. Courier/Terrorist detection in Pakistan 36 Source: http://bit.ly/1KY4SQE

  37. Feedback loops abused Tay.ai was a chat bot deployed on

    Twitter by Microsoft for just a day. Trolls started to "subvert" the bot by "teaching" it to be politically incorrect by focussed exposure to extreme content.
  38. Moral Machine http://moralmachine.mit.edu 38

  39. Smaller tips for ML •Always model uncertainty. •Read this •Don’t

    mock values of a non-existant predictive model. 39
  40. Books

  41. Other Links •https://www.ma.utexas.edu/users/mks/statmistakes/StatisticsMist akes.html •Quantopian Lecture Series: p-Hacking and Multiple

    Comparison bias https://www.youtube.com/watch?v=YiDfbYtgUPc •David Hume: A Treatise on Human Nature: http://www.davidhume.org/texts/thn.html 41
  42. Thanks! Questions? Github: github.com/uberwach 42