Gerrit Gruben: Limits of Data Science and other ethical considerations

Gerrit Gruben: Limits of Data Science and other ethical considerations

Faking statistics or doing bogus research on data has always been a classic and interesting topic. In the big data age, we observe otherwise rare phenomena such as the Simpson's paradox more often. There are also limits to our methods, both theoretical - think "black swan" - and human - think biases. I want to touch several topics to increase your consciousness and sharpen your critical thinking as an ethical data scientist. As everyone in Machine Learning has created a faulty experimental design at least once, this presentation is also of a high practical value. I will show-case you concrete examples of where the model evaluation has been screwed up for the disadvantage of human beings.



January 31, 2018


  1. Ethics for Data Scientists The Limits of ML Munich DataGeeks

    Gerrit Gruben 31. January 2018
  2. 2 •Freelance DS, before worked as DS/SWE. •Training people

    in a 3-month boot camp to be DS → •Org. of Kaggle Berlin meetup •ML PhD Dropout @ Potsdam •Degrees in Math. & CS, going for Laws (sic!)
  3. Goals 3

  4. 4

  5. Main points •No data positivism in ML • inductive bias

    always there • IID assumption is idealistic. •Can't predict everything •ML systems prone to manipulation (fragility) 5
  6. Limits & Biases 6

  7. 7 Benevolent or evil?

  8. ”Absence of Evidence is not Evidence of Absence" --- Data

    Scientist’s Proverbs
  9. 9

  10. 10 Source:

  11. 11

  12. ”I beseech you, in the bowels of Christ, think it

    possible that you may be mistaken" --- Oliver Cromwell Dennis Lindley: avoid prior probabilities of 0 and 1.
  13. Problem of Induction •More general as the black swan problem.

    •ML models have an inductive bias. 13 ” The process of inferring a general law or principle from the observation of particular instances." --- Oxford's Dictionary (direct opposite of deduction)
  14. ” When you have two competing theories that make exactly

    the same predictions, the simpler one is the better." --- Ockham’s Razor
  15. Technical Things What goes wrong often…

  16. Multiple Testing Retrying the tests so often, until "hitting" the

    significance level by chance. Solution: Bayesian or correction (e.g. Bonferroni correction) or different experimental design. Data Snooping:
  17. Statistical Power

  18. Simpson's Paradox Let's try at:

  19. Frequentist vs Bayesian

  20. "P-hacking"

  21. "P-hacking" II "When a measure becomes a target, it ceases

    to be a good measure" --- Goodhart's law
  22. selection ≠ evaluation

  23. 23 Paper: Prefer to call it “over-selection” In “Learning

    with Kernels” from Smola & Schölkopf they name ex. 5.10. “overfitting on the test set”.
  24. Empirical Risk Minimization • 24

  25. Empirical Loss • 25

  26. Empirical Risk Minimization II • 26

  27. Bias / Variance 27

  28. • 28

  29. 29 Source:

  30. 30 Source: University of Potsdam

  31. 31 Source: University of Potsdam

  32. Nested CV 32 From Quora:

  33. Messing up your experiments •Data split strategy is part of

    experiment. •Mainly care for: • Class distribution • Problem domain relevant issues such as time 33 ”Validation and Test sets should model nature and nature is not accommodating." --- Data Scientist’s Proverbs
  34. 34 “Model evaluation, model selection…“ by Sebastian Raschka: “Approximate

    Statistical Tests For Comparing Supervised Class. Learning Algorithms” (Dietterich 98):
  35. Gallery of Fails

  36. Courier/Terrorist detection in Pakistan 36 Source:

  37. Feedback loops abused was a chat bot deployed on

    Twitter by Microsoft for just a day. Trolls started to "subvert" the bot by "teaching" it to be politically incorrect by focussed exposure to extreme content.
  38. Moral Machine 38

  39. Smaller tips for ML •Always model uncertainty. •Read this •Don’t

    mock values of a non-existant predictive model. 39
  40. Books

  41. Other Links • akes.html •Quantopian Lecture Series: p-Hacking and Multiple

    Comparison bias •David Hume: A Treatise on Human Nature: 41
  42. Thanks! Questions? Github: 42