Slide 1

Slide 1 text

Ethics for Data Scientists The Limits of ML Munich DataGeeks Gerrit Gruben 31. January 2018

Slide 2

Slide 2 text 2 •Freelance DS, before worked as DS/SWE. •Training people in a 3-month boot camp to be DS → •Org. of Kaggle Berlin meetup •ML PhD Dropout @ Potsdam •Degrees in Math. & CS, going for Laws (sic!)

Slide 3

Slide 3 text

Goals 3

Slide 4

Slide 4 text


Slide 5

Slide 5 text

Main points •No data positivism in ML • inductive bias always there • IID assumption is idealistic. •Can't predict everything •ML systems prone to manipulation (fragility) 5

Slide 6

Slide 6 text

Limits & Biases 6

Slide 7

Slide 7 text

7 Benevolent or evil?

Slide 8

Slide 8 text

”Absence of Evidence is not Evidence of Absence" --- Data Scientist’s Proverbs

Slide 9

Slide 9 text


Slide 10

Slide 10 text

10 Source:

Slide 11

Slide 11 text


Slide 12

Slide 12 text

”I beseech you, in the bowels of Christ, think it possible that you may be mistaken" --- Oliver Cromwell Dennis Lindley: avoid prior probabilities of 0 and 1.

Slide 13

Slide 13 text

Problem of Induction •More general as the black swan problem. •ML models have an inductive bias. 13 ” The process of inferring a general law or principle from the observation of particular instances." --- Oxford's Dictionary (direct opposite of deduction)

Slide 14

Slide 14 text

” When you have two competing theories that make exactly the same predictions, the simpler one is the better." --- Ockham’s Razor

Slide 15

Slide 15 text

Technical Things What goes wrong often…

Slide 16

Slide 16 text

Multiple Testing Retrying the tests so often, until "hitting" the significance level by chance. Solution: Bayesian or correction (e.g. Bonferroni correction) or different experimental design. Data Snooping:

Slide 17

Slide 17 text

Statistical Power

Slide 18

Slide 18 text

Simpson's Paradox Let's try at:

Slide 19

Slide 19 text

Frequentist vs Bayesian

Slide 20

Slide 20 text


Slide 21

Slide 21 text

"P-hacking" II "When a measure becomes a target, it ceases to be a good measure" --- Goodhart's law

Slide 22

Slide 22 text

selection ≠ evaluation

Slide 23

Slide 23 text

23 Paper: Prefer to call it “over-selection” In “Learning with Kernels” from Smola & Schölkopf they name ex. 5.10. “overfitting on the test set”.

Slide 24

Slide 24 text

Empirical Risk Minimization • 24

Slide 25

Slide 25 text

Empirical Loss • 25

Slide 26

Slide 26 text

Empirical Risk Minimization II • 26

Slide 27

Slide 27 text

Bias / Variance 27

Slide 28

Slide 28 text

• 28

Slide 29

Slide 29 text

29 Source:

Slide 30

Slide 30 text

30 Source: University of Potsdam

Slide 31

Slide 31 text

31 Source: University of Potsdam

Slide 32

Slide 32 text

Nested CV 32 From Quora:

Slide 33

Slide 33 text

Messing up your experiments •Data split strategy is part of experiment. •Mainly care for: • Class distribution • Problem domain relevant issues such as time 33 ”Validation and Test sets should model nature and nature is not accommodating." --- Data Scientist’s Proverbs

Slide 34

Slide 34 text

34 “Model evaluation, model selection…“ by Sebastian Raschka: “Approximate Statistical Tests For Comparing Supervised Class. Learning Algorithms” (Dietterich 98):

Slide 35

Slide 35 text

Gallery of Fails

Slide 36

Slide 36 text

Courier/Terrorist detection in Pakistan 36 Source:

Slide 37

Slide 37 text

Feedback loops abused was a chat bot deployed on Twitter by Microsoft for just a day. Trolls started to "subvert" the bot by "teaching" it to be politically incorrect by focussed exposure to extreme content.

Slide 38

Slide 38 text

Moral Machine 38

Slide 39

Slide 39 text

Smaller tips for ML •Always model uncertainty. •Read this •Don’t mock values of a non-existant predictive model. 39

Slide 40

Slide 40 text


Slide 41

Slide 41 text

Other Links • akes.html •Quantopian Lecture Series: p-Hacking and Multiple Comparison bias •David Hume: A Treatise on Human Nature: 41

Slide 42

Slide 42 text

Thanks! Questions? Github: 42