Paul Smaldino
November 16, 2017
420

# The Natural Selection of Bad Science

Presentation given to graduate students at UC Merced in the NSF training program (NRT) on Intelligent Adaptive Systems, Nov 16, 2017.

## Paul Smaldino

November 16, 2017

## Transcript

1. ### The Natural Selection of Bad Science Paul E. Smaldino Assistant

Professor Cognitive & Information Sciences Quantitative & Systems Biology University of California, Merced
2. ### Counterpoint: (Begley & Ellis 2012, Nature) Oncology 47/53 ‘landmark’ studies

did not replicate Neuroscience Errors in popular statistical methods imply false positive rate of up to 70% (Eklund et al. 2016, PNAS) (Open Science Collaboration 2015, Science) Psychology 61/100 studies in top journals failed to replicate (p < .05) Most ﬁelds? (Baker 2016, Nature)
3. ### False facts are highly injurious to the progress of science,

for they often endure long; but false views, if supported by some evidence, do little harm, for every one takes a salutary pleasure in proving their falseness; and when this is done, one path towards error is closed and the road to truth is often at the same time opened. –Darwin 1871 The Descent of Man

5. ### How do we ﬁnd facts? 2. Investigation T Real truth

of hypothesis Probability of result 1 – β α β 1 – α + – positive results negative results True (T) False (T) Unknown Positive (+) Negative (–) General case General case (+ or –) F
6. ### Assume Probability of false positive ﬁnding is 5%. Pr(+|F) =

0.05 Probability of true positive ﬁnding is 50%. Pr(+|T) = 0.50 Your test yields a positive result. What is the probability this ﬁnding indicates a true hypothesis? Most common answer: 0.95 (Eddy 1982; Gigerenzer & Hoffrage 1995)
7. ### Pr(T|+) = Pr(+|T) Pr(T) Pr(+|T) Pr(T) + Pr(+|F) Pr(F) Assume

Probability of false positive ﬁnding is 5%. Pr(+|F) = 0.05 Probability of true positive ﬁnding is 50%. Pr(+|T) = 0.50 Your test yields a positive result. What is the probability this ﬁnding indicates a true hypothesis?
8. ### Pr(T|+) = Pr(+|T) Pr(T) Pr(+|T) Pr(T) + Pr(+|F) Pr(F) Need

base rate Assume Probability of false positive ﬁnding is 5%. Pr(+|F) = 0.05 Probability of true positive ﬁnding is 50%. Pr(+|T) = 0.50 Your test yields a positive result. What is the probability this ﬁnding indicates a true hypothesis?
9. ### The Signal and the Noise true false base rate: proportion

of hypotheses which are true

12. ### How do we ﬁnd facts? 2. Investigation T Real truth

of hypothesis Probability of result 1 – β α β 1 – α + – positive results negative results True (T) False (T) Unknown Positive (+) Negative (–) General case General case (+ or –) F
13. ### 1. Hypothesis Selection Novel hypotheses Tested hypotheses A previously tested

hypothesis is selected for replication with probability r, otherwise a novel (untested) hypothesis is selected. Novel hypotheses are true with probability b. 1 – r r 2. Investigation T Real truth of hypothesis Probability of result 1 – β α β 1 – α + – 3. Communication Experimental results are communicated to the scientiﬁc community with a probability that depends upon both the experimental result (+, –) and whether the hypothesis was novel (N) or a replication (R). Communicated results join the set of tested hypotheses. Uncommunicated replications revert to their prior status. 1 – C N– C N– positive results negative results 1 – C R+ C R+ New result communicated New result not communicated 1 – C R– C R– File drawer novel replic. novel replic. True (T) False (T) KEY Interior = true epistemic state Exterior = experimental evidence Unknown Positive (+) Negative (–) General case General case (+ or –) F McElreath R & Smaldino PE (2015) Replication, communication, and the population dynamics of scientiﬁc discovery. PLOS ONE 10(8):e0136088.
14. ### 1. Hypothesis Selection Novel hypotheses Tested hypotheses A previously tested

hypothesis is selected for replication with probability r, otherwise a novel (untested) hypothesis is selected. Novel hypotheses are true with probability b. 1 – r r 2. Investigation T Real truth of hypothesis Probability of result 1 – β α β 1 – α + – 3. Communication Experimental results are communicated to the scientiﬁc community with a probability that depends upon both the experimental result (+, –) and whether the hypothesis was novel (N) or a replication (R). Communicated results join the set of tested hypotheses. Uncommunicated replications revert to their prior status. 1 – C N– C N– positive results negative results 1 – C R+ C R+ New result communicated New result not communicated 1 – C R– C R– File drawer novel replic. novel replic. True (T) False (T) KEY Interior = true epistemic state Exterior = experimental evidence Unknown Positive (+) Negative (–) General case General case (+ or –) F McElreath R & Smaldino PE (2015) Replication, communication, and the population dynamics of scientiﬁc discovery. PLOS ONE 10(8):e0136088.
15. ### 1. Hypothesis Selection Novel hypotheses Tested hypotheses A previously tested

hypothesis is selected for replication with probability r, otherwise a novel (untested) hypothesis is selected. Novel hypotheses are true with probability b. 1 – r r 2. Investigation T Real truth of hypothesis Probability of result 1 – β α β 1 – α + – 3. Communication Experimental results are communicated to the scientiﬁc community with a probability that depends upon both the experimental result (+, –) and whether the hypothesis was novel (N) or a replication (R). Communicated results join the set of tested hypotheses. Uncommunicated replications revert to their prior status. 1 – C N– C N– positive results negative results 1 – C R+ C R+ New result communicated New result not communicated 1 – C R– C R– File drawer novel replic. novel replic. True (T) False (T) KEY Interior = true epistemic state Exterior = experimental evidence Unknown Positive (+) Negative (–) General case General case (+ or –) F McElreath R & Smaldino PE (2015) Replication, communication, and the population dynamics of scientiﬁc discovery. PLOS ONE 10(8):e0136088.
16. ### 1. Hypothesis Selection Novel hypotheses Tested hypotheses A previously tested

hypothesis is selected for replication with probability r, otherwise a novel (untested) hypothesis is selected. Novel hypotheses are true with probability b. 1 – r r 2. Investigation T Real truth of hypothesis Probability of result 1 – β α β 1 – α + – 3. Communication Experimental results are communicated to the scientiﬁc community with a probability that depends upon both the experimental result (+, –) and whether the hypothesis was novel (N) or a replication (R). Communicated results join the set of tested hypotheses. Uncommunicated replications revert to their prior status. 1 – C N– C N– positive results negative results 1 – C R+ C R+ New result communicated New result not communicated 1 – C R– C R– File drawer novel replic. novel replic. True (T) False (T) KEY Interior = true epistemic state Exterior = experimental evidence Unknown Positive (+) Negative (–) General case General case (+ or –) F McElreath R & Smaldino PE (2015) Replication, communication, and the population dynamics of scientiﬁc discovery. PLOS ONE 10(8):e0136088.
17. ### 1. Hypothesis Selection Novel hypotheses Tested hypotheses A previously tested

hypothesis is selected for replication with probability r, otherwise a novel (untested) hypothesis is selected. Novel hypotheses are true with probability b. 1 – r r 2. Investigation T Real truth of hypothesis Probability of result 1 – β α β 1 – α + – 3. Communication Experimental results are communicated to the scientiﬁc community with a probability that depends upon both the experimental result (+, –) and whether the hypothesis was novel (N) or a replication (R). Communicated results join the set of tested hypotheses. Uncommunicated replications revert to their prior status. 1 – C N– C N– positive results negative results 1 – C R+ C R+ New result communicated New result not communicated 1 – C R– C R– File drawer novel replic. novel replic. True (T) False (T) KEY Interior = true epistemic state Exterior = experimental evidence Unknown Positive (+) Negative (–) General case General case (+ or –) F McElreath R & Smaldino PE (2015) Replication, communication, and the population dynamics of scientiﬁc discovery. PLOS ONE 10(8):e0136088.
18. ### 1. Hypothesis Selection Novel hypotheses Tested hypotheses A previously tested

hypothesis is selected for replication with probability r, otherwise a novel (untested) hypothesis is selected. Novel hypotheses are true with probability b. 1 – r r 2. Investigation T Real truth of hypothesis Probability of result 1 – β α β 1 – α + – 3. Communication Experimental results are communicated to the scientiﬁc community with a probability that depends upon both the experimental result (+, –) and whether the hypothesis was novel (N) or a replication (R). Communicated results join the set of tested hypotheses. Uncommunicated replications revert to their prior status. 1 – C N– C N– positive results negative results 1 – C R+ C R+ New result communicated New result not communicated 1 – C R– C R– File drawer novel replic. novel replic. True (T) False (T) KEY Interior = true epistemic state Exterior = experimental evidence Unknown Positive (+) Negative (–) General case General case (+ or –) F Recursions: Solutions: McElreath R & Smaldino PE (2015) Replication, communication, and the population dynamics of scientiﬁc discovery. PLOS ONE 10(8):e0136088.
19. ### Proportion true hypotheses at different numbers of net positive ﬁndings

 0.001 0.1 0.5 0 0.2 0.5 0.8 1 0 0.1 0.3 0.5 0 0.2 0.5 0.8 1 0.5 0.8 0.99 0 0.2 0.5 0.8 1 0.05 0.1 0.15 0.2 0 0.2 0.5 0.8 1 0 0.25 0.5 0.75 1 0 0.2 0.5 0.8 1 0 0.25 0.5 0.75 1 0 0.2 0.5 0.8 1 0 0.25 0.5 0.75 1 0 0.2 0.5 0.8 1 0 0.25 0.5 0.75 1 0 0.2 0.5 0 0.25 0.5 0.75 1 0 0.2 0.5 0 0.25 0.5 0.75 1 0 0.2 0.5 Proportion true Proportion true base rate replication rate power false-positive rate communicate neg. rep. communicate pos. rep. communicate neg. new 1 3 5 (a) (b) (c) (d) (e) (f) (g) 5 5 5 5 5 5 Propo communicate neg. rep. communicate pos. rep. communicate neg. new Optimistic sc Pessimistic scenario 3 3 3 3 3 3 0 0 0 McElreath R & Smaldino PE (2015) Replication, communication, and the population dynamics of scientiﬁc discovery. PLOS ONE 10(8):e0136088.
20. ###  0.001 0.1 0.5 0 0.2 0.5 0.8 1 0

0.1 0.3 0.5 0 0.2 0.5 0.8 1 0.5 0.8 0.99 0 0.2 0.5 0.8 1 0.05 0.1 0.15 0.2 0 0.2 0.5 0.8 1 0 0.25 0.5 0.75 1 0 0.2 0.5 0.8 1 0 0.25 0.5 0.75 1 0 0.2 0.5 0.8 1 0 0.25 0.5 0.75 1 0 0.2 0.5 0.8 1 0 0.25 0.5 0.75 1 0 0.2 0.5 0 0.25 0.5 0.75 1 0 0.2 0.5 0 0.25 0.5 0.75 1 0 0.2 0.5 Proportion true Proportion true base rate replication rate power false-positive rate communicate neg. rep. communicate pos. rep. communicate neg. new 1 3 5 (a) (b) (c) (d) (e) (f) (g) 5 5 5 5 5 5 Propo communicate neg. rep. communicate pos. rep. communicate neg. new Optimistic sc Pessimistic scenario 3 3 3 3 3 3 0 0 0 Base rate and false-positive rate are the most important factors in avoiding false facts Proportion true hypotheses at different numbers of net positive ﬁndings McElreath R & Smaldino PE (2015) Replication, communication, and the population dynamics of scientiﬁc discovery. PLOS ONE 10(8):e0136088.
21. ### False facts more common when… • Studies are underpowered ➡

False positives and ambiguous results • Negative results aren’t published ➡ Lower information content of literature • Misunderstanding of statistical techniques ➡ False positives and ambiguous results • Surprising, easily understood results easiest to publish ➡ Lowering base rate Why Isn’t Science Better?
22. ### Incentives not aligned with best practices (Schillebeeckx et al. 2013,

Nature Biotech.) “Part of the problem is that no-one is incentivised to be right. Instead, scientists are incentivised to be productive and innovative.” –Richard Horton The Lancet April 2015
23. ### (Brischoux & Angler 2015, Scientometrics) Numbers of papers at hiring

for CNRS evolutionary biologists More papers, more co-authorship (Nabout et al. 2015, Scientometrics) botany zoology ecology genetics % single-authored papers physics fraction multi-author pubs (Wardil & Hauert. 2015, Phys Rev E) Successful scientists are publishing more
24. ### (van Dijk et al. 2014, Curr Biol) publications per year

average impact factor Successful scientists are publishing more
25. ### When a measure becomes a target, it ceases to be

a good measure. Donald T. Campbell 1916–1996 The more any quantitative social indicator is used for social decision- making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor. –Campbell 1976
26. ### Publishing more can lead directly to some kinds of success

(Franzoni et al. 2011, Science) PLOS ONE \$984 PNAS \$3,513 Nature, Science \$43,783 Average amount paid to ﬁrst author in China in 2016 (in USD) (Quan et al. 2017, Aslib J Inform Manag) Similar incentives in many other countries • India • Malaysia • Korea • Turkey • Venezuela • Chile

1975-2014
28. ### Such a system can (and does) incentivize cheaters… …but does

not require cheating to be damaging.
29. ### An evolutionary model of science • Population of N labs

• Each lab has characteristic methodological power, Pr(+|T) • Increasing power also increases false positives, unless effort is exerted • Effort increases the time between results • Novel negative results tough to publish • Labs that publish more are more likely to have their methods “reproduced” in new labs • Two phases: (1) Science, (2) Evolution Smaldino PE & McElreath R (2016) The natural selection of bad science. Royal Society Open Science 3: 160384.
30. ### Phase 1: Science • New hypothesis tackled with probability inversely

proportionate to effort • Novel hypotheses true at rate b = 0.1 Investigation • Always yields a positive (+) or negative (–) result • Power: W = Pr(+|T) • False positive rate is a function of power and effort: 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 power false positive rate e = 1 e = 10 e = 75 Hypothesis Selection 0.5 0.6 0.7 0.8 0.9 1 0 20 40 60 80 100 Probability of new study effort Communication • Novel positive results can always be published • Novel negative results unlikely to be published increased effort
31. ### 1. From a randomly selected group of size d, the

oldest “dies.” 2. From another randomly selected group of size d, the lab with the highest accumulated payoff “reproduces,” transmitting its methods to its “offspring” with mutation Phase 2: Evolution Smaldino PE & McElreath R (2016) The natural selection of bad science. Royal Society Open Science 3: 160384.
32. ### The natural selection of bad science Power evolves, constant effort

Effort evolves, constant power Smaldino PE & McElreath R (2016) The natural selection of bad science. Royal Society Open Science 3: 160384.

35. ### 0 0.2 0.4 0.6 0.8 1 1955 1965 1975 1985

1995 2005 2015 Stiatistical Power R2 = 0.00097 Statistical power to detect small effects in the social + behavioral sciences mean power = 0.24 (Smaldino & McElreath 2016) Szucs & Ioannidis 2017 100,000+ statistical tests from ~10,000 papers in psych, cog. neuro, and medical journals (2011-2014)
36. ### Replication to the Rescue? Adding replication to the model -

Labs replicate tests of previously published hypotheses at rate r - All replications are publishable, worth 50% prestige of novel ﬁnding (0.5 points) - Success replication boosts original authors’ prestige (+0.1 points) - Failed replication severely damages original authors’ prestige (–100 points)
37. ### Replication ≠ Salvation Smaldino PE & McElreath R (2016) The

natural selection of bad science. Royal Society Open Science 3: 160384.
38. ### Take homes • Incentives to boost quantitative metrics can lead

to the degradation of research methods • Requires no fraud or ill intent, only that successful individuals transmit their methods • Changing individual behavior not enough — improving science requires institutional change • This is unlikely to be easy or happen quickly. But some promising changes are already happening