Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Natural Selection of Bad Science

Paul Smaldino
November 16, 2017

The Natural Selection of Bad Science

Presentation given to graduate students at UC Merced in the NSF training program (NRT) on Intelligent Adaptive Systems, Nov 16, 2017.

Paul Smaldino

November 16, 2017
Tweet

Other Decks in Science

Transcript

  1. The Natural Selection of Bad Science Paul E. Smaldino Assistant

    Professor Cognitive & Information Sciences Quantitative & Systems Biology University of California, Merced
  2. Counterpoint: (Begley & Ellis 2012, Nature) Oncology 47/53 ‘landmark’ studies

    did not replicate Neuroscience Errors in popular statistical methods imply false positive rate of up to 70% (Eklund et al. 2016, PNAS) (Open Science Collaboration 2015, Science) Psychology 61/100 studies in top journals failed to replicate (p < .05) Most fields? (Baker 2016, Nature)
  3. False facts are highly injurious to the progress of science,

    for they often endure long; but false views, if supported by some evidence, do little harm, for every one takes a salutary pleasure in proving their falseness; and when this is done, one path towards error is closed and the road to truth is often at the same time opened. –Darwin 1871 The Descent of Man
  4. How do we find facts? 2. Investigation T Real truth

    of hypothesis Probability of result 1 – β α β 1 – α + – positive results negative results True (T) False (T) Unknown Positive (+) Negative (–) General case General case (+ or –) F
  5. Assume Probability of false positive finding is 5%. Pr(+|F) =

    0.05 Probability of true positive finding is 50%. Pr(+|T) = 0.50 Your test yields a positive result. What is the probability this finding indicates a true hypothesis? Most common answer: 0.95 (Eddy 1982; Gigerenzer & Hoffrage 1995)
  6. Pr(T|+) = Pr(+|T) Pr(T) Pr(+|T) Pr(T) + Pr(+|F) Pr(F) Assume

    Probability of false positive finding is 5%. Pr(+|F) = 0.05 Probability of true positive finding is 50%. Pr(+|T) = 0.50 Your test yields a positive result. What is the probability this finding indicates a true hypothesis?
  7. Pr(T|+) = Pr(+|T) Pr(T) Pr(+|T) Pr(T) + Pr(+|F) Pr(F) Need

    base rate Assume Probability of false positive finding is 5%. Pr(+|F) = 0.05 Probability of true positive finding is 50%. Pr(+|T) = 0.50 Your test yields a positive result. What is the probability this finding indicates a true hypothesis?
  8. How do we find facts? 2. Investigation T Real truth

    of hypothesis Probability of result 1 – β α β 1 – α + – positive results negative results True (T) False (T) Unknown Positive (+) Negative (–) General case General case (+ or –) F
  9. 1. Hypothesis Selection Novel hypotheses Tested hypotheses A previously tested

    hypothesis is selected for replication with probability r, otherwise a novel (untested) hypothesis is selected. Novel hypotheses are true with probability b. 1 – r r 2. Investigation T Real truth of hypothesis Probability of result 1 – β α β 1 – α + – 3. Communication Experimental results are communicated to the scientific community with a probability that depends upon both the experimental result (+, –) and whether the hypothesis was novel (N) or a replication (R). Communicated results join the set of tested hypotheses. Uncommunicated replications revert to their prior status. 1 – C N– C N– positive results negative results 1 – C R+ C R+ New result communicated New result not communicated 1 – C R– C R– File drawer novel replic. novel replic. True (T) False (T) KEY Interior = true epistemic state Exterior = experimental evidence Unknown Positive (+) Negative (–) General case General case (+ or –) F McElreath R & Smaldino PE (2015) Replication, communication, and the population dynamics of scientific discovery. PLOS ONE 10(8):e0136088.
  10. 1. Hypothesis Selection Novel hypotheses Tested hypotheses A previously tested

    hypothesis is selected for replication with probability r, otherwise a novel (untested) hypothesis is selected. Novel hypotheses are true with probability b. 1 – r r 2. Investigation T Real truth of hypothesis Probability of result 1 – β α β 1 – α + – 3. Communication Experimental results are communicated to the scientific community with a probability that depends upon both the experimental result (+, –) and whether the hypothesis was novel (N) or a replication (R). Communicated results join the set of tested hypotheses. Uncommunicated replications revert to their prior status. 1 – C N– C N– positive results negative results 1 – C R+ C R+ New result communicated New result not communicated 1 – C R– C R– File drawer novel replic. novel replic. True (T) False (T) KEY Interior = true epistemic state Exterior = experimental evidence Unknown Positive (+) Negative (–) General case General case (+ or –) F McElreath R & Smaldino PE (2015) Replication, communication, and the population dynamics of scientific discovery. PLOS ONE 10(8):e0136088.
  11. 1. Hypothesis Selection Novel hypotheses Tested hypotheses A previously tested

    hypothesis is selected for replication with probability r, otherwise a novel (untested) hypothesis is selected. Novel hypotheses are true with probability b. 1 – r r 2. Investigation T Real truth of hypothesis Probability of result 1 – β α β 1 – α + – 3. Communication Experimental results are communicated to the scientific community with a probability that depends upon both the experimental result (+, –) and whether the hypothesis was novel (N) or a replication (R). Communicated results join the set of tested hypotheses. Uncommunicated replications revert to their prior status. 1 – C N– C N– positive results negative results 1 – C R+ C R+ New result communicated New result not communicated 1 – C R– C R– File drawer novel replic. novel replic. True (T) False (T) KEY Interior = true epistemic state Exterior = experimental evidence Unknown Positive (+) Negative (–) General case General case (+ or –) F McElreath R & Smaldino PE (2015) Replication, communication, and the population dynamics of scientific discovery. PLOS ONE 10(8):e0136088.
  12. 1. Hypothesis Selection Novel hypotheses Tested hypotheses A previously tested

    hypothesis is selected for replication with probability r, otherwise a novel (untested) hypothesis is selected. Novel hypotheses are true with probability b. 1 – r r 2. Investigation T Real truth of hypothesis Probability of result 1 – β α β 1 – α + – 3. Communication Experimental results are communicated to the scientific community with a probability that depends upon both the experimental result (+, –) and whether the hypothesis was novel (N) or a replication (R). Communicated results join the set of tested hypotheses. Uncommunicated replications revert to their prior status. 1 – C N– C N– positive results negative results 1 – C R+ C R+ New result communicated New result not communicated 1 – C R– C R– File drawer novel replic. novel replic. True (T) False (T) KEY Interior = true epistemic state Exterior = experimental evidence Unknown Positive (+) Negative (–) General case General case (+ or –) F McElreath R & Smaldino PE (2015) Replication, communication, and the population dynamics of scientific discovery. PLOS ONE 10(8):e0136088.
  13. 1. Hypothesis Selection Novel hypotheses Tested hypotheses A previously tested

    hypothesis is selected for replication with probability r, otherwise a novel (untested) hypothesis is selected. Novel hypotheses are true with probability b. 1 – r r 2. Investigation T Real truth of hypothesis Probability of result 1 – β α β 1 – α + – 3. Communication Experimental results are communicated to the scientific community with a probability that depends upon both the experimental result (+, –) and whether the hypothesis was novel (N) or a replication (R). Communicated results join the set of tested hypotheses. Uncommunicated replications revert to their prior status. 1 – C N– C N– positive results negative results 1 – C R+ C R+ New result communicated New result not communicated 1 – C R– C R– File drawer novel replic. novel replic. True (T) False (T) KEY Interior = true epistemic state Exterior = experimental evidence Unknown Positive (+) Negative (–) General case General case (+ or –) F McElreath R & Smaldino PE (2015) Replication, communication, and the population dynamics of scientific discovery. PLOS ONE 10(8):e0136088.
  14. 1. Hypothesis Selection Novel hypotheses Tested hypotheses A previously tested

    hypothesis is selected for replication with probability r, otherwise a novel (untested) hypothesis is selected. Novel hypotheses are true with probability b. 1 – r r 2. Investigation T Real truth of hypothesis Probability of result 1 – β α β 1 – α + – 3. Communication Experimental results are communicated to the scientific community with a probability that depends upon both the experimental result (+, –) and whether the hypothesis was novel (N) or a replication (R). Communicated results join the set of tested hypotheses. Uncommunicated replications revert to their prior status. 1 – C N– C N– positive results negative results 1 – C R+ C R+ New result communicated New result not communicated 1 – C R– C R– File drawer novel replic. novel replic. True (T) False (T) KEY Interior = true epistemic state Exterior = experimental evidence Unknown Positive (+) Negative (–) General case General case (+ or –) F Recursions: Solutions: McElreath R & Smaldino PE (2015) Replication, communication, and the population dynamics of scientific discovery. PLOS ONE 10(8):e0136088.
  15. Proportion true hypotheses at different numbers of net positive findings

     0.001 0.1 0.5 0 0.2 0.5 0.8 1 0 0.1 0.3 0.5 0 0.2 0.5 0.8 1 0.5 0.8 0.99 0 0.2 0.5 0.8 1 0.05 0.1 0.15 0.2 0 0.2 0.5 0.8 1 0 0.25 0.5 0.75 1 0 0.2 0.5 0.8 1 0 0.25 0.5 0.75 1 0 0.2 0.5 0.8 1 0 0.25 0.5 0.75 1 0 0.2 0.5 0.8 1 0 0.25 0.5 0.75 1 0 0.2 0.5 0 0.25 0.5 0.75 1 0 0.2 0.5 0 0.25 0.5 0.75 1 0 0.2 0.5 Proportion true Proportion true base rate replication rate power false-positive rate communicate neg. rep. communicate pos. rep. communicate neg. new 1 3 5 (a) (b) (c) (d) (e) (f) (g) 5 5 5 5 5 5 Propo communicate neg. rep. communicate pos. rep. communicate neg. new Optimistic sc Pessimistic scenario 3 3 3 3 3 3 0 0 0 McElreath R & Smaldino PE (2015) Replication, communication, and the population dynamics of scientific discovery. PLOS ONE 10(8):e0136088.
  16.  0.001 0.1 0.5 0 0.2 0.5 0.8 1 0

    0.1 0.3 0.5 0 0.2 0.5 0.8 1 0.5 0.8 0.99 0 0.2 0.5 0.8 1 0.05 0.1 0.15 0.2 0 0.2 0.5 0.8 1 0 0.25 0.5 0.75 1 0 0.2 0.5 0.8 1 0 0.25 0.5 0.75 1 0 0.2 0.5 0.8 1 0 0.25 0.5 0.75 1 0 0.2 0.5 0.8 1 0 0.25 0.5 0.75 1 0 0.2 0.5 0 0.25 0.5 0.75 1 0 0.2 0.5 0 0.25 0.5 0.75 1 0 0.2 0.5 Proportion true Proportion true base rate replication rate power false-positive rate communicate neg. rep. communicate pos. rep. communicate neg. new 1 3 5 (a) (b) (c) (d) (e) (f) (g) 5 5 5 5 5 5 Propo communicate neg. rep. communicate pos. rep. communicate neg. new Optimistic sc Pessimistic scenario 3 3 3 3 3 3 0 0 0 Base rate and false-positive rate are the most important factors in avoiding false facts Proportion true hypotheses at different numbers of net positive findings McElreath R & Smaldino PE (2015) Replication, communication, and the population dynamics of scientific discovery. PLOS ONE 10(8):e0136088.
  17. False facts more common when… • Studies are underpowered ➡

    False positives and ambiguous results • Negative results aren’t published ➡ Lower information content of literature • Misunderstanding of statistical techniques ➡ False positives and ambiguous results • Surprising, easily understood results easiest to publish ➡ Lowering base rate Why Isn’t Science Better?
  18. Incentives not aligned with best practices (Schillebeeckx et al. 2013,

    Nature Biotech.) “Part of the problem is that no-one is incentivised to be right. Instead, scientists are incentivised to be productive and innovative.” –Richard Horton The Lancet April 2015
  19. (Brischoux & Angler 2015, Scientometrics) Numbers of papers at hiring

    for CNRS evolutionary biologists More papers, more co-authorship (Nabout et al. 2015, Scientometrics) botany zoology ecology genetics % single-authored papers physics fraction multi-author pubs (Wardil & Hauert. 2015, Phys Rev E) Successful scientists are publishing more
  20. (van Dijk et al. 2014, Curr Biol) publications per year

    average impact factor Successful scientists are publishing more
  21. When a measure becomes a target, it ceases to be

    a good measure. Donald T. Campbell 1916–1996 The more any quantitative social indicator is used for social decision- making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor. –Campbell 1976
  22. Publishing more can lead directly to some kinds of success

    (Franzoni et al. 2011, Science) PLOS ONE $984 PNAS $3,513 Nature, Science $43,783 Average amount paid to first author in China in 2016 (in USD) (Quan et al. 2017, Aslib J Inform Manag) Similar incentives in many other countries • India • Malaysia • Korea • Turkey • Venezuela • Chile
  23. An evolutionary model of science • Population of N labs

    • Each lab has characteristic methodological power, Pr(+|T) • Increasing power also increases false positives, unless effort is exerted • Effort increases the time between results • Novel negative results tough to publish • Labs that publish more are more likely to have their methods “reproduced” in new labs • Two phases: (1) Science, (2) Evolution Smaldino PE & McElreath R (2016) The natural selection of bad science. Royal Society Open Science 3: 160384.
  24. Phase 1: Science • New hypothesis tackled with probability inversely

    proportionate to effort • Novel hypotheses true at rate b = 0.1 Investigation • Always yields a positive (+) or negative (–) result • Power: W = Pr(+|T) • False positive rate is a function of power and effort: 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 power false positive rate e = 1 e = 10 e = 75 Hypothesis Selection 0.5 0.6 0.7 0.8 0.9 1 0 20 40 60 80 100 Probability of new study effort Communication • Novel positive results can always be published • Novel negative results unlikely to be published increased effort
  25. 1. From a randomly selected group of size d, the

    oldest “dies.” 2. From another randomly selected group of size d, the lab with the highest accumulated payoff “reproduces,” transmitting its methods to its “offspring” with mutation Phase 2: Evolution Smaldino PE & McElreath R (2016) The natural selection of bad science. Royal Society Open Science 3: 160384.
  26. The natural selection of bad science Power evolves, constant effort

    Effort evolves, constant power Smaldino PE & McElreath R (2016) The natural selection of bad science. Royal Society Open Science 3: 160384.
  27. 0 0.2 0.4 0.6 0.8 1 1955 1965 1975 1985

    1995 2005 2015 Stiatistical Power R2 = 0.00097 Statistical power to detect small effects in the social + behavioral sciences mean power = 0.24 (Smaldino & McElreath 2016) Szucs & Ioannidis 2017 100,000+ statistical tests from ~10,000 papers in psych, cog. neuro, and medical journals (2011-2014)
  28. Replication to the Rescue? Adding replication to the model -

    Labs replicate tests of previously published hypotheses at rate r - All replications are publishable, worth 50% prestige of novel finding (0.5 points) - Success replication boosts original authors’ prestige (+0.1 points) - Failed replication severely damages original authors’ prestige (–100 points)
  29. Replication ≠ Salvation Smaldino PE & McElreath R (2016) The

    natural selection of bad science. Royal Society Open Science 3: 160384.
  30. Take homes • Incentives to boost quantitative metrics can lead

    to the degradation of research methods • Requires no fraud or ill intent, only that successful individuals transmit their methods • Changing individual behavior not enough — improving science requires institutional change • This is unlikely to be easy or happen quickly. But some promising changes are already happening
  31. Science is still awesome. “The first principle is that you

    must not fool yourself—and you are the easiest person to fool.”