Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Natural Selection of Bad Science

Paul Smaldino
November 16, 2017

The Natural Selection of Bad Science

Presentation given to graduate students at UC Merced in the NSF training program (NRT) on Intelligent Adaptive Systems, Nov 16, 2017.

Paul Smaldino

November 16, 2017
Tweet

Transcript

  1. The Natural Selection of Bad Science
    Paul E. Smaldino
    Assistant Professor Cognitive & Information Sciences
    Quantitative & Systems Biology
    University of California, Merced

    View Slide

  2. View Slide

  3. Counterpoint:
    (Begley & Ellis 2012, Nature)
    Oncology
    47/53 ‘landmark’ studies
    did not replicate
    Neuroscience
    Errors in popular
    statistical methods imply
    false positive rate of up to
    70%
    (Eklund et al. 2016, PNAS)
    (Open Science Collaboration 2015,
    Science)
    Psychology
    61/100 studies in top
    journals failed to replicate
    (p < .05)
    Most fields?
    (Baker 2016, Nature)

    View Slide

  4. False facts are highly injurious to
    the progress of science, for they
    often endure long; but false
    views, if supported by some
    evidence, do little harm, for every
    one takes a salutary pleasure in
    proving their falseness; and when
    this is done, one path towards
    error is closed and the road to
    truth is often at the same time
    opened.
    –Darwin 1871 The Descent of Man

    View Slide

  5. Science as Signal Detection for Facts

    View Slide

  6. How do we find facts?
    2. Investigation
    T
    Real truth of hypothesis
    Probability of result
    1 – β α
    β 1 – α
    +

    positive results
    negative results
    True (T)
    False (T)
    Unknown
    Positive (+)
    Negative (–)
    General case
    General case (+ or –)
    F

    View Slide

  7. Assume
    Probability of false positive finding is 5%. Pr(+|F) = 0.05
    Probability of true positive finding is 50%. Pr(+|T) = 0.50
    Your test yields a positive result. What is the probability this
    finding indicates a true hypothesis?
    Most common answer: 0.95
    (Eddy 1982; Gigerenzer & Hoffrage 1995)

    View Slide

  8. Pr(T|+) =
    Pr(+|T) Pr(T)
    Pr(+|T) Pr(T) + Pr(+|F) Pr(F)
    Assume
    Probability of false positive finding is 5%. Pr(+|F) = 0.05
    Probability of true positive finding is 50%. Pr(+|T) = 0.50
    Your test yields a positive result. What is the probability this
    finding indicates a true hypothesis?

    View Slide

  9. Pr(T|+) =
    Pr(+|T) Pr(T)
    Pr(+|T) Pr(T) + Pr(+|F) Pr(F)
    Need base rate
    Assume
    Probability of false positive finding is 5%. Pr(+|F) = 0.05
    Probability of true positive finding is 50%. Pr(+|T) = 0.50
    Your test yields a positive result. What is the probability this
    finding indicates a true hypothesis?

    View Slide

  10. The Signal and the Noise
    true false
    base rate: proportion of hypotheses which are true

    View Slide

  11. Positive
    Negative
    Pr(true|+) = 0.5
    The Signal and the Noise

    View Slide

  12. The Signal and the Noise
    Positive
    Negative

    View Slide

  13. How do we find facts?
    2. Investigation
    T
    Real truth of hypothesis
    Probability of result
    1 – β α
    β 1 – α
    +

    positive results
    negative results
    True (T)
    False (T)
    Unknown
    Positive (+)
    Negative (–)
    General case
    General case (+ or –)
    F

    View Slide

  14. 1. Hypothesis Selection
    Novel
    hypotheses
    Tested
    hypotheses
    A previously tested
    hypothesis is selected
    for replication with
    probability r, otherwise
    a novel (untested)
    hypothesis is selected.
    Novel hypotheses are
    true with probability b.
    1 – r r
    2. Investigation
    T
    Real truth of hypothesis
    Probability of result
    1 – β α
    β 1 – α
    +

    3. Communication
    Experimental results are communicated to
    the scientific community with a probability that
    depends upon both the experimental result
    (+, –) and whether the hypothesis was novel
    (N) or a replication (R). Communicated
    results join the set of tested hypotheses.
    Uncommunicated replications revert to their
    prior status.
    1 – C
    N–
    C
    N–
    positive results
    negative results
    1 – C
    R+
    C
    R+
    New result communicated
    New result not communicated
    1 – C
    R–
    C
    R–
    File drawer
    novel
    replic.
    novel
    replic.
    True (T)
    False (T)
    KEY
    Interior = true epistemic state
    Exterior = experimental evidence
    Unknown
    Positive (+)
    Negative (–)
    General case
    General case (+ or –)
    F
    McElreath R & Smaldino PE (2015) Replication, communication, and the
    population dynamics of scientific discovery. PLOS ONE 10(8):e0136088.

    View Slide

  15. 1. Hypothesis Selection
    Novel
    hypotheses
    Tested
    hypotheses
    A previously tested
    hypothesis is selected
    for replication with
    probability r, otherwise
    a novel (untested)
    hypothesis is selected.
    Novel hypotheses are
    true with probability b.
    1 – r r
    2. Investigation
    T
    Real truth of hypothesis
    Probability of result
    1 – β α
    β 1 – α
    +

    3. Communication
    Experimental results are communicated to
    the scientific community with a probability that
    depends upon both the experimental result
    (+, –) and whether the hypothesis was novel
    (N) or a replication (R). Communicated
    results join the set of tested hypotheses.
    Uncommunicated replications revert to their
    prior status.
    1 – C
    N–
    C
    N–
    positive results
    negative results
    1 – C
    R+
    C
    R+
    New result communicated
    New result not communicated
    1 – C
    R–
    C
    R–
    File drawer
    novel
    replic.
    novel
    replic.
    True (T)
    False (T)
    KEY
    Interior = true epistemic state
    Exterior = experimental evidence
    Unknown
    Positive (+)
    Negative (–)
    General case
    General case (+ or –)
    F
    McElreath R & Smaldino PE (2015) Replication, communication, and the
    population dynamics of scientific discovery. PLOS ONE 10(8):e0136088.

    View Slide

  16. 1. Hypothesis Selection
    Novel
    hypotheses
    Tested
    hypotheses
    A previously tested
    hypothesis is selected
    for replication with
    probability r, otherwise
    a novel (untested)
    hypothesis is selected.
    Novel hypotheses are
    true with probability b.
    1 – r r
    2. Investigation
    T
    Real truth of hypothesis
    Probability of result
    1 – β α
    β 1 – α
    +

    3. Communication
    Experimental results are communicated to
    the scientific community with a probability that
    depends upon both the experimental result
    (+, –) and whether the hypothesis was novel
    (N) or a replication (R). Communicated
    results join the set of tested hypotheses.
    Uncommunicated replications revert to their
    prior status.
    1 – C
    N–
    C
    N–
    positive results
    negative results
    1 – C
    R+
    C
    R+
    New result communicated
    New result not communicated
    1 – C
    R–
    C
    R–
    File drawer
    novel
    replic.
    novel
    replic.
    True (T)
    False (T)
    KEY
    Interior = true epistemic state
    Exterior = experimental evidence
    Unknown
    Positive (+)
    Negative (–)
    General case
    General case (+ or –)
    F
    McElreath R & Smaldino PE (2015) Replication, communication, and the
    population dynamics of scientific discovery. PLOS ONE 10(8):e0136088.

    View Slide

  17. 1. Hypothesis Selection
    Novel
    hypotheses
    Tested
    hypotheses
    A previously tested
    hypothesis is selected
    for replication with
    probability r, otherwise
    a novel (untested)
    hypothesis is selected.
    Novel hypotheses are
    true with probability b.
    1 – r r
    2. Investigation
    T
    Real truth of hypothesis
    Probability of result
    1 – β α
    β 1 – α
    +

    3. Communication
    Experimental results are communicated to
    the scientific community with a probability that
    depends upon both the experimental result
    (+, –) and whether the hypothesis was novel
    (N) or a replication (R). Communicated
    results join the set of tested hypotheses.
    Uncommunicated replications revert to their
    prior status.
    1 – C
    N–
    C
    N–
    positive results
    negative results
    1 – C
    R+
    C
    R+
    New result communicated
    New result not communicated
    1 – C
    R–
    C
    R–
    File drawer
    novel
    replic.
    novel
    replic.
    True (T)
    False (T)
    KEY
    Interior = true epistemic state
    Exterior = experimental evidence
    Unknown
    Positive (+)
    Negative (–)
    General case
    General case (+ or –)
    F
    McElreath R & Smaldino PE (2015) Replication, communication, and the
    population dynamics of scientific discovery. PLOS ONE 10(8):e0136088.

    View Slide

  18. 1. Hypothesis Selection
    Novel
    hypotheses
    Tested
    hypotheses
    A previously tested
    hypothesis is selected
    for replication with
    probability r, otherwise
    a novel (untested)
    hypothesis is selected.
    Novel hypotheses are
    true with probability b.
    1 – r r
    2. Investigation
    T
    Real truth of hypothesis
    Probability of result
    1 – β α
    β 1 – α
    +

    3. Communication
    Experimental results are communicated to
    the scientific community with a probability that
    depends upon both the experimental result
    (+, –) and whether the hypothesis was novel
    (N) or a replication (R). Communicated
    results join the set of tested hypotheses.
    Uncommunicated replications revert to their
    prior status.
    1 – C
    N–
    C
    N–
    positive results
    negative results
    1 – C
    R+
    C
    R+
    New result communicated
    New result not communicated
    1 – C
    R–
    C
    R–
    File drawer
    novel
    replic.
    novel
    replic.
    True (T)
    False (T)
    KEY
    Interior = true epistemic state
    Exterior = experimental evidence
    Unknown
    Positive (+)
    Negative (–)
    General case
    General case (+ or –)
    F
    McElreath R & Smaldino PE (2015) Replication, communication, and the
    population dynamics of scientific discovery. PLOS ONE 10(8):e0136088.

    View Slide

  19. 1. Hypothesis Selection
    Novel
    hypotheses
    Tested
    hypotheses
    A previously tested
    hypothesis is selected
    for replication with
    probability r, otherwise
    a novel (untested)
    hypothesis is selected.
    Novel hypotheses are
    true with probability b.
    1 – r r
    2. Investigation
    T
    Real truth of hypothesis
    Probability of result
    1 – β α
    β 1 – α
    +

    3. Communication
    Experimental results are communicated to
    the scientific community with a probability that
    depends upon both the experimental result
    (+, –) and whether the hypothesis was novel
    (N) or a replication (R). Communicated
    results join the set of tested hypotheses.
    Uncommunicated replications revert to their
    prior status.
    1 – C
    N–
    C
    N–
    positive results
    negative results
    1 – C
    R+
    C
    R+
    New result communicated
    New result not communicated
    1 – C
    R–
    C
    R–
    File drawer
    novel
    replic.
    novel
    replic.
    True (T)
    False (T)
    KEY
    Interior = true epistemic state
    Exterior = experimental evidence
    Unknown
    Positive (+)
    Negative (–)
    General case
    General case (+ or –)
    F
    Recursions:
    Solutions:
    McElreath R & Smaldino PE (2015) Replication, communication, and the
    population dynamics of scientific discovery. PLOS ONE 10(8):e0136088.

    View Slide

  20. Proportion true hypotheses at different numbers of net positive findings

    0.001 0.1 0.5
    0
    0.2
    0.5
    0.8
    1
    0 0.1 0.3 0.5
    0
    0.2
    0.5
    0.8
    1
    0.5 0.8 0.99
    0
    0.2
    0.5
    0.8
    1
    0.05 0.1 0.15 0.2
    0
    0.2
    0.5
    0.8
    1
    0 0.25 0.5 0.75 1
    0
    0.2
    0.5
    0.8
    1
    0 0.25 0.5 0.75 1
    0
    0.2
    0.5
    0.8
    1
    0 0.25 0.5 0.75 1
    0
    0.2
    0.5
    0.8
    1
    0 0.25 0.5 0.75 1
    0
    0.2
    0.5
    0 0.25 0.5 0.75 1
    0
    0.2
    0.5
    0 0.25 0.5 0.75 1
    0
    0.2
    0.5
    Proportion true
    Proportion true
    base rate replication rate power false-positive rate
    communicate neg. rep. communicate pos. rep. communicate neg. new
    1
    3
    5
    (a) (b) (c) (d)
    (e) (f) (g)
    5
    5
    5
    5 5
    5
    Propo
    communicate neg. rep. communicate pos. rep. communicate neg. new
    Optimistic sc
    Pessimistic scenario
    3
    3
    3
    3
    3 3
    0 0 0
    McElreath R & Smaldino PE (2015) Replication, communication, and the
    population dynamics of scientific discovery. PLOS ONE 10(8):e0136088.

    View Slide


  21. 0.001 0.1 0.5
    0
    0.2
    0.5
    0.8
    1
    0 0.1 0.3 0.5
    0
    0.2
    0.5
    0.8
    1
    0.5 0.8 0.99
    0
    0.2
    0.5
    0.8
    1
    0.05 0.1 0.15 0.2
    0
    0.2
    0.5
    0.8
    1
    0 0.25 0.5 0.75 1
    0
    0.2
    0.5
    0.8
    1
    0 0.25 0.5 0.75 1
    0
    0.2
    0.5
    0.8
    1
    0 0.25 0.5 0.75 1
    0
    0.2
    0.5
    0.8
    1
    0 0.25 0.5 0.75 1
    0
    0.2
    0.5
    0 0.25 0.5 0.75 1
    0
    0.2
    0.5
    0 0.25 0.5 0.75 1
    0
    0.2
    0.5
    Proportion true
    Proportion true
    base rate replication rate power false-positive rate
    communicate neg. rep. communicate pos. rep. communicate neg. new
    1
    3
    5
    (a) (b) (c) (d)
    (e) (f) (g)
    5
    5
    5
    5 5
    5
    Propo
    communicate neg. rep. communicate pos. rep. communicate neg. new
    Optimistic sc
    Pessimistic scenario
    3
    3
    3
    3
    3 3
    0 0 0
    Base rate and false-positive rate are the
    most important factors in avoiding false facts
    Proportion true hypotheses at different numbers of net positive findings
    McElreath R & Smaldino PE (2015) Replication, communication, and the
    population dynamics of scientific discovery. PLOS ONE 10(8):e0136088.

    View Slide

  22. False facts more common when…
    • Studies are underpowered
    ➡ False positives and ambiguous results
    • Negative results aren’t published
    ➡ Lower information content of literature
    • Misunderstanding of statistical techniques
    ➡ False positives and ambiguous results
    • Surprising, easily understood results easiest to publish
    ➡ Lowering base rate
    Why Isn’t Science Better?

    View Slide

  23. Incentives not aligned with best practices
    (Schillebeeckx et al. 2013, Nature Biotech.)
    “Part of the problem is that no-one is
    incentivised to be right. Instead,
    scientists are incentivised to be
    productive and innovative.”
    –Richard Horton
    The Lancet
    April 2015

    View Slide

  24. (Brischoux & Angler 2015, Scientometrics)
    Numbers of papers at hiring for
    CNRS evolutionary biologists
    More papers, more
    co-authorship
    (Nabout et al. 2015, Scientometrics)
    botany zoology
    ecology genetics
    % single-authored papers
    physics
    fraction multi-author pubs
    (Wardil & Hauert. 2015, Phys Rev E)
    Successful scientists are
    publishing more

    View Slide

  25. (van Dijk et al. 2014, Curr Biol)
    publications per year
    average impact factor
    Successful scientists are
    publishing more

    View Slide

  26. When a measure becomes a target,
    it ceases to be a good measure.
    Donald T. Campbell
    1916–1996
    The more any quantitative social
    indicator is used for social decision-
    making, the more subject it will be to
    corruption pressures and the more
    apt it will be to distort and corrupt
    the social processes it is intended to
    monitor.
    –Campbell 1976

    View Slide

  27. Publishing more can lead directly
    to some kinds of success
    (Franzoni et al. 2011, Science)
    PLOS ONE $984
    PNAS $3,513
    Nature,
    Science
    $43,783
    Average amount paid to first author in China
    in 2016 (in USD)
    (Quan et al. 2017, Aslib J Inform Manag)
    Similar incentives in many other countries
    • India
    • Malaysia
    • Korea
    • Turkey
    • Venezuela
    • Chile

    View Slide

  28. (Vinkers et al. 2015, BMJ)
    Relative frequency in PubMed abstracts, 1975-2014

    View Slide

  29. Such a system can (and does) incentivize cheaters…
    …but does not require cheating to be damaging.

    View Slide

  30. An evolutionary model of science
    • Population of N labs
    • Each lab has characteristic methodological power, Pr(+|T)
    • Increasing power also increases false positives, unless effort is
    exerted
    • Effort increases the time between results
    • Novel negative results tough to publish
    • Labs that publish more are more likely to have their methods
    “reproduced” in new labs
    • Two phases: (1) Science, (2) Evolution
    Smaldino PE & McElreath R (2016) The natural selection of bad science. Royal Society Open Science 3: 160384.

    View Slide

  31. Phase 1: Science
    • New hypothesis tackled with probability
    inversely proportionate to effort
    • Novel hypotheses true at rate b = 0.1
    Investigation
    • Always yields a positive (+) or negative
    (–) result
    • Power: W = Pr(+|T)
    • False positive rate is a function of power
    and effort:
    0.0 0.2 0.4 0.6 0.8 1.0
    0.0 0.2 0.4 0.6 0.8 1.0
    power
    false positive rate
    e = 1
    e = 10
    e = 75
    Hypothesis Selection
    0.5
    0.6
    0.7
    0.8
    0.9
    1
    0 20 40 60 80 100
    Probability of new study
    effort
    Communication
    • Novel positive results can always be
    published
    • Novel negative results unlikely to be
    published
    increased
    effort

    View Slide

  32. 1. From a randomly selected group of
    size d, the oldest “dies.”
    2. From another randomly selected
    group of size d, the lab with the
    highest accumulated payoff
    “reproduces,” transmitting its
    methods to its “offspring” with
    mutation
    Phase 2: Evolution
    Smaldino PE & McElreath R (2016) The natural selection of bad science. Royal Society Open Science 3: 160384.

    View Slide

  33. The natural selection of bad science
    Power evolves, constant effort Effort evolves, constant power
    Smaldino PE & McElreath R (2016) The natural selection of bad science. Royal Society Open Science 3: 160384.

    View Slide

  34. 2016
    1967

    View Slide

  35. 1962
    1989
    1990

    View Slide

  36. 0
    0.2
    0.4
    0.6
    0.8
    1
    1955 1965 1975 1985 1995 2005 2015
    Stiatistical Power
    R2 = 0.00097
    Statistical power to detect small effects in
    the social + behavioral sciences
    mean power = 0.24
    (Smaldino & McElreath 2016)
    Szucs & Ioannidis 2017
    100,000+ statistical tests from
    ~10,000 papers in psych, cog.
    neuro, and medical journals
    (2011-2014)

    View Slide

  37. Replication to the Rescue?
    Adding replication to the model
    - Labs replicate tests of previously
    published hypotheses at rate r
    - All replications are publishable, worth
    50% prestige of novel finding (0.5 points)
    - Success replication boosts original
    authors’ prestige (+0.1 points)
    - Failed replication severely damages
    original authors’ prestige (–100 points)

    View Slide

  38. Replication ≠ Salvation
    Smaldino PE & McElreath R (2016) The natural selection of bad science. Royal Society Open Science 3: 160384.

    View Slide

  39. Take homes
    • Incentives to boost quantitative metrics can lead to
    the degradation of research methods
    • Requires no fraud or ill intent, only that successful
    individuals transmit their methods
    • Changing individual behavior not enough —
    improving science requires institutional change
    • This is unlikely to be easy or happen quickly. But
    some promising changes are already happening

    View Slide

  40. http://www.ascb.org/dora/
    https://cos.io/
    http://bulliedintobadscience.org/

    View Slide

  41. Science is still awesome.
    “The first principle is that you
    must not fool yourself—and you
    are the easiest person to fool.”

    View Slide