Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Intro to the Meta-Analytic Method

mllewis
December 07, 2020

Intro to the Meta-Analytic Method

mllewis

December 07, 2020
Tweet

More Decks by mllewis

Other Decks in Science

Transcript

  1. Intro to the Meta-Analytic
    Method
    7 December 2020
    Graduate Research Methods
    Molly Lewis
    Carnegie Mellon University

    View Slide

  2. View Slide

  3. What are the long term consequences of getting
    covid-19?

    View Slide

  4. Source: Moll & Tomasello, 2010
    Dependent measure:
    Looking time to
    checkerboard
    Independent variable: ADS
    vs. IDS played in pairs of
    trials within subjects
    Cooper & Aslin (1990)
    Do infants prefer IDS to ADS?

    View Slide

  5. ManyBabies (2020)
    • Multi-lab effort to replicate
    IDS preference
    • Each lab conducted their
    own replication of Cooper &
    Aslin (1990), with
    standardization of the
    paradigm across labs
    • 68 labs, 2773 babies!

    View Slide

  6. Many estimates of the size of an effect
    across many repeated experiments.
    Preference for IDS

    View Slide

  7. How do we summarize this pattern?
    ”The Madison Lab replicated the
    finding that infants prefer infant
    directed speech, while the other five
    labs did not.”
    That throws out a lot of
    information!!

    View Slide

  8. Summarizing literatures is a more general
    challenge in psychology
    Psychological literatures
    are almost always
    conflicting
    Qualitative literature
    reviews are:
    • not very precise
    • difficult when there
    are many studies
    (Tsuji, et al. 2014)

    View Slide

  9. Meta-analysis
    A quantitative approach to summarizing results across studies
    Meta-analytic
    effect size estimate

    View Slide

  10. History of meta-analysis
    • Mid-1970s: many studies had
    accumulated that were important to
    social decision policies
    • e.g. do students learn more when class
    sizes are smaller?
    • Research findings were conflicting,
    implications unclear -> difficult to get
    funding
    • Glass (1976): Research findings were
    not as conflicting as appeared
    • Using meta-analysis, reveals cumulative
    patterns
    • The first “big data”
    (Gurevitch, et al. 2018)

    View Slide

  11. Why do a meta-analysis?
    1. Summarize what has been done in a literature
    2. Theory development – compare strength of
    different effects and moderating factors
    3. Evaluate bias in literature (e.g. publication bias)
    4. Estimate an effect size so you can determine a
    sample
    N = ?
    (Kuhl, 2004)

    View Slide

  12. How are meta-analyses presented?
    1. Stand-alone publications
    (~ literature review)
    2. As part of an empirical
    paper (meta-analysis +
    new experiments)
    3. Within-paper meta-
    analysis (“mini-meta-
    analysis”)
    (Lewis & Frank, 2016)

    View Slide

  13. Plan for today
    An example meta-analysis:
    mutual exclusivity (Lewis, et al., 2020)
    Metalab
    Conducting your own meta-
    analysis

    View Slide

  14. An example meta-analysis: Mutual exclusivity
    (Lewis, et al., 2020)
    (Markman & Wachtel, 1988)

    View Slide

  15. Open questions
    • How big is the mutual exclusivity effect?
    • Does this effect have evidential value?
    • How robust is the effect to methodological variability?
    • What does the developmental trajectory effect look like?
    • What leads to developmental change?

    View Slide

  16. Conducting a MA of the mutual exclusivity
    literature
    An example paper from this literature:

    View Slide

  17. 0.00 0.25 0.50 0.75 1.00
    Prop. trials fixating novel object
    0.00 0.25 0.50 0.75 1.00
    Prop. trials fixating novel object
    0.00 0.25 0.50 0.75 1.00
    Prop. trials fixating novel object
    0.00 0.25 0.50 0.75 1.00
    Prop. trials fixating novel object
    Extracting effect sizes from a paper
    Where’s the dofa?
    chance
    .65
    .13
    d
    Bion, et al. (2013)
    For 24 mo, mean proportion of trials fixating on novel
    object = .65 (SD = .13)

    View Slide

  18. Grand effect size
    −1.00 1.00 2.00 3.00
    Effect size estimate
    8. spiegel
    7. markman
    6. grassman
    5. grassman
    4. byers
    3. bion
    2. bion
    1. bion
    2011
    1988
    2010
    2010
    2009
    2013
    2013
    2013
    30
    45
    48
    24
    17
    30
    24
    18
    72
    10
    12
    12
    16
    20
    25
    22
    First author Year Age (m.) N
    (mini) mutual exclusivity meta-analysis
    Grand effect
    size estimate
    1
    “Forest plot”
    Pool effect sizes across
    studies, weighting by
    sample size

    View Slide

  19. Moderators
    = anything you think might influence the effect size
    • Age
    • Vocabulary size
    • Population type
    • Design type
    • (stimuli type)

    View Slide

  20. Results
    • Coded 146 effect sizes from 48 papers
    • Aggregate effect size was d = 1.27 [0.99, 1.55]

    View Slide

  21. Moderators

    View Slide

  22. Lewis et al. Summary
    • Effect is robust and large
    • Evidence for developmental change, and some evidence it is
    related to experience
    • Difficult to make causal claims about the source of this
    developmental change – that’s the goal of the subsequent
    experiments in the paper
    • How does this effect compare to other effects in language
    acquisition/cognitive development??

    View Slide

  23. Plan for today
    An example meta-analysis:
    mutual exclusivity (Lewis, et al., 2020)
    Metalab

    View Slide

  24. • Open source, aggregation of meta-analyses of 29 different
    phenomena in cognitive development (focus on language
    acquisition)
    • Interactive visualizations
    • In the process of developing an R package to access data
    (metalabR)
    • http://metalab.stanford.edu/
    (Lewis et al., 2016;
    Bergmann, et al., 2018)

    View Slide

  25. Concept−label
    advantage
    Online word
    recognition
    Gaze following
    Pointing
    and vocabulary
    Statistical sound
    learning
    Word segmentation Mutual exclusivity Sound symbolism
    IDS preference Phonotactic learning
    Vowel discrimination
    (native)
    Vowel discrimination
    (non−native)
    0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3
    −1
    0
    1
    2
    3
    −1
    0
    1
    2
    3
    −1
    0
    1
    2
    3
    Age (years)
    Effect size (d)
    (Lewis et al., 2016)

    View Slide

  26. Theories of language development
    “Stages” hypothesis
    “Interactive” hypothesis
    • Infants learn phonetic contrasts when
    supported by word context (Feldman, et al.,
    2013)
    • Infants learn word mappings when
    supported by prosody (Shukla, White, & Aslin,
    2011)
    Linguistic Hierarchy
    (Lewis et al., 2016)

    View Slide

  27. Theories of language acquisition
    Stages Interactive Ad hoc Observed
    0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3
    0
    1
    2
    Age (years)
    0HWKRGïUHVLGXDOLzed
    effect size
    WS
    GF
    IDS
    LA
    ME
    WR
    PV
    SS
    SSL
    9'ï1
    9'ï11
    Hypotheses
    (Lewis et al., 2016)

    View Slide

  28. Evidence for continuous development across the language
    hierarchy
    (Lewis et al., 2016)

    View Slide

  29. Plan for today
    An example meta-analysis:
    mutual exclusivity (Lewis, et al., 2020)
    Metalab
    Conducting your own meta-
    analysis

    View Slide

  30. Steps for conducting a meta-analysis
    1. Identify phenomenon of interest
    2. Literature search
    3. Code data reported in papers
    4. Calculate study-level effect sizes
    5. Pool effect sizes across studies, weighting by
    sample size

    View Slide

  31. Steps for conducting a meta-analysis
    1. Identify phenomenon of interest

    View Slide

  32. Identifying phenomenon
    • Tradeoff between breadth and specificity
    • Too broad -> comparing apples and oranges
    • Too narrow -> doesn’t answer question you care about, not many
    studies
    • Can be defined by a paradigm (as in mutual exclusivity)
    • Can start with seminal study
    • How many studies do you need??
    • Answer: at least two
    • Aggregated evidence is more precise than individual studies
    • Within-paper meta-analyses sometimes contain only a few (~5 studies;
    Lewis & Frank, 2016)

    View Slide

  33. Steps for conducting a meta-analysis
    1. Identify phenomenon of interest
    2. Literature search

    View Slide

  34. Define inclusion criteria
    • What studies are you going to include in your MA?
    • Every MA is unique
    • These might change later on as you get to know your topic more
    • Criteria
    • Document type (e.g., All literature, journal papers, theses, proceedings
    papers)
    • Participants (e.g., adults vs. children)
    • Method (e.g., eye-tracking vs. pointing)
    • Stimuli (e.g., objects vs. pictures)
    • Reasons for exclusion:
    • not relevant
    • not empirical (no data)
    • doesn't satisfy inclusion criteria X

    View Slide

  35. Define search protocol
    • Database search
    • Google scholar
    • PubMed
    • …
    • Scanning references
    • Recent paper: Who does it cite?
    • Seminal paper: Who cites it?
    • Expert list
    • Direct request
    • Review paper (can be biased)

    View Slide

  36. Enter results into spreadsheet
    • Read title and abstract
    • Make inclusion exclusion decision
    • Process should be reproducible
    • [template]

    View Slide

  37. The PRISMA statement
    • Standardized diagram for
    reporting paper selection
    process for meta-analytic
    review
    • Describes 4 stages:
    Identification, Screening,
    Eligibility, Excluded

    View Slide

  38. Steps for conducting a meta-analysis
    1. Identify phenomenon of interest
    2. Literature search
    3. Code data reported in papers

    View Slide

  39. Code data reported in papers

    View Slide

  40. Moderators
    • = anything you think
    might influence the
    effect size (continuous
    or discrete)
    • Can be of theoretical or
    methodological interest
    • Specific to each MA
    (e.g., age, design, etc.)
    • Make a codebook for
    how you will enter each
    moderator

    View Slide

  41. Steps for conducting a meta-analysis
    1. Identify phenomenon of interest
    2. Literature search
    3. Code data reported in papers
    4. Calculate study-level effect sizes

    View Slide

  42. Code data reported in papers

    View Slide

  43. Standardized measure of the size of an effect
    Encodes magnitude and direction of effect
    Cohen’s d:
    Cohen’s d
    diff. between means
    standard dev.
    Effect Size =

    View Slide

  44. Interpreting Cohen’s d
    Size Description Cohen’s Intuition Psychological Example
    .2 “small” Diff. between the
    heights of 15 yo and 16
    yo girls in the US
    Bouba-kiki effect in kids
    (~.15; Lammertink, et al.
    2016)
    .5 “medium” Diff. between the
    heights of 14 yo and 18
    yo girls.
    Cognitive behavioral
    therapy on anxiety (~ .4;
    (Belleville, et al., 2004)
    Sex difference in implicit
    math attitudes (~.5; Klein, et
    al., 2013)
    .8 “large” Diff. between the
    heights of 13 yo and 18
    yo girls.
    Syntactic Priming (~.9;
    Mahowald, et al., under
    review)
    Mutual exclusivity ( ~1.0;
    Lewis & Frank, 2020)
    (Cohen, 1969)
    Explore Cohen’s d:
    https://rpsychologist.
    com/d3/cohend/

    View Slide

  45. Interpreting Cohen’s d
    Cognitive (JEP:LMC) Social (JPSP) Psych. Science
    0
    1
    2
    0 1 2 3 0 1 2 3 0 1 2 3
    absolute cohen's d
    M = .69
    M = .19
    M = .78
    Estimating the Replicability of Psychological Science (OSF, Science, 2015)
    N = 97
    Relatively “large”
    effects reported
    in cognitive
    psychology

    View Slide

  46. −1 0 1 2 3
    Effect Size
    −1 0 1 2 3
    Effect Size
    Effect size variance and CI
    n1
    = 24
    n2
    = 24

    View Slide

  47. Effect size measures
    • Cohen’s d is just one (prototype)
    • Appropriate effect size measure depends on aspect of design
    (e.g., within vs. between subject), and types of variables (e.g.
    qualitative vs. quantitative).
    • For any statistical test you conduct can compute effect size (in
    principle)
    • the difference is between groups (t-test, d)
    • the relationship between variables (correlation, r)
    • the amount of variance accounted for by a factor (ANOVA,
    regression, f)
    • …
    • Can convert between ES metrics

    View Slide

  48. Steps for conducting a meta-analysis
    1. Identify phenomenon of interest
    2. Literature search
    3. Code data reported in papers
    4. Calculate study-level effect sizes
    5. Pool effect sizes across studies, weighting by
    sample size

    View Slide

  49. Grand effect size
    −1.00 1.00 2.00 3.00
    Effect size estimate
    8. spiegel
    7. markman
    6. grassman
    5. grassman
    4. byers
    3. bion
    2. bion
    1. bion
    2011
    1988
    2010
    2010
    2009
    2013
    2013
    2013
    30
    45
    48
    24
    17
    30
    24
    18
    72
    10
    12
    12
    16
    20
    25
    22
    First author Year Age (m.) N
    How do you pool effect sizes?
    Grand effect
    size estimate

    View Slide

  50. How do you pool effect sizes?
    • The goal of a meta-analysis is to estimate the true population
    effect size
    • Treat each study as a sample effect size from a population of
    studies
    • Aggregate using quantitative methods (e.g. averaging)
    • Get point estimate of the true effect size with measure of
    certainty
    More precise estimate of effect size than from single study.

    View Slide

  51. Population
    Sample
    All the studies we did
    run (i.e. the ones in the
    literature)
    0
    2
    4
    6
    0.4 0.6 0.8 1.0
    Prop. Right
    count
    Sample
    Effect size
    Count of studies
    0
    25000
    50000
    75000
    100000
    0.0 0.4 0.8
    Prop. Right
    count
    Population
    Effect size
    Count of studies
    All the studies
    we could have
    run
    Use samples to estimate
    population.
    Cohen’s d = .7

    View Slide

  52. Methods for pooling
    Analogous to logic in single study:
    • In a study, sample participants and pool to get estimate of effect in study
    (unweighted mean)
    • In meta-analysis, sample studies to get estimate of grand effect (weighted
    mean)
    Just as for models across participants, two models for pooling:
    • Fixed effect: One true population effect
    • Random effect: Random sample of many population effects, estimates
    mean

    View Slide

  53. Fixed Effect Model
    Slide adopted from http://compare2what.blogspot.com/ (Lohse, 2013)

    View Slide

  54. Fixed Effect Model
    Slide adopted from http://compare2what.blogspot.com/ (Lohse, 2013)

    View Slide

  55. Random Effect Model
    Random effect models recommended!
    Slide adopted from http://compare2what.blogspot.com/ (Lohse, 2013)

    View Slide

  56. Structure of the data for fitting a meta-
    analytic model
    N = 50 effect sizes
    Effect size Variance of
    effect size

    View Slide

  57. Fitting the meta-analytic model
    Grand meta-analytic
    effect size
    Grand meta-analytic
    effect size confidence interval
    Is the grand effect size
    significantly different from zero?
    Metafor package in R (Viechtbauer, 2010)

    View Slide

  58. Forest Plot • Point = study
    • Size of square = weight
    • Ranges = individual
    confidence intervals
    (uncertainty)
    • Diamond = weighted
    mean
    • Dashed line = ES of 0
    • If diamond overlap with
    dashed line the overall
    effect sizes does not differ
    from zero
    Grand effect size
    −1.00 1.00 2.00 3.00
    Effect size estimate
    8. spiegel
    7. markman
    6. grassman
    5. grassman
    4. byers
    3. bion
    2. bion
    1. bion
    2011
    1988
    2010
    2010
    2009
    2013
    2013
    2013
    30
    45
    48
    24
    17
    30
    24
    18
    72
    10
    12
    12
    16
    20
    25
    22
    First author Year Age (m.) N

    View Slide

  59. What a forest plot tells you
    1. What is the overall effect size for phenomenon X?
    • Because this estimate reflects data from many more participants than a single
    study, it should be more accurate than the effect size from a single study.
    • How big is this effect relative to other effects in psychology?
    2. Does the effect significantly differ from zero?
    • If it does not, this suggest there may be no effect (even though individual
    studies may show an effect).
    3. How much variability is there?
    • Are the effects of individual studies roughly the same, or is there a lot of
    variability?
    • If there’s a lot of variability, this suggests there might be an important
    moderator

    View Slide

  60. Analyzing moderators
    • Does the effect size vary by different features of the
    experiment?
    • Two kinds of moderators: Categorical and Continuous
    (Left fig. from Gurevitch et al, 2018)
    Mutual exclusivity MA

    View Slide

  61. Fitting meta-analytic models with
    moderators
    ma_model_age <- rma(d_calc ~ mean_age, vi = d_var_calc, data = ma_data)

    View Slide

  62. Assessing publication bias
    How many “missing” studies would have to exist
    in order for the overall effect size to be zero?
    Fail-Safe-N (Orwin, 1983)

    View Slide

  63. Fail-Safe-N
    M = 3,634
    Pointing and vocabulary
    Gaze following
    Online word recognition
    &RQFHSWïODEHODGvantage
    Sound symbolism
    Mutual exclusivity
    Word segmentation
    Statistical sound learning
    Vowel discrimination
    QRQïQDWLve)
    Vowel discrimination
    (native)
    Phonotactic learning
    IDS preference
    0
    1000
    2000
    3000
    4000
    5000
    6000
    7000
    8000
    9000
    FDLOï6DfHï1
    Phenomenon
    .
    M = 3,228
    (Lewis et al., 2016)

    View Slide

  64. Detecting bias through meta-analysis
    Some expected variability in effect size due to sample size – need to
    distinguish this from bias
    0.00 0.25 0.50 0.75 1.00 0.4 0.6 0.8 1.0
    chance
    d
    x x
    Study 1
    Study 2
    x
    Study 3
    N = 100
    0.4 0.6 0.8 1.0
    N = 12

    View Slide

  65. Funnel Plots
    • Scatter plot
    • Red points are each an effect size
    • X–axis = magnitude of effect size
    • Y–axis = measure of how precise the
    study is (number of participants, SE)
    • Black vertical dashed line is an effect
    size of zero
    • Red dashed line is meta-analytic effect
    size
    • Triangle corresponds to a 95%
    confidence interval around the mean
    (ignore black circle points for now)
    Fig from Gurevitch, 2018
    N = large
    N = small

    View Slide

  66. Funnel Plots
    Studies that are more precise (i.e.
    larger sample sizes) should have
    less variance around the true
    population effect size.
    Fig from Gurevitch, 2018
    N = large
    N = small

    View Slide

  67. Funnel Plots and Publication Bias
    If all results are published, then studies will
    deviate from mean in either direction (i.e.
    be symmetrical)
    If a field of research systematically ignores
    a certain direction, then this plot can be
    asymmetrical.
    If researchers are not publishing studies
    that have non-significant ES, we should
    expect a gap in the lower left hand corner
    Fig from Gurevitch, 2018
    N = large
    N = small

    View Slide

  68. Romantic Priming
    (Sundie et al., 2011; Study 2)
    How much do you want to purchase an
    expensive-looking wallet?
    Evolutionary psychologists have argued that male risk-taking and conspicuous consumption are
    costly sexual signals intended to attract potential mates (Shanks et al. 2015)

    View Slide

  69. Meta-analysis of the “Romantic Priming
    Effect”
    Where are all those
    studies? Very
    asymmetrical
    Suggests publication
    bias!
    What should be done
    next?
    N = 48 effect sizes

    View Slide

  70. Large-scale, pre-registered replications
    Shanks, et al. 2015: 14 replications
    MA funnel plot (for comparison)
    Suggests there is no effect!

    View Slide

  71. 0.0
    0.2
    0.4
    0.6
    0.01 0.02 0.03 0.04 0.05
    p−value
    Proportion p−values
    Baseline
    (null is true)
    0.0
    0.2
    0.4
    0.6
    0.01 0.02 0.03 0.04 0.05
    p−value
    Proportion p−values
    Observed –
    Evidential value
    0.0
    0.2
    0.4
    0.6
    0.01 0.02 0.03 0.04 0.05
    p−value
    Proportion p−values
    Observed –
    p-hacked
    Assessing analytical bias
    “P-curve”: Distribution of p-values of a test statistic across a literature
    (Simonsohn, Nelson, & Simmons, 2014; Simonsohn et al., 2014; Simonsohn, Simmons, & Nelson, 2015)

    View Slide

  72. Concept−label advantage Online word recognition Gaze following Pointing and vocabulary
    Statistical sound learning Word segmentation Mutual exclusivity Sound symbolism
    IDS preference Phonotactic learning
    Vowel discrimination
    (native)
    Vowel discrimination
    (non−native)
    0.01 0.02 0.03 0.04 0.05 0.01 0.02 0.03 0.04 0.05 0.01 0.02 0.03 0.04 0.05 0.01 0.02 0.03 0.04 0.05
    0.00
    0.25
    0.50
    0.75
    1.00
    0.00
    0.25
    0.50
    0.75
    1.00
    0.00
    0.25
    0.50
    0.75
    1.00
    p−value
    Proportion p−values
    P-curves
    Observed
    Baseline
    (null)
    (Lewis et al., 2016)

    View Slide

  73. Steps for conducting a meta-analysis
    1. Identify phenomenon of interest
    2. Literature search
    3. Code data reported in papers
    4. Calculate study-level effect sizes
    5. Pool effect sizes across studies, weighting by
    sample size

    View Slide

  74. Helpful R packages for doing meta-
    analysis
    • metafor (Viechtbauer, 2010) – the main workhorse for doing meta-
    analyses in R (modeling + plotting)
    • compute.es (Del Re, 2012), esc (Lüdecke, 2018) – for computing a
    variety of effect sizes and converting between them
    • pwr (Champely, 2020) – for estimating study power
    • PRISMAstatement (Wasey, 2019) – for making PRISMA plots

    View Slide

  75. Wrap-up
    • Meta-analysis is a powerful statistical tool for synthesizing
    existing evidence
    • Assess the evidential value of a literature, the strength of an
    effect, and moderating influences
    • Can be used both within a paper and across papers
    • Great way to start a new project
    • Reproducibility is important - there are lots of great tools in R
    for doing MAs
    • If you’re thinking of doing an MA, I’d be happy to chat with you
    about it!

    View Slide