Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Intro to the Meta-Analytic Method

mllewis
December 07, 2020

Intro to the Meta-Analytic Method

mllewis

December 07, 2020
Tweet

More Decks by mllewis

Other Decks in Science

Transcript

  1. Intro to the Meta-Analytic Method 7 December 2020 Graduate Research

    Methods Molly Lewis Carnegie Mellon University
  2. Source: Moll & Tomasello, 2010 Dependent measure: Looking time to

    checkerboard Independent variable: ADS vs. IDS played in pairs of trials within subjects Cooper & Aslin (1990) Do infants prefer IDS to ADS?
  3. ManyBabies (2020) • Multi-lab effort to replicate IDS preference •

    Each lab conducted their own replication of Cooper & Aslin (1990), with standardization of the paradigm across labs • 68 labs, 2773 babies!
  4. Many estimates of the size of an effect across many

    repeated experiments. Preference for IDS
  5. How do we summarize this pattern? ”The Madison Lab replicated

    the finding that infants prefer infant directed speech, while the other five labs did not.” That throws out a lot of information!!
  6. Summarizing literatures is a more general challenge in psychology Psychological

    literatures are almost always conflicting Qualitative literature reviews are: • not very precise • difficult when there are many studies (Tsuji, et al. 2014)
  7. History of meta-analysis • Mid-1970s: many studies had accumulated that

    were important to social decision policies • e.g. do students learn more when class sizes are smaller? • Research findings were conflicting, implications unclear -> difficult to get funding • Glass (1976): Research findings were not as conflicting as appeared • Using meta-analysis, reveals cumulative patterns • The first “big data” (Gurevitch, et al. 2018)
  8. Why do a meta-analysis? 1. Summarize what has been done

    in a literature 2. Theory development – compare strength of different effects and moderating factors 3. Evaluate bias in literature (e.g. publication bias) 4. Estimate an effect size so you can determine a sample N = ? (Kuhl, 2004)
  9. How are meta-analyses presented? 1. Stand-alone publications (~ literature review)

    2. As part of an empirical paper (meta-analysis + new experiments) 3. Within-paper meta- analysis (“mini-meta- analysis”) (Lewis & Frank, 2016)
  10. Plan for today An example meta-analysis: mutual exclusivity (Lewis, et

    al., 2020) Metalab Conducting your own meta- analysis
  11. Open questions • How big is the mutual exclusivity effect?

    • Does this effect have evidential value? • How robust is the effect to methodological variability? • What does the developmental trajectory effect look like? • What leads to developmental change?
  12. 0.00 0.25 0.50 0.75 1.00 Prop. trials fixating novel object

    0.00 0.25 0.50 0.75 1.00 Prop. trials fixating novel object 0.00 0.25 0.50 0.75 1.00 Prop. trials fixating novel object 0.00 0.25 0.50 0.75 1.00 Prop. trials fixating novel object Extracting effect sizes from a paper Where’s the dofa? chance .65 .13 d Bion, et al. (2013) For 24 mo, mean proportion of trials fixating on novel object = .65 (SD = .13)
  13. Grand effect size −1.00 1.00 2.00 3.00 Effect size estimate

    8. spiegel 7. markman 6. grassman 5. grassman 4. byers 3. bion 2. bion 1. bion 2011 1988 2010 2010 2009 2013 2013 2013 30 45 48 24 17 30 24 18 72 10 12 12 16 20 25 22 First author Year Age (m.) N (mini) mutual exclusivity meta-analysis Grand effect size estimate 1 “Forest plot” Pool effect sizes across studies, weighting by sample size
  14. Moderators = anything you think might influence the effect size

    • Age • Vocabulary size • Population type • Design type • (stimuli type)
  15. Results • Coded 146 effect sizes from 48 papers •

    Aggregate effect size was d = 1.27 [0.99, 1.55]
  16. Lewis et al. Summary • Effect is robust and large

    • Evidence for developmental change, and some evidence it is related to experience • Difficult to make causal claims about the source of this developmental change – that’s the goal of the subsequent experiments in the paper • How does this effect compare to other effects in language acquisition/cognitive development??
  17. • Open source, aggregation of meta-analyses of 29 different phenomena

    in cognitive development (focus on language acquisition) • Interactive visualizations • In the process of developing an R package to access data (metalabR) • http://metalab.stanford.edu/ (Lewis et al., 2016; Bergmann, et al., 2018)
  18. Concept−label advantage Online word recognition Gaze following Pointing and vocabulary

    Statistical sound learning Word segmentation Mutual exclusivity Sound symbolism IDS preference Phonotactic learning Vowel discrimination (native) Vowel discrimination (non−native) 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 −1 0 1 2 3 −1 0 1 2 3 −1 0 1 2 3 Age (years) Effect size (d) (Lewis et al., 2016)
  19. Theories of language development “Stages” hypothesis “Interactive” hypothesis • Infants

    learn phonetic contrasts when supported by word context (Feldman, et al., 2013) • Infants learn word mappings when supported by prosody (Shukla, White, & Aslin, 2011) Linguistic Hierarchy (Lewis et al., 2016)
  20. Theories of language acquisition Stages Interactive Ad hoc Observed 0

    1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 Age (years) 0HWKRGïUHVLGXDOLzed effect size WS GF IDS LA ME WR PV SS SSL 9'ï1 9'ï11 Hypotheses (Lewis et al., 2016)
  21. Plan for today An example meta-analysis: mutual exclusivity (Lewis, et

    al., 2020) Metalab Conducting your own meta- analysis
  22. Steps for conducting a meta-analysis 1. Identify phenomenon of interest

    2. Literature search 3. Code data reported in papers 4. Calculate study-level effect sizes 5. Pool effect sizes across studies, weighting by sample size
  23. Identifying phenomenon • Tradeoff between breadth and specificity • Too

    broad -> comparing apples and oranges • Too narrow -> doesn’t answer question you care about, not many studies • Can be defined by a paradigm (as in mutual exclusivity) • Can start with seminal study • How many studies do you need?? • Answer: at least two • Aggregated evidence is more precise than individual studies • Within-paper meta-analyses sometimes contain only a few (~5 studies; Lewis & Frank, 2016)
  24. Define inclusion criteria • What studies are you going to

    include in your MA? • Every MA is unique • These might change later on as you get to know your topic more • Criteria • Document type (e.g., All literature, journal papers, theses, proceedings papers) • Participants (e.g., adults vs. children) • Method (e.g., eye-tracking vs. pointing) • Stimuli (e.g., objects vs. pictures) • Reasons for exclusion: • not relevant • not empirical (no data) • doesn't satisfy inclusion criteria X
  25. Define search protocol • Database search • Google scholar •

    PubMed • … • Scanning references • Recent paper: Who does it cite? • Seminal paper: Who cites it? • Expert list • Direct request • Review paper (can be biased)
  26. Enter results into spreadsheet • Read title and abstract •

    Make inclusion exclusion decision • Process should be reproducible • [template]
  27. The PRISMA statement • Standardized diagram for reporting paper selection

    process for meta-analytic review • Describes 4 stages: Identification, Screening, Eligibility, Excluded
  28. Steps for conducting a meta-analysis 1. Identify phenomenon of interest

    2. Literature search 3. Code data reported in papers
  29. Moderators • = anything you think might influence the effect

    size (continuous or discrete) • Can be of theoretical or methodological interest • Specific to each MA (e.g., age, design, etc.) • Make a codebook for how you will enter each moderator
  30. Steps for conducting a meta-analysis 1. Identify phenomenon of interest

    2. Literature search 3. Code data reported in papers 4. Calculate study-level effect sizes
  31. Standardized measure of the size of an effect Encodes magnitude

    and direction of effect Cohen’s d: Cohen’s d diff. between means standard dev. Effect Size =
  32. Interpreting Cohen’s d Size Description Cohen’s Intuition Psychological Example .2

    “small” Diff. between the heights of 15 yo and 16 yo girls in the US Bouba-kiki effect in kids (~.15; Lammertink, et al. 2016) .5 “medium” Diff. between the heights of 14 yo and 18 yo girls. Cognitive behavioral therapy on anxiety (~ .4; (Belleville, et al., 2004) Sex difference in implicit math attitudes (~.5; Klein, et al., 2013) .8 “large” Diff. between the heights of 13 yo and 18 yo girls. Syntactic Priming (~.9; Mahowald, et al., under review) Mutual exclusivity ( ~1.0; Lewis & Frank, 2020) (Cohen, 1969) Explore Cohen’s d: https://rpsychologist. com/d3/cohend/
  33. Interpreting Cohen’s d Cognitive (JEP:LMC) Social (JPSP) Psych. Science 0

    1 2 0 1 2 3 0 1 2 3 0 1 2 3 absolute cohen's d M = .69 M = .19 M = .78 Estimating the Replicability of Psychological Science (OSF, Science, 2015) N = 97 Relatively “large” effects reported in cognitive psychology
  34. −1 0 1 2 3 Effect Size −1 0 1

    2 3 Effect Size Effect size variance and CI n1 = 24 n2 = 24
  35. Effect size measures • Cohen’s d is just one (prototype)

    • Appropriate effect size measure depends on aspect of design (e.g., within vs. between subject), and types of variables (e.g. qualitative vs. quantitative). • For any statistical test you conduct can compute effect size (in principle) • the difference is between groups (t-test, d) • the relationship between variables (correlation, r) • the amount of variance accounted for by a factor (ANOVA, regression, f) • … • Can convert between ES metrics
  36. Steps for conducting a meta-analysis 1. Identify phenomenon of interest

    2. Literature search 3. Code data reported in papers 4. Calculate study-level effect sizes 5. Pool effect sizes across studies, weighting by sample size
  37. Grand effect size −1.00 1.00 2.00 3.00 Effect size estimate

    8. spiegel 7. markman 6. grassman 5. grassman 4. byers 3. bion 2. bion 1. bion 2011 1988 2010 2010 2009 2013 2013 2013 30 45 48 24 17 30 24 18 72 10 12 12 16 20 25 22 First author Year Age (m.) N How do you pool effect sizes? Grand effect size estimate
  38. How do you pool effect sizes? • The goal of

    a meta-analysis is to estimate the true population effect size • Treat each study as a sample effect size from a population of studies • Aggregate using quantitative methods (e.g. averaging) • Get point estimate of the true effect size with measure of certainty More precise estimate of effect size than from single study.
  39. Population Sample All the studies we did run (i.e. the

    ones in the literature) 0 2 4 6 0.4 0.6 0.8 1.0 Prop. Right count Sample Effect size Count of studies 0 25000 50000 75000 100000 0.0 0.4 0.8 Prop. Right count Population Effect size Count of studies All the studies we could have run Use samples to estimate population. Cohen’s d = .7
  40. Methods for pooling Analogous to logic in single study: •

    In a study, sample participants and pool to get estimate of effect in study (unweighted mean) • In meta-analysis, sample studies to get estimate of grand effect (weighted mean) Just as for models across participants, two models for pooling: • Fixed effect: One true population effect • Random effect: Random sample of many population effects, estimates mean
  41. Random Effect Model Random effect models recommended! Slide adopted from

    http://compare2what.blogspot.com/ (Lohse, 2013)
  42. Structure of the data for fitting a meta- analytic model

    N = 50 effect sizes Effect size Variance of effect size
  43. Fitting the meta-analytic model Grand meta-analytic effect size Grand meta-analytic

    effect size confidence interval Is the grand effect size significantly different from zero? Metafor package in R (Viechtbauer, 2010)
  44. Forest Plot • Point = study • Size of square

    = weight • Ranges = individual confidence intervals (uncertainty) • Diamond = weighted mean • Dashed line = ES of 0 • If diamond overlap with dashed line the overall effect sizes does not differ from zero Grand effect size −1.00 1.00 2.00 3.00 Effect size estimate 8. spiegel 7. markman 6. grassman 5. grassman 4. byers 3. bion 2. bion 1. bion 2011 1988 2010 2010 2009 2013 2013 2013 30 45 48 24 17 30 24 18 72 10 12 12 16 20 25 22 First author Year Age (m.) N
  45. What a forest plot tells you 1. What is the

    overall effect size for phenomenon X? • Because this estimate reflects data from many more participants than a single study, it should be more accurate than the effect size from a single study. • How big is this effect relative to other effects in psychology? 2. Does the effect significantly differ from zero? • If it does not, this suggest there may be no effect (even though individual studies may show an effect). 3. How much variability is there? • Are the effects of individual studies roughly the same, or is there a lot of variability? • If there’s a lot of variability, this suggests there might be an important moderator
  46. Analyzing moderators • Does the effect size vary by different

    features of the experiment? • Two kinds of moderators: Categorical and Continuous (Left fig. from Gurevitch et al, 2018) Mutual exclusivity MA
  47. Assessing publication bias How many “missing” studies would have to

    exist in order for the overall effect size to be zero? Fail-Safe-N (Orwin, 1983)
  48. Fail-Safe-N M = 3,634 Pointing and vocabulary Gaze following Online

    word recognition &RQFHSWïODEHODGvantage Sound symbolism Mutual exclusivity Word segmentation Statistical sound learning Vowel discrimination QRQïQDWLve) Vowel discrimination (native) Phonotactic learning IDS preference 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 FDLOï6DfHï1 Phenomenon . M = 3,228 (Lewis et al., 2016)
  49. Detecting bias through meta-analysis Some expected variability in effect size

    due to sample size – need to distinguish this from bias 0.00 0.25 0.50 0.75 1.00 0.4 0.6 0.8 1.0 chance d x x Study 1 Study 2 x Study 3 N = 100 0.4 0.6 0.8 1.0 N = 12
  50. Funnel Plots • Scatter plot • Red points are each

    an effect size • X–axis = magnitude of effect size • Y–axis = measure of how precise the study is (number of participants, SE) • Black vertical dashed line is an effect size of zero • Red dashed line is meta-analytic effect size • Triangle corresponds to a 95% confidence interval around the mean (ignore black circle points for now) Fig from Gurevitch, 2018 N = large N = small
  51. Funnel Plots Studies that are more precise (i.e. larger sample

    sizes) should have less variance around the true population effect size. Fig from Gurevitch, 2018 N = large N = small
  52. Funnel Plots and Publication Bias If all results are published,

    then studies will deviate from mean in either direction (i.e. be symmetrical) If a field of research systematically ignores a certain direction, then this plot can be asymmetrical. If researchers are not publishing studies that have non-significant ES, we should expect a gap in the lower left hand corner Fig from Gurevitch, 2018 N = large N = small
  53. Romantic Priming (Sundie et al., 2011; Study 2) How much

    do you want to purchase an expensive-looking wallet? Evolutionary psychologists have argued that male risk-taking and conspicuous consumption are costly sexual signals intended to attract potential mates (Shanks et al. 2015)
  54. Meta-analysis of the “Romantic Priming Effect” Where are all those

    studies? Very asymmetrical Suggests publication bias! What should be done next? N = 48 effect sizes
  55. Large-scale, pre-registered replications Shanks, et al. 2015: 14 replications MA

    funnel plot (for comparison) Suggests there is no effect!
  56. 0.0 0.2 0.4 0.6 0.01 0.02 0.03 0.04 0.05 p−value

    Proportion p−values Baseline (null is true) 0.0 0.2 0.4 0.6 0.01 0.02 0.03 0.04 0.05 p−value Proportion p−values Observed – Evidential value 0.0 0.2 0.4 0.6 0.01 0.02 0.03 0.04 0.05 p−value Proportion p−values Observed – p-hacked Assessing analytical bias “P-curve”: Distribution of p-values of a test statistic across a literature (Simonsohn, Nelson, & Simmons, 2014; Simonsohn et al., 2014; Simonsohn, Simmons, & Nelson, 2015)
  57. Concept−label advantage Online word recognition Gaze following Pointing and vocabulary

    Statistical sound learning Word segmentation Mutual exclusivity Sound symbolism IDS preference Phonotactic learning Vowel discrimination (native) Vowel discrimination (non−native) 0.01 0.02 0.03 0.04 0.05 0.01 0.02 0.03 0.04 0.05 0.01 0.02 0.03 0.04 0.05 0.01 0.02 0.03 0.04 0.05 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 p−value Proportion p−values P-curves Observed Baseline (null) (Lewis et al., 2016)
  58. Steps for conducting a meta-analysis 1. Identify phenomenon of interest

    2. Literature search 3. Code data reported in papers 4. Calculate study-level effect sizes 5. Pool effect sizes across studies, weighting by sample size
  59. Helpful R packages for doing meta- analysis • metafor (Viechtbauer,

    2010) – the main workhorse for doing meta- analyses in R (modeling + plotting) • compute.es (Del Re, 2012), esc (Lüdecke, 2018) – for computing a variety of effect sizes and converting between them • pwr (Champely, 2020) – for estimating study power • PRISMAstatement (Wasey, 2019) – for making PRISMA plots
  60. Wrap-up • Meta-analysis is a powerful statistical tool for synthesizing

    existing evidence • Assess the evidential value of a literature, the strength of an effect, and moderating influences • Can be used both within a paper and across papers • Great way to start a new project • Reproducibility is important - there are lots of great tools in R for doing MAs • If you’re thinking of doing an MA, I’d be happy to chat with you about it!