Intro to the Meta-Analytic Method

Slide 1

Slide 1 text

Intro to the Meta-Analytic Method 7 December 2020 Graduate Research Methods Molly Lewis Carnegie Mellon University

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

What are the long term consequences of getting covid-19?

Slide 4

Slide 4 text

Source: Moll & Tomasello, 2010 Dependent measure: Looking time to checkerboard Independent variable: ADS vs. IDS played in pairs of trials within subjects Cooper & Aslin (1990) Do infants prefer IDS to ADS?

Slide 5

Slide 5 text

ManyBabies (2020) • Multi-lab effort to replicate IDS preference • Each lab conducted their own replication of Cooper & Aslin (1990), with standardization of the paradigm across labs • 68 labs, 2773 babies!

Slide 6

Slide 6 text

Many estimates of the size of an effect across many repeated experiments. Preference for IDS

Slide 7

Slide 7 text

How do we summarize this pattern? ”The Madison Lab replicated the finding that infants prefer infant directed speech, while the other five labs did not.” That throws out a lot of information!!

Slide 8

Slide 8 text

Summarizing literatures is a more general challenge in psychology Psychological literatures are almost always conflicting Qualitative literature reviews are: • not very precise • difficult when there are many studies (Tsuji, et al. 2014)

Slide 9

Slide 9 text

Meta-analysis A quantitative approach to summarizing results across studies Meta-analytic effect size estimate

Slide 10

Slide 10 text

History of meta-analysis • Mid-1970s: many studies had accumulated that were important to social decision policies • e.g. do students learn more when class sizes are smaller? • Research findings were conflicting, implications unclear -> difficult to get funding • Glass (1976): Research findings were not as conflicting as appeared • Using meta-analysis, reveals cumulative patterns • The first “big data” (Gurevitch, et al. 2018)

Slide 11

Slide 11 text

Why do a meta-analysis? 1. Summarize what has been done in a literature 2. Theory development – compare strength of different effects and moderating factors 3. Evaluate bias in literature (e.g. publication bias) 4. Estimate an effect size so you can determine a sample N = ? (Kuhl, 2004)

Slide 12

Slide 12 text

How are meta-analyses presented? 1. Stand-alone publications (~ literature review) 2. As part of an empirical paper (meta-analysis + new experiments) 3. Within-paper meta- analysis (“mini-meta- analysis”) (Lewis & Frank, 2016)

Slide 13

Slide 13 text

Plan for today An example meta-analysis: mutual exclusivity (Lewis, et al., 2020) Metalab Conducting your own meta- analysis

Slide 14

Slide 14 text

An example meta-analysis: Mutual exclusivity (Lewis, et al., 2020) (Markman & Wachtel, 1988)

Slide 15

Slide 15 text

Open questions • How big is the mutual exclusivity effect? • Does this effect have evidential value? • How robust is the effect to methodological variability? • What does the developmental trajectory effect look like? • What leads to developmental change?

Slide 16

Slide 16 text

Conducting a MA of the mutual exclusivity literature An example paper from this literature:

Slide 17

Slide 17 text

0.00 0.25 0.50 0.75 1.00 Prop. trials fixating novel object 0.00 0.25 0.50 0.75 1.00 Prop. trials fixating novel object 0.00 0.25 0.50 0.75 1.00 Prop. trials fixating novel object 0.00 0.25 0.50 0.75 1.00 Prop. trials fixating novel object Extracting effect sizes from a paper Where’s the dofa? chance .65 .13 d Bion, et al. (2013) For 24 mo, mean proportion of trials fixating on novel object = .65 (SD = .13)

Slide 18

Slide 18 text

Grand effect size −1.00 1.00 2.00 3.00 Effect size estimate 8. spiegel 7. markman 6. grassman 5. grassman 4. byers 3. bion 2. bion 1. bion 2011 1988 2010 2010 2009 2013 2013 2013 30 45 48 24 17 30 24 18 72 10 12 12 16 20 25 22 First author Year Age (m.) N (mini) mutual exclusivity meta-analysis Grand effect size estimate 1 “Forest plot” Pool effect sizes across studies, weighting by sample size

Slide 19

Slide 19 text

Moderators = anything you think might influence the effect size • Age • Vocabulary size • Population type • Design type • (stimuli type)

Slide 20

Slide 20 text

Results • Coded 146 effect sizes from 48 papers • Aggregate effect size was d = 1.27 [0.99, 1.55]

Slide 21

Slide 21 text

Moderators

Slide 22

Slide 22 text

Lewis et al. Summary • Effect is robust and large • Evidence for developmental change, and some evidence it is related to experience • Difficult to make causal claims about the source of this developmental change – that’s the goal of the subsequent experiments in the paper • How does this effect compare to other effects in language acquisition/cognitive development??

Slide 23

Slide 23 text

Plan for today An example meta-analysis: mutual exclusivity (Lewis, et al., 2020) Metalab

Slide 24

Slide 24 text

• Open source, aggregation of meta-analyses of 29 different phenomena in cognitive development (focus on language acquisition) • Interactive visualizations • In the process of developing an R package to access data (metalabR) • http://metalab.stanford.edu/ (Lewis et al., 2016; Bergmann, et al., 2018)

Slide 25

Slide 25 text

Concept−label advantage Online word recognition Gaze following Pointing and vocabulary Statistical sound learning Word segmentation Mutual exclusivity Sound symbolism IDS preference Phonotactic learning Vowel discrimination (native) Vowel discrimination (non−native) 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 −1 0 1 2 3 −1 0 1 2 3 −1 0 1 2 3 Age (years) Effect size (d) (Lewis et al., 2016)

Slide 26

Slide 26 text

Theories of language development “Stages” hypothesis “Interactive” hypothesis • Infants learn phonetic contrasts when supported by word context (Feldman, et al., 2013) • Infants learn word mappings when supported by prosody (Shukla, White, & Aslin, 2011) Linguistic Hierarchy (Lewis et al., 2016)

Slide 27

Slide 27 text

Theories of language acquisition Stages Interactive Ad hoc Observed 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 Age (years) 0HWKRGïUHVLGXDOLzed effect size WS GF IDS LA ME WR PV SS SSL 9'ï1 9'ï11 Hypotheses (Lewis et al., 2016)

Slide 28

Slide 28 text

Evidence for continuous development across the language hierarchy (Lewis et al., 2016)

Slide 29

Slide 29 text

Plan for today An example meta-analysis: mutual exclusivity (Lewis, et al., 2020) Metalab Conducting your own meta- analysis

Slide 30

Slide 30 text

Steps for conducting a meta-analysis 1. Identify phenomenon of interest 2. Literature search 3. Code data reported in papers 4. Calculate study-level effect sizes 5. Pool effect sizes across studies, weighting by sample size

Slide 31

Slide 31 text

Steps for conducting a meta-analysis 1. Identify phenomenon of interest

Slide 32

Slide 32 text

Identifying phenomenon • Tradeoff between breadth and specificity • Too broad -> comparing apples and oranges • Too narrow -> doesn’t answer question you care about, not many studies • Can be defined by a paradigm (as in mutual exclusivity) • Can start with seminal study • How many studies do you need?? • Answer: at least two • Aggregated evidence is more precise than individual studies • Within-paper meta-analyses sometimes contain only a few (~5 studies; Lewis & Frank, 2016)

Slide 33

Slide 33 text

Steps for conducting a meta-analysis 1. Identify phenomenon of interest 2. Literature search

Slide 34

Slide 34 text

Define inclusion criteria • What studies are you going to include in your MA? • Every MA is unique • These might change later on as you get to know your topic more • Criteria • Document type (e.g., All literature, journal papers, theses, proceedings papers) • Participants (e.g., adults vs. children) • Method (e.g., eye-tracking vs. pointing) • Stimuli (e.g., objects vs. pictures) • Reasons for exclusion: • not relevant • not empirical (no data) • doesn't satisfy inclusion criteria X

Slide 35

Slide 35 text

Define search protocol • Database search • Google scholar • PubMed • … • Scanning references • Recent paper: Who does it cite? • Seminal paper: Who cites it? • Expert list • Direct request • Review paper (can be biased)

Slide 36

Slide 36 text

Enter results into spreadsheet • Read title and abstract • Make inclusion exclusion decision • Process should be reproducible • [template]

Slide 37

Slide 37 text

The PRISMA statement • Standardized diagram for reporting paper selection process for meta-analytic review • Describes 4 stages: Identification, Screening, Eligibility, Excluded

Slide 38

Slide 38 text

Steps for conducting a meta-analysis 1. Identify phenomenon of interest 2. Literature search 3. Code data reported in papers

Slide 39

Slide 39 text

Code data reported in papers

Slide 40

Slide 40 text

Moderators • = anything you think might influence the effect size (continuous or discrete) • Can be of theoretical or methodological interest • Specific to each MA (e.g., age, design, etc.) • Make a codebook for how you will enter each moderator

Slide 41

Slide 41 text

Steps for conducting a meta-analysis 1. Identify phenomenon of interest 2. Literature search 3. Code data reported in papers 4. Calculate study-level effect sizes

Slide 42

Slide 42 text

Code data reported in papers

Slide 43

Slide 43 text

Standardized measure of the size of an effect Encodes magnitude and direction of effect Cohen’s d: Cohen’s d diff. between means standard dev. Effect Size =

Slide 44

Slide 44 text

Interpreting Cohen’s d Size Description Cohen’s Intuition Psychological Example .2 “small” Diff. between the heights of 15 yo and 16 yo girls in the US Bouba-kiki effect in kids (~.15; Lammertink, et al. 2016) .5 “medium” Diff. between the heights of 14 yo and 18 yo girls. Cognitive behavioral therapy on anxiety (~ .4; (Belleville, et al., 2004) Sex difference in implicit math attitudes (~.5; Klein, et al., 2013) .8 “large” Diff. between the heights of 13 yo and 18 yo girls. Syntactic Priming (~.9; Mahowald, et al., under review) Mutual exclusivity ( ~1.0; Lewis & Frank, 2020) (Cohen, 1969) Explore Cohen’s d: https://rpsychologist. com/d3/cohend/

Slide 45

Slide 45 text

Interpreting Cohen’s d Cognitive (JEP:LMC) Social (JPSP) Psych. Science 0 1 2 0 1 2 3 0 1 2 3 0 1 2 3 absolute cohen's d M = .69 M = .19 M = .78 Estimating the Replicability of Psychological Science (OSF, Science, 2015) N = 97 Relatively “large” effects reported in cognitive psychology

Slide 46

Slide 46 text

−1 0 1 2 3 Effect Size −1 0 1 2 3 Effect Size Effect size variance and CI n1 = 24 n2 = 24

Slide 47

Slide 47 text

Effect size measures • Cohen’s d is just one (prototype) • Appropriate effect size measure depends on aspect of design (e.g., within vs. between subject), and types of variables (e.g. qualitative vs. quantitative). • For any statistical test you conduct can compute effect size (in principle) • the difference is between groups (t-test, d) • the relationship between variables (correlation, r) • the amount of variance accounted for by a factor (ANOVA, regression, f) • … • Can convert between ES metrics

Slide 48

Slide 48 text

Slide 49

Slide 49 text

Grand effect size −1.00 1.00 2.00 3.00 Effect size estimate 8. spiegel 7. markman 6. grassman 5. grassman 4. byers 3. bion 2. bion 1. bion 2011 1988 2010 2010 2009 2013 2013 2013 30 45 48 24 17 30 24 18 72 10 12 12 16 20 25 22 First author Year Age (m.) N How do you pool effect sizes? Grand effect size estimate

Slide 50

Slide 50 text

How do you pool effect sizes? • The goal of a meta-analysis is to estimate the true population effect size • Treat each study as a sample effect size from a population of studies • Aggregate using quantitative methods (e.g. averaging) • Get point estimate of the true effect size with measure of certainty More precise estimate of effect size than from single study.

Slide 51

Slide 51 text

Population Sample All the studies we did run (i.e. the ones in the literature) 0 2 4 6 0.4 0.6 0.8 1.0 Prop. Right count Sample Effect size Count of studies 0 25000 50000 75000 100000 0.0 0.4 0.8 Prop. Right count Population Effect size Count of studies All the studies we could have run Use samples to estimate population. Cohen’s d = .7

Slide 52

Slide 52 text

Methods for pooling Analogous to logic in single study: • In a study, sample participants and pool to get estimate of effect in study (unweighted mean) • In meta-analysis, sample studies to get estimate of grand effect (weighted mean) Just as for models across participants, two models for pooling: • Fixed effect: One true population effect • Random effect: Random sample of many population effects, estimates mean

Slide 53

Slide 53 text

Fixed Effect Model Slide adopted from http://compare2what.blogspot.com/ (Lohse, 2013)

Slide 54

Slide 54 text

Fixed Effect Model Slide adopted from http://compare2what.blogspot.com/ (Lohse, 2013)

Slide 55

Slide 55 text

Random Effect Model Random effect models recommended! Slide adopted from http://compare2what.blogspot.com/ (Lohse, 2013)

Slide 56

Slide 56 text

Structure of the data for fitting a meta- analytic model N = 50 effect sizes Effect size Variance of effect size

Slide 57

Slide 57 text

Fitting the meta-analytic model Grand meta-analytic effect size Grand meta-analytic effect size confidence interval Is the grand effect size significantly different from zero? Metafor package in R (Viechtbauer, 2010)

Slide 58

Slide 58 text

Forest Plot • Point = study • Size of square = weight • Ranges = individual confidence intervals (uncertainty) • Diamond = weighted mean • Dashed line = ES of 0 • If diamond overlap with dashed line the overall effect sizes does not differ from zero Grand effect size −1.00 1.00 2.00 3.00 Effect size estimate 8. spiegel 7. markman 6. grassman 5. grassman 4. byers 3. bion 2. bion 1. bion 2011 1988 2010 2010 2009 2013 2013 2013 30 45 48 24 17 30 24 18 72 10 12 12 16 20 25 22 First author Year Age (m.) N

Slide 59

Slide 59 text

What a forest plot tells you 1. What is the overall effect size for phenomenon X? • Because this estimate reflects data from many more participants than a single study, it should be more accurate than the effect size from a single study. • How big is this effect relative to other effects in psychology? 2. Does the effect significantly differ from zero? • If it does not, this suggest there may be no effect (even though individual studies may show an effect). 3. How much variability is there? • Are the effects of individual studies roughly the same, or is there a lot of variability? • If there’s a lot of variability, this suggests there might be an important moderator

Slide 60

Slide 60 text

Analyzing moderators • Does the effect size vary by different features of the experiment? • Two kinds of moderators: Categorical and Continuous (Left fig. from Gurevitch et al, 2018) Mutual exclusivity MA

Slide 61

Slide 61 text

Fitting meta-analytic models with moderators ma_model_age <- rma(d_calc ~ mean_age, vi = d_var_calc, data = ma_data)

Slide 62

Slide 62 text

Assessing publication bias How many “missing” studies would have to exist in order for the overall effect size to be zero? Fail-Safe-N (Orwin, 1983)

Slide 63

Slide 63 text

Fail-Safe-N M = 3,634 Pointing and vocabulary Gaze following Online word recognition &RQFHSWïODEHODGvantage Sound symbolism Mutual exclusivity Word segmentation Statistical sound learning Vowel discrimination QRQïQDWLve) Vowel discrimination (native) Phonotactic learning IDS preference 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 FDLOï6DfHï1 Phenomenon . M = 3,228 (Lewis et al., 2016)

Slide 64

Slide 64 text

Detecting bias through meta-analysis Some expected variability in effect size due to sample size – need to distinguish this from bias 0.00 0.25 0.50 0.75 1.00 0.4 0.6 0.8 1.0 chance d x x Study 1 Study 2 x Study 3 N = 100 0.4 0.6 0.8 1.0 N = 12

Slide 65

Slide 65 text

Funnel Plots • Scatter plot • Red points are each an effect size • X–axis = magnitude of effect size • Y–axis = measure of how precise the study is (number of participants, SE) • Black vertical dashed line is an effect size of zero • Red dashed line is meta-analytic effect size • Triangle corresponds to a 95% confidence interval around the mean (ignore black circle points for now) Fig from Gurevitch, 2018 N = large N = small

Slide 66

Slide 66 text

Funnel Plots Studies that are more precise (i.e. larger sample sizes) should have less variance around the true population effect size. Fig from Gurevitch, 2018 N = large N = small

Slide 67

Slide 67 text

Funnel Plots and Publication Bias If all results are published, then studies will deviate from mean in either direction (i.e. be symmetrical) If a field of research systematically ignores a certain direction, then this plot can be asymmetrical. If researchers are not publishing studies that have non-significant ES, we should expect a gap in the lower left hand corner Fig from Gurevitch, 2018 N = large N = small

Slide 68

Slide 68 text

Romantic Priming (Sundie et al., 2011; Study 2) How much do you want to purchase an expensive-looking wallet? Evolutionary psychologists have argued that male risk-taking and conspicuous consumption are costly sexual signals intended to attract potential mates (Shanks et al. 2015)

Slide 69

Slide 69 text

Meta-analysis of the “Romantic Priming Effect” Where are all those studies? Very asymmetrical Suggests publication bias! What should be done next? N = 48 effect sizes

Slide 70

Slide 70 text

Large-scale, pre-registered replications Shanks, et al. 2015: 14 replications MA funnel plot (for comparison) Suggests there is no effect!

Slide 71

Slide 71 text

0.0 0.2 0.4 0.6 0.01 0.02 0.03 0.04 0.05 p−value Proportion p−values Baseline (null is true) 0.0 0.2 0.4 0.6 0.01 0.02 0.03 0.04 0.05 p−value Proportion p−values Observed – Evidential value 0.0 0.2 0.4 0.6 0.01 0.02 0.03 0.04 0.05 p−value Proportion p−values Observed – p-hacked Assessing analytical bias “P-curve”: Distribution of p-values of a test statistic across a literature (Simonsohn, Nelson, & Simmons, 2014; Simonsohn et al., 2014; Simonsohn, Simmons, & Nelson, 2015)

Slide 72

Slide 72 text

Concept−label advantage Online word recognition Gaze following Pointing and vocabulary Statistical sound learning Word segmentation Mutual exclusivity Sound symbolism IDS preference Phonotactic learning Vowel discrimination (native) Vowel discrimination (non−native) 0.01 0.02 0.03 0.04 0.05 0.01 0.02 0.03 0.04 0.05 0.01 0.02 0.03 0.04 0.05 0.01 0.02 0.03 0.04 0.05 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 p−value Proportion p−values P-curves Observed Baseline (null) (Lewis et al., 2016)

Slide 73

Slide 73 text

Slide 74

Slide 74 text

Helpful R packages for doing meta- analysis • metafor (Viechtbauer, 2010) – the main workhorse for doing meta- analyses in R (modeling + plotting) • compute.es (Del Re, 2012), esc (Lüdecke, 2018) – for computing a variety of effect sizes and converting between them • pwr (Champely, 2020) – for estimating study power • PRISMAstatement (Wasey, 2019) – for making PRISMA plots

Slide 75

Slide 75 text

Wrap-up • Meta-analysis is a powerful statistical tool for synthesizing existing evidence • Assess the evidential value of a literature, the strength of an effect, and moderating influences • Can be used both within a paper and across papers • Great way to start a new project • Reproducibility is important - there are lots of great tools in R for doing MAs • If you’re thinking of doing an MA, I’d be happy to chat with you about it!