Metalab

4a8d62be623e1a3e3cde003a6a810c0b?s=47 mllewis
June 01, 2016

 Metalab

Meta-analyses in language acquisition research

4a8d62be623e1a3e3cde003a6a810c0b?s=128

mllewis

June 01, 2016
Tweet

Transcript

  1. 1 June 2016 A Quantitative Synthesis of Language Development Using

    Meta-Analysis Molly Lewis In collaboration with Mika Braginsky, Christina Bergmann, Sho Tsuji, Page Piccinini, Alex Cristia, and Michael C. Frank
  2. Our goal: Build predictive, explanatory theories aaaaa oooooo Limited data

    “dog” “dog” /pragmatics/
  3. Ideally, what would our data look like? Veridical description of

    behavior. e.g., If we think kids of a certain age can discriminate vowels, they actually can. High fidelity. e.g., How good are kids at discriminating vowels? How does this skill change across development? How does the difficulty of this skill compare to other skills?
  4. None
  5. Sources of bias Low power (failure to detect real effect)

    (e.g., Button, et al., 2013) Publication bias (“file drawer problem”) (e.g., Rosenthal, 1979) Analytical flexibility (“p-hacking”) (e.g., Simmons, Nelson, & Simonsohn, 2011) Particularly problematic for language development research (small Ns and effects)
  6. Kuhl (2004) Current descriptions of behavior

  7. 1) Categorical description 2) Lack of variability 3) Cross-domain comparisons

    difficult success 0 1 2 3 4 5 6 Time Limited fidelity of current descriptions
  8. Meta-analysis as a solution Effect size as unit of analysis:

    Quantitative, scale-free measure of “success” vs. 0.25 0.50 0.75 1.00 Prop. trials fixating novel object 0.00 0.25 0.50 0.75 1.00 Prop. trials fixating novel object chance d Observed mean
  9. Meta-analysis as a solution To combine studies: – Treat each

    study as an sample effect size from a population of studies – Aggregating using quantitative methods (e.g. averaging) – Get point estimate of the true effect size with measure of certainty More precise estimate of effect size than from single study.
  10. Meta-analysis supports theory building Veracity: Method for identifying signatures of

    bias and improving replicability Fidelity: Method for obtaining quantitative descriptions, and comparing across phenomena
  11. Aggregates meta-analyses across phenomena in language development Publicly available [metalab.stanford.edu]

    Summary visualizations of effect sizes, and power calculator Estimate effect sizes for particular phenomena, age, and method
  12. Outline I. The MetaLab dataset II. Assess bias in the

    language development literature III. Toward a theoretical synthesis
  13. Conducting a meta-analysis 1. Select phenomenon of interest 2. Select

    papers via sampling strategy 3. Code statistics reported in papers 4. Calculate effect sizes 5. Pool effect sizes across studies, weighting by sample size
  14. 0.00 0.25 0.50 0.75 1.00 Prop. trials fixating novel object

    0.00 0.25 0.50 0.75 1.00 Prop. trials fixating novel object 0.00 0.25 0.50 0.75 1.00 Prop. trials fixating novel object 0.00 0.25 0.50 0.75 1.00 Prop. trials fixating novel object Example: Mutual exclusivity meta-analysis Where’s the dofa? Bion, et al. (2013) For 24 mo, mean proportion of trials fixating on novel object = .65 (SD = .13) chance .65 .13 d
  15. Pool effect sizes across studies, weighting by sample size Grand

    effect size −1.00 1.00 2.00 3.00 Effect size estimate 8. spiegel 7. markman 6. grassman 5. grassman 4. byers 3. bion 2. bion 1. bion 2011 1988 2010 2010 2009 2013 2013 2013 30 45 48 24 17 30 24 18 72 10 12 12 16 20 25 22 First author Year Age (m.) N Example: Mutual exclusivity meta-analysis Grand effect size estimate
  16. Phenomena in MetaLab Prosody Communication Sounds Words

  17. Overall effect sizes Random effect models using metafor R package

    (Viechtbauer, 2010) • • • • • • • • • • • • • • • • • • • • • • Pointing and vocabulary Gaze following Online word recognition Concept−label advantage Mutual exclusivity Word segmentation Statistical sound learning Vowel discrimination (non−native) Vowel discrimination (native) Phonotactic learning IDS preference 0 1 2 3 Effect Size Phenomenon
  18. IDS preference Phonotactic learning Vowel discrimination (native) Vowel discrimination (non−native)

    Statistical sound learning Word segmentation Mutual exclusivity Concept−label advantage Online word recognition Gaze following Pointing and vocabulary −1 0 1 2 3 −1 0 1 2 3 −1 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 Age (years) Effect size (d)
  19. Outline I. The MetaLab dataset II. Assessing bias in the

    language development literature III. Towards a theoretical synthesis
  20. Detecting bias through meta-analysis Some expected variability in effect size

    due to sample size – need to distinguish this from bias 0.00 0.25 0.50 0.75 1.00 0.4 0.6 0.8 1.0 chance d x x Study 1 Study 2 x Study 3 N = 100 0.4 0.6 0.8 1.0 N = 12
  21. Assessing publication bias How many ``missing” studies would have to

    exist in order for the overall effect size to be zero? Fail-Safe-N (Orwin, 1983)
  22. Pointing and vocabulary Gaze following Online word recognition Concept−label advantage

    Mutual exclusivity Word segmentation Vowel discrimination (non−native) Vowel discrimination (native) IDS preference 0 2500 5000 7500 10000 Fail−Safe−N Phenomenon Fail-Safe-N M = 3914
  23. 0.0 0.2 0.4 0.6 0.01 0.02 0.03 0.04 0.05 p−value

    Proportion p−values Baseline (null is true) 0.0 0.2 0.4 0.6 0.01 0.02 0.03 0.04 0.05 p−value Proportion p−values Observed – Evidential value 0.0 0.2 0.4 0.6 0.01 0.02 0.03 0.04 0.05 p−value Proportion p−values Observed – p-hacked Assessing analytical bias “P-curve”: Distribution of p-values of a test statistic across a literature (Simonsohn, Nelson, & Simmons, 2014; Simonsohn et al., 2014; Simonsohn, Simmons, & Nelson, 2015)
  24. Phonotactic learning Vowel discrimination (native) Vowel discrimination (non−native) Word segmentation

    Mutual exclusivity Concept−label advantage 0.0 0.2 0.4 0.6 0.0 0.2 0.4 0.6 0.01 0.02 0.03 0.04 0.05 0.01 0.02 0.03 0.04 0.05 0.01 0.02 0.03 0.04 0.05 p−value proportion p−values P-curves Baseline (Null) Observed
  25. Null (H0 ) H1 critical value α β 1-β Meta-analysis

    helps maximize power for prospective studies = probability of rejecting a false null hypothesis
  26. Improving power Increase N Increase d (Krzywinski & Altman, 2003)

    Null (H0 ) H1
  27. Increasing d through method choice Method choices: Conditioned head-turn Forced-choice

    by pointing High-amplitude sucking Head-turn preference procedure Central fixation Looking-while-listening Anticipatory eye movements “behavior” “eye-tracking”
  28. Increasing d through method choice • • • • •

    • • • • • • • • • • • −0.25 0.00 0.25 0.50 0.75 behavior eye−tracking response mode residualized effect size IDS preference Gaze following Word recognition Mutual exclusivity Sound category learning Vowel discrimination (native) Vowel discrimination (non-native) Word segmentation Concept-label advantage
  29. Discussion Fail-Safe-N suggests in most cases there would have to

    be large number of missing studies for effects to be 0 P-curves do not suggest any evidence of p-hacking In sum: Literature is veridical and thus should form the basis for theory-building Use effect sizes to plan sample sizes prospectively, in order to increase power
  30. Outline I. The MetaLab dataset II. Assessing bias in the

    language development literature III. Toward a theoretical synthesis
  31. Theories of language development Stages hypothesis Continuous, synergistic hypothesis –

    Infants learn phonetic contrasts when supported by word context (Feldman, et al., 2013) – Infants learn word mappings when supported by prosody (Shukla, White, & Aslin, 2011) Linguistic Hierarchy
  32. Theories of language acquisition: Hypothesis space effect size age effect

    size age effect size age effect size age “Stages” Hypothesis “Synergistic” Hypothesis
  33. −1 0 1 2 3 0 1 2 3 Age

    (years) Effect size (d) n 25 50 75 Gaze following Infant directed speech preference Label advantage in concept learning Mutual exclusivity Online word recognition Phonotactic learning Statistical sound category learning Vowel discrimination (native) Vowel discrimination (non−native) Word segmentation
  34. Evidence for continuous development across the language hierarchy FIX THIS

    prosody sounds words communication 0.0 0.5 1.0 1.5 2.0 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 Age (years) Effect size (d) dataset Gaze following Infant directed speech preference Label advantage in concept learning Mutual exclusivity Online word recognition Phonotactic learning Statistical sound category learning Vowel discrimination (native) Vowel discrimination (non−native) Word segmentation IDS preference Gaze following Word recognition Mutual exclusivity Phonotactic learning Sound category learning Vowel discrimination (native) Vowel discrimination (non-native) Word segmentation Concept-label advantage
  35. prosody sounds words communication 0 1 2 3 0 1

    2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 Age (years) Effect size (d) response_mode behavior EEG eye−tracking NIRS other dataset Gaze following Infant directed speech preference Label advantage in concept learning Mutual exclusivity Online word recognition Phonotactic learning Statistical sound category learning Vowel discrimination (native) Vowel discrimination (non−native) Word segmentation Evidence for continuous development across the language hierarchy IDS preference Gaze following Word recognition Mutual exclusivity Phonotactic learning Sound category learning Vowel discrimination (native) Vowel discrimination (non-native) Word segmentation Concept-label advantage
  36. Limitations Publication bias Magnitude of effect may be related to

    method Non-representativeness of participant populations Limited number of similar studies in some domains
  37. Conclusion Veracity: – Some bias, but evidential value – Use

    ES to calculate power, reduce bias prospectively Fidelity: – Comparisons across phenomena – Build synthetic, precise theories
  38. Toward a quantitative synthesis −1 0 1 2 3 0

    1 2 3 Age (years) Effect size (d) n 25 50 75 Gaze following Infant directed speech preference Label advantage in concept learning Mutual exclusivity Online word recognition Phonotactic learning Statistical sound category learning Vowel discrimination (native) Vowel discrimination (non−native) Word segmentation
  39. Thanks! metalab.stanford.edu Kyle MacDonald, Bria Long (Harvard University)