Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Word tree reconciliation. Adopting biological methods and metaphors in historical linguistics

Word tree reconciliation. Adopting biological methods and metaphors in historical linguistics

Talk, held at the 27th GeSuS-Jahrestagung "Sprach(en)forschung: Disziplinen und Interdisziplinarität" (University Library Warsaw, 2019/05/30).

Schweikhard

May 30, 2019
Tweet

More Decks by Schweikhard

Other Decks in Science

Transcript

  1. LC
    CA
    Word tree reconciliation:
    Adopting biological methods and metaphors in
    historical linguistics
    N. E. Schweikhard
    Max Planck Institute for the Science of Human History
    Department of Linguistic and Cultural Evolution
    CALC Project
    May 30th, 2019
    1 / 20

    View Slide

  2. Table of Contents
    1 Models of Language Change
    2 Tree Reconciliation in Phylogenetics
    3 Word Tree Reconciliation
    4 Derivation Trees
    5 Project Goals
    2 / 20

    View Slide

  3. The tree model in historical linguistics
    describes divergence and change through time
    excludes horizontal relations
    large-scale: whole languages
    August Schleicher’s Indo-European tree from 1853
    3 / 20

    View Slide

  4. Every word has its own history
    Wave model and neighbor-net
    more inclusive:
    common ancestry and contact
    sometimes no difference between relations
    Dimensions of lexical change
    sound change
    borrowing
    word formation
    semantic shift
    Borrowing across
    Indo-European languages
    (Boc et al. 2010)
    4 / 20

    View Slide

  5. Modeling word formation
    Computational approaches
    borrowing included
    but typically not word formation
    both are not regular like sound changes
    → difficult to detect by algorithms
    Word formation
    unpredictable but follows tendencies
    patterns can show us the most likely history
    involves considering large amounts of data
    there is software for this in evolutionary biology
    → adopting tree reconciliation into linguistics
    5 / 20

    View Slide

  6. Tree reconciliation
    6 / 20

    View Slide

  7. Tree reconciliation
    a b c d A D C B
    Projecting a gene tree onto a species tree (adapted from Nakhleh 2013)
    7 / 20

    View Slide

  8. Differences & correspondences: biology & linguistics
    Processes:
    random mutation regular sound change
    horizontal gene transfer vs. borrowing
    gene duplication word formation
    Objects:
    species language
    genome vs. lexicon
    gene word
    8 / 20

    View Slide

  9. Incomplete lineage sorting in linguistics
    Sonne
    sol German Swedish Latin
    sol
    (List et al. 2016, Jacques and List forthcoming, graphic adapted from Nakhleh 2013)
    9 / 20

    View Slide

  10. Word trees
    proposed by Gray et al. 2007
    first applied by e.g.
    Boc et al. 2010
    Willems et al. 2016
    big-data
    focus on borrowing
    yet only surface similarity
    no regular sound change,
    no word formation
    → A more exhaustive model is needed
    10 / 20

    View Slide

  11. Derivation trees: Cognates meaning ‘life’
    aiōn āiiū ā́yu Vedic Avestan Greek
    Reconstructing the history of *h₂ai̯-u̯-on- and *h₂oi̯-u-
    11 / 20

    View Slide

  12. Derivation trees: Cognates meaning ‘life’
    aiōn āiiū ā́yu Vedic Avestan Greek
    Reconstructing the history of *h₂ai̯-u̯-on- and *h₂oi̯-u-
    12 / 20

    View Slide

  13. Derivation trees: Cognates meaning ‘life’
    aiōn āiiū ā́yu Vedic Avestan Greek
    Reconstructing the history of *h₂ai̯-u̯-on- and *h₂oi̯-u-
    13 / 20

    View Slide

  14. Derivation trees
    aiōn āiiū ā́yu
    ēwo iūnō
    yúvan
    dīrghā́yu
    darəgāiiū
    *h₂ai̯-u-on-
    *h₂oi̯-u-
    *h₂i̯-u-h₃on-
    *h₂i̯-u-h₃n-on-
    *dl̩h₁gʰ-ó-h₂oi̯-u-
    OHG Greek
    Old Avestan Vedic
    Latin
    A family tree of *h₂ei-u- (based on Wodtko et al. 2008 and Mallory/Adams 2006)
    14 / 20

    View Slide

  15. Human- and machine-readable
    ID LANGUAGE CONCEPT FORM MORPHEMES COGNATES ROOTS
    1 Old High German eternity ēwo ēw o 1 2 1 2
    2 Ancient Greek life aiōn ai ōn 1 2 1 2
    3 Old Avestan life āiiū āiiū 3 1
    4 Old Avestan long-living darəgāiiū darəg a āiiū 4 5 3 3 4 1
    5 Vedic life áyu áyu 3 1
    6 Vedic long-living dīrghā́yu dīrgh á ā́yu 4 5 3 3 4 1
    7 Vedic young yúvan yúv an 6 7 1 5
    8 Latin (deity name) iūnō iū n ō 6 8 2 1 5 2
    9 Indo-European life *h₂ai̯-u-on- h₂ai̯u on 3 2 1 2
    10 Indo-European life *h₂oi̯-u- h₂oi̯u 1 1
    11 Indo-European long-living *dl̩h₁gʰ-ó-h₂oi̯-u- dl̩h₁gʰ ó h₂oi̯u 4 5 1 3 4 1
    12 Indo-European young *h₂i̯-u-h₃on- h₂i̯u h₃on 6 7 1 5
    13 Indo-European the young one *h₂i̯-u-h₃n-on- h₂i̯u h₃n on 6 8 2 1 5 2
    Source Source-ID Target Target-ID Change
    *h₂ai̯-u-on- 1 aiōn 2 sound change
    *h₂oi̯-u- 3 *h₂ai̯-u-on- 1 e-grade, on-suffix
    *h₂oi̯-u- 3 *dl̩h₁gʰ-ó-h₂oi̯-u- 4 compound with *dl̩h₁gʰ-ó-
    *dl̩h₁gʰ-ó-h₂oi̯- 7 dīrghā́yu 8 sound change
    ... ... ... ... ...
    aiōn āiiū ā́yu
    ēwo iūnō
    yúvan
    dīrghā́yu
    darəgāiiū
    *h₂ai̯-u-on-
    *h₂oi̯-u-
    *h₂i̯-u-h₃on-
    *h₂i̯-u-h₃n-on-
    *dl̩h₁gʰ-ó-h₂oi̯-u-
    OHG Greek
    Old Avestan Vedic
    Latin
    15 / 20

    View Slide

  16. Human- and machine-readable
    ID LANGUAGE CONCEPT FORM MORPHEMES COGNATES ROOTS
    1 Old High German eternity ēwo ēw o 1 2 1 2
    2 Ancient Greek life aiōn ai ōn 1 2 1 2
    3 Old Avestan life āiiū āiiū 3 1
    4 Old Avestan long-living darəgāiiū darəg a āiiū 4 5 3 3 4 1
    5 Vedic life áyu áyu 3 1
    6 Vedic long-living dīrghā́yu dīrgh á ā́yu 4 5 3 3 4 1
    7 Vedic young yúvan yúv an 6 7 1 5
    8 Latin (deity name) iūnō iū n ō 6 8 2 1 5 2
    9 Indo-European life *h₂ai̯-u-on- h₂ai̯u on 3 2 1 2
    10 Indo-European life *h₂oi̯-u- h₂oi̯u 1 1
    11 Indo-European long-living *dl̩h₁gʰ-ó-h₂oi̯-u- dl̩h₁gʰ ó h₂oi̯u 4 5 1 3 4 1
    12 Indo-European young *h₂i̯-u-h₃on- h₂i̯u h₃on 6 7 1 5
    13 Indo-European the young one *h₂i̯-u-h₃n-on- h₂i̯u h₃n on 6 8 2 1 5 2
    aiōn āiiū ā́yu
    ēwo iūnō
    yúvan
    dīrghā́yu
    darəgāiiū
    *h₂ai̯-u-on-
    *h₂oi̯-u-
    *h₂i̯-u-h₃on-
    *h₂i̯-u-h₃n-on-
    *dl̩h₁gʰ-ó-h₂oi̯-u-
    OHG Greek
    Old Avestan Vedic
    Latin
    Source Source-ID Target Target-ID Change
    *h₂ai̯-u-on- 1 aiōn 2 sound change
    *h₂oi̯-u- 3 *h₂ai̯-u-on- 1 e-grade, on-suffix
    *h₂oi̯-u- 3 *dl̩h₁gʰ-ó-h₂oi̯-u- 4 compound with *dl̩h₁gʰ-ó-
    *dl̩h₁gʰ-ó-h₂oi̯- 7 dīrghā́yu 8 sound change
    ... ... ... ... ...
    16 / 20

    View Slide

  17. Human- and machine-readable
    aiōn āiiū ā́yu
    ēwo iūnō
    yúvan
    dīrghā́yu
    darəgāiiū
    *h₂ai̯-u-on-
    *h₂oi̯-u-
    *h₂i̯-u-h₃on-
    *h₂i̯-u-h₃n-on-
    *dl̩h₁gʰ-ó-h₂oi̯-u-
    OHG Greek
    Old Avestan Vedic
    Latin
    ID LANGUAGE CONCEPT FORM MORPHEMES COGNATES ROOTS
    1 Old High German eternity ēwo ēw o 1 2 1 2
    2 Ancient Greek life aiōn ai ōn 1 2 1 2
    3 Old Avestan life āiiū āiiū 3 1
    4 Old Avestan long-living darəgāiiū darəg a āiiū 4 5 3 3 4 1
    5 Vedic life áyu áyu 3 1
    6 Vedic long-living dīrghā́yu dīrgh á ā́yu 4 5 3 3 4 1
    7 Vedic young yúvan yúv an 6 7 1 5
    8 Latin (deity name) iūnō iū n ō 6 8 2 1 5 2
    9 Indo-European life *h₂ai̯-u-on- h₂ai̯u on 3 2 1 2
    10 Indo-European life *h₂oi̯-u- h₂oi̯u 1 1
    11 Indo-European long-living *dl̩h₁gʰ-ó-h₂oi̯-u- dl̩h₁gʰ ó h₂oi̯u 4 5 1 3 4 1
    12 Indo-European young *h₂i̯-u-h₃on- h₂i̯u h₃on 6 7 1 5
    13 Indo-European the young one *h₂i̯-u-h₃n-on- h₂i̯u h₃n on 6 8 2 1 5 2
    Source Source-ID Target Target-ID Change
    *h₂ai̯-u-on- 1 aiōn 2 sound change
    *h₂oi̯-u- 3 *h₂ai̯-u-on- 1 e-grade, on-suffix
    *h₂oi̯-u- 3 *dl̩h₁gʰ-ó-h₂oi̯-u- 4 compound with *dl̩h₁gʰ-ó-
    *dl̩h₁gʰ-ó-h₂oi̯- 7 dīrghā́yu 8 sound change
    ... ... ... ... ...
    17 / 20

    View Slide

  18. Database
    pilot study
    about 100 Indo-European roots
    based on established findings, e.g. NIL:
    focus on productive word classes
    reliable etymologies
    manageable amount of possible
    reconstructions per attested word
    18 / 20

    View Slide

  19. Application possibilities
    Digitizing etymological relations allows for quantitative studies on
    sound correspondences and change
    directions of semantic shift
    word formation patterns through time
    interrelations between meaning and morphological productivity
    and supports us in
    consistency of etymological reconstruction
    making etymological data accessible
    19 / 20

    View Slide

  20. Thank you for your attention!
    Contact: [email protected]
    http://calc.digling.org/
    CALC members :
    Dr. Johann-Mattis List (Group leader)
    Dr. Yunfan Lai (Post-Doc)
    Dr. Tiago Tresoldi (Post-Doc)
    Mei-Shin Wu (Doctorate student)
    Nathanael E. Schweikhard (Doctorate student)
    Associated:
    Martin J. Kümmel
    20 / 20

    View Slide