Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Beyond Cognacy

Beyond Cognacy

Talk presented at the workshop "Towards a Global Language Phylogeny", 17-19 September, Max Planck Institute for History and the Sciences, Jena.

Johann-Mattis List

September 17, 2014
Tweet

More Decks by Johann-Mattis List

Other Decks in Science

Transcript

  1. Beyond Cognacy Current Chances and Future Challenges of Automatic Cognate

    Detection in Historical Linguistics Johann-Mattis List Forschungszentrum Deutscher Sprachatlas Philipps-University Marburg 2014-09-17 1 / 30
  2. word Wort слово cuvînt palabra mot adottszó slovo verbum focal

    词 parola λόγος शब◌् द ord λόγος Wort слово cuvînt palabra mot adottszó slovo verbum focal 词 parola शब◌् द ord word ord ord word Cognate Detection 2 / 30
  3. Cognate Detection Traditional Approaches The Comparative Method FRANZ BOPP VERY,

    VERY LONG TITLE proof of relationship identification of cognates identification of sound correspondences reconstruction of proto-forms internal classification 4 / 30
  4. Cognate Detection Traditional Approaches The Comparative Method FRANZ BOPP VERY,

    VERY LONG TITLE proof of relationship identification of cognates identification of sound correspondences reconstruction of proto-forms internal classification 4 / 30
  5. Cognate Detection Traditional Approaches Cognate Detection FRANZ BOPP VERY, VERY

    LONG TITLE Cognate List Alignment Correspondence List German dünn d ʏ n GER ENG Frequ. d θ 3 x d d 1 x n n 1 x m m 1 x ŋ ŋ 1 x English thin θ ɪ n German Ding d ɪ ŋ English thing θ ɪ ŋ German dumm d ʊ m English dumb d ʌ m German Dorn d ɔɐ n English thorn d ɔː n 5 / 30
  6. Cognate Detection Traditional Approaches Cognate Detection FRANZ BOPP VERY, VERY

    LONG TITLE Cognate List Alignment Correspondence List German dünn d ʏ n GER ENG Frequ. d θ 3 x d d 1 x n n 1 x m m 1 x ŋ ŋ 1 x English thin θ ɪ n German Ding d ɪ ŋ English thing θ ɪ ŋ German dumm d ʊ m English dumb d ʌ m German Dorn d ɔɐ n English thorn d ɔː n 5 / 30
  7. Cognate Detection Traditional Approaches Cognate Detection FRANZ BOPP VERY, VERY

    LONG TITLE Cognate List Alignment Correspondence List German dünn d ʏ n GER ENG Frequ. d θ 2 x d d 1 x n n 1 x m m 1 x ŋ ŋ 1 x English thin θ ɪ n German Ding d ɪ ŋ English thing θ ɪ ŋ German dumm d ʊ m English dumb d ʌ m German Dorn d ɔɐ n English thorn d ɔː n 5 / 30
  8. Cognate Detection Traditional Approaches Cognate Detection FRANZ BOPP VERY, VERY

    LONG TITLE Cognate List Alignment Correspondence List German dünn d ʏ n GER ENG Frequ. d θ 2 x d d 1 x n n 1 x m m 1 x ŋ ŋ 1 x English thin θ ɪ n German Ding d ɪ ŋ English thing θ ɪ ŋ German dumm d ʊ m English dumb d ʌ m German Dorn d ɔɐ n English thorn θ ɔː n 5 / 30
  9. Cognate Detection Traditional Approaches Cognate Detection FRANZ BOPP VERY, VERY

    LONG TITLE Cognate List Alignment Correspondence List German dünn d ʏ n GER ENG Frequ. d θ 3 x d d 1 x ? n n 2 x m m 1 x ŋ ŋ 1 x English thin θ ɪ n German Ding d ɪ ŋ English thing θ ɪ ŋ German dumm d ʊ m English dumb d ʌ m German Dorn d ɔɐ n English thorn θ ɔː n 5 / 30
  10. Cognate Detection Traditional Approaches Cognate Detection FRANZ BOPP VERY, VERY

    LONG TITLE Cognate List Alignment Correspondence List German dünn d ʏ n GER ENG Frequ. d θ 3 x d d 1 x n n 2 x m m 1 x ŋ ŋ 1 x English thin θ ɪ n German Ding d ɪ ŋ English thing θ ɪ ŋ German dumm d ʊ m English dumb d ʌ m German Dorn d ɔɐ n English thorn θ ɔː n 5 / 30
  11. Cognate Detection Automatic Approaches Narrowing down the Task P(A|B)=(P(B|A)P(A))/(P(B) Traditional

    Workflow *dent- dente dɑ̃ dɛnte *tanθ tuːθ t͡saːn DICTIONARIES WORDLISTS HISTORICAL SCENARIOS 7 / 30
  12. Cognate Detection Automatic Approaches Narrowing down the Task P(A|B)=(P(B|A)P(A))/(P(B) Traditional

    Workflow HAND [hænd] FOOT [fʊt] EARTH [ɜːrθ] TREE [triː] BARK [bɑːrk] *dent- dente dɑ̃ dɛnte *tanθ tuːθ t͡saːn DICTIONARIES WORDLISTS HISTORICAL SCENARIOS 7 / 30
  13. Cognate Detection Automatic Approaches Narrowing down the Task P(A|B)=(P(B|A)P(A))/(P(B) Traditional

    Workflow HAND [hænd] FOOT [fʊt] EARTH [ɜːrθ] TREE [triː] BARK [bɑːrk] *dent- dente dɑ̃ dɛnte *tanθ tuːθ t͡saːn DICTIONARIES WORDLISTS HISTORICAL SCENARIOS 7 / 30
  14. Cognate Detection Automatic Approaches Narrowing down the Task P(A|B)=(P(B|A)P(A))/(P(B) Technical

    Workflow HAND [hænd] FOOT [fʊt] EARTH [ɜːrθ] TREE [triː] BARK [bɑːrk] WORDLIST DATA HAND [hænd] FOOT [fʊt] EARTH [ɜːrθ] TREE [triː] BARK [bɑːrk] RAW DATA Semantic Tagging HAND [hænd] FOOT [fʊt] EARTH [ɜːrθ] TREE [triː] BARK [bɑːrk] TOKENS, MORPHEMES Tokenization Cognate Detection HAND [hænd] FOOT [fʊt] EARTH [ɜːrθ] TREE [triː] BARK [bɑːrk] COGNATE SETS Alignment Analysis HAND [hænd] FOOT [fʊt] EARTH [ɜːrθ] TREE [triː] BARK [bɑːrk] SOUND CORRESPON- DENCES HAND [hænd] FOOT [fʊt] EARTH [ɜːrθ] TREE [triː] BARK [bɑːrk] PROTO- FORMS Linguistic Reconstruction 7 / 30
  15. Cognate Detection Automatic Approaches Narrowing down the Task P(A|B)=(P(B|A)P(A))/(P(B) Technical

    Workflow HAND [hænd] FOOT [fʊt] EARTH [ɜːrθ] TREE [triː] BARK [bɑːrk] WORDLIST DATA HAND [hænd] FOOT [fʊt] EARTH [ɜːrθ] TREE [triː] BARK [bɑːrk] RAW DATA Semantic Tagging HAND [hænd] FOOT [fʊt] EARTH [ɜːrθ] TREE [triː] BARK [bɑːrk] TOKENS, MORPHEMES Tokenization Cognate Detection HAND [hænd] FOOT [fʊt] EARTH [ɜːrθ] TREE [triː] BARK [bɑːrk] COGNATE SETS Alignment Analysis HAND [hænd] FOOT [fʊt] EARTH [ɜːrθ] TREE [triː] BARK [bɑːrk] SOUND CORRESPON- DENCES HAND [hænd] FOOT [fʊt] EARTH [ɜːrθ] TREE [triː] BARK [bɑːrk] PROTO- FORMS Linguistic Reconstruction 7 / 30
  16. Cognate Detection Automatic Approaches Narrowing down the Task P(A|B)=(P(B|A)P(A))/(P(B) Technical

    Workflow INPUT: Multilingual wordlist → semantically tagged → phonetically transcribed → tokenized into phonemes OUTPUT: Multilingual wordlist → identified cognate entries assigned to clusters → identified cognate entries multiply aligned 7 / 30
  17. Cognate Detection Automatic Approaches Algorithms P(A|B)=(P(B|A)P(A))/(P(B) Basic Procedure for Multilingual

    Cognate Detection WORDLIST DATA PAIRWISE DISTANCES BETWEEN WORDS PAIRWISE COMPARISON 8 / 30
  18. Cognate Detection Automatic Approaches Algorithms P(A|B)=(P(B|A)P(A))/(P(B) Basic Procedure for Multilingual

    Cognate Detection WORDLIST DATA PAIRWISE DISTANCES BETWEEN WORDS COGNATE SETS COGNATE CLUSTERING PAIRWISE COMPARISON 8 / 30
  19. Cognate Detection Automatic Approaches Algorithms P(A|B)=(P(B|A)P(A))/(P(B) Cognate Clustering Analysis ID

    Taxa Word Gloss GlossID IPA ... ... ... ... ... ... 21 German Frau woman 20 frau 22 Dutch vrouw woman 20 vrɑu 23 English woman woman 20 wʊmən 24 Danish kvinde woman 20 kvenə 25 Swedish kvinna woman 20 kviːna 26 Norwegian kvine woman 20 kʋinə ... ... ... ... ... ... 8 / 30
  20. Cognate Detection Automatic Approaches Algorithms P(A|B)=(P(B|A)P(A))/(P(B) Cognate Clustering Swedish English

    Danish Norwegian Dutch German kvinna woman kvinde kvine vrouw Frau Swedish kvina 0.00 0.69 0.07 0.12 0.71 0.78 English wumin 0.69 0.00 0.66 0.57 0.68 0.87 Danish kveni 0.07 0.66 0.00 0.08 0.67 0.71 Norwegian kwini 0.12 0.57 0.08 0.00 0.75 0.74 Dutch frou 0.71 0.68 0.67 0.75 0.00 0.17 German frau 0.78 0.87 0.71 0.74 0.17 0.00 Analysis ID Taxa Word Gloss GlossID IPA ... ... ... ... ... ... 21 German Frau woman 20 frau 22 Dutch vrouw woman 20 vrɑu 23 English woman woman 20 wʊmən 24 Danish kvinde woman 20 kvenə 25 Swedish kvinna woman 20 kviːna 26 Norwegian kvine woman 20 kʋinə ... ... ... ... ... ... 8 / 30
  21. Cognate Detection Automatic Approaches Algorithms P(A|B)=(P(B|A)P(A))/(P(B) Cognate Clustering Swedish English

    Danish Norwegian Dutch German kvinna woman kvinde kvine vrouw Frau Swedish kvina 0.00 0.69 0.07 0.12 0.71 0.78 English wumin 0.69 0.00 0.66 0.57 0.68 0.87 Danish kveni 0.07 0.66 0.00 0.08 0.67 0.71 Norwegian kwini 0.12 0.57 0.08 0.00 0.75 0.74 Dutch frou 0.71 0.68 0.67 0.75 0.00 0.17 German frau 0.78 0.87 0.71 0.74 0.17 0.00 German Frau frau Dutch vrouw vrou English woman wumin Danish kvinde kveni Swedish kvinna kvina Norwegian kvine kwini 8 / 30
  22. Cognate Detection Automatic Approaches Algorithms P(A|B)=(P(B|A)P(A))/(P(B) Cognate Clustering Swedish English

    Danish Norwegian Dutch German kvinna woman kvinde kvine vrouw Frau Swedish kvina 0.00 0.69 0.07 0.12 0.71 0.78 English wumin 0.69 0.00 0.66 0.57 0.68 0.87 Danish kveni 0.07 0.66 0.00 0.08 0.67 0.71 Norwegian kwini 0.12 0.57 0.08 0.00 0.75 0.74 Dutch frou 0.71 0.68 0.67 0.75 0.00 0.17 German frau 0.78 0.87 0.71 0.74 0.17 0.00 German Frau frau Dutch vrouw vrou English woman wumin Danish kvinde kveni Swedish kvinna kvina Norwegian kvine kwini 8 / 30
  23. Cognate Detection Automatic Approaches Algorithms P(A|B)=(P(B|A)P(A))/(P(B) Cognate Clustering German Frau

    frau Dutch vrouw vrou English woman wumin Danish kvinde kveni Swedish kvinna kvina Norwegian kvine kwini Analysis ID Taxa Word Gloss GlossID IPA CogID ... ... ... ... ... ... ... 21 German Frau woman 20 frau 1 22 Dutch vrouw woman 20 vrɑu 1 23 English woman woman 20 wʊmən 2 24 Danish kvinde woman 20 kvenə 3 25 Swedish kvinna woman 20 kviːna 3 26 Norwegian kvine woman 20 kʋinə 3 ... ... ... ... ... ... ... 8 / 30
  24. Cognate Detection Automatic Approaches Algorithms P(A|B)=(P(B|A)P(A))/(P(B) INPUT TOKENIZATION PREPROCESSING LOG-ODDS

    D ISTANCE COGNATE OUTPUT CORRESPONDENCE DETECTION USING PHONETIC ALIGNMENT LOOP DISTRIBUTION LexStat Algorithm (List 2014) EXPECTED ATTESTED DISTRIBUTION CALCULATION CLUSTERING 8 / 30
  25. Cognate Detection Problems Applicability ! Method Multilingual? No additional requirements?

    Freely Available? Mackay & Kondrak 2005 ✗ ✓ ✗ Bergsma & Kondrak 2007 ✓ ✓ ✗ Turchin et al. 2010 ✓ ✓ ✓ Berg-Kirkpatrick & Klein 2011 ✗ ✓ ✗ Hauer & Kondrak 2011 ✓ ✓ ✗ Steiner et al. 2011 ✓ ✓ ✗ List 2012 & 2014 ✓ ✓ ✓ Beinborn et al. 2013 ✗ ? ✗ Bouchard-Côté et al. 2013 ✓ ✗ ✗ Rama 2013 ✗ ✓ ✗ Ciobanu & Dinu 2014 ✗ ✓ ✗ … … … … 10 / 30
  26. Cognate Detection Problems Applicability ! Method Multilingual? No additional requirements?

    Freely Available? Mackay & Kondrak 2005 ✗ ✓ ✗ Bergsma & Kondrak 2007 ✓ ✓ ✗ Turchin et al. 2010 ✓ ✓ ✓ Berg-Kirkpatrick & Klein 2011 ✗ ✓ ✗ Hauer & Kondrak 2011 ✓ ✓ ✗ Steiner et al. 2011 ✓ ✓ ✗ List 2012 & 2014 ✓ ✓ ✓ Beinborn et al. 2013 ✗ ? ✗ Bouchard-Côté et al. 2013 ✓ ✗ ✗ Rama 2013 ✗ ✓ ✗ Ciobanu & Dinu 2014 ✗ ✓ ✗ … … … … 10 / 30
  27. Cognate Detection Problems Applicability ! Method Multilingual? No additional requirements?

    Freely Available? Mackay & Kondrak 2005 ✗ ✓ ✗ Bergsma & Kondrak 2007 ✓ ✓ ✗ Turchin et al. 2010 ✓ ✓ ✓ Berg-Kirkpatrick & Klein 2011 ✗ ✓ ✗ Hauer & Kondrak 2011 ✓ ✓ ✗ Steiner et al. 2011 ✓ ✓ ✗ List 2012 & 2014 ✓ ✓ ✓ Beinborn et al. 2013 ✗ ? ✗ Bouchard-Côté et al. 2013 ✓ ✗ ✗ Rama 2013 ✗ ✓ ✗ Ciobanu & Dinu 2014 ✗ ✓ ✗ … … … … 10 / 30
  28. Cognate Detection Problems Applicability ! Method Multilingual? No additional requirements?

    Freely Available? Mackay & Kondrak 2005 ✗ ✓ ✗ Bergsma & Kondrak 2007 ✓ ✓ ✗ Turchin et al. 2010 ✓ ✓ ✓ Berg-Kirkpatrick & Klein 2011 ✗ ✓ ✗ Hauer & Kondrak 2011 ✓ ✓ ✗ Steiner et al. 2011 ✓ ✓ ✗ List 2012 & 2014 ✓ ✓ ✓ Beinborn et al. 2013 ✗ ? ✗ Bouchard-Côté et al. 2013 ✓ ✗ ✗ Rama 2013 ✗ ✓ ✗ Ciobanu & Dinu 2014 ✗ ✓ ✗ … … … … 10 / 30
  29. Cognate Detection Problems Transparency ! Results are often only reported

    as evaluation scores. Examples for individual cognate judgments are rare. 11 / 30
  30. Cognate Detection Problems Transparency ! Results are often only reported

    as evaluation scores. Examples for individual cognate judgments are rare. Supplementary data – is often lacking, or 11 / 30
  31. Cognate Detection Problems Transparency ! Results are often only reported

    as evaluation scores. Examples for individual cognate judgments are rare. Supplementary data – is often lacking, or – not given in a human-readable form. 11 / 30
  32. Cognate Detection Problems Transparency ! Results are often only reported

    as evaluation scores. Examples for individual cognate judgments are rare. Supplementary data – is often lacking, or – not given in a human-readable form. → The results show a great lack of transparency. 11 / 30
  33. Cognate Detection Problems Comparability ! Test sets (benchmarks) vary greatly.

    Often, only subsets of Dyen et al. (1992) are used. 12 / 30
  34. Cognate Detection Problems Comparability ! Test sets (benchmarks) vary greatly.

    Often, only subsets of Dyen et al. (1992) are used. → It is difficult to compare the performance of the methods. 12 / 30
  35. Cognate Detection Problems Accuracy ! Evaluation criteria are not very

    intuitive and vary greatly. It is difficult to communicate the results to traditional linguists. 13 / 30
  36. Cognate Detection Problems Accuracy ! Evaluation criteria are not very

    intuitive and vary greatly. It is difficult to communicate the results to traditional linguists. → Many linguists regard automatic cognate detection as – “impossible per se”, or 13 / 30
  37. Cognate Detection Problems Accuracy ! Evaluation criteria are not very

    intuitive and vary greatly. It is difficult to communicate the results to traditional linguists. → Many linguists regard automatic cognate detection as – “impossible per se”, or – as useful as “rolling a dice”. 13 / 30
  38. Chances Applicability Applicability PyPi GitHub SourceForge GoogleCode CPAN CTAN JSAN

    PEAR LaunchPad It was never easier to publish and maintain code... 15 / 30
  39. Chances Applicability LingPy PyPi GitHub SourceForge GoogleCode CPAN CTAN JSAN

    PEAR LaunchPad What is LingPy? Python library for automatic tasks in historical linguistics project homepage: http://lingpy.org code base: https://github.com/lingpy/lingpy supports Python2 and Python3 works on Mac, Linux, and (basically also) Windows current release: 2.3 16 / 30
  40. Chances Applicability LingPy PyPi GitHub SourceForge GoogleCode CPAN CTAN JSAN

    PEAR LaunchPad What does LingPy offer? tokenization of phonetic sequences phonetic alignment analyses (List 2012a) automatic cognate detection (Turchin 2010, List 2012b) automatic borrowing detection (List et al. 2014) basic routines for the evaluation of automatic methods plotting routines for interactive visualizations 16 / 30
  41. Chances Transparency Interactive Presentation of Results Alignments offer a unique

    perspective on results of cognate detection analyses. JavaScript and HTML5 offer unique ways for interactive data visualization. At the moment, we develop JavaScript tools that – visualize phonetic alignments of cognate sets, and – even allow to edit the data online. 18 / 30
  42. Chances Comparability Benchmark Databases for Historical Linguistics ML BAYES ?

    ! First benchmark databases have been compiled and published: Benchmark Database of Phonetic Alignments (BDPA, List & Prokić 2014, http://alignments.lingpy.org) Benchmark Database for Cognate Detection (BDCD, presented in List 2014, http://sequencecomparison.github.io). Benchmark Database for Linguistic Reconstruction (BDLR, in preparation). 20 / 30
  43. Chances Comparability Benchmark Databases for Historical Linguistics ML BAYES ?

    ! All data is given in phonetic transcriptions (IPA), tokenized into phonemic units, freely available for download, and can be directly used in LingPy. 20 / 30
  44. Chances Accuracy Performance of Cognate Detection Algorithms *h₂ B-Cubed F-Scores

    on BDCD Benchmark (List 2014) Bai (Tibeto-Burman) Indo-European Japanese and Ryukyu Ob-Ugrian Austronesian Sinitic (Chinese Dialects) 60 65 70 75 80 85 90 95 Turchin NED SCA LexStat 22 / 30
  45. Chances Accuracy Performance of Cognate Detection Algorithms *h₂ B-Cubed F-Scores

    on BDCD Benchmark (List 2014) Bai (Tibeto-Burman) Indo-European Japanese and Ryukyu Ob-Ugrian Austronesian Sinitic (Chinese Dialects) 60 65 70 75 80 85 90 95 Turchin NED SCA LexStat 75% 93% 92% 81% 89% 81% 22 / 30
  46. Chances Accuracy Performance of Cognate Detection Algorithms *h₂ B-Cubed F-Scores

    on BDCD Benchmark (List 2014) Bai (Tibeto-Burman) Indo-European Japanese and Ryukyu Ob-Ugrian Austronesian Sinitic (Chinese Dialects) 60 65 70 75 80 85 90 95 Turchin NED SCA LexStat 75% 93% 22 / 30
  47. Challenges Within Cognacy Within Cognacy We need to enhance our

    lexical databases (amount and quality of data), 24 / 30
  48. Challenges Within Cognacy Within Cognacy We need to enhance our

    lexical databases (amount and quality of data), cognate detection algorithms (accessibility and performance), and 24 / 30
  49. Challenges Within Cognacy Within Cognacy We need to enhance our

    lexical databases (amount and quality of data), cognate detection algorithms (accessibility and performance), and ways to present the results (interactive visualizations). 24 / 30
  50. Challenges Beyond Cognacy Beyond Cognacy German m oː n t

    - English m uː n - - Danish m ɔː n - ə Swedish m oː n - e 25 / 30
  51. Challenges Beyond Cognacy Beyond Cognacy German m oː n t

    - English m uː n - - Danish m ɔː n - ə Swedish m oː n - e Fúzhōu ŋ u o ʔ ⁵ - - - - - - - - - - Měixiàn ŋ i a t ⁵ - - - - - k u o ŋ ⁴⁴ Guǎngzhōu j - y t ² l - œ ŋ ²² - - - - - Běijīng - y ɛ - ⁵¹ l i ɑ ŋ - - - - - - 25 / 30
  52. Challenges Beyond Cognacy Beyond Cognacy German m oː n t

    - English m uː n - - Danish m ɔː n - ə Swedish m oː n - e Fúzhōu ŋ u o ʔ ⁵ - - - - - - - - - - Měixiàn ŋ i a t ⁵ - - - - - k u o ŋ ⁴⁴ Guǎngzhōu j - y t ² l - œ ŋ ²² - - - - - Běijīng - y ɛ - ⁵¹ l i ɑ ŋ - - - - - - "MOON" "MOON" "SHINE" "LIGHT" 25 / 30
  53. Challenges Beyond Cognacy Beyond Cognacy Fúzhōu Měixiàn Guǎngzhōu Běijīng INNO

    VATIO N INNO VATIO N INNO VATIO N BO RRO W ING LO SS INNO VATIO N INNO VATIO N 25 / 30
  54. Challenges Beyond Cognacy Lexical Change SEMANTIC CHANGE MORPHOLOGICAL CHANGE S

    T R A T IC C H A N G E Three Dimensions of Lexical Change (Gévaudan 2007) 26 / 30
  55. Challenges Beyond Cognacy Lexical Change Stratic Morphological Semantic Relation Biolog.

    Term continuity traditional notion of cognacy - + +/- +/- cognacy à la Swadesh - + +/- + automatic cognate detection - +/- +/- + direct cognate relation orthology + + + oblique cognate relation paralogy + - + etymological relation homology +/- +/- +/- oblique etymological relation xenology - +/- +/- 26 / 30
  56. Challenges Beyond Cognacy Inferring Lexical Change Scenarios In order to

    go beyond cognacy, we need methods for 27 / 30
  57. Challenges Beyond Cognacy Inferring Lexical Change Scenarios In order to

    go beyond cognacy, we need methods for borrowing detection (stratic aspect), 27 / 30
  58. Challenges Beyond Cognacy Inferring Lexical Change Scenarios In order to

    go beyond cognacy, we need methods for borrowing detection (stratic aspect), partial cognate inference (morphological aspect), and 27 / 30
  59. Challenges Beyond Cognacy Inferring Lexical Change Scenarios In order to

    go beyond cognacy, we need methods for borrowing detection (stratic aspect), partial cognate inference (morphological aspect), and cross-semantic cognate inference (semantic aspect). 27 / 30
  60. Challenges Beyond Cognacy Inferring Lexical Change Scenarios In order to

    go beyond cognacy, we need methods for borrowing detection (stratic aspect), partial cognate inference (morphological aspect), and cross-semantic cognate inference (semantic aspect). Following the lead of evolutionary biology, these methods should be combined under a unified framework of tree reconciliation (Page & Cotton 2002) in historical linguistics. 27 / 30
  61. Challenges Beyond Cognacy Tree Reconciliation PHYLOGENETIC RECONSTRUC- TION COGNATE (=HOMOLOG)

    DETECTION COGNATE TREE RECONCILIATION General Workflow for the Inference of Lexical Change Scenarios 28 / 30
  62. Conclusion Automatic cognate detection is still in its infancy, yet

    the child is constantly growing. Enhancing the applicability, transparency, comparability, and accuracy of cognate detection methods is a goal that can be achieved in the near future. 29 / 30
  63. Conclusion Automatic cognate detection is still in its infancy, yet

    the child is constantly growing. Enhancing the applicability, transparency, comparability, and accuracy of cognate detection methods is a goal that can be achieved in the near future. The greatest challenge arises from the complexity of lexical change processes. 29 / 30
  64. Conclusion Automatic cognate detection is still in its infancy, yet

    the child is constantly growing. Enhancing the applicability, transparency, comparability, and accuracy of cognate detection methods is a goal that can be achieved in the near future. The greatest challenge arises from the complexity of lexical change processes. More realistic approaches that go beyond cognacy should be able to handle variation along the stratic, the morphological, and the semantic dimension of lexical change. 29 / 30
  65. Conclusion Automatic cognate detection is still in its infancy, yet

    the child is constantly growing. Enhancing the applicability, transparency, comparability, and accuracy of cognate detection methods is a goal that can be achieved in the near future. The greatest challenge arises from the complexity of lexical change processes. More realistic approaches that go beyond cognacy should be able to handle variation along the stratic, the morphological, and the semantic dimension of lexical change. Evolutionary biology offers frameworks that could be employed to achieve these goals, yet it is not entirely clear whether and how this is possible. 29 / 30