Computer-Assisted Language Comparison Johann-Mattis List Research Group “Computer-Assisted Language Comparison” Department of Linguistic and Cultural Evolution Max-Planck Institute for the Science of Human History Jena, Germany 2017-10-21 very long title P(A|B)=P(B|A)... 1 / 20
"All languages change, as long as they exist." (August Schleicher 1863) walkman Indo-European Germanic Old English English p f f f ə a æ ɑː t d d ð eː eː e ə r r r r Germanic German English iPod Comparative Linguistics 2 / 20
iPod Indo-European Germanic Old English English p f f f ə a æ ɑː t d d ð eː eː e ə r r r r Germanic German English walkman "All languages change, as long as they exist." (August Schleicher 1863) Comparative Linguistics 2 / 20
walkman Indo-European Germanic Old English English p f f f ə a æ ɑː t d d ð eː eː e ə r r r r Germanic German English iPod "All languages change, as long as they exist." (August Schleicher 1863) Comparative Linguistics 2 / 20
walkman Indo-European Germanic Old English English p f f f ə a æ ɑː t d d ð eː eː e ə r r r r Germanic German English iPod "All languages change, as long as they exist." (August Schleicher 1863) Comparative Linguistics 2 / 20
iPod Indo-European Germanic Old English English p f f f ə a æ ɑː t d d ð eː eː e ə r r r r walkman L₁ L₁ L₁ L₁ L₁ "All languages change, as long as they exist." (August Schleicher 1863) Comparative Linguistics 2 / 20
iPod Indo-European Germanic Old English English p f f f ə a æ ɑː t d d ð eː eː e ə r r r r walkman L₁ L₁ L₁ L₁ L₁ "All languages change, as long as they exist." (August Schleicher 1863) Comparative Linguistics 2 / 20
iPod Indo-European Germanic Old English English p f f f ə a æ ɑː t d d ð eː eː e ə r r r r walkman L₁ L₁ L₁ L₁ L₁ "All languages change, as long as they exist." (August Schleicher 1863) Comparative Linguistics 2 / 20
iPod Indo-European Germanic Old English English p f f f ə a æ ɑː t d d ð eː eː e ə r r r r walkman L₁ L₁ L₁ "All languages change, as long as they exist." (August Schleicher 1863) Comparative Linguistics 2 / 20
iPod Indo-European Germanic Old English English p f f f ə a æ ɑː t d d ð eː eː e ə r r r r walkman L₂ L₁ L₃ "All languages change, as long as they exist." (August Schleicher 1863) Comparative Linguistics 2 / 20
Examples Cross-Linguistic Colexifications Cross-Linguistic Colexifications Polysemy If a word has two or more meanings which are historically related. 12 / 20
Examples Cross-Linguistic Colexifications Cross-Linguistic Colexifications Polysemy If a word has two or more meanings which are historically related. Homophony If two words which do not share a common etymological history have an identical pronunciation. 12 / 20
Examples Cross-Linguistic Colexifications Cross-Linguistic Colexifications Polysemy If a word has two or more meanings which are historically related. Homophony If two words which do not share a common etymological history have an identical pronunciation. Colexification Coined by François (2008): If one word form denotes several meanings. 12 / 20
Examples Cross-Linguistic Colexifications Database of Cross-Linguistic Colexifications CLICS (List et al. 2014, http://clics.lingpy.org) is an online database of synchronic lexical associations (“colexifica- tions”) in currently 221 language varieties of the world. Large databases offering lexical information on the world’s languages are already readily available for research in different online sources. However, the information on tendencies of meaning associations they enshrine is not easily extractable from these sources themselves. 13 / 20
Examples Cross-Linguistic Colexifications Database of Cross-Linguistic Colexifications We are currently substantially revising the amount of data in CLICS and hope to be able to release a much larger and also consistently enhanced version some time in the first half of next year. 13 / 20
Examples Cross-Linguistic Colexifications Database of Cross-Linguistic Colexifications In addition to the original CLICS database, we are currently also testing algorithms which measure compoundhood across languages. 13 / 20
Examples Rhyme Analysis Rhyme Analysis rhyme analysis is crucial for Old Chinese phonology emerged when scholars of the Suí 隋 (581–618) and Táng 唐 (618–907) dynasties realized that old poems, especially those in the Book of Odes (Shījīng 詩經 ca. 1050–600 BCE) had many inconsistencies regarding the rhyming of words 14 / 20
Examples Rhyme Analysis Rhyme Analysis rhyme analysis is crucial for Old Chinese phonology emerged when scholars of the Suí 隋 (581–618) and Táng 唐 (618–907) dynasties realized that old poems, especially those in the Book of Odes (Shījīng 詩經 ca. 1050–600 BCE) had many inconsistencies regarding the rhyming of words later scholars from the Míng 明 (1368–1644) and Qīng 清 dynasties (1644–1911) realized that the inconsistencies in the rhyme patterns reflect the effects of language change 14 / 20
Examples Rhyme Analysis The Shījīng Browser In order to make it more convenient for the readers to investi- gate the data underlying this paper in full detail, an interactive web-based application was created. This freely available Shījīng Browser (http://digling.org/shijing/) lists all potential rhyme words in tabular form along with additional information including the pīnyīn transliteration, the Middle Chinese reading, the reconstruction by Baxter and Sagart (ibid.), the reading by Pān (2000), the GSR index (Karlgren 1957), and the number of poem, stanza, and section. 16 / 20
Outlook In linguistics, as in science in general, we need to know what we are capable of and what we are not. If we keep on comparing languages manually, ignoring all the technical improvements of late, we will necessarily fail. On the other hand, if we blindly trust algorithms instead of experts expertise and intuition, we will also fail. We need integrated frameworks for historical language comparison in which the best of the two worlds is combined! 19 / 20