Slide 1

Slide 1 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion Do Roots Really Grow Trees? Quantitative Root-Based Approaches in Historical Linguistics Hans Geisler, Johann-Mattis List August 26, 2010 1 / 33

Slide 2

Slide 2 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion Structure of the Talk Introduction Comparison and Reconstruction The Root Concept in Historical Linguistics Lexicostatistics vs. Root-Based Approaches Two Models of Language Evolution The Separation Base Method Etymostatistics Phylogenetic Reconstruction Comparison of the Models Testing the Models of Language Evolution Simulations of the Evolutionary Models Testing the Models on Real Data Conclusion Model-Internal Problems Models and Reality 2 / 33

Slide 3

Slide 3 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion Comparison and Reconstruction The Root Concept in Historical Linguistics Lexicostatistics vs. Root-Based Approaches Introduction Comparison and Reconstruction The Root Concept in Historical Linguistics Lexicostatistics vs. Root-Based Approaches 3 / 33

Slide 4

Slide 4 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion Comparison and Reconstruction The Root Concept in Historical Linguistics Lexicostatistics vs. Root-Based Approaches Comparison and Reconstruction Goal of Comparison One major goal of comparison in historical linguistics is to reconstruct the way genetically related languages evolved from a common ancestor language. Characters of Comparison The characters of comparison differ in the different approaches in historical linguistics. The leading question in character selection is always, whether a specific sample of characters is meaningful for phylogenetic reconstruction. 4 / 33

Slide 5

Slide 5 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion Comparison and Reconstruction The Root Concept in Historical Linguistics Lexicostatistics vs. Root-Based Approaches The Root Concept in Historical Linguistics Indo-European Latin Romance tis tom si no d(e)h3 si m datum “given” Latin dōnāre “present” Latin dōnum “gift” Latin dare “to give” Latin dōs “dowry” Latin date “date” French douna “give” Provencal don “gift” Spanish dar “give” Portuguese dote “dowry” Italian 5 / 33

Slide 6

Slide 6 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion Comparison and Reconstruction The Root Concept in Historical Linguistics Lexicostatistics vs. Root-Based Approaches Lexicostatistics vs. Root-Based Approaches Lexicostatistics Root-Based-Approaches Evolutionary Model replacement of words denot- ing basic concepts in seman- tic meaning slots gain and loss of roots Comparanda words denoting the same ba- sic concepts words which can be traced back to a single root (“word families”) Method of comparison comparative method comparative method Characters basic concepts roots (proto-forms) 6 / 33

Slide 7

Slide 7 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion Comparison and Reconstruction The Root Concept in Historical Linguistics Lexicostatistics vs. Root-Based Approaches Lexicostatistics vs. Root-Based Approaches Concept Italian Romanian Spanish French Latin BIRD - pasǎre pássaro - passer ucello - ave oiseau avis Table: The Lexicostatistical Analysis for the Concept BIRD Root Meaning Italian Romanian Spanish French passer “sparrow” passero pasǎre pássaro passereau avis “bird” ucello - ave oiseau Table: Root-Based Analysis for Latin passer “sparrow” and avis “bird” 7 / 33

Slide 8

Slide 8 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion Comparison and Reconstruction The Root Concept in Historical Linguistics Lexicostatistics vs. Root-Based Approaches Lexicostatistics vs. Root-Based Approaches Apparent Advantages of Root-Based Approaches Root-based approaches do not depend on the basic vocabulary assumption. Dataset is not restricted to the realm of basic vocabulary. Use of roots (proto-forms) as primary characters of comparison comes closer to the framework of the comparative method. 8 / 33

Slide 9

Slide 9 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion The Separation Base Method Etymostatistics Phylogenetic Reconstruction Comparison of the Models Two Models of Language Evolution The Separation Base Method (Holm 2000 & 2008) Etymostatistics (Starostin 2000[1989]) Phylogenetic Reconstruction Comparison of the Models 9 / 33

Slide 10

Slide 10 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion The Separation Base Method Etymostatistics Phylogenetic Reconstruction Comparison of the Models Evolutionary Model of the Separation Base Method Roots inherited from the common ancestor language Roots lost after the split from the ancestor language L1234 L12 L34 L1 L2 L3 L4 10 / 33

Slide 11

Slide 11 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion The Separation Base Method Etymostatistics Phylogenetic Reconstruction Comparison of the Models Evolutionary Model of the Separation Base Method L1 L2 L3 L4 1 11 / 33

Slide 12

Slide 12 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion The Separation Base Method Etymostatistics Phylogenetic Reconstruction Comparison of the Models Datasets for the Separation Base Method Language Value Coding Proto *h2 ent- 1 Hittite hant- 1 Old Indian ánti 1 Avestan - 0 Armenian - 0 Greek antí 1 Slavic - 0 Baltic ãnt-i 1 Germanic *anθ-ia 1 Latin ante 1 Celtic *antono 1 Albanian - 0 Tokharian ānt 1 Table: Coding of data according to the Separation Base Method 12 / 33

Slide 13

Slide 13 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion The Separation Base Method Etymostatistics Phylogenetic Reconstruction Comparison of the Models Evolutionary Model of Etymostatistics Roots inherited from the common ancestor language Innovations at different stages of language evolution L1234 L12 L34 L1 L2 L3 L4 13 / 33

Slide 14

Slide 14 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion The Separation Base Method Etymostatistics Phylogenetic Reconstruction Comparison of the Models Evolutionary Model of Etymostatistics L1 L2 L3 L4 1 14 / 33

Slide 15

Slide 15 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion The Separation Base Method Etymostatistics Phylogenetic Reconstruction Comparison of the Models Datasets for Etymostatistics 1. Take whatever text you like for a given language and select from it all non-borrowed lexical roots. 2. Exclude all prefixes, suffixes and proper names and count each root only once. 3. Take this set of roots and look, with help of etymological dictionaries, for each root, whether it has a reflex in other genetically related languages you want to investigate. 4. Compute the similarity of the text-language to the other languages by calculating the percentage of roots reflected in the other languages. 5. Repeat the procedure for the other languages you want to investigate by changing the text-language and selecting different texts for the investigation. 15 / 33

Slide 16

Slide 16 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion The Separation Base Method Etymostatistics Phylogenetic Reconstruction Comparison of the Models Datasets for Etymostatistics “Das kräftige Wirtschaftswachstum [...] [hat] die Stimmung der Verbraucher [...] weiter aufgehellt.” (Spiegel ONLINE, 2010/08/26)1 Word Meaning “Lemma” Root Reflex Coding Das “that” das *þat that 1 kräftige “strong” Kraft *kraftiz craft 1 Wirtschaftswachstum “economic growth” Wirt *werđuz - 0 hat “has” haben *xaƀēnan to have 1 [die] = das Stimmung “mood” Stimme *stemnō - 0 [der] = das Verbraucher “consumer” Brauch *brūkanan to brook 1 weiter “further” weit *wīđaz wide 1 aufgehellt “brighten” “hell” OHG hellan - 0 1Translation: “The strong economic growth has further brightened the mood of the customers.” 16 / 33

Slide 17

Slide 17 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion The Separation Base Method Etymostatistics Phylogenetic Reconstruction Comparison of the Models Phylogenetic Reconstruction Distance-Based Methods Convert the binary data into distances, and analyze it with help of common cluster algorithms (e.g. Neighbor-Joining, cf. Saitou & Nei 1987; UPGMA, cf. Sokal & Michener 1958). Character-Based Methods Take the binary form of the data, and analyze it with help of specific algorithms which explain the distribution of characters according to certain evolutionary models (e.g. probabilistic models, cf. Ronquist 2003; parsimony models, cf. Camin & Sokal 1965). 17 / 33

Slide 18

Slide 18 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion The Separation Base Method Etymostatistics Phylogenetic Reconstruction Comparison of the Models Comparison of the Models Separation Base Method Etymostatistics Evolutionary Model Root loss Root loss and gain Data Complete etymological dictionaries listing all re- constructable roots of a proto-language Random samples of roots extracted from texts or word-lists Reconstruction Quasi-distances based on the assumption that the root reflexes in the descendant languages are hypergeometrically distributed Uncorrected distances (Percentages of com- mon character states) 18 / 33

Slide 19

Slide 19 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion Simulations of the Evolutionary Models Testing the Models on Real Data Testing the Methods Simulations of the Evolutionary Models Testing the Models on Real Data 19 / 33

Slide 20

Slide 20 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion Simulations of the Evolutionary Models Testing the Models on Real Data Simulations of the Evolutionary Models +++ short description of the programs +++ 20 / 33

Slide 21

Slide 21 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion Simulations of the Evolutionary Models Testing the Models on Real Data Simulations of the Evolutionary Models Python Program for the Simulation of the Models Program starts with one language L. Language goes through different generations of change. A generation of change is characterized by a possible split of the language into two descendant languages and a random amount of root-loss (Separation Base Method) or root-loss and root-gain (Etymostatistics). The result is a certain amount of descendant languages in the last generation of change and a specific distribution of roots among these languages. 21 / 33

Slide 22

Slide 22 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion Simulations of the Evolutionary Models Testing the Models on Real Data Simulations of the Evolutionary Models L_0000 L_0001 L_0010 L_0011 L_1000 L_1001 L_1010 L_1011 200 400 600 800 1000 22 / 33

Slide 23

Slide 23 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion Simulations of the Evolutionary Models Testing the Models on Real Data Simulations of the Evolutionary Models L 1000 L_0 977 L_0 884 L_00 818 L_10 745 L_000 665 L_001 682 L_100 714 L_101 567 L_0000 516 L_0001 521 L_0010 434 L_0011 615 L_1000 330 L_1001 708 L_1001 501 L_1011 387 23 / 33

Slide 24

Slide 24 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion Simulations of the Evolutionary Models Testing the Models on Real Data Simulations of the Evolutionary Models 24 / 33

Slide 25

Slide 25 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion Simulations of the Evolutionary Models Testing the Models on Real Data Testing the Separation Base Method +++ description of the test+++ 25 / 33

Slide 26

Slide 26 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion Simulations of the Evolutionary Models Testing the Models on Real Data Testing the Separation Base Method +++ graphic/tree +++ 26 / 33

Slide 27

Slide 27 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion Simulations of the Evolutionary Models Testing the Models on Real Data Testing the Separation Base Method +++ graphic/lexstat/stefenelli+++ 27 / 33

Slide 28

Slide 28 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion Simulations of the Evolutionary Models Testing the Models on Real Data Testing the Separation Base Method +++ zusammenfassen der Resultate+++ 28 / 33

Slide 29

Slide 29 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion Simulations of the Evolutionary Models Testing the Models on Real Data Testing Etymostatistics +++ description of the test+++ 29 / 33

Slide 30

Slide 30 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion Simulations of the Evolutionary Models Testing the Models on Real Data Testing Etymostatistics +++ graphic/results+++ 30 / 33

Slide 31

Slide 31 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion Simulations of the Evolutionary Models Testing the Models on Real Data Testing Etymostatistics +++ zusammenfassen der resultate+++ 31 / 33

Slide 32

Slide 32 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion Model-Internal Problems Models and Reality Conclusion Model-Internal Problems Models and Reality 32 / 33

Slide 33

Slide 33 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion Model-Internal Problems Models and Reality Model-Internal Problems +++ Information loss in the models +++ +++ more rigid testing of the appropriate method for reconstruction +++ 33 / 33

Slide 34

Slide 34 text

Introduction Two Models of Language Evolution Testing the Models of Language Evolution Conclusion Model-Internal Problems Models and Reality Models and Reality +++ split as the key assumption +++ evolution is not always tree-like +++ datasets are problematic 34 / 33