Slide 1

Slide 1 text

A Sound-Change-Based Phylogeny of the Tukanoan Language Family Using Ordered Multistate Models for Phylogenetic Reconstruction Thiago Chacon and Johann-Mattis List

Slide 2

Slide 2 text

Introduction

Slide 3

Slide 3 text

Goals of the Study The goals of this study is to present a phylogenetic model that can infer trees from sound changes Other models used for inferring trees from phonological data are ● Hruschka et al. (2015) ● Wheeler and Whiteley (2015) Our model differs from these approaches in the following important ways: ● We do neither use raw sequences (Wheeler and Whiteley 2015) nor aligned sequences (Hruschka et al. 2015). Instead we use data on sound change patterns. ● In this, we follow Barbaçon et. al. (2013:165): “linguistic phylogeny estimation — and studies of phylogeny estimation methods in linguistics — need to be informed by linguistic scholarship”

Slide 4

Slide 4 text

The Tukano Language Family 29 languages 15 with reasonable documentation Northwest Amazon

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

The Tukano Language Family Disputed Internal classification - Methods - Heuristics (Mason 1950) - Lexicostatistics (Waltz and Wheeler 1972, Ramirez 1997) - Sound Innovations (Malone 1987, Chacon 2014) - Major branches: - 2: ET and WT (Mason 1950, Chacon 2014) - 3: ET, WT and CT (Waltz and Wheeler 1972, Malone 1987, Barnes 1999) - Internal classification of major branches: - 3 minor branches in each branch (Chacon 2014)

Slide 7

Slide 7 text

Classification by Chacon (2014), based on shared innnovations in sound-change processes

Slide 8

Slide 8 text

Genetic grouping and subgrouping by the traditional comparative method 1. Word-lists 2. Cognate sets 3. Sound correspondence sets… 4. Evidence for genetic relationship: systematic form-meaning relations 5. Reconstruction of proto-forms 6. Analysis of complementary distribution 7. Refining reconstruction 8. Proto-form to reflexes 9. Proposal of intermediate proto-forms... 10. Subgrouping: identification of shared sound changes followed by interpretation of “remarkable” changes favouring certain subgroups.

Slide 9

Slide 9 text

Genetic grouping and subgrouping with the traditional comparative method Cognate set Kor Sek Sio Mai Tan Bas Des Kub Tuk Kar Wan P-T Gloss pia p’ia p’ia ʔbia bia bia bia bia bia bia bia *p’ia CHILI jeha jeha jiha jiha - jiba jeba jeba je’pa jepa ja’pa *jip’a LAND/ GROUND

Slide 10

Slide 10 text

Genetic grouping and subgrouping with the traditional comparative method Sound correspondence sets and proto-forms Kor Sek Sio Mai Tan Bas Des Kub Tuk Kar Wan Context P-T p p’ p’ ʔb b b b b b b b #_ *p’ h h h h h b b b ʔp p ʔp V_ *p’

Slide 11

Slide 11 text

Genetic grouping and subgrouping with the traditional comparative method A unique change identifies the ET subgroup: *p’ > b / #_ Another unique change identifies an ET subgroup *p’ > b / V_

Slide 12

Slide 12 text

Changes for PT *p /#_

Slide 13

Slide 13 text

Changes for PT *p /V_V

Slide 14

Slide 14 text

Genetic grouping and subgrouping with the traditional comparative method ● Homoplasy: Not all sound changes occur only once, they can occur multiple times, and some are so frequent that they do not provide any evidence for subgrouping (think of kentum-satem languages). But there’s often disagreement among scholars as to what are innovations and what not. ● What counts as “remarkable”: Scholars have not come up with a rigorous procedure that would justify why some innovations should be more important for subgrouping than others. ● Circularity: Since reconstruction usually involves a certain amount of subgrouping (at least in the researcher’s heads), one runs the risk of making circular arguments for a certain subgroup due to a certain reconstruction. ● Risk of cherry picking: In the end, all scholars run the danger of selecting innovations just according to their original hypotheses.

Slide 15

Slide 15 text

The model we propose here is a first step towards making the subgrouping enterprise more objective: - “Remarkable” is replaced by a definite principle of overall uniqueness: subgroups are established by the greatest number of more unique changes shared by a set of languages - Homoplasy is captured as the most frequent changes recurring over a set of languages that also share more unique changes - No personal bias towards cherry picking Circularity between reconstruction and subgrouping is to some degree still present, but... - “Phonetic Drifts” give a phonetic measure of the likelihood of a change - The algorithm evaluates other changes that were not at the focus of the analyst when establishing subgroupings

Slide 16

Slide 16 text

Materials

Slide 17

Slide 17 text

Language Data Data preparation i. phonemic representation ii. identifying cognate sets: 150 cognates iii. extracting sound correspondences: 33 sets iv. reconstruction of proto-sounds: 18 consonants and 42 reflexes v. Preparation of “phonetic drifts” for creating networks of sound transitions

Slide 18

Slide 18 text

Phonetic drifts “Phonetic drifts” represent sound changes as internally organized in different stages within a pool of potential articulatory variations. From the internal organization of drifts, it is possible to infer speciation events from a proto-sound to a set of reflexes: *k > k, t∫, s, x *k k t∫ s x L1 L2 L3 L4

Slide 19

Slide 19 text

Phonetic drifts This is only possible due to the very nature of sound changes, which are Regular: or at least overwhelmingly Gradual: following more or less discrete steps from T1 to T2… Tn Phonetically blind: overwhelmingly following from universal principles of phonetics Directional: A > B but B > A

Slide 20

Slide 20 text

Phonetic drifts The principles underlying the drifts are the following (1) Teleology: from proto-sound to reflexes (2) Intermediate stages: *A > C → **A > *B > C (3) Directionality (4) Context dependency (5) Competing pathways of change: *A > B > D or *A > C > D

Slide 21

Slide 21 text

Phonetic drift PT: *k Reflexes: t∫, s, h k > t∫ > ∫ > s > h k > t∫ > ∫ > h k > kx > x > h

Slide 22

Slide 22 text

Methods

Slide 23

Slide 23 text

Weighted Directed Transitions for Character States Our model assumes multiple character states with state transitions which are ● directed, and ● weighted. We further allow for unattested character states which we include explicitly into the model.

Slide 24

Slide 24 text

Weighted Directed Transitions for Character States

Slide 25

Slide 25 text

Weighted Directed Transitions for Character States

Slide 26

Slide 26 text

Weighted Directed Transitions for Character States

Slide 27

Slide 27 text

Weighted Directed Transitions for Character States

Slide 28

Slide 28 text

From Proto-Forms and Reflexes to Characters ● Phonetic drifts were converted into a sound network, treating the proto-form as just another sound. ● This pattern is then converted into a transition matrix, by ● converting it into a graph first ● and then calculating character transition weights by computing the shortest path length between the characters ● if no shortest path can be found, the transition is given a high penalty

Slide 29

Slide 29 text

From Proto-Forms and Reflexes to Characters

Slide 30

Slide 30 text

Tree Search Heuristics Since it is not feasible to search through the whole tree space when dealing with more than 10 languages, we need to search heuristically. The strategy we use follows the following schema: ● start from a random tree, and ○ create more trees by swapping nodes in the tree ○ retain the best trees (in terms of parsimony scores) and create more trees from them ○ create more random trees to avoid to get stuck in a local maximum ● stop the tree search and return the best trees, if the researcher has had enough

Slide 31

Slide 31 text

This plot illustrates how the model searches the tree space for the first 6000 trees visited. Note the nearly constant amount of badly scoring trees, reflecting the constant amount of random trees which are generated to make sure that the model does not get stuck in a local optimum. Tree Search Heuristics

Slide 32

Slide 32 text

Implementation, Analysis, and Evaluation ● Code is implemented in Python, as a LingPy plugin (later to be included regularly in LingPy). ● We analyzed three different models (searching 500 000 trees for each), in order to check for the effects of directionality and weights in networks: ○ FITCH: a simple parsimony that penalizes every transition with 1 ○ SANKOFF: a weighted parsimony model that penalizes transitions by calculating the shortest path in the sound transition network, but with the sound transition network being treated as an undirected network ○ WDT (weighted directed transitions): The model which we described before. ● We evaluate the performance of each model by comparing the reconstruction quality (ancestral state reconstruction for the best trees), and the trees themselves (qualitative evaluation).

Slide 33

Slide 33 text

Results

Slide 34

Slide 34 text

General Results: Numbers Model Parsimony Score Most Parsimonious Trees Homoplasy Reconstruction Success FITCH 107 716 0.67 35% SANKOFF 148 1019 0.82 33% WDT 182 18 1.9 90%

Slide 35

Slide 35 text

General Results: Networks of Sound Transitions With the result of a given analysis (a tree or a set of trees), we can calculate for each sound transition how frequently it occurs in the data for the given tree. This is interesting both with respect to questions regarding sound change frequencies, but also with respect to the quality of our analysis, since we would assume that those changes which occur most frequently are also those changes which are generally frequent and lead to high degrees of homoplasy in parsimony analyses.

Slide 36

Slide 36 text

Full network of attested sound transitions for the WDT analysis. General Results: Networks of Sound Transitions

Slide 37

Slide 37 text

General Results: Networks of Sound Transitions Sub-network of attested sound transitions for the WDT analysis (dental cluster).

Slide 38

Slide 38 text

General Results: Networks of Sound Transitions Sub-network of attested sound transitions for the WDT analysis (labial cluster).

Slide 39

Slide 39 text

Specific Results: Networks of Sound Transitions Sub-network of attested sound transitions for the WDT analysis (velar cluster).

Slide 40

Slide 40 text

General Results: Networks of Sound Transitions Sub-network of attested sound transitions for the WDT analysis (affricate cluster).

Slide 41

Slide 41 text

General Results: Visualizing the Findings ● An interactive web-application was created to allow for a quick inspection of the results. ● It shows individual evolutionary scenarios inferred for each of the models and the corresponding consensus tree. ● It shows also a summary of each model with all changes inferred for each node and a detailed listing of those sounds that have changed by then according to the given model. ● The application can be launched via: http://digling.org/tukano/

Slide 42

Slide 42 text

Specific Results Overall performance of tree topology WDT > FITCH > Sankoff excellent unacceptable

Slide 43

Slide 43 text

SANKOFF: Majority Rule Consensus General Results: Sankoff Trees

Slide 44

Slide 44 text

Specific Results SANKOFF - Overall failure in identifying major and intermediate subgroups - Only surface similarities seem to have influenced the tree - A little better performance regarding more shallow subgroups - Perfect match with manual classification regarding Western-ET subgroup

Slide 45

Slide 45 text

FITCH: Majority Rule Consensus Specific Results: FITCH Trees

Slide 46

Slide 46 text

Specific Results FITCH - A little better, but still unacceptable classification - ET and WT was not fully captured - ET languages BAS and MAK were wrongly classified as an outgroup - Good tree for WT and E-ET languages

Slide 47

Slide 47 text

WDT: Majority Rule Consensus General Results: WDT Trees

Slide 48

Slide 48 text

Specific Results WDT - Excellent tree! - WT vs. ET split was neatly captured! - WT internal classification matches perfectly with manual classification - 3 ET subgroups! - Kub and Tan as an ET outgroup, confirming alternative expectations! - Very consistent subgrouping in Western-ET and Eastern-ET

Slide 49

Slide 49 text

WDT vs. Chacon 2014: analysis of sound changes Chacon 2014 WDT *j > t∫ unique unique *t’ > d > r *h > Ø parallel parallel *t > d unique parallel *C’ > v’C retention unique + parallel *p’ > p vs. b overlooked unique

Slide 50

Slide 50 text

Examples of analysis of phonological patterns Relative Chronology and Chain Shifts (1a) *h > Ø (1b) *s > h Mergers and horizontal diffusion *t∫, *ts, *s

Slide 51

Slide 51 text

Outlook

Slide 52

Slide 52 text

Further experimentations: ○ general networks instead of individual networks for each character (“local” vs. “global” networks) ○ constraining vs. expanding intermediate stages in phonetic transitions ○ weighting sound transitions, e.g.: ■ assimilation +1 incrementation vs. dissimilation +5 incrementation ■ articulatorily biased change +1 vs. acoustically biased change +2 ○ other linguistic families: ■ widely known, e.g. Romance vs. poorly known, e.g. Arawak ■ shallow vs. deeper time depth ■ few (around 10) vs. many languages (40+) Research and database on the typology of sound changes Directionality seems to be the key. Do we get directionality into probabilistic models?

Slide 53

Slide 53 text

Thank You for Listening!