Slide 1

Slide 1 text

Vowel Purity and Rhyme Evidence in Old Chinese Reconstruction Johann-Mattis List (CRLAO)

Slide 2

Slide 2 text

Introduction

Slide 3

Slide 3 text

Rhyme Evidence in Old Chinese Reconstruction ● the morpheme-syllabic character of the Chinese writing system does not give us many clues regarding the exact pronunciation of Chinese during its oldest stages ● rhyme patterns in old Chinese poetry, like the Shījīng 詩經 (1050 - 600 BC), are therefore important for Old Chinese reconstruction ● rhyme pattern analysis has a long tradition in Chinese traditional scholarship, but it has never been thoroughly systematized

Slide 4

Slide 4 text

Traditional Rhyme Pattern Analysis ● traditional rhyme pattern analysis (sīguàn shéngqiān fǎ 絲 貫繩牽法 follows a greedy approach ○ start from words which could be shown to rhyme with each other in one poem ○ cluster words greedily into clusters by looking for words which occur across poems

Slide 5

Slide 5 text

Traditional Rhyme Pattern Analysis

Slide 6

Slide 6 text

Traditional Rhyme Pattern Analysis

Slide 7

Slide 7 text

Traditional Rhyme Pattern Analysis

Slide 8

Slide 8 text

Traditional Rhyme Pattern Analysis

Slide 9

Slide 9 text

Traditional Rhyme Pattern Analysis

Slide 10

Slide 10 text

Traditional Rhyme Pattern Analysis

Slide 11

Slide 11 text

Traditional Rhyme Pattern Analysis ● unfortunately, the traditional rhyme analysis favors lumping of rhyme categories over splitting ● no more than about 30 categories were identified by the traditional rhyme analysis up to the middle of the 20th century ● only later, Baxter’s (1992) hypotheses-testing approach to quantitative rhyme data made it possible to postulate more distinct (52) categories

Slide 12

Slide 12 text

Vowel Purity and Rhyme Evidence

Slide 13

Slide 13 text

Vowel Purity and Rhyme Evidence ● Ho (2016) criticises Baxter and Sagart’s (2014) reconstruction of Old Chinese by pointing to many rhymes in which words with different main vowels rhyme ● this principle, that says that old Chinese poetry was strictly avoiding the rhyming of words with different vowels could be called the principle of “vowel purity” in rhymes

Slide 14

Slide 14 text

Vowel Purity and Rhyme Evidence ● Ho’s (2016) argument rests on two fundamental assumptions a. Baxter and Sagart’s (2014) Old Chinese reconstruction is in strong conflict with the principle of vowel purity b. vowel purity was a key principle in Old Chinese rhyming

Slide 15

Slide 15 text

Vowel Purity and Rhyme Evidence ● assumption b. is very difficult to check, and we find many counter-examples both in Chinese rhyme traditions and in a cross-linguistic comparison of rhyme traditions ● assumption a. can be easily checked, but unlike Ho (2016), we need to check it quantitatively and comparatively for different OC reconstruction systems

Slide 16

Slide 16 text

Vowel Purity and Rhyme Evidence ● Ho’s (2016) argument against Baxter and Sagart (2014): ○ lacks any concrete examples ○ confuses conflicts between traditional rhyme categories and the rhyme categories by Baxter and Sagart (2014) with actual conflicts with vowel purity

Slide 17

Slide 17 text

Vowel Purity and Rhyme Evidence ● Ho’s (2016) argument against Baxter and Sagart (2014): ○ lacks any concrete examples ○ confuses conflicts between traditional rhyme categories and the rhyme categories by Baxter and Sagart (2014) with actual conflicts with vowel purity “It is a firm linguistic fact that rhyming should be based on identity of vowels. Interpretation should, of course, be based upon facts. Facts precede and matter more than interpretation.” (Ho 2016: 183)

Slide 18

Slide 18 text

Vowel Purity and Rhyme Evidence ● Ho’s (2016) argument against Baxter and Sagart (2014): ○ lacks any concrete examples ○ confuses conflicts between traditional rhyme categories and the rhyme categories by Baxter and Sagart (2014) with actual conflicts with vowel purity “It is a firm linguistic fact that rhyming should be based on identity of vowels. Interpretation should, of course, be based upon facts. Facts precede and matter more than interpretation.” (Ho 2016: 183) → yes, let’s work with pure facts!

Slide 19

Slide 19 text

Evaluating Vowel Purity in Reconstruction

Slide 20

Slide 20 text

Materials: Rhyme Network The Shījīng Browser ● rhyme data from the Shījīng following Baxter (1992) ● digitized and converted to machine-readable format in List (under review) ● data online available in form of a Shījīng Browser (http: //digling.org/shijing/)

Slide 21

Slide 21 text

Materials: Rhyme Network DEMO of http://digling.org/shijing/

Slide 22

Slide 22 text

Materials: Rhyme Network The Shījīng Rhyme Network ● a network of all words rhyming in the Shījīng (List under review) ● rhyme words are nodes in the networks (1996 nodes in total) ● links between nodes reflect instances in which two words rhyme in the Shījīng according to Baxter’s (1992) analysis ● an automatic analysis of the patterns, in which community-detection algorithms were used to search for potential rhyme groups is available at http://digling.org/shijing/infomap.html

Slide 23

Slide 23 text

Materials: Rhyme Network DEMO of http://digling.org/shijing/infomap.html

Slide 24

Slide 24 text

Materials: Reconstruction Systems ● Baxter and Sagart (2014): online available for download ● Karlgren (1954): provided by Eastling (http://www.eastling.org) ● Wáng Lì (1980): provided by Eastling ● Pān Wúyùn (2000): provided by Eastling ● Zhèngzhāng (2003): provided by Eastling ● Starostin (1989): provided by Tower of Babel (http://starling. rinet.ru) ● Li (1971): provided by Eastling

Slide 25

Slide 25 text

Materials: Reconstruction Systems ● Not all data is complete, since not all sources give reconstructions for all characters in the Shījīng ● 1213 character readings occur in all seven datasets ● two analyses are carried out: ○ “complete coverage”: analysis for the character readings which occur in all seven reconstruction systems ○ “partial coverage”: analysis for all character readings available for a given reconstruction system

Slide 26

Slide 26 text

Materials: Reconstruction Systems

Slide 27

Slide 27 text

Methods: Testing Vowel Purity in Rhyme Networks The Problem ● List (under review) uses community detection algorithms to determine possible rhyme categories in the Shījīng and to compare these with Old Chinese ● vowel purity, however, does not exclusively determine which words rhyme with each other, since we know that other aspects, like the coda, also contribute to rhyming ● on the other hand, vowel purity should restrict certain rhymes

Slide 28

Slide 28 text

Methods: Testing Vowel Purity in Rhyme Networks The Problem We search for a measure that reflects the tendency of vowel purity in a network model. But vowel purity does only to a certain extent coincide with rhyme categories. We thus do not search for a measure that tells us something about the quality of communities that we determine, but a measure that tells us to which degree the topology of our network is in conflict with the characteristics of the nodes.

Slide 29

Slide 29 text

Methods: Testing Vowel Purity in Rhyme Networks

Slide 30

Slide 30 text

Methods: Testing Vowel Purity in Rhyme Networks

Slide 31

Slide 31 text

Methods: Testing Vowel Purity in Rhyme Networks

Slide 32

Slide 32 text

Methods: Testing Vowel Purity in Rhyme Networks

Slide 33

Slide 33 text

Methods: Testing Vowel Purity in Rhyme Networks

Slide 34

Slide 34 text

Methods: Testing Vowel Purity in Rhyme Networks Conductance as a measure of cluster purity? ● The conductance of a group of nodes in a network estimates the degree of its isolation or fragmentation (Leskovec et al. 2008). ● List (under review) uses conductance to compare the purity of six- vowel systems (Baxter and Sagart 2014) in contrast with Middle Chinese vowel systems in the Shījīng rhyme network. ● But conductance has shortcomings when comparing different clusters across graphs: When averaging conductance scores for each cluster (vowel), systems with less vowels are favored.

Slide 35

Slide 35 text

Methods: Testing Vowel Purity in Rhyme Networks Modularity as a measure for cluster purity? ● Modularity of a given set of clusters in a network is the fraction of the edges within a cluster subtracted by the number of edges expected at random. (Newman 2006) ● Modularity can be positive or negative, with positive values indicating that communities are potentially present. (Newman 2006) ● Modularity suffers from a low resolution limits if networks become large or nodes do not share many links.

Slide 36

Slide 36 text

Methods: Testing Vowel Purity in Rhyme Networks Assortativity ● Assortativity (Newman 2003) tests whether nodes sharing connections in a graph are also similar regarding other characteristics. ● Adapted to our rhyme network, this means that we test whether words that rhyme in the Book of Odes share also the same vowel. ● This seems to be exactly what we are looking for: a measure for the degree to which a given reconstruction system reflects the assumption that words with identical vowels tend to rhyme.

Slide 37

Slide 37 text

Methods: Testing Vowel Purity in Rhyme Networks Assortativity high assortativity

Slide 38

Slide 38 text

Methods: Testing Vowel Purity in Rhyme Networks Assortativity low assortativity

Slide 39

Slide 39 text

Results

Slide 40

Slide 40 text

Results: General Remarks

Slide 41

Slide 41 text

Results: General Remarks

Slide 42

Slide 42 text

Results: General Remarks

Slide 43

Slide 43 text

Results: General Remarks

Slide 44

Slide 44 text

Results: General Remarks

Slide 45

Slide 45 text

Results: General Remarks Coding Rhymes by Vowel Quality: ■ a ■ e ■ i ■ o ■ u ■ ə

Slide 46

Slide 46 text

Results: General Remarks Coding Rhymes by Vowel Quality: ■ a ■ e ■ i ■ o ■ u ■ ə

Slide 47

Slide 47 text

Results: General Remarks Coding Rhymes by Vowel Quality: ■ a ■ e ■ i ■ o ■ u ■ ə

Slide 48

Slide 48 text

Results: Detailed Comparison 1213 Nodes 1471 - 1996 nodes

Slide 49

Slide 49 text

Results: Detailed Comparison 1213 Nodes 1471 - 1996 nodes

Slide 50

Slide 50 text

Results: Detailed Comparison 1213 Nodes 1471 - 1996 nodes None of the systems shows a 100% vowel purity, but apparently, Baxter and Sagart (2014) outperform all other reconstructions regarding vowel purity!

Slide 51

Slide 51 text

Conclusion and Discussion

Slide 52

Slide 52 text

Discussion ● the reconstruction by Baxter and Sagart (2014) corresponds closer to the criterion of vowel purity than the other systems compared ● the quantitative investigation shows that the critics by Ho (2016) do not hold ● the rather high assortativity scores reported for almost all reconstruction systems shows that vowel purity is a principle that is reflected in all reconstructions (albeit with different rigor)

Slide 53

Slide 53 text

Discussion: Caveats 1. the comparison lacks full coverage for all reconstruction systems, which may have influenced the results (although I expect no larger differences) 2. errors both in the rhyme networks and the reconstruction systems might have further influenced the results 3. we should be careful with the idea of vowel purity itself: it is by no means proven that it was a driving principle for those who created the poems in the Shījīng 4. we should ideally compare our study with an alternative rhyme network, like the one by Wáng (1980)

Slide 54

Slide 54 text

Conclusion Despite potential errors and further limits of the analysis, this study could (hopefully) show that ● thorough quantitative comparison can give us new insights into our problems in Old Chinese reconstruction ● instead of dismissing theories or reconstructions by spurious or non- existing examples or assumptions, exhaustive evaluations give us a fresh perspective on our problems ● In order to tackle our data-problems in the future, collaborative efforts are required and people should try to share all their data as transparently as possible

Slide 55

Slide 55 text

Many thanks to Philippe Lopez (Team Adaptation, Integration, Reticulation, Evolution, UPMC) for providing invaluable help with the network analysis, and to Laurent Sagart (CRLAO) for helpful discussions and for providing the Old Chinese data! Supplementary material with all code and all detailed reconstructions available at: https://gist.github.com/LinguList

Slide 56

Slide 56 text

Thanks for your attention! Many thanks to Philippe Lopez (Team Adaptation, Integration, Reticulation, Evolution, UPMC) for providing invaluable help with the network analysis, and to Laurent Sagart (CRLAO) for helpful discussions and for providing the Old Chinese data! Supplementary material with all code and all detailed reconstructions available at: https://gist.github.com/LinguList