Slide 1

Slide 1 text

CLICS² A computer-assisted framework for the investigation of lexical motivation patterns Johann-Mattis List Research Group “Computer-Assisted Language Comparison” Department of Linguistic and Cultural Evolution Max-Planck Institute for the Science of Human History Jena, Germany 2018-06-27 very long title P(A|B)=P(B|A)... 1 / 34

Slide 2

Slide 2 text

From Semantic Maps to Cross-Linguistic Polysemy Networks From Semantic Maps... ... to Cross-Linguistic Polysemies 2 / 34

Slide 3

Slide 3 text

From Semantic Maps to Cross-Linguistic Polysemy Networks Early Accounts Early Accounts: People and Ideas 3 / 34

Slide 4

Slide 4 text

From Semantic Maps to Cross-Linguistic Polysemy Networks Early Accounts Early Accounts: People and Ideas Haspelmath (2003): The geometry of grammatical meaning. 3 / 34

Slide 5

Slide 5 text

From Semantic Maps to Cross-Linguistic Polysemy Networks Early Accounts Early Accounts: People and Ideas Haspelmath (2003): The geometry of grammatical meaning. François (2008): Semantic maps and the typology of colexification. 3 / 34

Slide 6

Slide 6 text

From Semantic Maps to Cross-Linguistic Polysemy Networks Early Accounts Early Accounts: People and Ideas Haspelmath (2003): The geometry of grammatical meaning. François (2008): Semantic maps and the typology of colexification. Cysouw (2010): Drawing networks from recurrent polysemies. 3 / 34

Slide 7

Slide 7 text

From Semantic Maps to Cross-Linguistic Polysemy Networks Early Accounts Early Accounts: People and Ideas Haspelmath (2003): The geometry of grammatical meaning. François (2008): Semantic maps and the typology of colexification. Cysouw (2010): Drawing networks from recurrent polysemies. Steiner, Stadler, and Cysouw (2011): A pipeline for computational historical linguistics. 3 / 34

Slide 8

Slide 8 text

From Semantic Maps to Cross-Linguistic Polysemy Networks Early Accounts Early Accounts: People and Ideas Haspelmath (2003): The geometry of grammatical meaning. François (2008): Semantic maps and the typology of colexification. Cysouw (2010): Drawing networks from recurrent polysemies. Steiner, Stadler, and Cysouw (2011): A pipeline for computational historical linguistics. Urban (2011): Assymetries in overt marking and directionality in semantic change. 3 / 34

Slide 9

Slide 9 text

From Semantic Maps to Cross-Linguistic Polysemy Networks Early Accounts Early Accounts: Data 4 / 34

Slide 10

Slide 10 text

From Semantic Maps to Cross-Linguistic Polysemy Networks Early Accounts Early Accounts: Data Intercontinental Dictionary Series (IDS, Key and Comrie 2016) offers 1310 concepts translated into about 360 languages, an earlier version offered ca. 200 languages. 4 / 34

Slide 11

Slide 11 text

From Semantic Maps to Cross-Linguistic Polysemy Networks Early Accounts Early Accounts: Data Intercontinental Dictionary Series (IDS, Key and Comrie 2016) offers 1310 concepts translated into about 360 languages, an earlier version offered ca. 200 languages. World Loanword Typology (WOLD, Haspelmath and Tadmor 2009) offers 1430 concepts translated into 41 languages (some overlap with IDS). 4 / 34

Slide 12

Slide 12 text

From Semantic Maps to Cross-Linguistic Polysemy Networks Early Accounts Early Accounts: Techniques Steiner, Stadler, and Cysouw (2011) present the idea to model similarities between concepts by constructing a matrix from parts of the IDS data that shows how often individual languages colexify certain concepts. 5 / 34

Slide 13

Slide 13 text

From Semantic Maps to Cross-Linguistic Polysemy Networks Early Accounts Early Accounts: Techniques Steiner, Stadler, and Cysouw (2011) present the idea to model similarities between concepts by constructing a matrix from parts of the IDS data that shows how often individual languages colexify certain concepts. Cysouw (2010) shows how to use polysemy data to draw networks. 5 / 34

Slide 14

Slide 14 text

From Semantic Maps to Cross-Linguistic Polysemy Networks Initial Ideas Initial Ideas 6 / 34

Slide 15

Slide 15 text

From Semantic Maps to Cross-Linguistic Polysemy Networks Initial Ideas Initial Ideas List, Terhalle, and Urban (2013) build on ideas of Cysouw (2010) and Steiner, Stadler and Cysouw (2011) in using IDS data for polysemy studies and in using network techniques to study the data. 6 / 34

Slide 16

Slide 16 text

From Semantic Maps to Cross-Linguistic Polysemy Networks Initial Ideas Initial Ideas List, Terhalle, and Urban (2013) build on ideas of Cysouw (2010) and Steiner, Stadler and Cysouw (2011) in using IDS data for polysemy studies and in using network techniques to study the data. In contrast to earlier approaches, they use techniques for community detection (Girvan and Newman 2002) to further analyse the network, and to partition the concepts into communities which seem to make intuitively sense, reminding of naturally derived semantic fields. 6 / 34

Slide 17

Slide 17 text

From Semantic Maps to Cross-Linguistic Polysemy Networks Further Ideas Further Ideas 7 / 34

Slide 18

Slide 18 text

From Semantic Maps to Cross-Linguistic Polysemy Networks Further Ideas Further Ideas Mayer, List, Terhalle, and Urban (2014) present an interactive way to visualize cross-linguistic colexification data. 7 / 34

Slide 19

Slide 19 text

From Semantic Maps to Cross-Linguistic Polysemy Networks Further Ideas Further Ideas Mayer, List, Terhalle, and Urban (2014) present an interactive way to visualize cross-linguistic colexification data. List, Mayer, Terhalle, and Urban (2014) publish the database and the web-application online, under the name CLICS (Database of Cross-Linguistic Colexifications). 7 / 34

Slide 20

Slide 20 text

From Semantic Maps to Cross-Linguistic Polysemy Networks Further Ideas Further Ideas Mayer, List, Terhalle, and Urban (2014) present an interactive way to visualize cross-linguistic colexification data. List, Mayer, Terhalle, and Urban (2014) publish the database and the web-application online, under the name CLICS (Database of Cross-Linguistic Colexifications). In contrast to earlier attempts, they increase the data by merging IDS, WOLD, and additional datasets, thus containing 220 languages in total. 7 / 34

Slide 21

Slide 21 text

From Semantic Maps to Cross-Linguistic Polysemy Networks Further Ideas Further Ideas Mayer, List, Terhalle, and Urban (2014) present an interactive way to visualize cross-linguistic colexification data. List, Mayer, Terhalle, and Urban (2014) publish the database and the web-application online, under the name CLICS (Database of Cross-Linguistic Colexifications). In contrast to earlier attempts, they increase the data by merging IDS, WOLD, and additional datasets, thus containing 220 languages in total. They also improve the community detection procedure by using Infomap (Rosvall and Bergstrom 2008), an advanced algorithm based on random walks in complex networks. 7 / 34

Slide 22

Slide 22 text

CLICS 1.0 CLICS 1.0 8 / 34

Slide 23

Slide 23 text

CLICS 1.0 Data Data 9 / 34

Slide 24

Slide 24 text

CLICS 1.0 Data Data IDS (Key and Comrie 2007 version), of 233 language varieties, 178 included in CLICS. 9 / 34

Slide 25

Slide 25 text

CLICS 1.0 Data Data IDS (Key and Comrie 2007 version), of 233 language varieties, 178 included in CLICS. WOLD (Haspelmath and Tadmor 2009), of 41 languages in WOLD, 33 are included in CLICS. 9 / 34

Slide 26

Slide 26 text

CLICS 1.0 Data Data IDS (Key and Comrie 2007 version), of 233 language varieties, 178 included in CLICS. WOLD (Haspelmath and Tadmor 2009), of 41 languages in WOLD, 33 are included in CLICS. Logos Dictionary (Logos Group), of dictionaries for more than 60 different languages, 4 languages were manually extracted and included in CLICS. 9 / 34

Slide 27

Slide 27 text

CLICS 1.0 Data Data IDS (Key and Comrie 2007 version), of 233 language varieties, 178 included in CLICS. WOLD (Haspelmath and Tadmor 2009), of 41 languages in WOLD, 33 are included in CLICS. Logos Dictionary (Logos Group), of dictionaries for more than 60 different languages, 4 languages were manually extracted and included in CLICS. Språkbanken project (University of Gothenburg) offers 8 word lists for SEA languages, 6 were included in CLICS. 9 / 34

Slide 28

Slide 28 text

CLICS 1.0 Methods Methods Problems 10 / 34

Slide 29

Slide 29 text

CLICS 1.0 Methods Methods Problems (A) Data cannot be displayed fully, complexity needs to be reduced. (B) Data is noisy and needs to be corrected. 10 / 34

Slide 30

Slide 30 text

CLICS 1.0 Methods Methods Problems (A) Data cannot be displayed fully, complexity needs to be reduced. (B) Data is noisy and needs to be corrected. Solutions 10 / 34

Slide 31

Slide 31 text

CLICS 1.0 Methods Methods Problems (A) Data cannot be displayed fully, complexity needs to be reduced. (B) Data is noisy and needs to be corrected. Solutions (A) Show communities instead of showing all the data, offer a subgraph-view that cuts out the nearest neighbors of one concept to compensate for data loss in the community view. (B) Filter by language families and weight the concept links by frequency of occurrence, following Dellert’s (2014) suggestion. This will cut most of the links resulting from homophony and leaves the links which are due to polysemy. 10 / 34

Slide 32

Slide 32 text

CLICS 1.0 Interface Interface 11 / 34

Slide 33

Slide 33 text

CLICS 1.0 Interface Interface Interface is written in JavaScript for the visualizations and PhP for querying the data. 11 / 34

Slide 34

Slide 34 text

CLICS 1.0 Interface Interface Interface is written in JavaScript for the visualizations and PhP for querying the data. The interactive component of the network browser was specifically designed for CLICS and builds on the D3 framework by Bostock et al. (2011). 11 / 34

Slide 35

Slide 35 text

CLICS 1.0 Interface Interface Interface is written in JavaScript for the visualizations and PhP for querying the data. The interactive component of the network browser was specifically designed for CLICS and builds on the D3 framework by Bostock et al. (2011). The underlying network with the inferred communities is offered for download from the website, and the whole code which was used to create the website is available for download at http://github.com/clics/clics. 11 / 34

Slide 36

Slide 36 text

CLICS 1.0 Interface Interface Interface is written in JavaScript for the visualizations and PhP for querying the data. The interactive component of the network browser was specifically designed for CLICS and builds on the D3 framework by Bostock et al. (2011). The underlying network with the inferred communities is offered for download from the website, and the whole code which was used to create the website is available for download at http://github.com/clics/clics. The full wordlists underlying the original CLICS database are now also available from Zenodo (published in List 2018, https://zenodo.org/record/1194088). 11 / 34

Slide 37

Slide 37 text

CLICS 1.0 Interface CLICS DEMO 12 / 34

Slide 38

Slide 38 text

CLICS² CLICS² 13 / 34

Slide 39

Slide 39 text

CLICS² Motivation Motivation 14 / 34

Slide 40

Slide 40 text

CLICS² Motivation Motivation Problems in CLICS 1.0 difficult to curate (error-correction, linking data, adding data) 14 / 34

Slide 41

Slide 41 text

CLICS² Motivation Motivation Problems in CLICS 1.0 difficult to curate (error-correction, linking data, adding data) difficult to collaborate (the CLICS team is separated and everybody is extremely busy with things other than CLICS 14 / 34

Slide 42

Slide 42 text

CLICS² Motivation Motivation Problems in CLICS 1.0 difficult to curate (error-correction, linking data, adding data) difficult to collaborate (the CLICS team is separated and everybody is extremely busy with things other than CLICS difficult to communicate (not all users understand how we arrived at the data, and often think that it is us who messed up datasets, etc., although we only take the data to produce something new out of it) 14 / 34

Slide 43

Slide 43 text

CLICS² Motivation Motivation Problems in CLICS 1.0 difficult to curate (error-correction, linking data, adding data) difficult to collaborate (the CLICS team is separated and everybody is extremely busy with things other than CLICS difficult to communicate (not all users understand how we arrived at the data, and often think that it is us who messed up datasets, etc., although we only take the data to produce something new out of it) difficult to expand (new datasets cannot be added without having a true guiding principle) 14 / 34

Slide 44

Slide 44 text

CLICS² Motivation Motivation Problems in CLICS 1.0 difficult to curate (error-correction, linking data, adding data) difficult to collaborate (the CLICS team is separated and everybody is extremely busy with things other than CLICS difficult to communicate (not all users understand how we arrived at the data, and often think that it is us who messed up datasets, etc., although we only take the data to produce something new out of it) difficult to expand (new datasets cannot be added without having a true guiding principle) difficult to catch up (we know much, much better now, how to curate datasets, but we did not know this when preparing CLICS initially) 14 / 34

Slide 45

Slide 45 text

CLICS² Ideas Ideas 15 / 34

Slide 46

Slide 46 text

CLICS² Ideas Ideas use the state of the art of available wordlist data 15 / 34

Slide 47

Slide 47 text

CLICS² Ideas Ideas use the state of the art of available wordlist data separate data from display (CLICS² does not host data, but simply uses it) 15 / 34

Slide 48

Slide 48 text

CLICS² Ideas Ideas use the state of the art of available wordlist data separate data from display (CLICS² does not host data, but simply uses it) curate data following the recommendations developed for the Cross-Linguistic Data Formats (CLDF, http://cldf.clld.org) initiative (Forkel et al. 2017) 15 / 34

Slide 49

Slide 49 text

CLICS² Ideas Ideas use the state of the art of available wordlist data separate data from display (CLICS² does not host data, but simply uses it) curate data following the recommendations developed for the Cross-Linguistic Data Formats (CLDF, http://cldf.clld.org) initiative (Forkel et al. 2017) curate the code and the data with help of a transparent API 15 / 34

Slide 50

Slide 50 text

CLICS² Ideas Ideas use the state of the art of available wordlist data separate data from display (CLICS² does not host data, but simply uses it) curate data following the recommendations developed for the Cross-Linguistic Data Formats (CLDF, http://cldf.clld.org) initiative (Forkel et al. 2017) curate the code and the data with help of a transparent API regularly release the data in release circles of about 1 per year (following the practice of Glottolog and other CLLD projects) 15 / 34

Slide 51

Slide 51 text

CLICS² Ideas Ideas use the state of the art of available wordlist data separate data from display (CLICS² does not host data, but simply uses it) curate data following the recommendations developed for the Cross-Linguistic Data Formats (CLDF, http://cldf.clld.org) initiative (Forkel et al. 2017) curate the code and the data with help of a transparent API regularly release the data in release circles of about 1 per year (following the practice of Glottolog and other CLLD projects) 15 / 34

Slide 52

Slide 52 text

CLICS² Excursus Excursus: The Cross-Linguistic Data Initiative Cross-Linguistic Data Formats (Forkel et al. 2017) aims at increasing the comparability of cross-linguistic data and analyses supports methods for standardization via reference catalogues like Glottolog (Hammarström et al. 2018) and Concepticon (List et al. 2017) provides software APIs which help to test whether data conforms to standards offers working examples for best practice supported by different software frameworks (LingPy, BEASTling, EDICTOR) 16 / 34

Slide 53

Slide 53 text

CLICS² Excursus CLDFDEMO 17 / 34

Slide 54

Slide 54 text

CLICS² Excursus Excursus: Reference Catalogues The advantages of linking one’s data to reference catalogs like Glottolog (Hammarström et al. 2018, http://glottolog.org) are obvious: Since Glottolog harvests various types of additional information regarding language varieties all over the world that can be used effortlessly, once linked. 18 / 34

Slide 55

Slide 55 text

CLICS² Excursus Excursus: Reference Catalogues The advantages of linking one’s data to reference catalogs like Glottolog (Hammarström et al. 2018, http://glottolog.org) are obvious: Since Glottolog harvests various types of additional information regarding language varieties all over the world that can be used effortlessly, once linked. The Concepticon project (http://concepticon.clld.org, List et al. 2016, List et al. 2018) is much less well known among scholars, but it offers the same advantages when dealing with wordlist data that was built by means of a questionnaire of “elicitation glosses”. 18 / 34

Slide 56

Slide 56 text

CLICS² Excursus Excursus: Concepticon Concepticon (List et al. 2016) link concept labels (“elicitation glosses”) in published concept lists (questionnaires) to concept sets link concept sets to meta-data define relations between concept sets never link one concept in a given list to more than one concept set (guarantees consistency) provide an API to check the consistency of the data and to query the data provide a web-interface to browse through the data 19 / 34

Slide 57

Slide 57 text

CLICS² Excursus Concepticon ID Concept in Source English Gloss Conceptlist Alpher-1999-151-27 fat, grease [english] Alpher 1999 151 He-2010-207-145 脂肪 [chinese] fat He 2010 207 Janhunan-2008-235-96 fat / grease [english] Janhunan 2008 235 Gudschinsky-1956-200-42 fat-grease [english] Gudschinsky 1956 200 Swadesh-1952-200-43 fat (organic substance) [english] Swadesh 1952 200 Swadesh-1955-100-26 fat (grease) [english] Swadesh 1955 100 ... ... ... ... Concept Set FAT (ORGANIC SUBSTANCE) Related concept sets Esters of three fatty acid chains and the alcohol glycerol which form a semi-solid substance in room temperature and occur in animals and plants. 20 / 34

Slide 58

Slide 58 text

CLICS² Excursus Concepticon English German Chinese French Spanish Russian Portuguese Selected language: en fece| MATCH ID GLOSS DEFINITION SIMILARITY face 1560 FACE The front part of the head, featuring the eyes, nose, and mouth and the surrounding area. 3 feces 675 FAECES (EXCREMENT) Substance that human and animal bodies release from time to time as a little pile of waste remaining from digestion, after it has been collected in the colon. 3 fence 1690 FENCE Delimitation for an area. 3 20 / 34

Slide 59

Slide 59 text

CLICS² Excursus CONCEPTICON DEMO 21 / 34

Slide 60

Slide 60 text

CLICS² Excursus Excursus: Data in CLDF # Dataset Source Range Glosses Concepticon Varieties Glottolog Families 1 allenbai Allen (2007) Bai (ST) 500 499 9 9 1 2 bantubvd Greenhill & Gray (2015) Bantu 430 415 10 9 1 3 beidasinitic Běijīng Dàxué (1964) Sinitic (ST) 905 700 18 18 1 4 bowernpny Bowern & Atkinson (2011) Pama-Nyungan 348 342 171 164 2 5 hubercolumbian Huber & Reed (1992) Colombian 374 343 69 65 16 6 ids Key & Comrie (2016) World-wide 1305 1305 324 234 61 7 kraft Kraft (1981) Chadic 434 428 67 60 3 8 northeuralex Dellert & Jäger (2017) North-Eurasian 1016 940 107 105 21 9 robinsonap Robinson & Holton (2012) Alor-Pantar 398 386 13 11 1 10 satterthwaitetb Satterthwaite-Phillips (2011) Sino-Tibetan 423 418 18 15 1 11 sunztb Sūn (1991) Sino-Tibetan 1005 906 50 44 1 12 tls Nurse and Phillipson (1975) Tanzanian 1533 808 131 97 1 13 tryonsolomon Tryon and Hackman (1983) Solomon Islands 324 311 111 96 5 14 wold Haspelmath & Tadmor (2009) World-wide 1460 1457 41 40 25 15 zgraggenmadang Z’graggen (1980abcd) Madang 336 306 100 98 1 TOTAL / OVERLAP 2482 1266 1036 91 Datasets are all released under https://zenodo.org/communities/clics. 22 / 34

Slide 61

Slide 61 text

CLICS² Excursus Excursus: Data in CLDF Since our datasets are all available in CLDF format, we can easily aggregate them for our new version of CLICS². 23 / 34

Slide 62

Slide 62 text

CLICS² Excursus Excursus: Data in CLDF Since our datasets are all available in CLDF format, we can easily aggregate them for our new version of CLICS². Given problems with concept overlap in the datasets, we offer code examples that can be used to compute mutual coverage statists allowing users to select subsets of the data optimal for a given analysis. 23 / 34

Slide 63

Slide 63 text

CLICS² Excursus Excursus: Data in CLDF average mutual coverage 300 400 500 600 700 800 900 1000 language 0.0 0.2 0.4 0.6 0.8 1.0 2400 2200 2000 1800 1600 1400 1200 1000 800 600 400 200 A B 60 80 100 120 140 160 180 200 220 languages 0.0 0.2 0.4 0.6 0.8 1.0 1280 1180 1080 980 880 780 680 580 480 380 280 24 / 34

Slide 64

Slide 64 text

CLICS² Excursus Excursus: Software API 25 / 34

Slide 65

Slide 65 text

CLICS² Excursus Excursus: Software API With the Python API that we have prepared for CLICS² (https://github.com/clics/clics2), users are able to use their own data to run their own network analyses. Since all data for CLICS² is independently shared and curated, users can also use the data we selected for CLICS² but test different parameters of our API. 25 / 34

Slide 66

Slide 66 text

CLICS² Excursus Excursus: Software API With the Python API that we have prepared for CLICS² (https://github.com/clics/clics2), users are able to use their own data to run their own network analyses. Since all data for CLICS² is independently shared and curated, users can also use the data we selected for CLICS² but test different parameters of our API. We offer examples of how the data we use for CLICS² can be computed with help of the API and plan to make them available in form of code cookbooks. 25 / 34

Slide 67

Slide 67 text

CLICS² Excursus Excursus: Software API With the Python API that we have prepared for CLICS² (https://github.com/clics/clics2), users are able to use their own data to run their own network analyses. Since all data for CLICS² is independently shared and curated, users can also use the data we selected for CLICS² but test different parameters of our API. We offer examples of how the data we use for CLICS² can be computed with help of the API and plan to make them available in form of code cookbooks. By shifting to the CLLD framework, scholars can also create their own CLICS websites, since the source code for the creation of interactive networks is transparently shipped with the data. 25 / 34

Slide 68

Slide 68 text

CLICS² Features Features: Summary 26 / 34

Slide 69

Slide 69 text

CLICS² Features Features: Summary drastic increase in data 26 / 34

Slide 70

Slide 70 text

CLICS² Features Features: Summary drastic increase in data drastic increase in transparency 26 / 34

Slide 71

Slide 71 text

CLICS² Features Features: Summary drastic increase in data drastic increase in transparency drastic increase in replicability 26 / 34

Slide 72

Slide 72 text

CLICS² Features Features: Summary drastic increase in data drastic increase in transparency drastic increase in replicability regular floating releases which feature new data 26 / 34

Slide 73

Slide 73 text

CLICS² Features Features: Summary drastic increase in data drastic increase in transparency drastic increase in replicability regular floating releases which feature new data strict and clear-cut collaboration guidelines 26 / 34

Slide 74

Slide 74 text

CLICS² Features Features: Summary drastic increase in data drastic increase in transparency drastic increase in replicability regular floating releases which feature new data strict and clear-cut collaboration guidelines new methods (see demo on next slide) 26 / 34

Slide 75

Slide 75 text

CLICS² Features Features: Summary drastic increase in data drastic increase in transparency drastic increase in replicability regular floating releases which feature new data strict and clear-cut collaboration guidelines new methods (see demo on next slide) rigid policy towards open data (since we heavily profit from all of our colleagues who publish their data!) 26 / 34

Slide 76

Slide 76 text

CLICS² Features Features: Coverage 27 / 34

Slide 77

Slide 77 text

CLICS² Features Features: Enhanced Browsing 28 / 34

Slide 78

Slide 78 text

CLICS² Features Features: Enhanced Browsing Thanks to the CLLD framework, the data is now much easier to browse, and all data is clearly linked to the original datasets. 28 / 34

Slide 79

Slide 79 text

CLICS² Features Features: Enhanced Browsing Thanks to the CLLD framework, the data is now much easier to browse, and all data is clearly linked to the original datasets. Thanks to a standalone app that can be created from our data in pure HTML format, users can still browse CLICS² data with the old look-and-feel, and even use the standalone application to deploy their own data in form of CLICS networks. In addition, we are currently experimenting with a new visualization that allows users to inspect the CLICS² network in all its complexity, following visualization methods developed for the inspection of Galaxies (contributed by Thomas Mayer). 28 / 34

Slide 80

Slide 80 text

CLICS² Features Features: Examples CARRY IN HAND CARRY UNDER ARM RULE ORDER SALT TAKE CHOOSE LEND SHARE BRING FORGET ACQUIT HAVE SEX HAND LIBERATE DIRTY GUEST ARM BETWEEN UPPER ARM MOLD TORCH OR LAMP OWN GAP (DISTANCE) DRIP (EMIT LIQUID) FINGERNAIL OR TOENAIL RIVER KISS RAIN (PRECIPITATION) WHEN SPOON SUCK ROUND LICK FINGERNAIL CLAW SOUP DRINK FORK PITCHFORK WATER SEA OPEN SMOKE (INHALE) LET GO OR SET FREE CAUSE DIRT FORKED BRANCH SEND LIP FORGIVE UNTIE ANCHOR EAT BITE BEVERAGE SWALLOW SAP URINE ANKLE FISHHOOK WHEEL WHERE LIFT CHIEFTAIN LOWER ARM CAUSE TO (LET) QUEEN GIVE ELBOW DONATE ELECTRICITY SKY STORM CLOUDS MUD SWAMP SMOKE (EXHAUST) FRESH SMOKE (EMIT SMOKE) STRANGER CEASE MOORLAND HOST GO UP (ASCEND) WEDDING CLIMB CLOUD PALM OF HAND FIVE MARRY RISE (MOVE UPWARDS) WRIST KING PRESIDENT FATHOM COLLARBONE RIDE SPACE (AVAILABLE) MASTER SHOULDER BROOM RAKE FLESH HOOK DRIBBLE SPIT TOE PAW OCEAN FINGER LAKE EDGE OBSCURE TOP NIGHT INCREASE WORLD UP DARKNESS BE GOD CALF OF LEG LEG SHIN FISH LOWER LEG WOMAN FEMALE (OF PERSON) FEMALE FEMALE (OF ANIMAL) LAGOON CORNER BORDER BESIDE FRINGE BOUNDARY WIFE COAST POINTED SHARP SHORE PLACE (POSITION) END (OF SPACE) EARTH (SOIL) BLACK STAND UP CHEW MEAL BREAKFAST HEEL FOOD DINNER (SUPPER) FOOT STAR SAND CLAY STAND SHOULDERBLADE CRAWL WAKE UP FOG FINISH DARK MALE ICE WAIST MARRIED MAN HIP DEEP LUNG FOAM REMAINS BLUE WAIT (FOR) LIFE LATE BE ALIVE AFTER TOWN BEHIND ASH FLOUR STATE (POLITICS) NEW UPPER BACK BOTTOM PASTURE THATCH BUTTOCKS MAN MALE (OF ANIMAL) MALE (OF PERSON) SIT DOWN TALL CROUCH EVENING AFTERNOON HIGH WEST GROW MAINLAND SIT LAND FLOOR AREA HALT (STOP) DUST REMAIN GROUND NATIVE COUNTRY DWELL (LIVE, RESIDE) COUNTRY HUSBAND BACK END (OF TIME) SPINE GRASS DEW MARRIED WOMAN ROOSTER INSECT FOWL BIRD ANIMAL HEN SHORT BABY CORN FIELD THIN SAGO PALM GARDEN SMALL THIN (OF SHAPE OF OBJECT) CLAN NARROW FAMILY YOUNG CITIZEN FINE OR THIN SHALLOW THIN (SLIM) GIRL RELATIVES YOUNG MAN FRIEND PARENTS CHILD (DESCENDANT) YOUNG WOMAN BOY NEIGHBOUR CHILD (YOUNG HUMAN) SON SIBLING BROTHER DESCENDANTS OLDER SIBLING DAUGHTER ALONE FENCE ONLY FEW TOWER SOME ONE YARD OUTSIDE FORTRESS NEVER PLAIN PEOPLE VALLEY DOWN FIELD LOW PERSON YOUNGER SIBLING YOUNGER SISTER OLDER BROTHER YOUNGER BROTHER COUSIN SISTER OLDER SISTER NEPHEW DAMP FLOWER MANY SMOOTH WIDE FLAT BLOOD WET BELOW OR UNDER DOWN OR BELOW GREY BREAD DOUGH RAW VILLAGE GREEN CROWD SOFT AT ALL SLIP UNRIPE VEIN BLOOD VESSEL ALWAYS TENDON ROOF ROOT INSIDE OR GENTLE OLD WITH ENOUGH OLD (AGED) FORMER AND ROOM HOME TENT HUT GARDEN-HOUSE WEAK DENSE MEN'S HOUSE OLD MAN LAZY STILL (CONTINUING) TIRED AGAIN MORE READY OLD WOMAN SOMETIMES IN HOUSE OFTEN YELLOW RED AFTERWARDS BIG GOLD YOLK HOUR SALTY PINCH KNEEL AGE RIPE THICK FULL STRAIGHT BE LATE LIGHT (RADIATION) ABOVE WORK (ACTIVITY) PRODUCE MAKE DAY (NOT NIGHT) HEAVEN WORK (LABOUR) BUILD FAR AT THAT TIME LONG WHITE LENGTH THEN MOUNTAIN OR HILL SEASON HAVE PRESS GET PICK UP HEAD HOLD EARN DO OR MAKE WEATHER FATHER STEPFATHER UNCLE FATHER-IN-LAW (OF MAN) FATHER'S BROTHER MOTHER'S BROTHER STEPMOTHER AUNT BEGINNING BEGIN FIRST FATHER'S SISTER MOTHER-IN-LAW (OF WOMAN) MOTHER'S SISTER MOTHER MOTHER-IN-LAW (OF MAN) PARENTS-IN-LAW GRANDDAUGHTER SON-IN-LAW (OF WOMAN) FATHER-IN-LAW (OF WOMAN) SON-IN-LAW (OF MAN) DAUGHTER-IN-LAW (OF WOMAN) CHILD-IN-LAW SIBLING'S CHILD NIECE GRANDFATHER DAUGHTER-IN-LAW (OF MAN) IN FRONT OF FORWARD GRANDSON GRANDCHILD GRANDMOTHER ANCESTORS GRANDPARENTS THING STREET MANNER ROAD PIECE PORT PATH OR ROAD PATH RIB BONE BAIT THIGH BAY FLESH OR MEAT MEAT FOOTPRINT SIDE PART SLICE WALL (OF HOUSE) MIDDLE NAVEL SNOW LAST (FINAL) HAY HALF NEAR CHICKEN BULL SNAKE WORM CATTLE LIVESTOCK CALF OX COW WHICH WHITHER (WHERE TO) WINE HOW CIRCLE RING BALL BRACELET HOW MUCH HOW MANY BEEHIVE GRAVE CAVE BEARD RAIN (RAINING) SPRING OR WELL MOUSTACHE STREAM GLUE ALCOHOL (FERMENTED DRINK) BEE BEER HONEY WHO WASP MEAD WHAT WHY CANDY LUNCH ITEM WARE CUSTOM LAW MIDDAY PIT (POTHOLE) HOLE FURROW DITCH LAIR JUDGMENT COURT ADJUDICATE CONDEMN CONVICT ACCUSE BLAME ANNOUNCE PREACH EXPLAIN SAY ASK (REQUEST) THROW BUDGE (ONESELF) SHOOT EMBERS UGLY CHOP CUT DOWN COLD (OF WEATHER) FIREWOOD GRASP LEAD (GUIDE) DISTANCE LIE DOWN CARRY ON HEAD PERMIT PUSH MOLAR TOOTH FRONT TOOTH (INCISOR) RIDGEPOLE BEAK COAT TOWEL HELMET SHIRT HEADBAND HEADGEAR RAG VEIL SOON TOGETHER IMMEDIATELY NEST NOW BED TODAY INSTANTLY SUDDENLY RUG WITHOUT PONCHO BLANKET CLOAK MAT BEFORE BOLT (MOVE IN HASTE) ROAR (OF SEA) FAST DASH (OF VEHICLE) EARLY YESTERDAY HURRY AT FIRST EMPTY NO DRY ZERO NOTHING NOT RESULT IN BE BORN HAPPEN PASS SUCCEED BECOME BRAVE CLOTH POWERFUL DARE LOUD GRASS-SKIRT DRESS CLOTHES SKIRT RIPEN SOLID PIERCE HARD BEGET ROUGH REFUSE FRY DRESS UP DENY CALM MORNING PEACE BE SILENT QUIET SWELL TOMORROW HEALTHY EXPENSIVE HAPPY ROAST OR FRY STRONG BAKE PRICE BOIL (SOMETHING) PUT ON COOKED SLOW FAITHFUL RIGHT LAST (ENDURE) FOR A LONG TIME DAWN BEAUTIFUL GOOD COOK (SOMETHING) YES CORRECT (RIGHT) BOIL (OF LIQUID) DO PUT BRIGHT CLEAN LIGHT (COLOR) LAY (VERB) SHINE SEAT (SOMEBODY) INNOCENT FORBID PREPARE CERTAIN TRUTH TRUE DEAR PRECIOUS WARM HEAT CONCEIVE SEW LOOM PLAIT LIGHT (IGNITE) BURN (SOMETHING) PREVENT HOLY GOOD-LOOKING ARSON BEND CHANGE (BECOME DIFFERENT) BURNING TWIST DEBT CROOKED ROLL SPIN HEAVY HOT WEAVE DIFFICULT FEVER PLAIT OR BRAID OR WEAVE PREGNANT OWE TWINKLE CLEAR BEND (SOMETHING) MORTAR CRUSHER PESTLE BITTER MILL MONTH SKULL MEASURE TRY COME BACK TIME MOON COUNT JOIN SQUEEZE PILE UP CLOCK BUY DRAW MILK DAY (24 HOURS) BETRAY GUARD PROTECT PAY KNEE KEEP SELL SUN BILL HELP LIE (MISLEAD) TRADE OR BARTER DECEIT PERJURY RESCUE CURE FOLD SIEVE PRESERVE TRANSLATE TURN (SOMETHING) TURN WRAP HERD (SOMETHING) WAGES DEFEND CHANGE RETURN HOME TIE UP (TETHER) TURN AROUND HANG KNIT WEIGH HANG UP GIVE BACK CONNECT COVER BUTTON BUNCH KNOT SHUT BUNDLE TIE NOOSE GILL EAR EARLOBE THINK FOLLOW JEWEL BE ABLE OBEY SUMMER FEEL (TACTUALLY) REMEMBER SUSPECT BELIEVE GUESS RECOGNIZE (SOMEBODY) SOUR SWEET SUGAR CANE BRACKISH SUGAR TASTY CALCULATE IMITATE CITRUS FRUIT TASTE (SOMETHING) READ COME PRECIPICE SEE STONE OR ROCK APPROACH TOUCH ARRIVE YEAR MEET GRIND FRAGRANT ROTTEN SMELL (STINK) SMELL (PERCEIVE) STINKING SNIFF PUS FEEL UNDERSTAND HEAR THINK (BELIEVE) LISTEN MOVE (AFFECT EMOTIONALLY) KNOW (SOMETHING) NOTICE (SOMETHING) WATCH LEARN REEF STUDY LOOK FOR LOOK NASAL MUCUS (SNOT) SPLASH PITY HIDE (CONCEAL) SHELF FLY (MOVE THROUGH AIR) REGRET NOSTRIL THIEF BOARD SINK (DESCEND) DECREASE CHEEK NOSE BROKEN LOSE EMERGE (APPEAR) ANXIETY BAD LUCK GOOD LUCK OMEN WRONG SLAB FOREHEAD EYE BAD EVIL TABLE INJURE DANGER SURPRISED HARVEST BERRY FEAR (FRIGHT) NUT FAULT MISTAKE BECOME SICK SEED MISS (A TARGET) GUILTY SWELLING BRUISE BLISTER BOIL (OF SKIN) SCAR CHOKE ENTER ACHE SICK DISEASE PAIN DAMAGE (INJURY) SEVERE GRIEF SAUSAGE BEAD STOMACH INTESTINES CHAIN SPLEEN NECKLACE WOMB LIVER BELLY MEANING GHOST POSTCARD HEART LEGENDARY CREATURE SHADE DEMON BRAIN MEMORY FIGHT LETTER THOUGHT MIND BOOK COLLAR INTENTION SPIRIT PURSUE LONG HAIR SPRINGTIME HAIR (HEAD) THINK (REFLECT) DOUBT AUTUMN ORNAMENT HOPE ARMY QUARREL BEAT SOLDIER KNOCK BATTLE NOISE REST NAPE (OF NECK) THROAT NECK IDEA IF BECAUSE SLEEP FOREST DRIP (FALL IN GLOBULES) STICK TREE WALKING STICK PLANT (VEGETATION) LIE (REST) DRAG ASK (INQUIRE) DIVIDE URGE (SOMEONE) STING BRANCH CAMPFIRE BORROW SEPARATE TOOTH MOUTH CANDLE FALL ASLEEP DRIVE (CATTLE) MATCH DRIVE RAFTER BEAM DOORPOST DREAM (SOMETHING) POST MAST TUMBLE (FALL DOWN) WALK TREE TRUNK LAND (DESCEND) TEAR (SHRED) SAW GO OUT FALL TEAR (OF EYE) GO DOWN (DESCEND) BODY TREE STUMP SHOW CARVE SPOIL (SOMEBODY OR SOMETHING) BREAK (CLEAVE) PLANT (SOMETHING) DESTROY WALK (TAKE A WALK) CHIN BREAK (DESTROY OR GET DESTROYED) CUT PICK SPLIT LEAVE PULL CLUB WOOD MOVE (ONESELF) HIRE PRAISE MIX KNEAD WIPE SNEEZE BOAST SCRATCH CLEAN (SOMETHING) HOARFROST WORSHIP COUGH SWEEP RUB SCRAPE CARCASS DIE (FROM ACCIDENT) DIE BATHE SWIM DEAD FLOAT LOVE STAB SAIL PEEL SPREAD OUT CRY COMMON COLD (DISEASE) FROST CORPSE SHRIEK JUMP SHOUT DIG WINTER NAME STREAM (FLOW CONTINUOUSLY) PLOUGH CULTIVATE PLAY VISIBLE SEEM STRETCH SOW SEEDS RETREAT INVITE MUSIC RUN COLD HOLLOW OUT CHARCOAL TONGUE STOVE CONVERSATION SKIN DIVORCE OVEN EARWAX COOKHOUSE TIP (OF TONGUE) AIR HUNT BORE CALL BY NAME BREATH STEP (VERB) SONG ATTACK WASH PROUD SIN DEFENDANT CRIME CHIME (ACTION) EGG TESTICLES BARLEY FRUIT VEGETABLES GRAIN MAIZE RICE WHEAT RUDDER RYE PADDLE SWAY SWING (MOVEMENT) SWING (SOMETHING) SHAKE ROW FREEZE JOG (SOMETHING) OAT SHIVER RINSE RING (MAKE SOUND) MAKE NOISE SOUND (OF INSTRUMENT OR VOICE) TINKLE HOE SHOVEL SPADE FLOW DANCE FLEE CALL DAMAGE SAME FACE SIMILAR DISAPPEAR ESCAPE PRAY GAME BURY CAPE CHAIR MOVE STEAL GROAN HOWL COLD (CHILL) JAW DROWN SINK (DISAPPEAR IN WATER) SET (HEAVENLY BODIES) DIVE WOUND POUND TALK BREATHE PROMISE SPEAK WIND VOICE FUR PUBIC HAIR SOUND OR NOISE STRIKE OR BEAT BARK SCALE KILL HAMMER TONE (MUSIC) WOOL EXTINGUISH MURDER HIT SPEECH CHAT (WITH SOMEBODY) WORD STORM THRESH LEATHER LIKE NEED (NOUN) FELT SKIN (OF FRUIT) PAPER OATH WANT SWEAR KICK SNAIL DEATH PULL OFF (SKIN) SHELL FIREPLACE PEN HAIR (BODY) LANGUAGE CONVEY (A MESSAGE) TELL LEAF (LEAFLIKE OBJECT) FEATHER POUR FLAME GO SING BEESWAX HELL GATHER CARRY SEIZE CATCH TRAP (CATCH) WING FIRE CARRY ON SHOULDER CAST MOW BOSS FIND FIN ADMIT TEACH LEAF SAILCLOTH HAIR ANSWER SAY FOOT CIRCLE GRAIN 29 / 34

Slide 81

Slide 81 text

CLICS² Features Features: Examples TONGUE TELL ANNOUNCE TALK TIP (OF TONGUE) ADMIT CHAT (WITH SOMEBODY) SAY WORD ANSWER LANGUAGE VOICE SOUND OR NOISE NOISE PREACH SPEECH TONE (MUSIC) EXPLAIN CONVERSATION CONVEY (A MESSAGE) SPEAK 29 / 34

Slide 82

Slide 82 text

CLICS² Features Features: Examples TOE ANKLE ROUND RING FOOT CIRCLE WHEEL LEG BALL FOOTPRINT HEEL 29 / 34

Slide 83

Slide 83 text

CLICS² Features Features: Examples 29 / 34

Slide 84

Slide 84 text

CLICS² Features CLICS² DEMO 30 / 34

Slide 85

Slide 85 text

CLICS² Schedule Schedule 31 / 34

Slide 86

Slide 86 text

CLICS² Schedule Schedule CLICS data is currently being released, see https://zenodo.org/communities/clics. 31 / 34

Slide 87

Slide 87 text

CLICS² Schedule Schedule CLICS data is currently being released, see https://zenodo.org/communities/clics. CLICS² is deployed online in a beta-version (0.1) at http://clics.clld.org and published by List, Greenhill, Anderson, Mayer, Tresoldi and Forkel (2018). 31 / 34

Slide 88

Slide 88 text

CLICS² Schedule Schedule CLICS data is currently being released, see https://zenodo.org/communities/clics. CLICS² is deployed online in a beta-version (0.1) at http://clics.clld.org and published by List, Greenhill, Anderson, Mayer, Tresoldi and Forkel (2018). The official version will be published along with our paper on CLICS² (List et al. forthcoming, Linguistic Typology), approximately by the end of July. 31 / 34

Slide 89

Slide 89 text

CLICS² Schedule Schedule CLICS data is currently being released, see https://zenodo.org/communities/clics. CLICS² is deployed online in a beta-version (0.1) at http://clics.clld.org and published by List, Greenhill, Anderson, Mayer, Tresoldi and Forkel (2018). The official version will be published along with our paper on CLICS² (List et al. forthcoming, Linguistic Typology), approximately by the end of July. The space-ship visualization will be deployed online later this year. 31 / 34

Slide 90

Slide 90 text

Outlook Outlook 32 / 34

Slide 91

Slide 91 text

With CLICS², we provide a new framework for the collection and curation of data for the purpose of studying cross-linguistic colexification patterns. 33 / 34

Slide 92

Slide 92 text

With CLICS², we provide a new framework for the collection and curation of data for the purpose of studying cross-linguistic colexification patterns. Future updates are planned, and we assume that we will be able to increase the data further by at least five more larger datasets. 33 / 34

Slide 93

Slide 93 text

With CLICS², we provide a new framework for the collection and curation of data for the purpose of studying cross-linguistic colexification patterns. Future updates are planned, and we assume that we will be able to increase the data further by at least five more larger datasets. CLICS² is not perfect, and it does not come with any warranty. However, we hope that the improvements in terms of data transparency will make it much easier for scholars to work with the new cross-linguistic colexification database than its predecessor. 33 / 34

Slide 94

Slide 94 text

Thanks to our CLICS² team: Simon Greenhill, Cormac Anderson, Thomas Mayer, Tiago Tresoldi, and Robert Forkel 34 / 34

Slide 95

Slide 95 text

Thanks to our CLICS² team: Simon Greenhill, Cormac Anderson, Thomas Mayer, Tiago Tresoldi, and Robert Forkel Thank You for your attention! 34 / 34