Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Open problems in computational diversity linguistics

Open problems in computational diversity linguistics

Talk, held at the Research Colloquium of the Department of Comparative Linguistics (University Zürich, 2019-05-17).

Johann-Mattis List

May 17, 2019
Tweet

More Decks by Johann-Mattis List

Other Decks in Science

Transcript

  1. Open Problems in Computational Diversity Linguistics
    Johann-Mattis List
    Research Group “Computer-Assisted Language Comparison”
    Department of Linguistic and Cultural Evolution
    Max-Planck Institute for the Science of Human History
    Jena, Germany
    2019-05-17
    very
    long
    title
    P(A|B)=P(B|A)...
    1 / 62

    View full-size slide

  2. Introduction
    Introduction
    Introduction
    2 / 62

    View full-size slide

  3. Introduction Problems
    Problems (we ignore)
    La Société n’adment aucune communication concernant, soit
    l’origine du langage, soit la création d’une langue universelle.
    (Statuts de la Société de Linguistique de Paris, 1866: III)
    3 / 62

    View full-size slide

  4. Introduction Problems
    Problems (we did not know about)
    The Proto-Sapiens grammar was so simple that the sporadic ref-
    erences in previous paragraphs have essentially described it. The
    prime importance of sound symbolism for the people of nature
    should be noted again before we further detail that the vowel “E”
    was felt as indicating the “yin” element, passivity, femininity etc.
    [...] (Papakitsos and Kenanidis 2018: 8)
    4 / 62

    View full-size slide

  5. Introduction Problems
    Problems (we forgot)
    Based on an analysis of the literature and a large scale crowd-
    sourcing experiment, we estimate that an average 20-year-old na-
    tive speaker of American English knows 42,000 lemmas and 4,200
    non-transparent multiword expressions, derived from 11,100 word
    families. (Brysbaert et al. 2016: 1)
    5 / 62

    View full-size slide

  6. Introduction Hilbert Problems
    Hilbert Problems
    23 problems identified by the
    mathematician David Hilbert in
    1900 (Hilbert 1902)
    at least 10 problems have been
    solved by now
    some 7 problems have solutions
    accepted by some scientists
    6 / 62

    View full-size slide

  7. Introduction Hilbert Problems
    Hilpert Problems
    Martin Hilpert proposed a list of
    problems for linguistics in a talk
    in 2014
    Russell D. Gray further
    promoted the idea in a series of
    talks, where he emphasized we
    should ask more Hilb/pert
    questions in the field of diversity
    linguistics
    7 / 62

    View full-size slide

  8. Introduction Problems in CDL
    Problems in Computational Diversity Linguistics
    *deh3
    -
    ?
    “small” problems in comparison
    to big picture questions asked
    by Hilpert and Gray
    problems identified by myself
    can be solved by some workflow
    or algorithm
    they help us to advance our
    research by forcing us to
    formalize our work and to make
    clear what data we actually
    want to use
    8 / 62

    View full-size slide

  9. Open Problems
    Open Problems
    in Computational Diversity Linguistics
    *deh3
    -
    ?
    *deh3
    -
    ?
    *deh
    3 -
    ?
    *deh3
    -
    ?
    *deh3
    -
    ?
    9 / 62

    View full-size slide

  10. Open Problems Background
    Series of Blog Posts
    https://phylonetworks.blogspot.com
    10 / 62

    View full-size slide

  11. Open Problems Background
    Series of Blog Posts
    10 problems in total
    initial basic division into problems of inference, simulation, statistics,
    and typology
    problems will be discussed on a monthly basis throughout 2019
    first three problems were already discussed in February, March, and
    April
    updated division of problems: modeling, inference, analysis (MIA)
    11 / 62

    View full-size slide

  12. Open Problems Background
    Modeling, Inference, and Analysis
    20 x
    10 x
    5 x ?
    Modeling
    Inference
    Analysis
    12 / 62

    View full-size slide

  13. Open Problems Inference Problems
    Inference Problems
    Inference
    1 automatic morpheme segmentation
    (blog in February 2019)
    2 automatic sound law induction (blog
    in March 2019)
    3 automatic borrowing detection (blog
    in April 2019)
    4 automatic phonological reconstruction
    (blog planned for May 2019)
    13 / 62

    View full-size slide

  14. Open Problems Inference Problems
    Inference Problems
    As a rule, all inference problems deal with something we want
    to find in linguistic data. Their common objective is to identify
    past and present processes and states of which we – due to our
    models – think that they have occurred or existed once, or still
    occur and exist.
    14 / 62

    View full-size slide

  15. Open Problems Modeling Problems
    Modeling Problems
    Modeling
    5 simulation of lexical change
    6 simulation of sound change
    7 proof of language relatedness
    15 / 62

    View full-size slide

  16. Open Problems Modeling Problems
    Modeling Problems
    The modeling problems deal with our knowledge about pro-
    cesses and how we account for the processes in a formal or
    mathematical way. Proof of language relatedness is a specific
    case, maybe not completely fitting into this category, but its
    key objective is to model chance resemblances, which is why it
    is basically also a modeling task and not a task of inference.
    16 / 62

    View full-size slide

  17. Open Problems Analysis Problems
    Analysis Problems
    20 x
    10 x
    5 x ?
    Analysis
    8 typology of semantic change
    9 typology of semantic promiscuity
    10 typology of sound change
    17 / 62

    View full-size slide

  18. Open Problems Analysis Problems
    Analysis Problems
    The analysis problems deal with the bigger picture of the pro-
    cesses, and with the question if we can derive tendencies, rates,
    or frequencies from linguistic data. In order to achieve this, we
    need to infer the processes first, and this is the reason why
    these problems are listed last in this overview.
    18 / 62

    View full-size slide

  19. Open Problems Analysis Problems
    Analysis Problems: Semantic Promiscuity
    List et al. (2016): Unity and disunity [...]. Biology Direct.
    19 / 62

    View full-size slide

  20. Open Problems Analysis Problems
    Analysis Problems: Semantic Promiscuity
    List (2018): Von Wortfamilien [...]. Von Wörtern und Bäumen.
    20 / 62

    View full-size slide

  21. Open Problems Analysis Problems
    Analysis Problems: Semantic Promiscuity
    In der Linguistik gibt es noch keinen richtigen Terminus für Wörter,
    die selbst Grundlage von vielen anderen Wörtern sind, also im
    Wortschatz einer Sprache häufig in Ableitungen wiederverwendet
    werden. In Anlehnung an die Biologie, wo wir in den Protein-
    domänen ähnliche Phänomene vorfinden (Basu et al. 2008, List
    et al. 2016), könnten wir jedoch von promiskuitiven Konzepten
    sprechen, zu denen «schlagen» im Deutschen dann auf jeden
    Fall auch gehören sollte. (List 2018: Von Wortfamilien und
    promiskuitiven Wörtern)
    21 / 62

    View full-size slide

  22. Problem Solving
    Problem Solving Strategies
    *deh3
    -
    ?
    22 / 62

    View full-size slide

  23. Problem Solving CALC
    Computer-Assisted Language Comparison
    data in linguistics are steadily increasing
    our qualitative methods reach their practical limits
    we need to take computational methods into account
    but computational methods are not very accurate and may yield
    wrong results
    23 / 62

    View full-size slide

  24. Problem Solving CALC
    Computer-Assisted Language Comparison
    24 / 62

    View full-size slide

  25. Problem Solving CALC
    Computer-Assisted Language Comparison
    24 / 62

    View full-size slide

  26. Problem Solving CALC
    Computer-Assisted Language Comparison
    24 / 62

    View full-size slide

  27. Problem Solving CALC
    Computer-Assisted Language Comparison
    24 / 62

    View full-size slide

  28. Problem Solving CALC
    Computer-Assisted Language Comparison
    24 / 62

    View full-size slide

  29. Problem Solving CALC
    Computer-Assisted Language Comparison
    24 / 62

    View full-size slide

  30. Problem Solving CALC
    Computer-Assisted Language Comparison
    24 / 62

    View full-size slide

  31. Problem Solving CALC
    Computer-Assisted Language Comparison
    24 / 62

    View full-size slide

  32. Problem Solving CALC
    Computer-Assisted Language Comparison
    24 / 62

    View full-size slide

  33. Problem Solving CALC
    Computer-Assisted Language Comparison
    24 / 62

    View full-size slide

  34. Problem Solving CALC
    Computer-Assisted Language Comparison
    24 / 62

    View full-size slide

  35. Problem Solving CALC
    Computer-Assisted Language Comparison
    very
    long
    title
    P(A|B)=P(B|A)...
    ERC Starting Grant
    (2017-2022)
    Host: MPI-SHH (Jena)
    Current team: 2
    post-docs, 2 docs, and
    myself
    Objectives go beyond
    historical linguistics and
    Sino-Tibetan (but they
    are our starting point)
    http://calc.digling.org
    25 / 62

    View full-size slide

  36. Problem Solving Machine Learning and Black Boxes
    Machine Learning and Black Boxes
    many problems of inference and analysis are nowadays addressed by
    employing machine learning methods in the broad sense
    among the most popular techniques are Bayesian inference and neural
    networks
    using different techniques for problem solving is less accepted, and
    often receives surprised reactions, specifically among scholars with
    little training in historical linguistics and linguistic typology
    26 / 62

    View full-size slide

  37. Problem Solving Machine Learning and Black Boxes
    Linguistic Theory and Machine Learning
    For over 20 years, the IEEE has organised a biennial workshop
    covering the latest developments in automatic speech recognition
    and attended by the leading researchers in the field. Many of
    these meetings have been of high significance; for example, it was
    at the 1985 workshop entitled ‘Frontiers of Speech Recognition’
    that Fred Jelinek uttered the now immortal phrase “Every time
    we fire a phonetician/linguist, the performance of our system goes
    up”. By 1995 the series had become known as ‘ASRU’ - the IEEE
    workshop on Automatic Speech Recognition and Understanding.
    (Moore 2005: 1)
    27 / 62

    View full-size slide

  38. Problem Solving Machine Learning and Black Boxes
    Caveats in Applying Naive Machine Learning Techniques
    Problems may have a “normal” solution, so there is no need to bother
    for an approximate one.
    Standard techniques may not be apt the task at hand, i.e., Bayesian
    inference and neural networks are not good at induction, but
    induction is needed in many tasks.
    If the criteria upon which the results are based are hidden in a black
    box, they lack epistemological interest: as scientists we want to know
    what’s going on, not that a machine can do the same as we can.
    28 / 62

    View full-size slide

  39. Problem Solving Machine Learning and Black Boxes
    Importance of Feature Design
    One of the promises of deep learning is that it vastly simplifies
    the feature-engineering process by allowing the model designer to
    specify a small set of core, basic, or “natural” features, and letting
    the trainable neural network architecture combine them into more
    meaningful higher-level features, or representations. However, one
    still needs to specify a suitable set of core features, and tie them
    to a suitable architecture. (Goldberg 2017: 18)
    29 / 62

    View full-size slide

  40. Problem Solving Computer-Assisted Problem Solving
    Computer-Assisted Problem Solving
    identify the core class of your problem (modeling, inference, analysis)
    look at existing qualitative solutions
    formalize the problem in a way that allows one to test it (specify data
    and techniques for evaluation)
    do not hesitate to define sub-problems, given that qualitative
    solutions are often holistic
    search for inspiration in neighboring disciplines (graph theory,
    computer science, evolutionary biology) by looking for similar
    processes that could be addressed in an analogous or similar way
    accept a qualitative or semi-automatic solution for inference
    processes, but make sure that the results are annotated in a
    machine-readable way
    insist on transparent output (no black boxes) to allow for an
    immediate review of results by experts
    30 / 62

    View full-size slide

  41. Possible Solutions
    Possible Solutions
    *deh3
    -
    eu
    *deh3
    -
    re
    *deh
    3 -
    ka
    *deh3
    -
    He
    *deh3
    -
    !
    for the Inference Problems
    31 / 62

    View full-size slide

  42. Possible Solutions Morpheme Segmentation
    Automatic Morpheme Segmentation: Task
    Given a list of less than 1000 words
    in phonetic transcription, readily seg-
    mented into sounds, with concepts
    mapped to common concept lists
    (e.g., Concepticon), identify the mor-
    pheme boundaries in the data.
    32 / 62

    View full-size slide

  43. Possible Solutions Morpheme Segmentation
    Automatic Morpheme Segmentation: Current Solutions
    most algorithms build on n-grams (recurring symbol sequences of
    arbitrary length)
    assuming that n-grams representing meaning-building units should be
    distributed more frequently across the lexicon of a language, they
    assemble n-gram statistics from the data
    with Morfessor, there is a popular family of algorithms avilable in
    form of a stable library (Creutz and Lagus 2005, virpioja et al. 2013)
    33 / 62

    View full-size slide

  44. Possible Solutions Morpheme Segmentation
    Automatic Morpheme Segmentation: Performance
    34 / 62

    View full-size slide

  45. Possible Solutions Morpheme Segmentation
    Automatic Morpheme Segmentation: Difficulty
    morphemes are ambiguous, they are not only based on the form, but
    also on semantics
    even speakers may at times no longer understand the original
    morphology of their language (folk etymology, etc.)
    morphological judgments are thus based on different viewpoints
    (historical perspective involving more than one language, speaker
    intuition, descriptive grammar)
    35 / 62

    View full-size slide

  46. Possible Solutions Morpheme Segmentation
    Automatic Morpheme Segmentation: Qualitative Solutions
    humans take semantics into account (e.g., Spanish hermano
    “brother” vs. hermana “sister”)
    humans know that morphological structure varies across languages
    (compare SEA languages vs. Indo-European languages)
    humans try to infer phonotactic rules
    humans make use of cross-linguistic evidence
    36 / 62

    View full-size slide

  47. Possible Solutions Morpheme Segmentation
    Automatic Morpheme Segmentation: Suggestions
    employ semantic information (make use of new resources such as
    CLICS, Concepticon)
    employ phonotactic information (make use of the prosody models in
    LingPy)
    employ cross-linguistic information (use LingPy’s sequence
    comparison techniques)
    give up the idea of a universal morpheme segmentation algorithm
    (rather proceed from linguistic areas)
    invest time to create datasets for testing and training
    37 / 62

    View full-size slide

  48. Possible Solutions Borrowing Detection
    Automatic Borrowing Detection: Task
    Given word lists of different lan-
    guages, find out which words have
    been borrowed, and also determine
    the direction of borrowing.
    $
    38 / 62

    View full-size slide

  49. Possible Solutions Borrowing Detection
    Automatic Borrowing Detection: Current Solutions
    conflicts in the phylogeny, explain them by invoking borrowings (MLN
    approach, Nelson-Sathi et al. 2011, List et al. 2014)
    similar words among unrelated languages (Mennecier et al. 2016)
    tree reconciliation methods (Willems et al. 2016)
    borrowability statistics (Sergey Yakhontov, as reported by Starostin
    1990, Chén 1996, McMahon et al. 2005)
    39 / 62

    View full-size slide

  50. Possible Solutions Borrowing Detection
    Automatic Borrowing Detection: Performance
    conflicts in the phylogeny tend to overestimate the amount of
    borrowing, since there are multiple reasons for conflicts in
    phylogenies, not only borrowing (Morrison 2011)
    sequence comparison on unrelated languages seem solid, but one
    needs to be careful with chance resemblances based on
    onomatopoetic words etc. (mama, papa, etc., Jakobson 1960, Blasi
    et al. 2016)
    tree reconciliation methods are unrealistic if word trees are derived
    from simple edit distances
    sublist-approaches may be useful, but they require large accounts on
    known borrowings, which we usually lack
    40 / 62

    View full-size slide

  51. Possible Solutions Borrowing Detection
    Automatic Borrowing Detection: Difficulty
    detecting borrowing presupposes to exclude alternative reasons
    (inheritance, natural patterns, chance)
    no unified procedure for the identification of borrowings in the
    classical discipline
    borrowing detection is much more based on multiple types of
    evidence (“consilience”, “cumulative evidence”) than other tasks in
    historical linguistics
    41 / 62

    View full-size slide

  52. Possible Solutions Borrowing Detection
    Automatic Borrowing Detection: Qualitative Solutions
    Basic idea: search for conflicts (List, under review)
    search for phylogenetic conflicts (English mountain, French
    montagne)
    search for trait-related conflicts (German Damm, English dam)
    check for areal proximity (as a pre-condition)
    use borrowability arguments in cases of doubt, or as heuristics
    42 / 62

    View full-size slide

  53. Possible Solutions Borrowing Detection
    Automatic Borrowing Detection: Suggestions
    increase cross-linguistic data in phonetic transcription and consistent
    definition of meanings to allow for search of similar words among
    unrelated languages
    test methods for automatic correspondence pattern recognition and
    search for trait-related conflicts (List 2019)
    work on cross-linguistic datasets of known borrowed words to increase
    our knowledge of borrowability
    43 / 62

    View full-size slide

  54. Possible Solutions Sound Law Induction
    Automatic Sound Law Induction: Task
    Given a list of words in an
    ancestral language and their
    reflexes in a descendant lan-
    guage, identify the sound laws
    by which the ancestor can be
    converted into the descendant.
    *p > *pf / #_
    44 / 62

    View full-size slide

  55. Possible Solutions Sound Law Induction
    Automatic Sound Law Induction: Current Solutions
    simulation studies (black boxes, see e.g., Ciobanu and Dinu 2018) for
    word prediction
    manual tools to model sound change when providing sound laws
    (PHONO, Hartmann 2003)
    correspondence-pattern based word prediction (List 2019, Bodt and
    List under review)
    45 / 62

    View full-size slide

  56. Possible Solutions Sound Law Induction
    Automatic Sound Law Induction: Performance
    problem of handling conditioning context (be it long-distance or
    abstract)
    no direct solution to the task at hand
    46 / 62

    View full-size slide

  57. Possible Solutions Sound Law Induction
    Automatic Sound Law Induction: Difficulty
    induction of rules as a problem usually not addressed in machine
    learning solutions
    problem of handling context of arbitrary distance to target sound
    problem of handling “abstract” context (suprasegmentals)
    problem of handling systemic aspects of sound change (where sound
    change is modeled in features)
    47 / 62

    View full-size slide

  58. Possible Solutions Sound Law Induction
    Automatic Sound Law Induction: Suggestions
    Multi-tiered sequence modeling (List 2014, List and Chacon 2015, WIP):
    by modeling all different possible conditioning contexts, we make sure
    that we can find the context that conditions a sound change
    by selecting those which actually do condition a sound change, using
    computational tools, we can identify and propose potential
    environments of varying degrees of abstractness
    we still need, however, to reflect, how to handle systematic aspects of
    sound change
    48 / 62

    View full-size slide

  59. Possible Solutions Sound Law Induction
    Automatic Sound Law Induction: Suggestions

    ŋ
    w
    # $
    #
    PRECEDING
    FO
    LLO
    W
    IN
    G
    ABSTRACT
    tone
    palatal
    nasal
    æ
    k
    æ
    A
    V

    i
    N
    48 / 62

    View full-size slide

  60. Possible Solutions Sound Law Induction
    Automatic Sound Law Induction: Suggestions
    48 / 62

    View full-size slide

  61. Possible Solutions Sound Law Induction
    Automatic Sound Law Induction: Suggestions
    48 / 62

    View full-size slide

  62. Possible Solutions Phonological Reconstruction
    Automatic Phonological Reconstruction: Task
    Given a set of alignments
    of strict cognate morphemes
    across a set of related lan-
    guages, as well as the typ-
    ical correspondence patterns
    by which the sounds in the
    languages correspond to each
    other, try to infer the hypothet-
    ical pronunciation of each mor-
    pheme in the proto-language.
    * ₂
    49 / 62

    View full-size slide

  63. Possible Solutions Phonological Reconstruction
    Automatic Phonological Reconstruction: Current Solutions
    Bouchard-Côté et al. (2013) use a framework that makes use of
    probabilistic string transducers. If the family tree of the languages is
    known, and cognate sets are defined as such, the method produces
    proto-form suggestions.
    50 / 62

    View full-size slide

  64. Possible Solutions Phonological Reconstruction
    Automatic Phonological Reconstruction: Performance
    the method by Bouchard-Côté was only tested on Austronesian, and
    is not available, so it cannot be tested further without
    re-implementing from scratch
    the scores reported are good (error rates between 0.25 and 0.12), but
    Austronesian is not a challenging candidate for reconstruction
    the method cannot reconstruct sounds that are not found in the data,
    while this is quite possible to happen in language change
    the evaluation uses edit distance, but differences between
    reconstructions are better compared for structural differences, than
    for substantial ones (List 2018)
    51 / 62

    View full-size slide

  65. Possible Solutions Phonological Reconstruction
    Automatic Phonological Reconstruction: Difficulty
    scholars still disagree with respect to the question of how
    reconstruction should be best carried out, i.e., if it should be abstract
    or realistic (so-called abstractionalist-realist debate, Lass 2017,
    Jakobson 1958)
    no measures to account for the predictive quality of a given
    reconstruction system exist
    reconstructing what was not, as in the case of laryngeals in
    Indo-European (Saussure 1879), does not have a counterpart in
    biology (or is simply ignored in biology)
    52 / 62

    View full-size slide

  66. Possible Solutions Phonological Reconstruction
    Automatic Phonological Reconstruction: Qualitative Sol.
    use of sound correspondence patterns (Anttila 1972, Meillet 1903)
    use of external evidence where possible
    use of internal reconstruction where possible
    53 / 62

    View full-size slide

  67. Possible Solutions Phonological Reconstruction
    Automatic Phonological Reconstruction: Suggestions
    we should start from semi-automatic reconstructions (especially, since
    we can now compute sound correspondence patterns from alignment
    data, see List 2019), by having experts inspect correspondence
    patterns, and assigning one proto-form to each of them
    if we manage to implement our system for sound law induction with
    multi-tiered sequence representations, we could evaluate the overall
    plausibility of existing and proposed reconstruction systems
    automatically
    we need more data for testing and training
    we need to work on measures to compare different reconstruction
    systems (along the lines proposed in List 2018, by measuring
    structural differences)
    54 / 62

    View full-size slide

  68. Possible Solutions General Problems
    Evaluation
    lack of benchmark datasets,
    especially gold standards,
    training data, and baselines
    lack of good evaluation
    measures
    simulation methods may help to
    produce more evaluation data
    interfaces for data annotation
    are crucial to produce more high
    quality data
    55 / 62

    View full-size slide

  69. Possible Solutions General Problems
    Standards
    are needed to make linguistic
    data comparable
    allow for a better integration of
    software and data
    can also guarantee that data is
    available in both human- and
    machine-readable form
    01
    | | | 05 | | | | 10 | | | | 15
    First attempts: Cross-Linguistic Data Formats initiative
    Forkel et al. (2018), Scientific Data.
    https://cldf.clld.org
    56 / 62

    View full-size slide

  70. Possible Solutions General Problems
    Standards
    Glottolog
    arbitrarité
    Concepticon
    CLTS
    languages
    concepts
    sounds
    Reference Catalogs
    >>> from pycldf import *
    >>> ds = Dataset('path')
    >>> ds.validate()
    >>> ds.statistics()
    Validation Software
    CLDF
    ID CONCEPT IPA COGNACY
    1 hand hant 1
    2 hand hænd 1
    3 ruka ruka 2
    4 rẽnka rẽnka 2
    ... ... ... ...
    Spreadsheet Formats
    Online Publication (CLLD)
    56 / 62

    View full-size slide

  71. Possible Solutions General Problems
    Standards
    56 / 62

    View full-size slide

  72. Possible Solutions General Problems
    Interfaces
    allow for a rapid annotation of
    data
    guarantee that data is human-
    and machine-readable
    allow for qualitative and
    quantitative research at the
    same time
    very
    long
    title
    P(A|B)=P(B|A)...
    First attempts: Etymological Dictionary Editor (EDICTOR)
    List (2017): Proc. of the EACL. System Demonstrations.
    https://edictor.digling.org
    57 / 62

    View full-size slide

  73. Possible Solutions General Problems
    Interfaces
    ID DOCULECT CONCEPT SEGMENTS
    N U O ?
    wOld
    yuE_5_1liaN_1
    moon
    moon
    moon
    moon
    Běijīng
    Guǎngzhōu
    Měixiàn
    Fúzhōu
    1
    2
    3
    4
    Conversion and Segmentation
    Highlighting of Unrecognized
    Phonetic Symbols
    yuE_5_1liaN_1
    yɛ⁵¹liɑŋ¹
    y ɛ ⁵¹ l i ɑ ŋ ¹
    annotate data
    analyze data
    edit alignments
    Etymological DICTionary ediTor
    http://edictor.digling.org
    List (2017)
    E D T
    57 / 62

    View full-size slide

  74. Outlook
    Outlook
    *deh3
    -
    ?
    58 / 62

    View full-size slide

  75. Outlook Measuring
    Measuring
    «Measure what is measurable, and make measurable what is not
    so.» (Galileo Galilei [quote apparently falsely attributed to Galilei,
    see Kleinert 2009])
    59 / 62

    View full-size slide

  76. Outlook Towards Big Data
    Towards Big Data
    CLICS: Database of Cross-Linguistic Colexifications
    http://clics.clld.org
    List et al. (2018)
    >1000 languages
    >1500 concepts
    60 / 62

    View full-size slide

  77. Outlook Towards Big Data
    Towards Big Data
    CLICS: Database of Cross-Linguistic Colexifications
    http://clics.clld.org
    List et al. (2018)
    CARRY IN HAND
    CARRY UNDER ARM
    RULE
    ORDER
    SALT
    TAKE
    CHOOSE
    LEND SHARE
    BRING
    FORGET
    ACQUIT
    HAVE SEX
    HAND
    LIBERATE
    DIRTY
    GUEST
    ARM
    BETWEEN
    UPPER ARM
    MOLD
    TORCH OR LAMP
    OWN
    GAP (DISTANCE)
    DRIP (EMIT LIQUID)
    FINGERNAIL OR TOENAIL
    RIVER
    KISS
    RAIN (PRECIPITATION)
    WHEN
    SPOON
    SUCK
    ROUND
    LICK
    FINGERNAIL
    CLAW SOUP
    DRINK
    FORK
    PITCHFORK
    WATER
    SEA
    OPEN
    SMOKE (INHALE)
    LET GO OR SET FREE
    CAUSE
    DIRT
    FORKED BRANCH
    SEND
    LIP
    FORGIVE
    UNTIE
    ANCHOR
    EAT
    BITE
    BEVERAGE
    SWALLOW
    SAP
    URINE
    ANKLE
    FISHHOOK
    WHEEL
    WHERE
    LIFT
    CHIEFTAIN
    LOWER ARM
    CAUSE TO (LET)
    QUEEN
    GIVE
    ELBOW
    DONATE
    ELECTRICITY
    SKY
    STORM CLOUDS
    MUD
    SWAMP
    SMOKE (EXHAUST)
    FRESH
    SMOKE (EMIT SMOKE)
    STRANGER
    CEASE
    MOORLAND
    HOST
    GO UP (ASCEND)
    WEDDING
    CLIMB
    CLOUD
    PALM OF HAND
    FIVE
    MARRY
    RISE (MOVE UPWARDS)
    WRIST
    KING
    PRESIDENT
    FATHOM
    COLLARBONE
    RIDE
    SPACE (AVAILABLE)
    MASTER
    SHOULDER
    BROOM
    RAKE
    FLESH
    HOOK
    DRIBBLE
    SPIT
    TOE
    PAW
    OCEAN
    FINGER
    LAKE
    EDGE
    OBSCURE
    TOP
    NIGHT
    INCREASE
    WORLD
    UP
    DARKNESS
    BE
    GOD
    CALF OF LEG
    LEG
    SHIN
    FISH
    LOWER LEG
    WOMAN
    FEMALE (OF PERSON)
    FEMALE
    FEMALE (OF ANIMAL)
    LAGOON
    CORNER
    BORDER
    BESIDE
    FRINGE
    BOUNDARY
    WIFE
    COAST
    POINTED
    SHARP
    SHORE
    PLACE (POSITION)
    END (OF SPACE)
    EARTH (SOIL)
    BLACK
    STAND UP
    CHEW
    MEAL
    BREAKFAST
    HEEL
    FOOD
    DINNER (SUPPER)
    FOOT
    STAR
    SAND
    CLAY
    STAND
    SHOULDERBLADE
    CRAWL
    WAKE UP FOG
    FINISH
    DARK
    MALE ICE
    WAIST
    MARRIED MAN
    HIP
    DEEP
    LUNG
    FOAM
    REMAINS
    BLUE
    WAIT (FOR)
    LIFE
    LATE
    BE ALIVE
    AFTER
    TOWN
    BEHIND
    ASH
    FLOUR
    STATE (POLITICS)
    NEW
    UPPER BACK
    BOTTOM
    PASTURE
    THATCH
    BUTTOCKS
    MAN
    MALE (OF ANIMAL)
    MALE (OF PERSON)
    SIT DOWN
    TALL
    CROUCH
    EVENING
    AFTERNOON
    HIGH
    WEST
    GROW
    MAINLAND
    SIT
    LAND
    FLOOR
    AREA
    HALT (STOP)
    DUST
    REMAIN
    GROUND
    NATIVE COUNTRY
    DWELL (LIVE, RESIDE)
    COUNTRY
    HUSBAND
    BACK
    END (OF TIME)
    SPINE
    GRASS
    DEW
    MARRIED WOMAN
    ROOSTER
    INSECT
    FOWL
    BIRD
    ANIMAL
    HEN
    SHORT
    BABY
    CORN FIELD
    THIN
    SAGO PALM
    GARDEN
    SMALL
    THIN (OF SHAPE OF OBJECT)
    CLAN
    NARROW
    FAMILY
    YOUNG
    CITIZEN
    FINE OR THIN
    SHALLOW
    THIN (SLIM)
    GIRL
    RELATIVES
    YOUNG MAN
    FRIEND
    PARENTS
    CHILD (DESCENDANT)
    YOUNG WOMAN
    BOY
    NEIGHBOUR
    CHILD (YOUNG HUMAN)
    SON
    SIBLING
    BROTHER
    DESCENDANTS
    OLDER SIBLING
    DAUGHTER
    ALONE
    FENCE
    ONLY
    FEW
    TOWER
    SOME
    ONE
    YARD
    OUTSIDE
    FORTRESS
    NEVER
    PLAIN
    PEOPLE
    VALLEY
    DOWN
    FIELD
    LOW
    PERSON
    YOUNGER SIBLING
    YOUNGER SISTER
    OLDER BROTHER
    YOUNGER BROTHER
    COUSIN
    SISTER
    OLDER SISTER
    NEPHEW
    DAMP
    FLOWER
    MANY
    SMOOTH
    WIDE
    FLAT
    BLOOD
    WET
    BELOW OR UNDER
    DOWN OR BELOW
    GREY
    BREAD
    DOUGH
    RAW
    VILLAGE
    GREEN
    CROWD
    SOFT
    AT
    ALL
    SLIP
    UNRIPE
    VEIN
    BLOOD VESSEL
    ALWAYS
    TENDON
    ROOF
    ROOT
    INSIDE
    OR
    GENTLE
    OLD
    WITH
    ENOUGH
    OLD (AGED)
    FORMER
    AND
    ROOM
    HOME
    TENT
    HUT
    GARDEN-HOUSE
    WEAK
    DENSE
    MEN'S HOUSE
    OLD MAN
    LAZY
    STILL (CONTINUING)
    TIRED
    AGAIN
    MORE
    READY
    OLD WOMAN
    SOMETIMES
    IN
    HOUSE
    OFTEN
    YELLOW
    RED
    AFTERWARDS
    BIG
    GOLD
    YOLK
    HOUR
    SALTY
    PINCH
    KNEEL
    AGE
    RIPE
    THICK
    FULL
    STRAIGHT
    BE LATE
    LIGHT (RADIATION) ABOVE
    WORK (ACTIVITY)
    PRODUCE
    MAKE
    DAY (NOT NIGHT)
    HEAVEN
    WORK (LABOUR) BUILD
    FAR
    AT THAT TIME
    LONG
    WHITE
    LENGTH
    THEN
    MOUNTAIN OR HILL
    SEASON
    HAVE
    PRESS
    GET
    PICK UP
    HEAD
    HOLD
    EARN
    DO OR MAKE
    WEATHER
    FATHER
    STEPFATHER
    UNCLE
    FATHER-IN-LAW (OF MAN)
    FATHER'S BROTHER
    MOTHER'S BROTHER
    STEPMOTHER
    AUNT
    BEGINNING
    BEGIN
    FIRST
    FATHER'S SISTER
    MOTHER-IN-LAW (OF WOMAN)
    MOTHER'S SISTER
    MOTHER
    MOTHER-IN-LAW (OF MAN)
    PARENTS-IN-LAW
    GRANDDAUGHTER
    SON-IN-LAW (OF WOMAN)
    FATHER-IN-LAW (OF WOMAN)
    SON-IN-LAW (OF MAN)
    DAUGHTER-IN-LAW (OF WOMAN)
    CHILD-IN-LAW
    SIBLING'S CHILD
    NIECE
    GRANDFATHER
    DAUGHTER-IN-LAW (OF MAN)
    IN FRONT OF
    FORWARD
    GRANDSON
    GRANDCHILD
    GRANDMOTHER
    ANCESTORS
    GRANDPARENTS
    THING
    STREET
    MANNER
    ROAD
    PIECE
    PORT
    PATH OR ROAD
    PATH
    RIB
    BONE
    BAIT
    THIGH
    BAY
    FLESH OR MEAT MEAT FOOTPRINT
    SIDE
    PART
    SLICE
    WALL (OF HOUSE)
    MIDDLE
    NAVEL
    SNOW
    LAST (FINAL)
    HAY HALF
    NEAR
    CHICKEN
    BULL
    SNAKE
    WORM
    CATTLE
    LIVESTOCK
    CALF
    OX
    COW
    WHICH
    WHITHER (WHERE TO)
    WINE
    HOW
    CIRCLE
    RING
    BALL
    BRACELET
    HOW MUCH
    HOW MANY
    BEEHIVE
    GRAVE
    CAVE
    BEARD
    RAIN (RAINING)
    SPRING OR WELL
    MOUSTACHE
    STREAM
    GLUE
    ALCOHOL (FERMENTED DRINK)
    BEE
    BEER
    HONEY
    WHO WASP
    MEAD
    WHAT
    WHY
    CANDY
    LUNCH
    ITEM
    WARE
    CUSTOM
    LAW
    MIDDAY
    PIT (POTHOLE)
    HOLE
    FURROW
    DITCH
    LAIR
    JUDGMENT
    COURT
    ADJUDICATE
    CONDEMN
    CONVICT
    ACCUSE
    BLAME
    ANNOUNCE
    PREACH
    EXPLAIN
    SAY
    ASK (REQUEST)
    THROW
    BUDGE (ONESELF)
    SHOOT
    EMBERS
    UGLY
    CHOP
    CUT DOWN
    COLD (OF WEATHER)
    FIREWOOD
    GRASP
    LEAD (GUIDE)
    DISTANCE
    LIE DOWN
    CARRY ON HEAD
    PERMIT
    PUSH
    MOLAR TOOTH
    FRONT TOOTH (INCISOR)
    RIDGEPOLE
    BEAK
    COAT
    TOWEL
    HELMET
    SHIRT
    HEADBAND
    HEADGEAR
    RAG
    VEIL
    SOON
    TOGETHER
    IMMEDIATELY
    NEST
    NOW
    BED
    TODAY
    INSTANTLY
    SUDDENLY
    RUG
    WITHOUT
    PONCHO
    BLANKET
    CLOAK
    MAT
    BEFORE
    BOLT (MOVE IN HASTE)
    ROAR (OF SEA)
    FAST
    DASH (OF VEHICLE)
    EARLY
    YESTERDAY
    HURRY
    AT FIRST
    EMPTY
    NO
    DRY
    ZERO
    NOTHING
    NOT
    RESULT IN
    BE BORN
    HAPPEN
    PASS
    SUCCEED
    BECOME
    BRAVE
    CLOTH
    POWERFUL
    DARE
    LOUD
    GRASS-SKIRT
    DRESS
    CLOTHES
    SKIRT
    RIPEN
    SOLID
    PIERCE
    HARD
    BEGET
    ROUGH
    REFUSE
    FRY
    DRESS UP
    DENY
    CALM
    MORNING
    PEACE
    BE SILENT
    QUIET
    SWELL
    TOMORROW
    HEALTHY
    EXPENSIVE
    HAPPY
    ROAST OR FRY
    STRONG BAKE
    PRICE
    BOIL (SOMETHING)
    PUT ON
    COOKED
    SLOW
    FAITHFUL
    RIGHT
    LAST (ENDURE)
    FOR A LONG TIME
    DAWN
    BEAUTIFUL
    GOOD
    COOK (SOMETHING)
    YES
    CORRECT (RIGHT)
    BOIL (OF LIQUID)
    DO
    PUT
    BRIGHT
    CLEAN
    LIGHT (COLOR)
    LAY (VERB)
    SHINE
    SEAT (SOMEBODY)
    INNOCENT
    FORBID
    PREPARE
    CERTAIN
    TRUTH TRUE
    DEAR
    PRECIOUS
    WARM
    HEAT
    CONCEIVE
    SEW
    LOOM
    PLAIT
    LIGHT (IGNITE)
    BURN (SOMETHING) PREVENT
    HOLY
    GOOD-LOOKING
    ARSON
    BEND
    CHANGE (BECOME DIFFERENT)
    BURNING
    TWIST
    DEBT
    CROOKED
    ROLL
    SPIN
    HEAVY
    HOT
    WEAVE
    DIFFICULT
    FEVER
    PLAIT OR BRAID OR WEAVE
    PREGNANT
    OWE
    TWINKLE
    CLEAR
    BEND (SOMETHING)
    MORTAR CRUSHER
    PESTLE
    BITTER
    MILL MONTH SKULL
    MEASURE
    TRY
    COME BACK TIME
    MOON
    COUNT
    JOIN
    SQUEEZE
    PILE UP
    CLOCK
    BUY
    DRAW MILK
    DAY (24 HOURS)
    BETRAY
    GUARD
    PROTECT
    PAY
    KNEE
    KEEP
    SELL
    SUN
    BILL
    HELP
    LIE (MISLEAD)
    TRADE OR BARTER
    DECEIT
    PERJURY
    RESCUE
    CURE
    FOLD
    SIEVE
    PRESERVE
    TRANSLATE
    TURN (SOMETHING)
    TURN
    WRAP
    HERD (SOMETHING)
    WAGES
    DEFEND
    CHANGE
    RETURN HOME
    TIE UP (TETHER)
    TURN AROUND
    HANG
    KNIT
    WEIGH
    HANG UP
    GIVE BACK
    CONNECT
    COVER
    BUTTON
    BUNCH
    KNOT
    SHUT
    BUNDLE
    TIE
    NOOSE
    GILL
    EAR
    EARLOBE
    THINK
    FOLLOW
    JEWEL
    BE ABLE
    OBEY
    SUMMER
    FEEL (TACTUALLY)
    REMEMBER
    SUSPECT
    BELIEVE
    GUESS
    RECOGNIZE (SOMEBODY)
    SOUR
    SWEET
    SUGAR CANE
    BRACKISH
    SUGAR
    TASTY
    CALCULATE
    IMITATE
    CITRUS FRUIT
    TASTE (SOMETHING)
    READ
    COME
    PRECIPICE
    SEE
    STONE OR ROCK
    APPROACH
    TOUCH
    ARRIVE
    YEAR
    MEET
    GRIND
    FRAGRANT
    ROTTEN SMELL (STINK)
    SMELL (PERCEIVE)
    STINKING
    SNIFF
    PUS
    FEEL
    UNDERSTAND
    HEAR
    THINK (BELIEVE)
    LISTEN
    MOVE (AFFECT EMOTIONALLY)
    KNOW (SOMETHING)
    NOTICE (SOMETHING)
    WATCH
    LEARN
    REEF
    STUDY
    LOOK FOR
    LOOK
    NASAL MUCUS (SNOT)
    SPLASH
    PITY
    HIDE (CONCEAL)
    SHELF
    FLY (MOVE THROUGH AIR)
    REGRET
    NOSTRIL
    THIEF
    BOARD
    SINK (DESCEND)
    DECREASE
    CHEEK
    NOSE
    BROKEN
    LOSE
    EMERGE (APPEAR)
    ANXIETY
    BAD LUCK
    GOOD LUCK
    OMEN
    WRONG
    SLAB
    FOREHEAD
    EYE
    BAD
    EVIL
    TABLE
    INJURE
    DANGER
    SURPRISED
    HARVEST
    BERRY
    FEAR (FRIGHT)
    NUT FAULT
    MISTAKE
    BECOME SICK
    SEED
    MISS (A TARGET)
    GUILTY
    SWELLING
    BRUISE
    BLISTER
    BOIL (OF SKIN)
    SCAR
    CHOKE
    ENTER
    ACHE
    SICK
    DISEASE
    PAIN
    DAMAGE (INJURY)
    SEVERE
    GRIEF
    SAUSAGE
    BEAD
    STOMACH
    INTESTINES
    CHAIN
    SPLEEN
    NECKLACE
    WOMB
    LIVER
    BELLY
    MEANING
    GHOST
    POSTCARD
    HEART
    LEGENDARY CREATURE
    SHADE
    DEMON
    BRAIN MEMORY
    FIGHT
    LETTER
    THOUGHT
    MIND
    BOOK
    COLLAR INTENTION
    SPIRIT
    PURSUE
    LONG HAIR
    SPRINGTIME
    HAIR (HEAD)
    THINK (REFLECT)
    DOUBT
    AUTUMN
    ORNAMENT
    HOPE
    ARMY
    QUARREL
    BEAT
    SOLDIER
    KNOCK
    BATTLE
    NOISE
    REST
    NAPE (OF NECK)
    THROAT
    NECK
    IDEA
    IF
    BECAUSE
    SLEEP
    FOREST
    DRIP (FALL IN GLOBULES)
    STICK
    TREE
    WALKING STICK
    PLANT (VEGETATION)
    LIE (REST)
    DRAG
    ASK (INQUIRE)
    DIVIDE
    URGE (SOMEONE)
    STING
    BRANCH
    CAMPFIRE BORROW SEPARATE TOOTH
    MOUTH
    CANDLE
    FALL ASLEEP
    DRIVE (CATTLE)
    MATCH
    DRIVE
    RAFTER
    BEAM
    DOORPOST
    DREAM (SOMETHING)
    POST
    MAST
    TUMBLE (FALL DOWN)
    WALK
    TREE TRUNK
    LAND (DESCEND)
    TEAR (SHRED)
    SAW
    GO OUT
    FALL
    TEAR (OF EYE) GO DOWN (DESCEND)
    BODY
    TREE STUMP
    SHOW
    CARVE
    SPOIL (SOMEBODY OR
    SOMETHING)
    BREAK (CLEAVE)
    PLANT (SOMETHING)
    DESTROY
    WALK (TAKE A WALK)
    CHIN
    BREAK (DESTROY OR GET
    DESTROYED)
    CUT
    PICK
    SPLIT
    LEAVE
    PULL
    CLUB
    WOOD
    MOVE (ONESELF)
    HIRE
    PRAISE
    MIX
    KNEAD
    WIPE
    SNEEZE
    BOAST
    SCRATCH
    CLEAN (SOMETHING)
    HOARFROST
    WORSHIP
    COUGH
    SWEEP
    RUB
    SCRAPE
    CARCASS
    DIE (FROM ACCIDENT)
    DIE
    BATHE
    SWIM
    DEAD
    FLOAT
    LOVE
    STAB
    SAIL
    PEEL
    SPREAD OUT
    CRY
    COMMON COLD (DISEASE)
    FROST
    CORPSE
    SHRIEK
    JUMP
    SHOUT
    DIG
    WINTER
    NAME
    STREAM (FLOW CONTINUOUSLY)
    PLOUGH
    CULTIVATE
    PLAY
    VISIBLE
    SEEM
    STRETCH
    SOW SEEDS
    RETREAT
    INVITE
    MUSIC
    RUN
    COLD
    HOLLOW OUT
    CHARCOAL
    TONGUE
    STOVE
    CONVERSATION
    SKIN
    DIVORCE
    OVEN
    EARWAX
    COOKHOUSE
    TIP (OF TONGUE)
    AIR
    HUNT
    BORE
    CALL BY NAME
    BREATH
    STEP (VERB)
    SONG
    ATTACK
    WASH
    PROUD
    SIN
    DEFENDANT
    CRIME
    CHIME (ACTION) EGG
    TESTICLES
    BARLEY
    FRUIT
    VEGETABLES
    GRAIN
    MAIZE
    RICE
    WHEAT
    RUDDER
    RYE
    PADDLE SWAY
    SWING (MOVEMENT)
    SWING (SOMETHING)
    SHAKE
    ROW
    FREEZE
    JOG (SOMETHING)
    OAT
    SHIVER
    RINSE
    RING (MAKE SOUND)
    MAKE NOISE
    SOUND (OF INSTRUMENT OR
    VOICE)
    TINKLE
    HOE
    SHOVEL
    SPADE
    FLOW
    DANCE
    FLEE
    CALL
    DAMAGE
    SAME FACE
    SIMILAR DISAPPEAR
    ESCAPE
    PRAY GAME
    BURY
    CAPE
    CHAIR
    MOVE
    STEAL
    GROAN
    HOWL
    COLD (CHILL)
    JAW
    DROWN
    SINK (DISAPPEAR IN WATER)
    SET (HEAVENLY BODIES)
    DIVE
    WOUND
    POUND
    TALK
    BREATHE
    PROMISE
    SPEAK
    WIND
    VOICE
    FUR
    PUBIC HAIR
    SOUND OR NOISE
    STRIKE OR BEAT
    BARK
    SCALE
    KILL
    HAMMER
    TONE (MUSIC)
    WOOL
    EXTINGUISH
    MURDER
    HIT
    SPEECH
    CHAT (WITH SOMEBODY)
    WORD
    STORM
    THRESH
    LEATHER
    LIKE
    NEED (NOUN)
    FELT
    SKIN (OF FRUIT)
    PAPER
    OATH
    WANT
    SWEAR
    KICK
    SNAIL
    DEATH
    PULL OFF (SKIN)
    SHELL
    FIREPLACE
    PEN
    HAIR (BODY)
    LANGUAGE
    CONVEY (A MESSAGE)
    TELL
    LEAF (LEAFLIKE OBJECT)
    FEATHER
    POUR
    FLAME
    GO
    SING
    BEESWAX
    HELL
    GATHER
    CARRY
    SEIZE
    CATCH
    TRAP (CATCH)
    WING
    FIRE
    CARRY ON SHOULDER
    CAST
    MOW
    BOSS
    FIND
    FIN
    ADMIT
    TEACH
    LEAF
    SAILCLOTH
    HAIR ANSWER
    SAY
    FOOT
    CIRCLE
    GRAIN
    Largest connected
    component in CLICS²
    Clusters inferred with
    the Infomap Community
    Detection algorithm
    List et al. (u. rev.)
    60 / 62

    View full-size slide

  78. Outlook Towards Big Data
    Towards Big Data
    CLICS: Database of Cross-Linguistic Colexifications
    http://clics.clld.org
    List et al. (2018)
    CARRY IN HAND
    CARRY UNDER ARM
    RULE
    ORDER
    SALT
    TAKE
    CHOOSE
    LEND SHARE
    BRING
    FORGET
    ACQUIT
    HAVE SEX
    HAND
    LIBERATE
    DIRTY
    GUEST
    ARM
    BETWEEN
    UPPER ARM
    MOLD
    TORCH OR LAMP
    OWN
    GAP (DISTANCE)
    DRIP (EMIT LIQUID)
    FINGERNAIL OR TOENAIL
    RIVER
    KISS
    RAIN (PRECIPITATION)
    WHEN
    SPOON
    SUCK
    ROUND
    LICK
    FINGERNAIL
    CLAW SOUP
    DRINK
    FORK
    PITCHFORK
    WATER
    SEA
    OPEN
    SMOKE (INHALE)
    LET GO OR SET FREE
    CAUSE
    DIRT
    FORKED BRANCH
    SEND
    LIP
    FORGIVE
    UNTIE
    ANCHOR
    EAT
    BITE
    BEVERAGE
    SWALLOW
    SAP
    URINE
    ANKLE
    FISHHOOK
    WHEEL
    WHERE
    LIFT
    CHIEFTAIN
    LOWER ARM
    CAUSE TO (LET)
    QUEEN
    GIVE
    ELBOW
    DONATE
    ELECTRICITY
    SKY
    STORM CLOUDS
    MUD
    SWAMP
    SMOKE (EXHAUST)
    FRESH
    SMOKE (EMIT SMOKE)
    STRANGER
    CEASE
    MOORLAND
    HOST
    GO UP (ASCEND)
    WEDDING
    CLIMB
    CLOUD
    PALM OF HAND
    FIVE
    MARRY
    RISE (MOVE UPWARDS)
    WRIST
    KING
    PRESIDENT
    FATHOM
    COLLARBONE
    RIDE
    SPACE (AVAILABLE)
    MASTER
    SHOULDER
    BROOM
    RAKE
    FLESH
    HOOK
    DRIBBLE
    SPIT
    TOE
    PAW
    OCEAN
    FINGER
    LAKE
    EDGE
    OBSCURE
    TOP
    NIGHT
    INCREASE
    WORLD
    UP
    DARKNESS
    BE
    GOD
    CALF OF LEG
    LEG
    SHIN
    FISH
    LOWER LEG
    WOMAN
    FEMALE (OF PERSON)
    FEMALE
    FEMALE (OF ANIMAL)
    LAGOON
    CORNER
    BORDER
    BESIDE
    FRINGE
    BOUNDARY
    WIFE
    COAST
    POINTED
    SHARP
    SHORE
    PLACE (POSITION)
    END (OF SPACE)
    EARTH (SOIL)
    BLACK
    STAND UP
    CHEW
    MEAL
    BREAKFAST
    HEEL
    FOOD
    DINNER (SUPPER)
    FOOT
    STAR
    SAND
    CLAY
    STAND
    SHOULDERBLADE
    CRAWL
    WAKE UP FOG
    FINISH
    DARK
    MALE ICE
    WAIST
    MARRIED MAN
    HIP
    DEEP
    LUNG
    FOAM
    REMAINS
    BLUE
    WAIT (FOR)
    LIFE
    LATE
    BE ALIVE
    AFTER
    TOWN
    BEHIND
    ASH
    FLOUR
    STATE (POLITICS)
    NEW
    UPPER BACK
    BOTTOM
    PASTURE
    THATCH
    BUTTOCKS
    MAN
    MALE (OF ANIMAL)
    MALE (OF PERSON)
    SIT DOWN
    TALL
    CROUCH
    EVENING
    AFTERNOON
    HIGH
    WEST
    GROW
    MAINLAND
    SIT
    LAND
    FLOOR
    AREA
    HALT (STOP)
    DUST
    REMAIN
    GROUND
    NATIVE COUNTRY
    DWELL (LIVE, RESIDE)
    COUNTRY
    HUSBAND
    BACK
    END (OF TIME)
    SPINE
    GRASS
    DEW
    MARRIED WOMAN
    ROOSTER
    INSECT
    FOWL
    BIRD
    ANIMAL
    HEN
    SHORT
    BABY
    CORN FIELD
    THIN
    SAGO PALM
    GARDEN
    SMALL
    THIN (OF SHAPE OF OBJECT)
    CLAN
    NARROW
    FAMILY
    YOUNG
    CITIZEN
    FINE OR THIN
    SHALLOW
    THIN (SLIM)
    GIRL
    RELATIVES
    YOUNG MAN
    FRIEND
    PARENTS
    CHILD (DESCENDANT)
    YOUNG WOMAN
    BOY
    NEIGHBOUR
    CHILD (YOUNG HUMAN)
    SON
    SIBLING
    BROTHER
    DESCENDANTS
    OLDER SIBLING
    DAUGHTER
    ALONE
    FENCE
    ONLY
    FEW
    TOWER
    SOME
    ONE
    YARD
    OUTSIDE
    FORTRESS
    NEVER
    PLAIN
    PEOPLE
    VALLEY
    DOWN
    FIELD
    LOW
    PERSON
    YOUNGER SIBLING
    YOUNGER SISTER
    OLDER BROTHER
    YOUNGER BROTHER
    COUSIN
    SISTER
    OLDER SISTER
    NEPHEW
    DAMP
    FLOWER
    MANY
    SMOOTH
    WIDE
    FLAT
    BLOOD
    WET
    BELOW OR UNDER
    DOWN OR BELOW
    GREY
    BREAD
    DOUGH
    RAW
    VILLAGE
    GREEN
    CROWD
    SOFT
    AT
    ALL
    SLIP
    UNRIPE
    VEIN
    BLOOD VESSEL
    ALWAYS
    TENDON
    ROOF
    ROOT
    INSIDE
    OR
    GENTLE
    OLD
    WITH
    ENOUGH
    OLD (AGED)
    FORMER
    AND
    ROOM
    HOME
    TENT
    HUT
    GARDEN-HOUSE
    WEAK
    DENSE
    MEN'S HOUSE
    OLD MAN
    LAZY
    STILL (CONTINUING)
    TIRED
    AGAIN
    MORE
    READY
    OLD WOMAN
    SOMETIMES
    IN
    HOUSE
    OFTEN
    YELLOW
    RED
    AFTERWARDS
    BIG
    GOLD
    YOLK
    HOUR
    SALTY
    PINCH
    KNEEL
    AGE
    RIPE
    THICK
    FULL
    STRAIGHT
    BE LATE
    LIGHT (RADIATION) ABOVE
    WORK (ACTIVITY)
    PRODUCE
    MAKE
    DAY (NOT NIGHT)
    HEAVEN
    WORK (LABOUR) BUILD
    FAR
    AT THAT TIME
    LONG
    WHITE
    LENGTH
    THEN
    MOUNTAIN OR HILL
    SEASON
    HAVE
    PRESS
    GET
    PICK UP
    HEAD
    HOLD
    EARN
    DO OR MAKE
    WEATHER
    FATHER
    STEPFATHER
    UNCLE
    FATHER-IN-LAW (OF MAN)
    FATHER'S BROTHER
    MOTHER'S BROTHER
    STEPMOTHER
    AUNT
    BEGINNING
    BEGIN
    FIRST
    FATHER'S SISTER
    MOTHER-IN-LAW (OF WOMAN)
    MOTHER'S SISTER
    MOTHER
    MOTHER-IN-LAW (OF MAN)
    PARENTS-IN-LAW
    GRANDDAUGHTER
    SON-IN-LAW (OF WOMAN)
    FATHER-IN-LAW (OF WOMAN)
    SON-IN-LAW (OF MAN)
    DAUGHTER-IN-LAW (OF WOMAN)
    CHILD-IN-LAW
    SIBLING'S CHILD
    NIECE
    GRANDFATHER
    DAUGHTER-IN-LAW (OF MAN)
    IN FRONT OF
    FORWARD
    GRANDSON
    GRANDCHILD
    GRANDMOTHER
    ANCESTORS
    GRANDPARENTS
    THING
    STREET
    MANNER
    ROAD
    PIECE
    PORT
    PATH OR ROAD
    PATH
    RIB
    BONE
    BAIT
    THIGH
    BAY
    FLESH OR MEAT MEAT FOOTPRINT
    SIDE
    PART
    SLICE
    WALL (OF HOUSE)
    MIDDLE
    NAVEL
    SNOW
    LAST (FINAL)
    HAY HALF
    NEAR
    CHICKEN
    BULL
    SNAKE
    WORM
    CATTLE
    LIVESTOCK
    CALF
    OX
    COW
    WHICH
    WHITHER (WHERE TO)
    WINE
    HOW
    CIRCLE
    RING
    BALL
    BRACELET
    HOW MUCH
    HOW MANY
    BEEHIVE
    GRAVE
    CAVE
    BEARD
    RAIN (RAINING)
    SPRING OR WELL
    MOUSTACHE
    STREAM
    GLUE
    ALCOHOL (FERMENTED DRINK)
    BEE
    BEER
    HONEY
    WHO WASP
    MEAD
    WHAT
    WHY
    CANDY
    LUNCH
    ITEM
    WARE
    CUSTOM
    LAW
    MIDDAY
    PIT (POTHOLE)
    HOLE
    FURROW
    DITCH
    LAIR
    JUDGMENT
    COURT
    ADJUDICATE
    CONDEMN
    CONVICT
    ACCUSE
    BLAME
    ANNOUNCE
    PREACH
    EXPLAIN
    SAY
    ASK (REQUEST)
    THROW
    BUDGE (ONESELF)
    SHOOT
    EMBERS
    UGLY
    CHOP
    CUT DOWN
    COLD (OF WEATHER)
    FIREWOOD
    GRASP
    LEAD (GUIDE)
    DISTANCE
    LIE DOWN
    CARRY ON HEAD
    PERMIT
    PUSH
    MOLAR TOOTH
    FRONT TOOTH (INCISOR)
    RIDGEPOLE
    BEAK
    COAT
    TOWEL
    HELMET
    SHIRT
    HEADBAND
    HEADGEAR
    RAG
    VEIL
    SOON
    TOGETHER
    IMMEDIATELY
    NEST
    NOW
    BED
    TODAY
    INSTANTLY
    SUDDENLY
    RUG
    WITHOUT
    PONCHO
    BLANKET
    CLOAK
    MAT
    BEFORE
    BOLT (MOVE IN HASTE)
    ROAR (OF SEA)
    FAST
    DASH (OF VEHICLE)
    EARLY
    YESTERDAY
    HURRY
    AT FIRST
    EMPTY
    NO
    DRY
    ZERO
    NOTHING
    NOT
    RESULT IN
    BE BORN
    HAPPEN
    PASS
    SUCCEED
    BECOME
    BRAVE
    CLOTH
    POWERFUL
    DARE
    LOUD
    GRASS-SKIRT
    DRESS
    CLOTHES
    SKIRT
    RIPEN
    SOLID
    PIERCE
    HARD
    BEGET
    ROUGH
    REFUSE
    FRY
    DRESS UP
    DENY
    CALM
    MORNING
    PEACE
    BE SILENT
    QUIET
    SWELL
    TOMORROW
    HEALTHY
    EXPENSIVE
    HAPPY
    ROAST OR FRY
    STRONG BAKE
    PRICE
    BOIL (SOMETHING)
    PUT ON
    COOKED
    SLOW
    FAITHFUL
    RIGHT
    LAST (ENDURE)
    FOR A LONG TIME
    DAWN
    BEAUTIFUL
    GOOD
    COOK (SOMETHING)
    YES
    CORRECT (RIGHT)
    BOIL (OF LIQUID)
    DO
    PUT
    BRIGHT
    CLEAN
    LIGHT (COLOR)
    LAY (VERB)
    SHINE
    SEAT (SOMEBODY)
    INNOCENT
    FORBID
    PREPARE
    CERTAIN
    TRUTH TRUE
    DEAR
    PRECIOUS
    WARM
    HEAT
    CONCEIVE
    SEW
    LOOM
    PLAIT
    LIGHT (IGNITE)
    BURN (SOMETHING) PREVENT
    HOLY
    GOOD-LOOKING
    ARSON
    BEND
    CHANGE (BECOME DIFFERENT)
    BURNING
    TWIST
    DEBT
    CROOKED
    ROLL
    SPIN
    HEAVY
    HOT
    WEAVE
    DIFFICULT
    FEVER
    PLAIT OR BRAID OR WEAVE
    PREGNANT
    OWE
    TWINKLE
    CLEAR
    BEND (SOMETHING)
    MORTAR CRUSHER
    PESTLE
    BITTER
    MILL MONTH SKULL
    MEASURE
    TRY
    COME BACK TIME
    MOON
    COUNT
    JOIN
    SQUEEZE
    PILE UP
    CLOCK
    BUY
    DRAW MILK
    DAY (24 HOURS)
    BETRAY
    GUARD
    PROTECT
    PAY
    KNEE
    KEEP
    SELL
    SUN
    BILL
    HELP
    LIE (MISLEAD)
    TRADE OR BARTER
    DECEIT
    PERJURY
    RESCUE
    CURE
    FOLD
    SIEVE
    PRESERVE
    TRANSLATE
    TURN (SOMETHING)
    TURN
    WRAP
    HERD (SOMETHING)
    WAGES
    DEFEND
    CHANGE
    RETURN HOME
    TIE UP (TETHER)
    TURN AROUND
    HANG
    KNIT
    WEIGH
    HANG UP
    GIVE BACK
    CONNECT
    COVER
    BUTTON
    BUNCH
    KNOT
    SHUT
    BUNDLE
    TIE
    NOOSE
    GILL
    EAR
    EARLOBE
    THINK
    FOLLOW
    JEWEL
    BE ABLE
    OBEY
    SUMMER
    FEEL (TACTUALLY)
    REMEMBER
    SUSPECT
    BELIEVE
    GUESS
    RECOGNIZE (SOMEBODY)
    SOUR
    SWEET
    SUGAR CANE
    BRACKISH
    SUGAR
    TASTY
    CALCULATE
    IMITATE
    CITRUS FRUIT
    TASTE (SOMETHING)
    READ
    COME
    PRECIPICE
    SEE
    STONE OR ROCK
    APPROACH
    TOUCH
    ARRIVE
    YEAR
    MEET
    GRIND
    FRAGRANT
    ROTTEN SMELL (STINK)
    SMELL (PERCEIVE)
    STINKING
    SNIFF
    PUS
    FEEL
    UNDERSTAND
    HEAR
    THINK (BELIEVE)
    LISTEN
    MOVE (AFFECT EMOTIONALLY)
    KNOW (SOMETHING)
    NOTICE (SOMETHING)
    WATCH
    LEARN
    REEF
    STUDY
    LOOK FOR
    LOOK
    NASAL MUCUS (SNOT)
    SPLASH
    PITY
    HIDE (CONCEAL)
    SHELF
    FLY (MOVE THROUGH AIR)
    REGRET
    NOSTRIL
    THIEF
    BOARD
    SINK (DESCEND)
    DECREASE
    CHEEK
    NOSE
    BROKEN
    LOSE
    EMERGE (APPEAR)
    ANXIETY
    BAD LUCK
    GOOD LUCK
    OMEN
    WRONG
    SLAB
    FOREHEAD
    EYE
    BAD
    EVIL
    TABLE
    INJURE
    DANGER
    SURPRISED
    HARVEST
    BERRY
    FEAR (FRIGHT)
    NUT FAULT
    MISTAKE
    BECOME SICK
    SEED
    MISS (A TARGET)
    GUILTY
    SWELLING
    BRUISE
    BLISTER
    BOIL (OF SKIN)
    SCAR
    CHOKE
    ENTER
    ACHE
    SICK
    DISEASE
    PAIN
    DAMAGE (INJURY)
    SEVERE
    GRIEF
    SAUSAGE
    BEAD
    STOMACH
    INTESTINES
    CHAIN
    SPLEEN
    NECKLACE
    WOMB
    LIVER
    BELLY
    MEANING
    GHOST
    POSTCARD
    HEART
    LEGENDARY CREATURE
    SHADE
    DEMON
    BRAIN MEMORY
    FIGHT
    LETTER
    THOUGHT
    MIND
    BOOK
    COLLAR INTENTION
    SPIRIT
    PURSUE
    LONG HAIR
    SPRINGTIME
    HAIR (HEAD)
    THINK (REFLECT)
    DOUBT
    AUTUMN
    ORNAMENT
    HOPE
    ARMY
    QUARREL
    BEAT
    SOLDIER
    KNOCK
    BATTLE
    NOISE
    REST
    NAPE (OF NECK)
    THROAT
    NECK
    IDEA
    IF
    BECAUSE
    SLEEP
    FOREST
    DRIP (FALL IN GLOBULES)
    STICK
    TREE
    WALKING STICK
    PLANT (VEGETATION)
    LIE (REST)
    DRAG
    ASK (INQUIRE)
    DIVIDE
    URGE (SOMEONE)
    STING
    BRANCH
    CAMPFIRE BORROW SEPARATE TOOTH
    MOUTH
    CANDLE
    FALL ASLEEP
    DRIVE (CATTLE)
    MATCH
    DRIVE
    RAFTER
    BEAM
    DOORPOST
    DREAM (SOMETHING)
    POST
    MAST
    TUMBLE (FALL DOWN)
    WALK
    TREE TRUNK
    LAND (DESCEND)
    TEAR (SHRED)
    SAW
    GO OUT
    FALL
    TEAR (OF EYE) GO DOWN (DESCEND)
    BODY
    TREE STUMP
    SHOW
    CARVE
    SPOIL (SOMEBODY OR
    SOMETHING)
    BREAK (CLEAVE)
    PLANT (SOMETHING)
    DESTROY
    WALK (TAKE A WALK)
    CHIN
    BREAK (DESTROY OR GET
    DESTROYED)
    CUT
    PICK
    SPLIT
    LEAVE
    PULL
    CLUB
    WOOD
    MOVE (ONESELF)
    HIRE
    PRAISE
    MIX
    KNEAD
    WIPE
    SNEEZE
    BOAST
    SCRATCH
    CLEAN (SOMETHING)
    HOARFROST
    WORSHIP
    COUGH
    SWEEP
    RUB
    SCRAPE
    CARCASS
    DIE (FROM ACCIDENT)
    DIE
    BATHE
    SWIM
    DEAD
    FLOAT
    LOVE
    STAB
    SAIL
    PEEL
    SPREAD OUT
    CRY
    COMMON COLD (DISEASE)
    FROST
    CORPSE
    SHRIEK
    JUMP
    SHOUT
    DIG
    WINTER
    NAME
    STREAM (FLOW CONTINUOUSLY)
    PLOUGH
    CULTIVATE
    PLAY
    VISIBLE
    SEEM
    STRETCH
    SOW SEEDS
    RETREAT
    INVITE
    MUSIC
    RUN
    COLD
    HOLLOW OUT
    CHARCOAL
    TONGUE
    STOVE
    CONVERSATION
    SKIN
    DIVORCE
    OVEN
    EARWAX
    COOKHOUSE
    TIP (OF TONGUE)
    AIR
    HUNT
    BORE
    CALL BY NAME
    BREATH
    STEP (VERB)
    SONG
    ATTACK
    WASH
    PROUD
    SIN
    DEFENDANT
    CRIME
    CHIME (ACTION) EGG
    TESTICLES
    BARLEY
    FRUIT
    VEGETABLES
    GRAIN
    MAIZE
    RICE
    WHEAT
    RUDDER
    RYE
    PADDLE SWAY
    SWING (MOVEMENT)
    SWING (SOMETHING)
    SHAKE
    ROW
    FREEZE
    JOG (SOMETHING)
    OAT
    SHIVER
    RINSE
    RING (MAKE SOUND)
    MAKE NOISE
    SOUND (OF INSTRUMENT OR
    VOICE)
    TINKLE
    HOE
    SHOVEL
    SPADE
    FLOW
    DANCE
    FLEE
    CALL
    DAMAGE
    SAME FACE
    SIMILAR DISAPPEAR
    ESCAPE
    PRAY GAME
    BURY
    CAPE
    CHAIR
    MOVE
    STEAL
    GROAN
    HOWL
    COLD (CHILL)
    JAW
    DROWN
    SINK (DISAPPEAR IN WATER)
    SET (HEAVENLY BODIES)
    DIVE
    WOUND
    POUND
    TALK
    BREATHE
    PROMISE
    SPEAK
    WIND
    VOICE
    FUR
    PUBIC HAIR
    SOUND OR NOISE
    STRIKE OR BEAT
    BARK
    SCALE
    KILL
    HAMMER
    TONE (MUSIC)
    WOOL
    EXTINGUISH
    MURDER
    HIT
    SPEECH
    CHAT (WITH SOMEBODY)
    WORD
    STORM
    THRESH
    LEATHER
    LIKE
    NEED (NOUN)
    FELT
    SKIN (OF FRUIT)
    PAPER
    OATH
    WANT
    SWEAR
    KICK
    SNAIL
    DEATH
    PULL OFF (SKIN)
    SHELL
    FIREPLACE
    PEN
    HAIR (BODY)
    LANGUAGE
    CONVEY (A MESSAGE)
    TELL
    LEAF (LEAFLIKE OBJECT)
    FEATHER
    POUR
    FLAME
    GO
    SING
    BEESWAX
    HELL
    GATHER
    CARRY
    SEIZE
    CATCH
    TRAP (CATCH)
    WING
    FIRE
    CARRY ON SHOULDER
    CAST
    MOW
    BOSS
    FIND
    FIN
    ADMIT
    TEACH
    LEAF
    SAILCLOTH
    HAIR ANSWER
    SAY
    FOOT
    CIRCLE
    GRAIN
    Largest connected
    component in CLICS²
    Clusters inferred with
    the Infomap Community
    Detection algorithm
    List et al. (u. rev.)
    TONGUE
    TELL ANNOUNCE
    TALK
    ADMIT
    CHAT (WITH SOMEBODY)
    SAY
    WORD
    ANSWER
    LANGUAGE
    VOICE
    SOUND OR NOISE
    NOISE
    PREACH
    SPEECH
    TONE (MUSIC)
    EXPLAIN
    CONVERSATION
    CONVEY (A MESSAGE)
    SPEAK
    60 / 62

    View full-size slide

  79. Outlook New Hypotheses
    New Hypotheses
    We do not need to ask completely stupid questions, but we
    should always work on questioning our key assumptions about
    language, language evolution, and how we study its synchronic
    and diachronic structures. Formulating open problems for our
    field is a first step towards their solution. Searching open prob-
    lems in our field that may have been overlooked so far is a first
    step to a deeper understanding of our research and our research
    subject.
    61 / 62

    View full-size slide

  80. Outlook New Hypotheses
    62 / 62

    View full-size slide