$30 off During Our Annual Pro Sale. View Details »

Multiple sequence alignments in historical linguistics

Multiple sequence alignments in historical linguistics

Paper, presented at the conference "Console XIX" (Groningen, Student Organization of Linguistics in Europe).

Johann-Mattis List

January 06, 2011
Tweet

More Decks by Johann-Mattis List

Other Decks in Science

Transcript

  1. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Multiple Sequence Alignment in Historical
    Linguistics
    A Sound Class Based Approach
    Johann-Mattis List∗
    ∗Institute for Romance Languages and Literature
    Heinrich Heine University Düsseldorf
    2011/01/06
    1 / 32

    View Slide

  2. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Structure of the Talk
    Introduction
    Sequences
    Alignments
    Automatic Alignment Analyses
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Alignments in Historical Linguistics
    Similarity
    Sound Classes
    LingPy
    Main Ideas
    Working Principle
    Scoring
    Performance of the Method
    Usage Example
    TPPSR
    2 / 32

    View Slide

  3. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Sequences
    Alignments
    Introduction
    Introduction
    - Sequences -
    - Alignments -
    3 / 32

    View Slide

  4. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Sequences
    Alignments
    Sequences
    Sets
    Sets are unordered lists of unique objects.
    Sets are compared by comparing the objects of different
    sets.
    Sequences
    Sequences are ordered lists of non-unique objects.
    Sequences are compared by comparing both the objects
    (segments) and the structure of different sequences.
    4 / 32

    View Slide

  5. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Sequences
    Alignments
    Alignments
    Sequence Alignment
    In alignment analyses, the corresponding segments of two or
    more sequences are ordered in such a way that they are set
    against each other. Segments which do not correspond to any
    other segments are marked by gaps (-). In this way, both, the
    structure and the segments of two or more sequences can be
    compared.
    5 / 32

    View Slide

  6. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Sequences
    Alignments
    Alignments
    ʧ ɪ l ɐ vʲ ɛ k
    ʧ o v ɛ k
    1
    6 / 32

    View Slide

  7. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Sequences
    Alignments
    Alignments
    ʧ ɪ l ɐ vʲ ɛ k
    ʧ o v ɛ k
    1
    6 / 32

    View Slide

  8. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Sequences
    Alignments
    Alignments
    ʧ ɪ l ɐ vʲ ɛ k
    ʧ - - o v ɛ k
    1
    6 / 32

    View Slide

  9. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Automatic Alignment Analyses
    h j - ä r t a -
    h - e - r z - -
    h - e a r t - -
    c - - o r d i s
    hjärta
    herz
    heart
    cordis
    1
    7 / 32

    View Slide

  10. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Pairwise Sequence Alignment
    Create a matrix which confronts all segments of two
    sequences, either with each other, or with gaps.
    Seek the path through the matrix which is of the lowest
    cost (or the highest score).
    Calculate the cost (or the score) cumulatively by scoring
    the matching of segments with segments and with gaps
    by means of a specific scoring function.
    8 / 32

    View Slide

  11. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Pairwise Sequence Alignment
    T E S T - - - -
    - - - - T E S T
    8
    9 / 32

    View Slide

  12. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Pairwise Sequence Alignment
    T E S T - - -
    - - - T E S T
    6
    9 / 32

    View Slide

  13. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Pairwise Sequence Alignment
    T E S - T - - -
    - - - T - E S T
    8
    9 / 32

    View Slide

  14. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Pairwise Sequence Alignment
    T E S T - - -
    - - T - E S T
    7
    9 / 32

    View Slide

  15. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Pairwise Sequence Alignment
    T E - S T - - -
    - - T - - E S T
    8
    9 / 32

    View Slide

  16. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Pairwise Sequence Alignment
    T E S T - - -
    - T - - E S T
    7
    9 / 32

    View Slide

  17. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Pairwise Sequence Alignment
    T - E S T - - -
    - T - - - E S T
    8
    9 / 32

    View Slide

  18. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Pairwise Sequence Alignment
    T E S T - - -
    T - - - E S T
    6
    9 / 32

    View Slide

  19. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Pairwise Sequence Alignment
    T E S T - -
    T - - E S T
    5
    9 / 32

    View Slide

  20. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Pairwise Sequence Alignment
    T E S - T - -
    T - - E - S T
    6
    9 / 32

    View Slide

  21. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Pairwise Sequence Alignment
    T E S T - -
    T - E - S T
    5
    9 / 32

    View Slide

  22. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Pairwise Sequence Alignment
    T E - S T - -
    T - E - - S T
    6
    9 / 32

    View Slide

  23. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Pairwise Sequence Alignment
    T E S T - -
    T E - - S T
    4
    9 / 32

    View Slide

  24. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Pairwise Sequence Alignment
    T E S T -
    T E - S T
    3
    9 / 32

    View Slide

  25. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Pairwise Sequence Alignment
    T E - S T -
    T E S - - T
    4
    9 / 32

    View Slide

  26. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Pairwise Sequence Alignment
    T E S T -
    T E S - T
    2
    9 / 32

    View Slide

  27. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Pairwise Sequence Alignment
    T E S T
    T E S T
    0
    9 / 32

    View Slide

  28. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Multiple Sequence Alignments
    Guide Tree Heuristics
    Due to computational restrictions, multiple sequence
    alignment (MSA) is based on heuristics.
    Heuristics based on guide-trees are the most common
    ones used in computational biology.
    Based on pairwise alignment scores, a guide-tree is
    reconstructed, and the sequences are stepwise added to
    the MSA along it (Feng & Dolittle 1987).
    10 / 32

    View Slide

  29. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Multiple Sequence Alignment
    čelovek
    “human”
    Russian
    člověk
    “human”
    Czech
    człowiek
    “human”
    Polish
    čovek
    “human”
    Bulgarian
    11 / 32

    View Slide

  30. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Multiple Sequence Alignment
    ʧɪlɐvʲɛk
    Russian
    ʧlovʲɛk
    Czech
    ʧwɔvʲɛk
    Polish
    ʧovɛk
    Bulgarian
    11 / 32

    View Slide

  31. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Multiple Sequence Alignment
    ʧɪlɐvʲɛk
    Russian
    ʧlovʲɛk
    Czech
    ʧwɔvʲɛk
    Polish
    ʧovɛk
    Bulgarian
    11 / 32

    View Slide

  32. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Multiple Sequence Alignment
    ʧɪlɐvʲɛk
    Russian
    ʧlovʲɛk
    Czech
    ʧwɔvʲɛk
    Polish
    ʧovɛk
    Bulgarian
    11 / 32

    View Slide

  33. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Multiple Sequence Alignment
    ʧɪlɐvʲɛk
    Russian
    ʧlovʲɛk
    Czech
    ʧwɔvʲɛk
    Polish
    ʧovɛk
    Bulgarian
    11 / 32

    View Slide

  34. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Multiple Sequence Alignment
    ʧɪlɐvʲɛk
    Russian
    ʧlovʲɛk
    Czech
    ʧwɔvʲɛk
    Polish
    ʧovɛk
    Bulgarian
    ʧ ɪ l ɐ vʲ ɛ k
    ʧ - l o vʲ ɛ k
    11 / 32

    View Slide

  35. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Multiple Sequence Alignment
    ʧɪlɐvʲɛk
    Russian
    ʧlovʲɛk
    Czech
    ʧwɔvʲɛk
    Polish
    ʧovɛk
    Bulgarian
    ʧ ɪ l ɐ vʲ ɛ k
    ʧ - l o vʲ ɛ k
    ʧ w ɔ vʲ ɛ k
    ʧ - o v ɛ k
    11 / 32

    View Slide

  36. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Multiple Sequence Alignment
    ʧɪlɐvʲɛk
    Russian
    ʧlovʲɛk
    Czech
    ʧwɔvʲɛk
    Polish
    ʧovɛk
    Bulgarian
    ʧ ɪ l ɐ vʲ ɛ k
    ʧ - l o vʲ ɛ k
    ʧ w ɔ vʲ ɛ k
    ʧ - o v ɛ k
    ʧ ɪ l ɐ vʲ ɛ k
    ʧ ˗ l o vʲ ɛ k
    ʧ ˗ w ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    11 / 32

    View Slide

  37. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Multiple Sequence Alignment
    Profiles
    The guide-tree heuristic can be enhanced by the
    application of profiles.
    A profile consists of the relative frequency of all segments
    of an MSA in all its positions, thus, a profile represents an
    MSA as a sequence of vectors.
    Aligning profiles to profiles instead of aligning two
    representative sequences of two given MSA yields better
    results, since more information can be taken into account.
    12 / 32

    View Slide

  38. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Multiple Sequence Alignment
    ʧɪlɐvʲɛk
    Russian
    ʧlovʲɛk
    Czech
    ʧwɔvʲɛk
    Polish
    ʧovɛk
    Bulgarian
    ʧ ɪ l ɐ vʲ ɛ k
    ʧ - l o vʲ ɛ k
    ʧ w ɔ vʲ ɛ k
    ʧ - o v ɛ k
    ʧ ɪ l ɐ vʲ ɛ k
    ʧ ˗ l o vʲ ɛ k
    ʧ ˗ w ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    13 / 32

    View Slide

  39. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Multiple Sequence Alignment
    ʧ ɪ l ɐ vʲ ɛ k
    ʧ ˗ l o vʲ ɛ k
    ʧ ˗ w ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    13 / 32

    View Slide

  40. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Multiple Sequence Alignment
    ʧ ɪ l ɐ vʲ ɛ k
    ʧ ˗ l o vʲ ɛ k
    ʧ ˗ w ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    ʧ 1.0
    13 / 32

    View Slide

  41. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Multiple Sequence Alignment
    ʧ ɪ l ɐ vʲ ɛ k
    ʧ ˗ l o vʲ ɛ k
    ʧ ˗ w ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    ʧ 1.0
    ɪ .25
    - .75
    13 / 32

    View Slide

  42. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Multiple Sequence Alignment
    ʧ ɪ l ɐ vʲ ɛ k
    ʧ ˗ l o vʲ ɛ k
    ʧ ˗ w ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    ʧ 1.0
    ɪ .25
    l .5
    - .75 .25
    w .25
    13 / 32

    View Slide

  43. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Multiple Sequence Alignment
    ʧ ɪ l ɐ vʲ ɛ k
    ʧ ˗ l o vʲ ɛ k
    ʧ ˗ w ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    ʧ 1.0
    ɪ .25
    l .5
    - .75 .25
    w .25
    o .5
    ɔ .25
    ɐ .25
    13 / 32

    View Slide

  44. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Multiple Sequence Alignment
    ʧ ɪ l ɐ vʲ ɛ k
    ʧ ˗ l o vʲ ɛ k
    ʧ ˗ w ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    ʧ 1.0
    ɪ .25
    l .5
    - .75 .25
    w .25
    o .5
    ɔ .25
    ɐ .25
    vʲ .75
    v .25
    13 / 32

    View Slide

  45. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Multiple Sequence Alignment
    ʧ ɪ l ɐ vʲ ɛ k
    ʧ ˗ l o vʲ ɛ k
    ʧ ˗ w ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    ʧ 1.0
    ɪ .25
    l .5
    - .75 .25
    w .25
    o .5
    ɔ .25
    ɐ .25
    vʲ .75
    v .25
    ɛ 1.0
    13 / 32

    View Slide

  46. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Multiple Sequence Alignment
    ʧ ɪ l ɐ vʲ ɛ k
    ʧ ˗ l o vʲ ɛ k
    ʧ ˗ w ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    ʧ 1.0
    ɪ .25
    l .5
    - .75 .25
    w .25
    o .5
    ɔ .25
    ɐ .25
    vʲ .75
    v .25
    ɛ 1.0
    k 1.0
    13 / 32

    View Slide

  47. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Pairwise Sequence Alignment
    Multiple Sequence Alignment
    Multiple Sequence Alignment
    ʧ ɪ l ɐ vʲ ɛ k
    ʧ ˗ l o vʲ ɛ k
    ʧ ˗ w ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    ʧ 1.0
    ɪ .25
    l .5
    - .75 .25
    w .25
    o .5
    ɔ .25
    ɐ .25
    vʲ .75
    v .25
    ɛ 1.0
    k 1.0
    13 / 32

    View Slide

  48. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Similarity
    Sound Classes
    Alignments in Historical Linguistics
    *ph2
    tēr
    *faθēr
    father
    1
    14 / 32

    View Slide

  49. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Similarity
    Sound Classes
    Similarity
    Synchronic Similarity
    Sounds in different languages are judged to be similar, if they
    show resemblences regarding the way they are produced or
    perceived.
    Diachronic Similarity
    Sounds in different languages are judged to be similar, if they
    go back to a common ancestor.
    15 / 32

    View Slide

  50. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Similarity
    Sound Classes
    Similarity
    Language Word Meaning
    Mandarin ma⁵⁵ma³ “mother”
    German mama “mother”
    Russian tak “in this way”
    German tʰaːk “day”
    16 / 32

    View Slide

  51. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Similarity
    Sound Classes
    Similarity
    Language Word Meaning
    German ʦʰaːn “tooth”
    English tʊːθ “tooth”
    Italian dɛntɛ “tooth”
    French dɑ̃ “tooth”
    16 / 32

    View Slide

  52. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Similarity
    Sound Classes
    Similarity
    .
    .
    German ʦʰ aː n -
    * Proto-Germanic t a n d
    English t ʊː θ -
    ** Proto-Indo-European d o n t
    Italian d ɛ n t ɛ
    * Proto-Romance d e n t
    French d ã - -
    funktionier endlich!
    17 / 32

    View Slide

  53. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Similarity
    Sound Classes
    Similarity
    .
    .
    German ʦʰ aː n -
    * Proto-Germanic t a n d
    English t ʊː θ -
    ** Proto-Indo-European d o n t
    Italian d ɛ n t ɛ
    * Proto-Romance d e n t
    French d ã - -
    funktionier endlich!
    17 / 32

    View Slide

  54. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Similarity
    Sound Classes
    Similarity
    .
    .
    German ʦʰ aː n -
    * Proto-Germanic t a n d
    English t ʊː - θ
    ** Proto-Indo-European d o n t
    Italian d ɛ n t ə
    * Proto-Romance d e n t
    French d ã - -
    funktionier endlich!
    17 / 32

    View Slide

  55. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Similarity
    Sound Classes
    Similarity
    .
    .
    German ʦʰ aː n -
    * Proto-Germanic t a n θ
    English t ʊː - θ
    ** Proto-Indo-European d o n t
    Italian d ɛ n t ə
    * Proto-Romance d e n t
    French d ã - -
    funktionier endlich!
    17 / 32

    View Slide

  56. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Similarity
    Sound Classes
    Similarity
    .
    .
    German ʦʰ aː n -
    * Proto-Germanic t a n θ
    English t ʊː - θ
    ** Proto-Indo-European d o n t
    Italian d ɛ n t ə
    * Proto-Romance d e n t
    French d ã - -
    funktionier endlich!
    17 / 32

    View Slide

  57. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Similarity
    Sound Classes
    Similarity
    .
    .
    German ʦʰ aː n -
    * Proto-Germanic t a n θ
    English t ʊː - θ
    ** Proto-Indo-European d o n t
    Italian d ɛ n t ə
    * Proto-Romance d e n t
    French d ã - -
    funktionier endlich!
    17 / 32

    View Slide

  58. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Similarity
    Sound Classes
    Similarity
    .
    .
    German ʦʰ aː n -
    * Proto-Germanic t a n d
    English t ʊː - θ
    ** Proto-Indo-European d o n t
    Italian d ɛ n t ə
    * Proto-Romance d e n t
    French d ã - -
    funktionier endlich!
    17 / 32

    View Slide

  59. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Similarity
    Sound Classes
    Sound Classes
    .
    .
    Correspondence Classes
    In sound class approaches, sounds are “divided into several
    types and thereby distinguished in such a way that phonetic
    correspondences inside a ‘type’ are more regular than those
    between different ‘types’” (Dolgopolsky 1986: 35).
    Diachronic Similarity
    Similarity is not based on synchronic resemblances of sounds
    but on class-membership: two sounds, how dissimilar they
    may be from a synchronic perspective, may still belong to the
    same class. Class membership indicates that the probability
    that sounds occur in a correspondence relationship in
    genetically related languages is considerably high.
    18 / 32

    View Slide

  60. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Similarity
    Sound Classes
    Sound Classes
    k g p b
    ʧ ʤ f v
    t d ʃ ʒ
    θ ð s z
    19 / 32

    View Slide

  61. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Similarity
    Sound Classes
    Sound Classes
    k g p b
    ʧ ʤ f v
    t d ʃ ʒ
    θ ð s z
    19 / 32

    View Slide

  62. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Similarity
    Sound Classes
    Sound Classes
    k g p b
    ʧ ʤ f v
    t d ʃ ʒ
    θ ð s z
    19 / 32

    View Slide

  63. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Similarity
    Sound Classes
    Sound Classes
    K
    T
    P
    S
    19 / 32

    View Slide

  64. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Main Ideas
    Working Principle
    Scoring
    LingPy
    20 / 32

    View Slide

  65. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Main Ideas
    Working Principle
    Scoring
    LingPy
    A Python Library for Sequence Alignment
    LingPy (www.lingulist.de/lingpy) is a suite of open
    source Python modules for sequence comparison, and distance
    analyses in quantitative historical linguistics. The library allows
    to carry out both pairwise and multiple alignments of strings
    encoded in IPA or X-Sampa, using different methods and
    algorithms, such as global (Needleman & Wunsch 1970) and
    local (Smith & Waterman 1981) pairwise alignments, multiple
    alignments based on guide trees (Feng & Doolittle 1987),
    profiles (Thompson et al. 1994), or iteration (Barton & Sternberg
    1987).
    21 / 32

    View Slide

  66. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Main Ideas
    Working Principle
    Scoring
    Main Ideas
    .
    .
    Alignment of Sound Class Sequences
    In contrast to previous approaches, which base the alignment
    on the sequences as they are given from the input, within the
    sound class approach, the input strings are first converted to
    sound classes before they are aligned.
    Transitions Between Sound Classes
    In contrast to previous sound class approaches (cf. e.g. Turchin
    et al. 2010), which do not allow for transitions between sound
    classes, this approach is based on a specific scoring
    function, which defines (diachronic) similarity among different
    sound classes.
    22 / 32

    View Slide

  67. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Main Ideas
    Working Principle
    Scoring
    Working Principle
    INPUT
    ʧɪlɐvʲɛk
    ʧovɛk
    23 / 32

    View Slide

  68. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Main Ideas
    Working Principle
    Scoring
    Working Principle
    INPUT
    ʧɪlɐvʲɛk
    ʧovɛk
    TOKENIZATION
    ʧ, ɪ, l, ɐ, vʲ, ɛ, k
    ʧ, o, v, ɛ, k
    23 / 32

    View Slide

  69. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Main Ideas
    Working Principle
    Scoring
    Working Principle
    INPUT
    ʧɪlɐvʲɛk
    ʧovɛk
    TOKENIZATION
    ʧ, ɪ, l, ɐ, vʲ, ɛ, k
    ʧ, o, v, ɛ, k
    CONVERSION
    CILAWEK
    COWEK
    23 / 32

    View Slide

  70. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Main Ideas
    Working Principle
    Scoring
    Working Principle
    INPUT
    ʧɪlɐvʲɛk
    ʧovɛk
    TOKENIZATION
    ʧ, ɪ, l, ɐ, vʲ, ɛ, k
    ʧ, o, v, ɛ, k
    CONVERSION
    CILAWEK
    COWEK
    ALIGNMENT
    C I L A W E K
    C - - O W E K
    23 / 32

    View Slide

  71. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Main Ideas
    Working Principle
    Scoring
    Working Principle
    INPUT
    ʧɪlɐvʲɛk
    ʧovɛk
    TOKENIZATION
    ʧ, ɪ, l, ɐ, vʲ, ɛ, k
    ʧ, o, v, ɛ, k
    CONVERSION
    CILAWEK
    COWEK
    ALIGNMENT
    C I L A W E K
    C - - O W E K
    OUTPUT
    ʧ ɪ l ɐ vʲ ɛ k
    ʧ - - o v ɛ k
    23 / 32

    View Slide

  72. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Main Ideas
    Working Principle
    Scoring
    Scoring
    .
    .
    Directionality of Sound Changes
    One crucial characteristic of certain well-known sound changes
    is their directionality, i.e. if certain sounds change, this change
    will go into a certain direction and the reverse change can
    rarely be attested.
    Directionality and Sound Correspondences
    While the nature of certain sound changes may be directional,
    sound correspondences do not directly reflect this directionality,
    and neither do scoring functions for sequence alignments, since
    these are not directional per definitionem, since the distance
    or similarity between two segments is always the same,
    regardless from which segment we start to compare.
    24 / 32

    View Slide

  73. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Main Ideas
    Working Principle
    Scoring
    Scoring
    .
    .
    Reflecting Directionality in Undirected Networks
    In this approach, the directionality of certain sound changes is
    accounted for by creating a non-metric scoring function.
    While in a metric scoring function the distance between two
    segments A and B would depend on the distance of A and B to
    a third segment C in such a way that, according to the triangle
    inequality the distance from A to B could not exceed the sum
    of the distances from A to C and from B to C, this does not
    hold for the probability of those sound correspondences, which
    occur as a product of directional sound change.
    25 / 32

    View Slide

  74. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Main Ideas
    Working Principle
    Scoring
    Scoring
    dentals
    affricates fricatives
    velars 8
    6
    8
    6
    0
    10
    10
    26 / 32

    View Slide

  75. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Main Ideas
    Working Principle
    Scoring
    Scoring
    dentals
    affricates fricatives
    velars 8
    6
    8
    6
    0
    10
    10
    26 / 32

    View Slide

  76. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Main Ideas
    Working Principle
    Scoring
    Scoring
    dentals
    affricates fricatives
    velars 8
    6
    8
    6
    0
    10
    10
    26 / 32

    View Slide

  77. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Main Ideas
    Working Principle
    Scoring
    Scoring
    dentals
    affricates fricatives
    velars 8
    6
    8
    6
    0
    10
    10
    26 / 32

    View Slide

  78. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Main Ideas
    Working Principle
    Scoring
    Scoring
    dentals
    affricates fricatives
    velars 8
    6
    8
    6
    0
    10
    10
    26 / 32

    View Slide

  79. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Main Ideas
    Working Principle
    Scoring
    Scoring
    dentals
    affricates fricatives
    velars 8
    6
    8
    6
    0
    10
    10
    26 / 32

    View Slide

  80. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Main Ideas
    Working Principle
    Scoring
    Scoring
    dentals
    affricates fricatives
    velars 8
    6
    8
    6
    0
    10
    10
    26 / 32

    View Slide

  81. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    Performance of the Method
    *
    *
    *
    *
    *
    *
    *
    *
    * *
    *
    *
    *
    v o l - d e m o r t
    v - l a d i m i r -
    v a l - d e m a r -
    1
    27 / 32

    View Slide

  82. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    Usage Example
    .
    .
    28 / 32

    View Slide

  83. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    Usage Example
    .
    .
    >>> from lingpy.compare.seqcom import Multiple
    28 / 32

    View Slide

  84. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    Usage Example
    .
    .
    >>> from lingpy.compare.seqcom import Multiple
    >>> mult = Multiple(['ʧwovʲɛk', 'ʧovɛk',\
    ... 'ʧlɔvʲɛk', 'ʧɪlɐvʲɛk'])
    28 / 32

    View Slide

  85. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    Usage Example
    .
    .
    >>> from lingpy.compare.seqcom import Multiple
    >>> mult = Multiple(['ʧwovʲɛk', 'ʧovɛk',\
    ... 'ʧlɔvʲɛk', 'ʧɪlɐvʲɛk'])
    >>> print ', '.join(mult.ipt_seqs)
    28 / 32

    View Slide

  86. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    Usage Example
    .
    .
    >>> from lingpy.compare.seqcom import Multiple
    >>> mult = Multiple(['ʧwovʲɛk', 'ʧovɛk',\
    ... 'ʧlɔvʲɛk', 'ʧɪlɐvʲɛk'])
    >>> print ', '.join(mult.ipt_seqs)
    ʧwɔvʲɛk, ʧovɛk, ʧlovʲɛk, ʧɪlɐvʲɛk
    28 / 32

    View Slide

  87. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    Usage Example
    .
    .
    >>> from lingpy.compare.seqcom import Multiple
    >>> mult = Multiple(['ʧwovʲɛk', 'ʧovɛk',\
    ... 'ʧlɔvʲɛk', 'ʧɪlɐvʲɛk'])
    >>> print ', '.join(mult.ipt_seqs)
    ʧwɔvʲɛk, ʧovɛk, ʧlovʲɛk, ʧɪlɐvʲɛk
    >>> mult.prog_align(method='sca',mode='profile')
    28 / 32

    View Slide

  88. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    Usage Example
    .
    .
    >>> from lingpy.compare.seqcom import Multiple
    >>> mult = Multiple(['ʧwovʲɛk', 'ʧovɛk',\
    ... 'ʧlɔvʲɛk', 'ʧɪlɐvʲɛk'])
    >>> print ', '.join(mult.ipt_seqs)
    ʧwɔvʲɛk, ʧovɛk, ʧlovʲɛk, ʧɪlɐvʲɛk
    >>> mult.prog_align(method='sca',mode='profile')
    ʧ - w ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    ʧ - l o vʲ ɛ k
    ʧ ɪ l ɐ vʲ ɛ k
    28 / 32

    View Slide

  89. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    Usage Example
    .
    .
    >>> from lingpy.compare.seqcom import Multiple
    >>> mult = Multiple(['ʧwovʲɛk', 'ʧovɛk',\
    ... 'ʧlɔvʲɛk', 'ʧɪlɐvʲɛk'])
    >>> print ', '.join(mult.ipt_seqs)
    ʧwɔvʲɛk, ʧovɛk, ʧlovʲɛk, ʧɪlɐvʲɛk
    >>> mult.prog_align(method='sca',mode='profile')
    ʧ - w ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    ʧ - l o vʲ ɛ k
    ʧ ɪ l ɐ vʲ ɛ k
    >>> mult.show_guide_tree()
    28 / 32

    View Slide

  90. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    Usage Example
    .
    .
    >>> from lingpy.compare.seqcom import Multiple
    >>> mult = Multiple(['ʧwovʲɛk', 'ʧovɛk',\
    ... 'ʧlɔvʲɛk', 'ʧɪlɐvʲɛk'])
    >>> print ', '.join(mult.ipt_seqs)
    ʧwɔvʲɛk, ʧovɛk, ʧlovʲɛk, ʧɪlɐvʲɛk
    >>> mult.prog_align(method='sca',mode='profile')
    ʧ - w ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    ʧ - l o vʲ ɛ k
    ʧ ɪ l ɐ vʲ ɛ k
    >>> mult.show_guide_tree()
    /-0:ʧwɔvʲɛk
    /--------|
    | \-1:ʧovɛk
    ---------|
    | /-3:ʧlovʲɛk
    \--------|
    \-2:ʧɪlɐvʲɛk
    28 / 32

    View Slide

  91. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    Usage Example
    .
    .
    >>> from lingpy.compare.seqcom import Multiple
    >>> mult = Multiple(['ʧwovʲɛk', 'ʧovɛk',\
    ... 'ʧlɔvʲɛk', 'ʧɪlɐvʲɛk'])
    >>> print ', '.join(mult.ipt_seqs)
    ʧwɔvʲɛk, ʧovɛk, ʧlovʲɛk, ʧɪlɐvʲɛk
    >>> mult.prog_align(method='sca',mode='profile')
    ʧ - w ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    ʧ - l o vʲ ɛ k
    ʧ ɪ l ɐ vʲ ɛ k
    >>> mult.show_guide_tree()
    /-0:ʧwɔvʲɛk
    /--------|
    | \-1:ʧovɛk
    ---------|
    | /-3:ʧlovʲɛk
    \--------|
    \-2:ʧɪlɐvʲɛk
    >>> print ', '.join([seq.cls_str for seq in \
    ... mult.lingpy_seqs])
    28 / 32

    View Slide

  92. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    Usage Example
    .
    .
    >>> from lingpy.compare.seqcom import Multiple
    >>> mult = Multiple(['ʧwovʲɛk', 'ʧovɛk',\
    ... 'ʧlɔvʲɛk', 'ʧɪlɐvʲɛk'])
    >>> print ', '.join(mult.ipt_seqs)
    ʧwɔvʲɛk, ʧovɛk, ʧlovʲɛk, ʧɪlɐvʲɛk
    >>> mult.prog_align(method='sca',mode='profile')
    ʧ - w ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    ʧ - l o vʲ ɛ k
    ʧ ɪ l ɐ vʲ ɛ k
    >>> mult.show_guide_tree()
    /-0:ʧwɔvʲɛk
    /--------|
    | \-1:ʧovɛk
    ---------|
    | /-3:ʧlovʲɛk
    \--------|
    \-2:ʧɪlɐvʲɛk
    >>> print ', '.join([seq.cls_str for seq in \
    ... mult.lingpy_seqs])
    CWOWEK, COWEK, CLOWEK, CILAWEK
    28 / 32

    View Slide

  93. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    Usage Example
    .
    .
    29 / 32

    View Slide

  94. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    Usage Example
    .
    .
    >>> mult.flat_cluster(0.3,method='sca')
    29 / 32

    View Slide

  95. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    Usage Example
    .
    .
    >>> mult.flat_cluster(0.3,method='sca')
    [1, 1, 1, 1]
    29 / 32

    View Slide

  96. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    Usage Example
    .
    .
    >>> mult.flat_cluster(0.3,method='sca')
    [1, 1, 1, 1]
    >>> mult.prog_align(method='sca',mode='profile')\
    ... # profile-based alignment
    29 / 32

    View Slide

  97. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    Usage Example
    .
    .
    >>> mult.flat_cluster(0.3,method='sca')
    [1, 1, 1, 1]
    >>> mult.prog_align(method='sca',mode='profile')\
    ... # profile-based alignment
    ʧ - w ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    ʧ - l o vʲ ɛ k
    ʧ ɪ l ɐ vʲ ɛ k
    29 / 32

    View Slide

  98. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    Usage Example
    .
    .
    >>> mult.flat_cluster(0.3,method='sca')
    [1, 1, 1, 1]
    >>> mult.prog_align(method='sca',mode='profile')\
    ... # profile-based alignment
    ʧ - w ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    ʧ - l o vʲ ɛ k
    ʧ ɪ l ɐ vʲ ɛ k
    >>> mult.sum_of_pairs()
    29 / 32

    View Slide

  99. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    Usage Example
    .
    .
    >>> mult.flat_cluster(0.3,method='sca')
    [1, 1, 1, 1]
    >>> mult.prog_align(method='sca',mode='profile')\
    ... # profile-based alignment
    ʧ - w ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    ʧ - l o vʲ ɛ k
    ʧ ɪ l ɐ vʲ ɛ k
    >>> mult.sum_of_pairs()
    39.666666666666664
    29 / 32

    View Slide

  100. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    Usage Example
    .
    .
    >>> mult.flat_cluster(0.3,method='sca')
    [1, 1, 1, 1]
    >>> mult.prog_align(method='sca',mode='profile')\
    ... # profile-based alignment
    ʧ - w ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    ʧ - l o vʲ ɛ k
    ʧ ɪ l ɐ vʲ ɛ k
    >>> mult.sum_of_pairs()
    39.666666666666664
    >>> mult.prog_align(method='sca',mode='fd') \
    ... # simple guide-tree alignment
    29 / 32

    View Slide

  101. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    Usage Example
    .
    .
    >>> mult.flat_cluster(0.3,method='sca')
    [1, 1, 1, 1]
    >>> mult.prog_align(method='sca',mode='profile')\
    ... # profile-based alignment
    ʧ - w ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    ʧ - l o vʲ ɛ k
    ʧ ɪ l ɐ vʲ ɛ k
    >>> mult.sum_of_pairs()
    39.666666666666664
    >>> mult.prog_align(method='sca',mode='fd') \
    ... # simple guide-tree alignment
    ʧ w - ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    ʧ - l o vʲ ɛ k
    ʧ ɪ l ɐ vʲ ɛ k
    29 / 32

    View Slide

  102. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    Usage Example
    .
    .
    >>> mult.flat_cluster(0.3,method='sca')
    [1, 1, 1, 1]
    >>> mult.prog_align(method='sca',mode='profile')\
    ... # profile-based alignment
    ʧ - w ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    ʧ - l o vʲ ɛ k
    ʧ ɪ l ɐ vʲ ɛ k
    >>> mult.sum_of_pairs()
    39.666666666666664
    >>> mult.prog_align(method='sca',mode='fd') \
    ... # simple guide-tree alignment
    ʧ w - ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    ʧ - l o vʲ ɛ k
    ʧ ɪ l ɐ vʲ ɛ k
    >>> mult.iterate()
    29 / 32

    View Slide

  103. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    Usage Example
    .
    .
    >>> mult.flat_cluster(0.3,method='sca')
    [1, 1, 1, 1]
    >>> mult.prog_align(method='sca',mode='profile')\
    ... # profile-based alignment
    ʧ - w ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    ʧ - l o vʲ ɛ k
    ʧ ɪ l ɐ vʲ ɛ k
    >>> mult.sum_of_pairs()
    39.666666666666664
    >>> mult.prog_align(method='sca',mode='fd') \
    ... # simple guide-tree alignment
    ʧ w - ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    ʧ - l o vʲ ɛ k
    ʧ ɪ l ɐ vʲ ɛ k
    >>> mult.iterate()
    Old SoP score: 37.8333333333
    New SoP score: 39.6666666667
    29 / 32

    View Slide

  104. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    Usage Example
    .
    .
    >>> mult.flat_cluster(0.3,method='sca')
    [1, 1, 1, 1]
    >>> mult.prog_align(method='sca',mode='profile')\
    ... # profile-based alignment
    ʧ - w ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    ʧ - l o vʲ ɛ k
    ʧ ɪ l ɐ vʲ ɛ k
    >>> mult.sum_of_pairs()
    39.666666666666664
    >>> mult.prog_align(method='sca',mode='fd') \
    ... # simple guide-tree alignment
    ʧ w - ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    ʧ - l o vʲ ɛ k
    ʧ ɪ l ɐ vʲ ɛ k
    >>> mult.iterate()
    Old SoP score: 37.8333333333
    New SoP score: 39.6666666667
    ʧ - w ɔ vʲ ɛ k
    ʧ - - o v ɛ k
    ʧ - l o vʲ ɛ k
    ʧ ɪ l ɐ vʲ ɛ k
    29 / 32

    View Slide

  105. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    TPPSR
    .
    .
    IPA-Encoding of the TPPSR
    The Tableaux phonétiques des patois suisses romand (TPPSR,
    Gauchat et al. 1925) is a collection of phonetic dialect data,
    which was digitized in an earlier research project of the Institute
    for Romance Languages and Literature (Heinrich Heine
    University Düsseldorf). The original data was converted to IPA
    in order make it suitable for alignment analyses using the
    LingPy library. The dataset consists of 480 charts (480 words
    and phrases) which contain phonetic information for 62 dialect
    points.
    Analysis within LingPy
    The analysis within LingPy is done via a simple terminal-based
    interface which takes text-files as input and outputs the results
    of the alignment analyses as text-files.
    30 / 32

    View Slide

  106. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    TPPSR
    tppsr
    69,sont tout près,pressu
    2 sɔ̃.tɔ.priː
    3 i.sɔ̃.tɔ.ʤeː
    5 ei.səɔ̃.tɔ.prei
    8 sɔ̃.pre
    11 sɔ̃.tɔ.pruːʦɔ
    18 sɔ̃.pre
    19 sɔ̃.tɔ.pre
    30 ʃʊn.pre
    31 ʃɔ̃n.tɔ.prei
    34 i.sɔ̃.tɔ.pre
    54 ɛ.sɔ̃.tɔ.prɛ
    55 prɛj
    56 a.sãõ.tɔ.d.koːt
    57 sɔ̃.tɔ.preː
    58 a.sɔ̃.tɔ.preŋ
    Interesting Site!
    31 / 32

    View Slide

  107. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    TPPSR
    tppsr
    69,sont tout près,pressu
    2 1 - s ɔ̃ - t ɔ p r iː - -
    3 2 i.sɔ̃.tɔ.ʤeː
    5 1 ei s əɔ̃ - t ɔ p r ei - -
    8 1 - s ɔ̃ - - - p r e - -
    11 1 - s ɔ̃ - t ɔ p r uː ʦ ɔ
    18 1 - s ɔ̃ - - - p r e - -
    19 1 - s ɔ̃ - t ɔ p r e - -
    30 1 - ʃ ʊ n - - p r e - -
    31 1 - ʃ ɔ̃ n t ɔ p r ei - -
    34 1 i s ɔ̃ - t ɔ p r e - -
    54 1 ɛ s ɔ̃ - t ɔ p r ɛ - -
    55 1 - - - - - - p r ɛ j -
    56 3 a.sãõ.tɔ.d.koːt
    57 1 - s ɔ̃ - t ɔ p r eː - -
    58 1 a s ɔ̃ - t ɔ p r e ŋ -
    31 / 32

    View Slide

  108. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    TPPSR
    tppsr
    69,sont tout près,pressu
    2 1 - s ɔ̃ - t ɔ p r iː - -
    3 2 i.sɔ̃.tɔ.ʤeː
    5 1 ei s əɔ̃ - t ɔ p r ei - -
    8 1 - s ɔ̃ - - - p r e - -
    11 1 - s ɔ̃ - t ɔ p r uː ʦ ɔ
    18 1 - s ɔ̃ - - - p r e - -
    19 1 - s ɔ̃ - t ɔ p r e - -
    30 1 - ʃ ʊ n - - p r e - -
    31 1 - ʃ ɔ̃ n t ɔ p r ei - -
    34 1 i s ɔ̃ - t ɔ p r e - -
    54 1 ɛ s ɔ̃ - t ɔ p r ɛ - -
    55 1 - - - - - - p r ɛ j -
    56 3 a.sãõ.tɔ.d.koːt
    57 1 - s ɔ̃ - t ɔ p r eː - -
    58 1 a s ɔ̃ - t ɔ p r e ŋ -
    Taxon-ID
    31 / 32

    View Slide

  109. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    TPPSR
    tppsr
    69,sont tout près,pressu
    2 1 - s ɔ̃ - t ɔ p r iː - -
    3 2 i.sɔ̃.tɔ.ʤeː
    5 1 ei s əɔ̃ - t ɔ p r ei - -
    8 1 - s ɔ̃ - - - p r e - -
    11 1 - s ɔ̃ - t ɔ p r uː ʦ ɔ
    18 1 - s ɔ̃ - - - p r e - -
    19 1 - s ɔ̃ - t ɔ p r e - -
    30 1 - ʃ ʊ n - - p r e - -
    31 1 - ʃ ɔ̃ n t ɔ p r ei - -
    34 1 i s ɔ̃ - t ɔ p r e - -
    54 1 ɛ s ɔ̃ - t ɔ p r ɛ - -
    55 1 - - - - - - p r ɛ j -
    56 3 a.sãõ.tɔ.d.koːt
    57 1 - s ɔ̃ - t ɔ p r eː - -
    58 1 a s ɔ̃ - t ɔ p r e ŋ -
    Cluster-ID
    31 / 32

    View Slide

  110. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    TPPSR
    tppsr
    69,sont tout près,pressu
    2 1 - s ɔ̃ - t ɔ p r iː - -
    3 2 i.sɔ̃.tɔ.ʤeː
    5 1 ei s əɔ̃ - t ɔ p r ei - -
    8 1 - s ɔ̃ - - - p r e - -
    11 1 - s ɔ̃ - t ɔ p r uː ʦ ɔ
    18 1 - s ɔ̃ - - - p r e - -
    19 1 - s ɔ̃ - t ɔ p r e - -
    30 1 - ʃ ʊ n - - p r e - -
    31 1 - ʃ ɔ̃ n t ɔ p r ei - -
    34 1 i s ɔ̃ - t ɔ p r e - -
    54 1 ɛ s ɔ̃ - t ɔ p r ɛ - -
    55 1 - - - - - - p r ɛ j -
    56 3 a.sãõ.tɔ.d.koːt
    57 1 - s ɔ̃ - t ɔ p r eː - -
    58 1 a s ɔ̃ - t ɔ p r e ŋ -
    Taxon-ID
    Cluster-ID
    Singleton
    Singleton
    31 / 32

    View Slide

  111. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    TPPSR
    tppsr
    69,sont tout près,pressu
    2 1 - s ɔ̃ - t ɔ p r iː - -
    5 1 ei s əɔ̃ - t ɔ p r ei - -
    8 1 - s ɔ̃ - - - p r e - -
    11 1 - s ɔ̃ - t ɔ p r uː ʦ ɔ
    18 1 - s ɔ̃ - - - p r e - -
    19 1 - s ɔ̃ - t ɔ p r e - -
    30 1 - ʃ ʊ n - - p r e - -
    31 1 - ʃ ɔ̃ n t ɔ p r ei - -
    34 1 i s ɔ̃ - t ɔ p r e - -
    54 1 ɛ s ɔ̃ - t ɔ p r ɛ - -
    55 1 - - - - - - p r ɛ j -
    57 1 - s ɔ̃ - t ɔ p r eː - -
    58 1 a s ɔ̃ - t ɔ p r e ŋ -
    Taxon-ID
    Cluster-ID
    Singleton
    Singleton
    31 / 32

    View Slide

  112. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    TPPSR
    tppsr
    69,sont tout près,pressu
    2 1 - s ɔ̃ - t ɔ p r iː - -
    5 1 ei s əɔ̃ - t ɔ p r ei - -
    8 1 - s ɔ̃ - - - p r e - -
    11 1 - s ɔ̃ - t ɔ p r uː ʦ ɔ
    18 1 - s ɔ̃ - - - p r e - -
    19 1 - s ɔ̃ - t ɔ p r e - -
    30 1 - ʃ ʊ n - - p r e - -
    31 1 - ʃ ɔ̃ n t ɔ p r ei - -
    34 1 i s ɔ̃ - t ɔ p r e - -
    54 1 ɛ s ɔ̃ - t ɔ p r ɛ - -
    55 1 - - - - - - p r ɛ j -
    57 1 - s ɔ̃ - t ɔ p r eː - -
    58 1 a s ɔ̃ - t ɔ p r e ŋ -
    Boring Site!
    Taxon-ID
    Cluster-ID
    31 / 32

    View Slide

  113. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    TPPSR
    tppsr
    69,sont tout près,pressu
    2 1 - s ɔ̃ - t ɔ p r iː - -
    5 1 ei s əɔ̃ - t ɔ p r ei - -
    8 1 - s ɔ̃ - - - p r e - -
    11 1 - s ɔ̃ - t ɔ p r uː ʦ ɔ
    18 1 - s ɔ̃ - - - p r e - -
    19 1 - s ɔ̃ - t ɔ p r e - -
    30 1 - ʃ ʊ n - - p r e - -
    31 1 - ʃ ɔ̃ n t ɔ p r ei - -
    34 1 i s ɔ̃ - t ɔ p r e - -
    54 1 ɛ s ɔ̃ - t ɔ p r ɛ - -
    55 1 - - - - - - p r ɛ j -
    57 1 - s ɔ̃ - t ɔ p r eː - -
    58 1 a s ɔ̃ - t ɔ p r e ŋ -
    Interesting Site!
    31 / 32

    View Slide

  114. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    TPPSR
    tppsr
    66,est étroite,stricta
    1 ɛ.etrɑːt
    2 ɛ.ɛtræːtɛ
    3 ɛ.ɛtreːta
    5 ɛ.ɛtraɛːta
    8 ɛ.ɛtrɑːɛt
    11 l.ɛ.ɛtræːtə
    19 l.ɛ.etrɑːtə
    30 l.ɛθ.ɛθreiti
    31 lʲ.ɛ.ɛhriːti
    34 ɛt.eːtraːto
    55 ɛ.ɛtræit
    56 ɛ.ɛtrɑːət
    57 ɛ.ɛtrɛt
    58 j.ɛ.ɛtreːt
    Interesting Site!
    31 / 32

    View Slide

  115. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    TPPSR
    tppsr
    66,est étroite,stricta
    1 1 - ɛ - e t r ɑː t -
    2 1 - ɛ - ɛ t r æː t ɛ
    3 1 - ɛ - ɛ t r eː t a
    5 1 - ɛ - ɛ t r aɛː t a
    8 1 - ɛ - ɛ t r ɑːɛ t -
    11 1 l ɛ - ɛ t r æː t ə
    19 1 l ɛ - e t r ɑː t ə
    30 1 l ɛ θ ɛ θ r ei t i
    31 1 lʲ ɛ - ɛ h r iː t i
    34 1 - ɛ t eː t r aː t o
    55 1 - ɛ - ɛ t r æi t -
    56 1 - ɛ - ɛ t r ɑːə t -
    57 1 - ɛ - ɛ t r ɛ t -
    58 1 j ɛ - ɛ t r eː t -
    31 / 32

    View Slide

  116. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    TPPSR
    tppsr
    195,une feuille,folia
    1 ɔ̃na.fɔlʲ
    2 na.folʲɛ
    3 na.fɔlʲ
    5 una.fɔlʲə
    8 ɔ̃na.fɔjə
    11 ɔ̃na.fɔlʲə
    19 na.føðə
    30 folʲe
    31 fɔłe
    34 na.fwolʲ
    55 ɔn.fɔdʲ
    56 ɛn.fuj
    57 ɛn.fuj
    58 ɛn.fœj
    31 / 32

    View Slide

  117. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    TPPSR
    tppsr
    195,une feuille,folia
    1 1 ɔ̃ n a f - ɔ lʲ -
    2 1 - n a f - o lʲ ɛ
    3 1 - n a f - ɔ lʲ -
    5 1 u n a f - ɔ lʲ ə
    8 1 ɔ̃ n a f - ɔ j ə
    11 1 ɔ̃ n a f - ɔ lʲ ə
    19 1 - n a f - ø ð ə
    30 1 - - - f - o lʲ e
    31 1 - - - f - ɔ ł e
    34 1 - n a f w o lʲ -
    55 1 ɔ n - f - ɔ dʲ -
    56 1 ɛ n - f - u j -
    57 1 ɛ n - f - u j -
    58 1 ɛ n - f - œ j -
    31 / 32

    View Slide

  118. Introduction
    Automatic Alignment Analyses
    Alignments in Historical Linguistics
    LingPy
    Performance of the Method
    Usage Example
    TPPSR
    Thank You
    for Listening!
    Special thanks to the
    German Federal Mi-
    nistry of Education
    and Research (BMBF)
    for funding our re-
    search project on
    evolution and clas-
    sification in biolo-
    gy, linguistics, and
    the history of sci-
    ence (EvoClass).
    1 32 / 32

    View Slide