Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Analyzing Continuous Semantic Shifts with Diach...

hajime kiyama
January 19, 2025
60

Analyzing Continuous Semantic Shifts with Diachronic Word Similarity Matrices.

Oral presentation at COLING 2025

hajime kiyama

January 19, 2025
Tweet

More Decks by hajime kiyama

Transcript

  1. Analyzing Continuous Semantic Shifts with Diachronic Word Similarity Matrices *Hajime

    Kiyama1 Taichi Aida1 Mamoru Komachi2 Toshinobu Ogiso3 Hiroya Takamura4 Daichi Mochihashi3,5 1Tokyo Metropolitan University 2Hitotsubashi University 3National Institute for Japanese Language and Linguistics 4National Institute of Advanced Industrial Science and Technology 5The Institute of Statistical Mathematics
  2. Overview Research Objective ◼ Analyze how the meanings of words

    change over time. Our Contributions ◼ Analyzing semantic shifts using diachronic word similarity matrices ◼ Considering arbitrary time periods and using lightweight word embeddings ◼ Words with similar patterns are grouped in an unsupervised setting 2
  3. Diachronic Semantic Shift ◼ The phenomenon where the meanings of

    words change over time ◼ The task of analyzing changes in word embeddings ◼ Based on the distributional hypothesis, a word’s meaning is determined by its surrounding words 3 Quoted from [Hamilton+, 2016]
  4. Semantic Shift Across Two Periods and Multiple Periods ◼ Semantic

    Shift in two periods [Cassotti+, 2023][Periti and Tahmasebi, 2024a][Aida and Bollegala, 2024][Periti+, 2024] ◼ What word change? ◼ Measure the degree of semantic shift between two time periods ◼ Datasets annotated with the degree of semantic change are available ◼ Semantic Shift in multiple periods [Kulkarni+, 2015] [Hu+, 2019] [Giulianelli+, 2020] ◼ How does a word change? ◼ change points detection and measure the proportion of word senses ◼ No annotated Dataset ◼ Does not reveal specific semantic transitions ◼ Computationally expensive, limiting the number of target words 4 [Periti and Tahmasebi, 2024b]
  5. Analysing diachronic semantic shifts using similarity matrices 6 Assumptions for

    input: Word embeddings are prepared in a comparable format. In this study, we use PPMI-SVD joint to obtain fast and lightweight word embeddings. It enables analysis with thousands of target words.
  6. Analysing diachronic semantic shifts using similarity matrices 7 Calculate the

    similarity of word embeddings for each time period. Semantic shift patterns become visible! Changes across arbitrary time periods can be identified at a glance.
  7. Analysing diachronic semantic shifts using similarity matrices 8 Clustering the

    resulting similarity matrix allows grouping of words with similar patterns! If a word belongs to the same cluster as a word that underwent semantic shift, it is likely to have changed in meaning as well.
  8. Setup ◼ Dataset ◼ COHA (Corpus of historical American English)

    ◼ 1830-2010, 19 period, each spanning 10 years ◼ Target words:3231 ◼ COCA (Corpus of contemporary American English) ◼ 1991-2019, 30 period, each spanning 1 years ◼ Target words:2805 ◼ Word Embedding ◼ PPMI-SVD joint [Aida+, 2021] ◼ Similarity ◼ Cosine similarity 12 • Comparison across different time slices • More than 100 target words per period (appearing words) • A very large number of words.
  9. Setup ◼ Dataset ◼ COHA (Corpus of historical American English)

    ◼ 1830-2010, 19 period, each spanning 10 years ◼ Target words:3231 ◼ COCA (Corpus of contemporary American English) ◼ 1991-2019, 30 period, each spanning 1 years ◼ Target words:2805 ◼ Word Embedding ◼ PPMI-SVD joint [Aida+, 2021] ◼ Similarity ◼ Cosine similarity 13 This method is CPU-based and computationally fast
  10. PPMI-SVD joint [Aida+, 2021] ◼ PPMI-SVD [Levy and Goldberg, 2014]

    ◼ PPMI : positive pointwise mutual information ◼ SVD : singular value decomposition ◼ Context words are shared and compressed simultaneously [Aida+, 2021] 14 Quoted from [Aida+, 2021] M : PPMI matrix
  11. Visualization of Similarity Matrices - COHA ◼ Visualization of the

    Similarity Matrix for ‘record’ in COHA ◼ The matrix is divided into two regions of high similarity ◼ Region 1 (period 0-8):1830 - 1920 ◼ Region 2 (period 9-18) : 1930 - 2010 ◼ ‘record’ is a semantic shifted word ◼ the change occurred in 1930 ◼ It reveals when the change occurred 16
  12. Visualization of Similarity Matrices - COCA ◼ Visualization of the

    Similarity Matrix for ‘president’ in COCA ◼ two regions of high similarity and two spikes ◼ Region 1 (period 0-26):1991 - 2016 ◼ Region 2 (period 27-29) : 2017 - 2019 ◼ Spikes (period 8, 22):1998, 2012 ◼ Did ‘president’ change due to societal events? ◼ There are periods of extreme change ◼ Reveals whether the meaning changes or reverts ◼ Changes identifiable across arbitrary time periods! ◼ Detailed analysis shows Sec 4.2.2 17
  13. Visualization of Similarity Matrices - COCA ◼ Visualization of the

    Similarity Matrix for ‘president’ in COCA ◼ two regions of high similarity and two spikes ◼ Region 1 (period 0-26):1991 - 2016 ◼ Region 2 (period 27-29) : 2017 - 2019 ◼ Spikes (period 8, 22):1998, 2012 ◼ Did ‘president’ change due to societal events? ◼ Yes! -> Detailed analysis shows Sec 4.2.2 ◼ Reveals whether the meaning changes or reverts ◼ Changes identifiable across arbitrary time periods! 18
  14. Setup : Clustering ◼ Clustering setup ◼ similarity:cosine ◼ feature:Upper

    triangular components ◼ clustering:Hierarchical clustering ◼ normalization:Standardizing the feature ◼ How were these parameters chosen? ◼ The configuration that performed best in experiments on pseudo-data was adopted. ◼ See → Sec 5: Experiment in Pseudo Data 20 Upper triangular components
  15. Concrete Examples of Similarity Matrix Clustering ◼ Words with similar

    local patterns can be grouped ◼ Words in the same cluster as those with semantic changes are likely to have undergone semantic shifts as well 22 COHA COCA
  16. Conclusion ◼ Analyzing semantic shifts using diachronic word similarity matrices

    ◼ Considering arbitrary time periods and using lightweight word embeddings ◼ Words with similar patterns are grouped in an unsupervised setting 23 Similarity matrices are good for analyzing arbitrary time periods!
  17. Challenge in Semantic Shift Across Multiple Periods 1 ◼ Change

    point detection in adjacent time periods [Kulkarni+, 2015] ◼ Measures the degree of semantic change between consecutive time periods ◼ A natural extension of semantic change analysis between two periods ◼ Does not reveal specific semantic transitions ◼ When there are multiple changes, does the meaning revert to its original state? ◼ Or does it fully shift to a different sense? 25 Considering information across arbitrary time periods is needed Quoted from [Periti and Tahmasebi, 2024b]
  18. Challenge in Semantic Shift Across Multiple Periods 2 ◼ Clustering

    word senses using BERT-based methods [Hu+, 2019][Giulianelli+, 2020] ◼ Clusters embeddings of target words to analyze the proportion of word senses ◼ Allows analysis of the temporal transitions of word senses ◼ Computationally expensive, limiting the number of target words ◼ Requires computation for each word multiplied by the number of examples. 26 A scalable method using lightweight word embeddings is needed Quoted from [Hu+, 2019]
  19. Analysis of Similarity Matrices ◼ Analysis using PPMI Differences ◼

    Changes in similarity are likely caused by changes in co-occurring words ◼ Analyze the words that appear exclusively in that period ◼ :PPMI matrix at period t ◼ :Top-k words that appear only in period t1 27
  20. Analysis of Similarity Matrices - COHA ◼ PPMI Differences for

    ‘record’ in COHA ◼ Semantic change is clearly observable! ◼ 1840 : the meaning of preserving events or objects. ◼ 1940 : media for playing back sound. 28
  21. Analysis of Similarity Matrices - COCA ◼ PPMI Differences for

    ‘president’ in COCA ◼ Changes driven by societal factors can be analyzed! ◼ Related to presidential elections and documentaries ◼ Changes during the Trump administration were also observed. 29
  22. Pseudo : shift schemas 30 ◼ Classification of seven shift

    schemas [Shoemark+, 2017] ◼ C1 : acquisition of new sense ◼ C2 : sense translation ◼ C3 : acquisition of noisy sense ◼ D1 : increase of a sense ◼ D2 : sensitive to a specific period ◼ D3 : periodically sensitive shifts ◼ D4 : pure noise
  23. Pseudo : Setup ◼ Dataset ◼ Mainichi Shimbun 2010 year

    ◼ Create pseudo shifts across 20 time periods ◼ Generate 20 pseudo-words for each shifts ◼ Embedding method ◼ PPMI-SVD joint 31 A total of 140 pseudo-words are generated. A classification task is performed on these 140 words, categorizing them into seven shifts.
  24. Pseudo : classification by clustering 32 ◼ Best classification setting

    ◼ Cosine similarity ◼ Upper Triangle ◼ Hierarchical clustering ◼ standalization of feature Period 0 Adjacent Upper Tri
  25. Limitations ◼ Dataset ◼ Finding optimal time slice is need

    ◼ Embedding ◼ Dynamic embedding is good for analysis? ◼ Pseudo schemas ◼ these seven schemas do not necessarily cover all types of semantic shifts ◼ Application ◼ How to select target words for analysis? ◼ changes in similarity do not always correspond to semantic shifts 34
  26. References 1 [Hamilton+,2016] Diachronic Word Embeddings Reveal Statistical Laws of

    Semantic Change [Cassotti+, 2023] XL-LEXEME: WiC Pretrained Model for Cross-Lingual LEXical sEMantic changE [Periti and Tahmasebi, 2024a] A Systematic Comparison of Contextualized Word Embeddings for Lexical Semantic Change [Periti and Tahmasebi, 2024b] Towards a Complete Solution to Lexical Semantic Change: an Extension to Multiple Time Periods and Diachronic Word Sense Induction [Aida and Bollegala, 2024] A Semantic Distance Metric Learning approach for Lexical Semantic Change Detection 36
  27. References 2 [Periti+, 2024] Analyzing Semantic Change through Lexical Replacements

    [Kulkarni+, 2015] Statistically Significant Detection of Linguistic Change [Hu+, 2019] Diachronic Sense Modeling with Deep Contextualized Word Embeddings: An Ecological View [Giulianelli+, 2020] Analysing Lexical Semantic Change with Contextualised Word Representations 37
  28. References 3 [Aida+, 2021] A Comprehensive Analysis of PMI-based Models

    for Measuring Semantic Differences. [Levy and Goldberg, 2014] Neural Word Embedding as Implicit Matrix Factorization [Shoemark+, 2017] Room to Glo: A Systematic Comparison of Semantic Change Detection Approaches with Word Embeddings [Nulund+, 2024] Time is Encoded in the Weights of Finetuned Language Models 38