Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Computational Approaches for Diachronic Semanti...

Lexuss-D
February 05, 2025

Computational Approaches for Diachronic Semantic Change Detection_2024_8

Lexuss-D

February 05, 2025
Tweet

More Decks by Lexuss-D

Other Decks in Science

Transcript

  1. Self Introduction 凌 志栋, Zhidong Ling (Zh), Ryo Shito(Jp) 2nd

    year Master Student Tokyo Metropolitan University Natural Language Processing Group (will disappear in 2025) Plan to go PhD to Hitotsubashi University from 2025, same supervisor Prof. Mamoru Komachi 2 My cat named Dog
  2. Table of Contents 1. Diachronic Semantic Change Detection (DSCD) 2.

    Basic Computational Methods for DSCD a. for Detection b. for Analysis 3. Evaluation of DSCD (what I have worked on) 4. Current Topics of Semantic Change 3
  3. Diachronic Semantic Change Detection (DSCD) • To detect words that

    changed its meaning from diachronic texts • Objective : Manual Check → Automatic Detection / Analysis Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change [Hamilton+16] 4
  4. Diachronic Semantic Change Detection (DSCD) • To detect words that

    changed its meaning from diachronic texts • Objective : Manual Check → Automatic Detection / Analysis Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change [Hamilton+16] 5
  5. Basic Methods of DSCD Methods for Detection via word embedding

    • Static + Alignment + Distance • Contextualized + Clustering + Distance/Distribution • Methods out of the paradigm Methods for Analysis • Topic Model with great explainability 6
  6. Word Embeddings Reflect Semantic Change Distribution Hypothesis: Words with similar

    distributions have similar meanings ◦ Word meaning is determined by context words ◦ In embedding space, words with similar/related meanings get closer to each other Car Auto Wagon Dog Hound ➔ Police wagons and fire wagons were not moving. ➔ We have a family in our church whose teenage son was in an auto accident. ➔ They arrived by car. ➔ A wily fox will outrun a pack of hounds, but never a bullet. ➔ The British are renowned as a nation of dog lovers. 7
  7. How Do Word Embeddings Reflect Semantic Change • For target

    word, learn different embeddings from different periods of corpora • Different meanings = Longer distance in embedding space • The distance (kind of) reflects the semantic change 8 Horse Coach_1850s Drive Basketball Football Coach_2010s ➔ [1851] Louis and his brother generally patronized the top of the coach, but as they drew near Bristol ➔ [1851] The coachman said, " Yes, yes; " and Rollo got into the coach. ➔ [2010] I am here with legendary icon and basketball coach
  8. Static+Alignment+Distance Static Embedding e.g. Word2Vec (one word one embedding) 9

    Coach @1850s usage1 usage2… Coach @2010s usage1 usage2… SGNS, CBOW, … V_Coach_1850s = [0.1, 0.3, 0.6…] V_Coach _2010s = [0.9, 0.4, -0.7…] Corpus @1850s Corpus @2010s V_Drive_2010s = [0.2, 0.3, -0.1…] V_Basketball_2010s = [-1.3, 0.4, -0.4…] V_Drive_1850s = [0.8, 0.1, -0.1…] V_Basketball_1850s = [-2.3, 0.5, -0.4…]
  9. Static+Alignment+Distance 10 Horse_1850s Coach_1850s Drive_1850s Basketball_1850s Football_1850s Horse_2010s Drive_2010s Basketball_2010s

    Football_2010s Coach_2010s Cannot be compared because they are in different embedding spaces Embedding Space @1850s Embedding Space @2010s
  10. Static+Alignment+Distance 11 Alignment e.g. Orthogonal Procrustes [Hamilton+16] Horse_1850s Coach_1850s Drive_1850s

    Basketball_1850s Football_1850s Horse_2010s Coach_1850s Drive_2010s Basketball_2010s Football_2010 s Coach_2010s Horse_1850s Drive_1850s Basketball_1850s Football_1850s into same space Rotate with R(θ) Embedding Space @1850s Embedding Space @2010s
  11. Static+Alignment+Distance 12 Distance e.g. Cosine Similarity Horse_1850s Coach_1850s Drive_1850s Basketball_

    1850s Football_18 50s Horse_2010s Coach_1850s Drive_2010s Basketball_2010s Football_2010s Coach_2010s Horse_1850s Drive_1850s Basketball_1850s Football_1850s Rotate with R(θ) into same space Embedding Space @1850s Embedding Space @2010s / Cosine Distance=(1- CosSim)
  12. Contextualized+Clustering+Distance/Distribution 13 Coach @1850s usage1 usage2… Coach @2010s usage1 usage2…

    BERT (w/ w/o FT) … ~Coach Drive~ ~Coach Basketball~ Contextualized e.g. BERT (one token one embedding) (almost) groupby Sense clustering
  13. Contextualized+Clustering+Distance/Distribution 14 ~Coach Drive~ ~Coach Drive~ ~Coach Basketball~ Embedding Distribution

    @1850s Embedding Distribution @2010s Distribution e.g. The SC degree = JSD(D_1850s,D_2010s) Drive Basketball Drive Basketball ~Coach Basketball~
  14. Paradigms for Detection Static emd - Alignment - Distance 15

    SGNS, CBOW, … Contextualized emd - Clustering - Distance/Distribution BERT XLM-R … @ time1 sense1 sense2… @ time2 sense1 sense2…
  15. Methods out of the paradigm 16 Swap and Predict [Aida+2023]

    Detecting Changes by norm and mean of vectors [Nagata+2023]
  16. Method for Analysis Infinite-SCAN [Inoue+2022] • A Bayesian Model •

    Jointly estimate the number of senses of words and the trend of their changes • Output the distribution of senses annotated with the snippets (words in the context) → Explainable Results for Analysis 17 Sense distribution of Coach Sense distribution of Record
  17. Evaluation of DSCD Early Stage : pre-selected word list that

    we know those words changed • How many target words in the top K of the rank • Cons: ◦ Lack of the semantic Stable words ◦ Lack of the materials for analysis (no proper usages of the target word to support) Now : Word list manually annotated with degrees of semantic change • Metric: Spearman’s Correlation between prediction and human judge 18
  18. How to create the degree of Semantic Change Diachronic Usage

    Relatedness (DURel) [Schlechtweg+2018] • A Framework for the Annotation of Lexical Semantic Change • By manually annotating the semantic relatedness to the target word across 2 usages (a usage pair), We can calculate the average of all scores as the degree of change 19 4-point scale of relatedness Usage Pair : • [Corpus 1] Louis and his brother generally patronized the top of the Coach , but as they drew near Bristol • [Corpus 2] I am here with legendary icon and basketball Coach → Score : 1 An example of annotation to a usage pair
  19. How to create the degree of Semantic Change Diachronic Usage

    Relatedness (DURel) [Schlechtweg+2018] 20 Stable more changed Chinese dataset based on DURel [Chen+2022] C1:1953~1978 C2:1979~2003 Reform and Opening-up (改革开放) 机制(machine-made -> mechanism) 软(soft sofa?->soft landing) 照片(photos) 雪(snow)
  20. How to create the degree of Semantic Change Diachronic Word

    Usage Graphs (DWUG) [Schlechtweg+2021] • An extent version of DURel with multi-round incremental annotation process • Each word has a graph, node = usage, edge = relatedness 21
  21. How to create the degree of Semantic Change Diachronic Word

    Usage Graphs (DWUG) [Schlechtweg+2021] • Datasets for 4 languages published (En, Ge, Sw, La) and SemEval 2020 task 1 • Expansive, and EXTREMELY time consuming (according to ZH dataset author) • Access (More languages available now) 22
  22. Current Topics of Semantic Change (detection & analysis) [~2023] •

    Fine Tuning for DSCD ◦ MLM with Time Label Masking [Rosin+2022a] ◦ Time Aware Self-attention Mechanism [Rosin+2022b] ◦ Prompt-based Time Adaptation [Tang+2023] ◦ Fine tuned XLM-R on WiC (Word in Context) [Cassotti+2023] ←SOTA in 2023 [2024: ACL, ECAL] • Exploring the type/pattern of semantic change [Cassotti+2024] • Detecting semantic change by replacing words [Periti+2024] • Semantic distance metric learning approach [Aida+2024] • Definition Generation + DSCD [Fedorova+2024] • Annotation Tool [Schlechtweg+2024] 23
  23. Current Topics of Semantic Change (detection & analysis) [~2023] •

    Fine Tuning for DSCD ◦ MLM with Time Label Masking [Rosin+2022a] ◦ Time Aware Self-attention Mechanism [Rosin+2022b] ◦ Prompt-based Time Adaptation [Tang+2023] ◦ Fine tuned XLM-R on WiC (Word in Context) [Cassotti+2023] [2024: ACL, ECAL] • Exploring the type/pattern of semantic change [Cassotti+2024] • Detecting semantic change by replacing words [Periti+2024] • Semantic distance metric learning approach [Aida+2024] • Definition Generation + DSCD [Fedorova+2024] • Annotation Tool [Schlechtweg+2024] 24 SOTA Race on SemEval datasets Explainability in Evaluation Semantic change w/ LLM
  24. Round Table Topics in LChange (ACL2024) LChange a workshop for

    language change 25 • Difficult task rather than easy task • Focus on real world application
  25. The future(I think) of Semantic Change Detection • The issues

    of the methods : Explainability ◦ Only output the degree of change, no direct result of how word senses changed • The issues of the evaluation ◦ Maybe the ground truth we(I) want is not the degree but the word sense distribution? ◦ With the degree of SC we can already know these methods can detect the change or not. The next task might be the Pattern prediction or Word sense description (with the generative models) ⇒ No (enough) data for these tasks yet • More crossfield topics for Application ◦ Semantic Change + Social Science (digging up new concept from (web) corpus) ◦ Semantic Change + Lexicography (adding new meanings into dictionaries) ◦ Semantic Change + Healthcare/Biomedical NLP maybe? 26
  26. References 1. A Wind of Change: Detecting and Evaluating Lexical

    Semantic Change across Times and Domains 2. Diachronic Usage Relatedness (DURel): A Framework for the Annotation of Lexical Semantic Change 3. DWUG: A large Resource of Diachronic Word Usage Graphs in Four Languages 4. SemEval-2010s Task 1: Unsupervised Lexical Semantic Change Detection 5. Analysing Lexical Semantic Change with Contextualised Word Representations 6. Lexicon of Changes: Towards the Evaluation of Diachronic Semantic Shift in Chinese 7. Swap and Predict Predicting the Semantic Changes in Words across Corpora by Context Swapping 8. Variance Matters: Detecting Semantic Differences without Corpus/Word Alignment 9. Infinite SCAN: An Infinite Model of Diachronic Semantic Change 10. Time Masking for Temporal Language Models 11. Temporal Attention for Language Models 12. Learning Dynamic Contextualised Word Embeddings via Template-based Temporal Adaptation 13. XL-LEXEME: WiC Pretrained Model for Cross-Lingual LEXical sEMantic changE 14. Using Synchronic Definitions and Semantic Relations to Classify Semantic Change Types 15. Analyzing Semantic Change through Lexical Replacements 16. A Semantic Distance Metric Learning approach for Lexical Semantic Change Detection 17. Definition generation for lexical semantic change detection 18. The DURel Annotation Tool: Human and Computational Measurement of Semantic Proximity, Sense Clusters and Semantic Change 27