Upgrade to Pro — share decks privately, control downloads, hide ads and more …

WhitenedCSE: Whitening-based Contrastive Learni...

WhitenedCSE: Whitening-based Contrastive Learning of Sentence Embeddings

対照学習と白色化処理を組み合わせた文埋め込み手法 WhitenedCSE について解説した資料です。

2023-09-28: ACL2023読み会@名大
http://cr.fvcrc.i.nagoya-u.ac.jp/~sasano/acl2023nagoya/

Hayato Tsukagoshi

September 26, 2023
Tweet

More Decks by Hayato Tsukagoshi

Other Decks in Research

Transcript

  1. WhitenedCSE: Whitening-based Contrastive Learning of Sentence Embeddings D1, Graduate School

    of Informatics, Nagoya University, Japan Hayato Tsukagoshi Wenjie Zhuo, Yifan Sun, Xiaohan Wang, Linchao Zhu, Yi Yang ACL 2023
 https://aclanthology.org/2023.acl-long.677/
  2. •ࣗવݴޠจͷີϕΫτϧදݱ •ϕΫτϧͷڑ཭͕จͷҙຯͷۙ͞Λදݱ ಋೖ: จຒΊࠐΈ / Sentence embedding 3 ͜Ͳ΋͕Ոʹ޲͔͍ͬͯΔɻ ͜Ͳ΋ֶ͕ߍ͔ΒՈʹ޲͔͍ͬͯΔɻ

    ͜Ͳ΋͕ਤॻؗʹ͍Δɻ ͜Ͳ΋͕ޕޙʹา͍͍ͯΔɻ จຒΊࠐΈۭؒ [0.1, 0.2, ...] [0.1, 0.3, ...] [0.9, 0.8, ...] [0.5, 0.7, ...]
  3. •ࣗવݴޠจͷີϕΫτϧදݱ •ϕΫτϧͷڑ཭͕จͷҙຯͷۙ͞Λදݱ ಋೖ: จຒΊࠐΈ / Sentence embedding 4 ͜Ͳ΋͕Ոʹ޲͔͍ͬͯΔɻ ͜Ͳ΋ֶ͕ߍ͔ΒՈʹ޲͔͍ͬͯΔɻ

    ͜Ͳ΋͕ਤॻؗʹ͍Δɻ ͜Ͳ΋͕ޕޙʹา͍͍ͯΔɻ จຒΊࠐΈۭؒ [0.1, 0.2, ...] [0.1, 0.3, ...] [0.9, 0.8, ...] [0.5, 0.7, ...] ҙຯతʹྨࣅ ͍ۙҙຯΛ࣋ͭจ͸ ۙ͘ʹ෼෍ ϕΫτϧؒͷڑ཭͕
 ҙຯతͳؔ܎Λදݱ
  4. ಋೖ: Contrastive Learning / ରরֶश •ਖ਼ྫͱෛྫͷಛ௃දݱΛϞσϧ͔Βग़ྗ •ਖ਼ྫಉ࢜ͷྨࣅ౓͕ߴ͘ͳΔΑ͏ʹֶशΛߦ͏ •Computer Vision෼໺ͰେਓؾɺNLPͰ΋ྲྀߦத SimCLR

    •ಉ͡ը૾ʹରͯ͠ҟͳΔ
 data augmentationΛͨ͠
 ը૾ಉ࢜Λਖ਼ྫʹ͢Δ •ޙஈͷը૾෼ྨλεΫͳͲ
 Ͱߴ͍ੑೳ •CVʹ͓͚ΔදݱֶशͷͨΊ
 ͷpre-trainingͱͯ͠༗ޮ 5 ը૾͸ϒϩά[16]ΑΓҾ༻ Oord+: Representation Learning with Contrastive Predictive Coding, arXiv ‘18 Chen+: A Simple Framework for Contrastive Learning of Visual Representations, ICML ’20 Advancing Self-Supervised and Semi-Supervised Learning with SimCLR, ’20 Chen+: Big Self-Supervised Models are Strong Semi-Supervised Learners, NeurIPS ’20
  5. •σʔλಉ͕࢜௒ٿ໘্ʹҰ༷෼෍͢ΔΑ͏ม׵ • ฏۉ: 0 • ෼ࢄڞ෼ࢄߦྻ: ୯Ґߦྻ ന৭Խʹ࢖͏ख๏ •Principal Component

    Analysis (PCA): σʔλߦྻΛݻ༗஋෼ղ •Zero-phase Component Analysis (ZCA): PCA + ճసଧͪফ͠ ҟํੑ (anisotropy) •σʔλ͕ߴ࣍ݩ্ۭؒͷ௿࣍ݩۭؒͷΈʹ෼෍ͯ͠͠·͏ੑ࣭ • ന৭Խ͸ҟํੑͷղফʹ༗༻ (౳ํతʹ෼෍ͨ͠ํ͕ੑೳ͕ྑ͍͜ͱ͕ଟ͍) ന৭Խ 10 H = WZ HHT = I
 WZ(WZ)T = WZZTWT ZZT = UΛUT WPCA = Λ−1/2UT
 WZCA = UΛ−1/2UT
  6. •SimCSEͰ΋alignment / uniformity͸޲্͍ͯ͠Δ͕… • ରরֶशͩͱෛྫಉ࢜Λ཭ͤͳ͍ Shu ff l ed Group

    Whitening (SGW) •γϟοϑϧͯ͠άϧʔϓ͝ͱʹന৭Խ •ݩͷॱ൪ʹ໭ͯ͠ग़ྗຒΊࠐΈͱ͢Δ Multi-Positive Contrastive Loss •SGWΛෳ਺ճ܁Γฦͯ͠ਖ਼ྫΛਫ૿͠ •ଟ༷ͳਖ਼ྫͰֶशͰ͖ؤ݈ੑ޲্ WhitenedCSE 11
  7. λεΫ: จຒΊࠐΈͷඪ४తͳϕϯνϚʔΫ •Semantic Textual Similarity (STS) •SentEval ௥ՃͷධՁࢦඪ •Uniformity, Alignment

    ܇࿅ઃఆ •ӳޠWikipedia͔ΒϥϯμϜʹαϯϓϦϯάͨ͠100ສจ (ϥϕϧͳ͠) •BERT, RoBERTaΛ fi ne-tuning •ന৭Խ͸όον͝ͱʹɺ384άϧʔϓ(1άϧʔϓ͋ͨΓ2࣍ݩ) (BERT) ࣮ݧ 18
  8. ڭࢣͳ͠STSͷධՁखॱ ᶃ ύϥϝʔλΛݻఆͨ͠
 จຒΊࠐΈϞσϧΛ༻ҙ ᶄ จϖΞͦΕͧΕΛจຒΊࠐΈʹม׵ ᶅ จ“ϕΫτϧ”ϖΞͷྨࣅ౓Λܭࢉ • ίαΠϯྨࣅ౓͕Α͘༻͍ΒΕΔ

    ᶆ ਓؒධՁͱͷ(ॱҐ)૬ؔ܎਺Λܭࢉ •૬ؔ܎਺͕ߴ͍ํ͕“ྑ͍จຒΊࠐΈ” ڭࢣͳ͠ (Unsupervised) STSλεΫ 20 จA จB ᶄ ᶅ ᶆ ᶃ จຒΊࠐΈϞσϧ ਓखධՁͱͷ
 ૬ؔ܎਺ͰධՁ จྨࣅ౓
  9. •ςΩετ෼ྨͳͲͷԼྲྀλεΫ͕ू·ͬͨtoolkit •จຒΊࠐΈΛೖྗͱ͢Δ෼ྨثΛ܇࿅ɺ෼ྨੑೳ͔ΒจຒΊࠐΈͷ࣭ΛධՁ Conneau+: SentEval: An Evaluation Toolkit for Universal Sentence

    Representations, LREC ‘18 SentEval 21 Task ෼ྨର৅ Ϋϥε਺ ྫจ MR өըϨϏϡʔͷpos/neg 2 Too slow for a younger crowd, too shallow for an older one. CR ঎඼ϨϏϡʔͷpos/neg 2 We tried it out christmas night and it worked great. SUBJ өը/͋Β͢͡ͷओ؍ੑ 2 A movie that doesn’t aim too high, but doesn’t need to. MPQA ϑϨʔζͷۃੑ 2 would like to tell SST-2 өըϨϏϡʔͷpos/neg 2 Audrey Tautou has a knack for picking roles that magnify her [..] TREC ࣭໰ͷछผ 6 What are the twin cities? MRPC 2จ͕ݴ͍׵͔͑Ͳ͏͔ 2 The procedure is generally performed in the second or third trimester. & The technique is used during the second and, occasionally, third trimester of pregnancy.
  10. SentEvalͷධՁखॱ ᶃ ύϥϝʔλΛݻఆͨ͠
 จຒΊࠐΈϞσϧΛ༻ҙ ᶄ ֤จΛจຒΊࠐΈʹม׵ ᶅ จຒΊࠐΈΛೖྗͱ͢Δ෼ྨثΛ܇࿅ ᶆ ෼ྨثͷੑೳ͔ΒจຒΊࠐΈ


    ͷ඼࣭ΛධՁ •෼ྨੑೳ͕ߴ͍ํ͕“ྑ͍จຒΊࠐΈ” •෼ྨث͸ϩδεςΟοΫճؼ෼ྨث͕ଟ͍ SentEval 22 จ ᶄ ᶃ ෼ྨੑೳ͔Β
 จຒΊࠐΈͷ඼࣭ΛධՁ ᶅ จຒΊࠐΈϞσϧ ෼ྨث ᶆ
  11. BERT- fl ow: ҟํతͳBERTͷจຒΊࠐΈۭ͔ؒΒ౳ํతͳજࡏۭؒ΁ͷࣸ૾Λֶश BERT-whitening: จຒΊࠐΈͷฏۉ͕0ɼڞ෼ࢄߦྻ͕୯ҐߦྻʹͳΔΑ͏ʹઢܗม׵ (+࣍ݩ࡟ݮ) IS-BERT: จຒΊࠐΈͱจதͷn-gramͷຒΊࠐΈͷ૬ޓ৘ใྔΛ࠷େԽ͢ΔΑ͏ʹֶश BERT-CT:

    ҟͳΔೋͭͷಉ͡Ϟσϧͷಉ͡จʹର͢ΔຒΊࠐΈಉ࢜ͷ಺ੵ͕େ͖͘ͳΔΑ͏ʹֶश SimCSE: ҟͳΔDropoutΛద༻ͨ͠ಉ͡จΛਖ਼ྫ or ؚҙؔ܎ͷจϖΞΛਖ਼ྫͱͨ͠ରরֶश MixCSE: ҟͳΔจΛࠞͥͨจΛhard negativeͱͯ͠ڭࢣͳ͠ରরֶश ArcCSE: ؚҙϖΞจຒΊࠐΈͷmargin෇͖֯౓࠷খԽ+DAͨ͠จΛෛྫʹ͢ΔTriplet Lossͷ༥߹ DCLR: Ψ΢γΞϯϊΠζΛෛྫͱͯ͠௥Ճ + ࣄྫ͝ͱॏΈ෇͚ͯ͠Unsup-SimCSE MoCoSE: ϞʔϝϯλϜΤϯίʔμͷ࠷దͳෛྫ਺෼ੳ+FGSMʹΑΔσʔλ֦ுͰରরֶश Li+: On the Sentence Embeddings from Pre-trained Language Models, EMNLP '20 Su+: Whitening Sentence Representations for Better Semantics and Faster Retrieval, arXiv ’21 Zhang+: An Unsupervised Sentence Embedding Method by Mutual Information Maximization, EMNLP ’20 Carlsson+: Semantic Re-tuning with Contrastive Tension, ICLR ’21 Gao+: SimCSE: Simple Contrastive Learning of Sentence Embeddings, EMNLP ’21 Zhang+: Unsupervised Sentence Representation via Contrastive Learning with Mixing Negatives, AAAI ’22 Zhang+: A Contrastive Framework for Learning Sentence Representations from Pairwise and Triple-wise Perspective in Angular Space, ACL ’22 Zhou+: Debiased Contrastive Learning of Unsupervised Sentence Representations, ACL 2022 Cao+: Exploring the Impact of Negative Samples of Contrastive Learning: A Case Study of Sentence Embedding, ACL fi ndings ’22 ൺֱख๏ 23
  12. •ରরֶशͱന৭ԽॲཧΛ૊Έ߹Θͤͨ
 จຒΊࠐΈֶशख๏ΛఏҊ • ຒΊࠐΈΛάϧʔϓʹ෼ׂͯ͠ന৭Խ (SGW) • ෳ਺ͷਖ਼ྫΛ༻͍ͨରরֶश ײ૝ •ͿͬͪΌ͚ධՁ͕গ͠ո͍͠ •άϧʔϓԽͤͣʹന৭Խͨ͠৔߹ͷੑೳ͕݁ߏ௿͍

    • ಛ௃දݱ͕drasticʹมԽ͗͢͠ΔͨΊʁ • άϧʔϓ͕খ͍͞ͷͰϚΠϧυͳന৭ԽΛ͍ͯ͠Δ •ϚΠϧυന৭ԽͰྑ͍ͳΒന৭Խૢ࡞ࣗମෆཁ͔΋ʁ • ෼ࢄڞ෼ࢄߦྻ͕IdenticalʹͳΔΑ͏ͳଛࣦ͸ʁ ·ͱΊ 32 ന৭Խͷ࢓ํ͕
 ؾʹͳΓ͗͢Δ