Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
文献紹介:MoverScore: Text Generation Evaluating wit...
Search
Taichi Aida
October 14, 2019
Technology
0
550
文献紹介:MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance
Taichi Aida
October 14, 2019
Tweet
Share
More Decks by Taichi Aida
See All by Taichi Aida
意味を表すベクトル表現を用いたテキスト分析
a1da4
0
77
PhD Defence: Considering Temporal and Contextual Information for Lexical Semantic Change Detection
a1da4
1
250
文献紹介:A Multidimensional Framework for Evaluating Lexical Semantic Change with Social Science Applications
a1da4
1
360
YANS2024:目指せ国際会議!「ネットワーキングの極意(国際会議編)」
a1da4
0
280
言語処理学会30周年記念事業留学支援交流会@YANS2024:「学生のための短期留学」
a1da4
1
410
新入生向けチュートリアル:文献のサーベイv2
a1da4
16
11k
文献紹介:Isotropic Representation Can Improve Zero-Shot Cross-Lingual Transfer on Multilingual Language Models
a1da4
0
200
文献紹介:WhitenedCSE: Whitening-based Contrastive Learning of Sentence Embeddings
a1da4
1
330
文献紹介:On the Transformation of Latent Space in Fine-Tuned NLP Models
a1da4
0
120
Other Decks in Technology
See All in Technology
Strands Agents × インタリーブ思考 で変わるAIエージェント設計 / Strands Agents x Interleaved Thinking AI Agents
takanorig
6
2.4k
なぜ あなたはそんなに re:Invent に行くのか?
miu_crescent
PRO
0
240
20251219 OpenIDファウンデーション・ジャパン紹介 / OpenID Foundation Japan Intro
oidfj
0
600
ハッカソンから社内プロダクトへ AIエージェント ko☆shi 開発で学んだ4つの重要要素
leveragestech
0
460
日本の AI 開発と世界の潮流 / GenAI Development in Japan
hariby
2
720
さくらのクラウド開発ふりかえり2025
kazeburo
2
1.3k
Oracle Database@Azure:サービス概要のご紹介
oracle4engineer
PRO
3
230
AWS re:Invent2025最新動向まとめ(NRIグループre:Cap 2025)
gamogamo
0
140
Oracle Database@Google Cloud:サービス概要のご紹介
oracle4engineer
PRO
1
800
ルネサンス開発者を育てる 1on1支援AIエージェント
yusukeshimizu
0
130
コールドスタンバイ構成でCDは可能か
hiramax
0
130
Agent Skillsがハーネスの垣根を超える日
gotalab555
7
4.9k
Featured
See All Featured
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
27k
Building a Modern Day E-commerce SEO Strategy
aleyda
45
8.4k
GraphQLの誤解/rethinking-graphql
sonatard
74
11k
StorybookのUI Testing Handbookを読んだ
zakiyama
31
6.5k
SEO in 2025: How to Prepare for the Future of Search
ipullrank
3
3.3k
Primal Persuasion: How to Engage the Brain for Learning That Lasts
tmiket
0
200
Visual Storytelling: How to be a Superhuman Communicator
reverentgeek
2
400
The Mindset for Success: Future Career Progression
greggifford
PRO
0
200
A Guide to Academic Writing Using Generative AI - A Workshop
ks91
PRO
0
170
brightonSEO & MeasureFest 2025 - Christian Goodrich - Winning strategies for Black Friday CRO & PPC
cargoodrich
2
73
Designing for Timeless Needs
cassininazir
0
110
Why Mistakes Are the Best Teachers: Turning Failure into a Pathway for Growth
auna
0
30
Transcript
จݙհʢʣ MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth
Mover Distance Wei Zhao† , Maxime Peyrard† , Fei Liu‡ , Yang Gao† , Christian M. Meyer† , Steffen Eger† EMNLP2019 Ԭٕज़Պֶେֶ ࣗવݴޠॲཧݚڀࣨɹ ૬ాɹଠҰ
Abstract • ੜͷλεΫʹ͓͍ͯɺؤڧͳධՁईΛௐࠪ • จ຺Λߟྀͨ͠୯ޠࢄදݱ ͱ Word Mover’s Distance ͷΈ߹Θ͕ͤ࠷ྑ͔ͬͨ
• ιʔείʔυΛެ։ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ https://github.com/AIPHES/emnlp19-moverscore 2
Related work • ৭ʑͳධՁख๏ʢ1ʣ • ཁɿROUGE(Lin 2004) • ػց༁ɿBLEU(Papinemi 2002),
RUSE(Shimanaka 2018) • Image CaptioningɿBLEU, CIDEr(Vedantam 2015), SPICE(Anderson 2016) 3 #-&6͔ͳ͍
Related work • ৭ʑͳධՁख๏ʢ2ʣ • ҙຯతྨࣅɿ “BERTScore”(Zhang 2019) • ༁ɿڭࢣ͋Γɾڭࢣͳ͠
BERT ࢄදݱ(Mathur 2019) • ཁɺΤοηΠ࠾ɿELMo + Sentence Mover’s Simirality(Clark 2019) 4 จ຺Λߟྀͨ͠ࢄදݱ $POUFYUVBMJ[FESFQSFTFOUBUJPO Λ༻͍Δख๏͕૿͖͑ͯͨ ࣮ݧͷ#BTFMJOFʹग़͖ͯ·͢
Method • ༷ʑͳੜλεΫΛධՁͰ͖Δࢦඪ(MoverScore)Λௐࠪ • ੜจͱࢀরจͷྨࣅʢʁʣΛଌΔ • จ຺Λߟྀͨ͠ࢄදݱɿBERT, ELMo • ग़ྗจͱࢀরจͷҙຯతڑɿWord
Mover's Distance 5
Method • MoverScore Variations • Granularityɿn-gram (n=1, 2, size-of-sentence) •
Embeddingɿword2vec, BERT, ELMo • Fine-tuningɿMultiNLI, QANLI, QQP • Aggregationɿpower means, routing mechanism 6 /-* 1BSBQISBTJOH #&35 &-.P #&35
Method • MoverScore Variations • Granularityɿn-gram (n=1, 2, size-of-sentence) •
Embeddingɿword2vec, BERT, ELMo • Fine-tuningɿMultiNLI, QANLI, QQP • Aggregationɿpower means, routing mechanism 7 #&35 &-.P
Method • Aggregation ʢ౷߹ํ๏ʣ • จ຺Λߟྀͨ͠ࢄදݱɿBERT, ELMo • ֤୯ޠ֤͔ΒͦΕͧΕҟͳΔϕΫτϧ͕͞ΕΔ •
Power MeansɿฏۉΛऔΓ ( )ɺconcat • Routing Mechanismɿৄ͘͠(Zhang 2018) p p = 1, ± ∞ 8
Method • ग़ྗจͱࢀরจͷҙຯతڑ • Word Mover's Distance (WMD) • Sentence
Mover's Distance (SMD) • ઌ΄ͲͷΈ߹ΘͤΛɺWMD, SMD ͦΕͧΕͰݕূ͢Δ 9
Experiment • Tasks • ػց༁ • ཁ • ରʢλεΫࢤʣ •
Image Captioning 10 ʢࢀরจɺෳͷγεςϜʹΑΔग़ྗจʣͷϖΞ γεςϜͷग़ྗจʹਓखධՁ͕͞Ε͍ͯΔ ʲධՁࢦඪɺMoverScore ͰΔ͜ͱʳ ɾγεςϜͷग़ྗจΛධՁ ɾਓखධՁͱͷ૬ؔΛݟΔ
Experiment • ػց༁ • DatasetɿWMT2017 • ࢀՃγεςϜͷग़ྗจʹɺ࠷Ͱ15ਓͷਓखධՁ • BaselinesɿSentBLEU, METEOR++,
RUSE, BERTScore(Zhang 2019) 11
Result • WMD+BERT+MNLI+PMeans ͕ Baseline Λ্ճΔ 12
Result • Sentence Representation Ͱใ͕ࣦΘΕΔʁ 13
Experiment • ཁ • DatasetɿTAC-2008, TAC-2009 • Responsivenessɿ༰ʴจ๏తͳ࣭ • Pyramidɿࢀরจʹؚ·ΕΔॏཁͳ༰͕ͲΕ͚ͩଟ͘Χόʔ͞
Ε͍ͯΔ͔ • BaselinesɿROUGE-1, ROUGE-2, (Peyrard 2017), BERTScore(Zhang 2019) S3 best 14 ڭࢣ͋ΓͷධՁࢦඪ
Result • WMD+BERT+MNLI+PMeans Ͱ Baselines Λ্ճΔ 15
Experiment • ରʢλεΫࢤʣ • DatasetɿBAGEL, SFHOTEL • Informativeness (Inf)ɿఏڙ͢Δใྔ •
Naturalness (Nat)ɿਓͷԠͷۙ͞ • Quality (Qual)ɿྲྀெੑɾจ๏ • BaselinesɿBLEU, METEOR, BERTScore(Zhang 2019) 16
Result • શମతʹ૬͕͍͕ؔɺఏҊख๏ͦͷதͰߴ͍ํ 17
Experiment • Image Captioning • DatasetɿMSCOCO • M1 ~ M5
ͷධՁ͕͋Δ • ࠓճɺશମͷ࣭ʹؔ͢ΔM1, M2 Λ࠾༻ • BaselinesɿCIDEr, SPICE, METEOR, LEIC(Cui 2018), BERTScore(Zhang 2019) 18 ڭࢣ͋ΓͷධՁࢦඪ
Result • Baseline ͷ LEIC ʹྼΔ͕ɺͦΕͰߴ͍૬ؔΛࣔ͢ 19 M: BERT fine-tuning
ʹ MultiNLI Λ༻ P: ELMo / BERT ͷ౷߹ (Aggregation) ʹ Power Means Λ༻
Discussion • ࣮ݧͷ Baseline ͱͯ͠ग़͖ͯͨ BERTScore ͱͷൺֱ 20
Discussion • ࣮ݧͷ Baseline ͱͯ͠ग़͖ͯͨ BERTScore ͱͷൺֱ 21 One-to-one ͷڧ͍
alignment Many-to-one ͷऑ͍ alignment WMD Ͱదͳڑ ͕औΕ͍ͯΔ
Discussion • ػց༁ͰਓखධՁͷߴ͍ͷ(good)ͱ͍ͷ(bad)ͷɹ 2ͭʹ͚ɺΛௐࠪ • ൺֱର • Baseline: SentBLEU •
Proposal: MoverScore(WMD+BERT) 22
Discussion • SentBLEU ਓखධՁ͕ྑͯ͘தఔͷՕॴʹଟ͘ • MoverScore ៉ྷʹ2ͭͷۃΛදݱͰ͖͍ͯΔ 23
Conclusion • ੜλεΫͷڭࢣͳ͠ධՁࢦඪΛఏҊ • 4ͭͷੜλεΫͰ Baselines Λ ͑Δ/ഭΔ ݁Ռʹ •
ιʔείʔυΛެ։ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ https://github.com/AIPHES/emnlp19-moverscore 24