Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
文献紹介:Treat the Word As a Whole or Look Inside? ...
Search
Taichi Aida
September 16, 2019
Technology
0
340
文献紹介:Treat the Word As a Whole or Look Inside? Subword Embeddings Model Language Change and Typology
Taichi Aida
September 16, 2019
Tweet
Share
More Decks by Taichi Aida
See All by Taichi Aida
PhD Defence: Considering Temporal and Contextual Information for Lexical Semantic Change Detection
a1da4
1
190
文献紹介:A Multidimensional Framework for Evaluating Lexical Semantic Change with Social Science Applications
a1da4
1
290
YANS2024:目指せ国際会議!「ネットワーキングの極意(国際会議編)」
a1da4
0
220
言語処理学会30周年記念事業留学支援交流会@YANS2024:「学生のための短期留学」
a1da4
1
340
新入生向けチュートリアル:文献のサーベイv2
a1da4
15
10k
文献紹介:Isotropic Representation Can Improve Zero-Shot Cross-Lingual Transfer on Multilingual Language Models
a1da4
0
180
文献紹介:WhitenedCSE: Whitening-based Contrastive Learning of Sentence Embeddings
a1da4
1
260
文献紹介:On the Transformation of Latent Space in Fine-Tuned NLP Models
a1da4
0
100
新入生向けチュートリアル:文献のサーベイ
a1da4
0
480
Other Decks in Technology
See All in Technology
生成AI開発案件におけるClineの業務活用事例とTips
shinya337
0
180
さくらのIaaS基盤のモニタリングとOpenTelemetry/OSC Hokkaido 2025
fujiwara3
2
250
怖くない!はじめてのClaude Code
shinya337
0
300
5min GuardDuty Extended Threat Detection EKS
takakuni
0
180
AWS Organizations 新機能!マルチパーティ承認の紹介
yhana
1
220
PHP開発者のためのSOLID原則再入門 #phpcon / PHP Conference Japan 2025
shogogg
4
930
250627 関西Ruby会議08 前夜祭 RejectKaigi「DJ on Ruby Ver.0.1」
msykd
PRO
2
370
KubeCon + CloudNativeCon Japan 2025 Recap Opening & Choose Your Own Adventureシリーズまとめ
mmmatsuda
0
230
GitHub Copilot の概要
tomokusaba
1
150
プロダクトエンジニアリング組織への歩み、その現在地 / Our journey to becoming a product engineering organization
hiro_torii
0
140
無意味な開発生産性の議論から抜け出すための予兆検知とお金とAI
i35_267
0
1k
論文紹介:LLMDet (CVPR2025 Highlight)
tattaka
0
240
Featured
See All Featured
Designing for humans not robots
tammielis
253
25k
Agile that works and the tools we love
rasmusluckow
329
21k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
29
2.7k
KATA
mclloyd
30
14k
For a Future-Friendly Web
brad_frost
179
9.8k
Adopting Sorbet at Scale
ufuk
77
9.4k
Into the Great Unknown - MozCon
thekraken
39
1.9k
Side Projects
sachag
455
42k
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.3k
Facilitating Awesome Meetings
lara
54
6.4k
Testing 201, or: Great Expectations
jmmastey
42
7.6k
Music & Morning Musume
bryan
46
6.6k
Transcript
จݙհʢʣ Treat the Word As a Whole or Look Inside?
Subword Embeddings Model Language Change and Typology Yang Xu, Jiasheng Zhang, David Reitter 1st International Workshop on Computational Approaches to Historical Language Change, ACL2019 Ԭٕज़Պֶେֶ ࣗવݴޠॲཧݚڀࣨɹ ૬ాɹଠҰ
Abstract • ݴޠֶతͳԾઆΛௐΔͨΊʹ subword Λߟྀͨ͠୯ޠࢄදݱΛఏҊ • Indo-European ͷݴޠ৽͍͠୯ޠ΄Ͳ subword ʹର͢ΔॏΈ͕૿͑ɺɹ
ٯʹதࠃޠ subword ʹର͢ΔॏΈ͕ݮΓɺ୯ޠʹର͢ΔॏΈ͕૿͑ͨ !2
Motivation w ݴޠֶతͳ݁ ʮதࠃޠʹ͓͍ͯɺ࣌ؒͱͱʹ༏Ґੑ͕୯ԻઅˠೋԻઅʹҠͬͨʯ w Ծઆ ʮݱͷதࠃޠʹ͓͍ͯɺ୯ޠʹؚ·ΕΔࣈจࣈʢTVCXPSEʣ ҙຯతͳׂ͕গͳ͍ʯ !3
Related Work w $#08ʢDPOUFYU ͔ΒUBSHFU Λ༧ଌʣ w $IBSBDUFSFOIBODFEXPSEFNCFEEJOH $8&
w 4LJQHSBNʢUBSHFU ͔ΒDPOUFYU Λ༧ଌʣ w GBTU5FYU vc ui vc ui !4 ୯ޠͱจࣈΛಉ͡ॏཁͰѻ͏
Method w %ZOBNJDTVCXPSEJODPSQPSBUFEFNCFEEJOHNPEFM %4& w %4&$#08 w %4&4( w
୯ޠʹ୯ޠͷॏΈ ͰɺTVCXPSEʹ ͰॏΈ͚͢Δ hw i 1 − hw i !5
Method !6
Experiment w %BUBTFUT w 5SBJOJOHXPSEFNCFEEJOH8JLJQFEJBEBUBCBTFEVNQT w $IJOFTF &OHMJTI 'SFODI (FSNBO
*UBMJBO 4QBOJTI w .PEFM w %4&$#08 %4&4(ʢఏҊख๏ʣ w $8& GBTU5FYU !7
Experiment w ࣮ݧ߲ ͱ୯ޠͷൃੜ࣌ظͱͷ૬ؔ w ൃੜ࣌ظɿ͋Δޠ͕(PPHMF#PPLT/HSBNʹॳΊͯొͨ͠ ޠͷҙຯλεΫ
w &NCFEEJOHͷੑೳΛଌΔ w 4JNJMBSJUZͱ"OBMPHZΛ༻ hw i !8
Result ୯ޠͷॏΈ ͱൃੜ࣌ظͱͷ૬ؔɿ*OEP&VSPQFBOͱதࠃͰਖ਼ର w hw i !9
Result ୯ޠͷॏΈ ͱൃੜ࣌ظͱͷ૬ؔɿ*OEP&VSPQFBOͱதࠃͰਖ਼ର w hw i !10 ͕࣌ਐΉͱ ୯ޠʹର͢ΔॏΈ͕ݮগ ˣ
୯ޠΑΓ 4VCXPSEΛॏࢹ
Result ୯ޠͷॏΈ ͱൃੜ࣌ظͱͷ૬ؔɿ*OEP&VSPQFBOͱதࠃͰਖ਼ର w hw i !11 ͕࣌ਐΉͱ ୯ޠʹର͢ΔॏΈ͕૿Ճ ˣ
4VCXPSEΑΓ ୯ޠΛॏࢹ ʢԾઆ͕͔֬ΊΒΕͨʣ
Result w ͦΕͧΕͷάϧʔϓͰൺֱ w $#08ܥʢ%4&$#08 $8&ʣ w 4LJQHSBNܥʢ%4&4( GBTUUFYUʣ w
%4&4(Ͱੑೳͷ্Λ֬ೝ !12
Conclusion w ԾઆΛݕূ͢ΔҝʹɺTVCXPSEΛߟྀ͢Δ୯ޠࢄදݱΛఏҊͨ͠ w *OEP&VSPQFBOͷݴޠͰ৽͘͠ੜ·ΕΔ୯ޠ΄ͲTVCXPSEʹҙຯͷ ॏΈ͕ॏࢹ͞ΕɺதࠃޠͰٯʹTVCXPSEͷॏΈ͕ݮΓɺ୯ޠͦͷ ͷʹରͯ͠ॏΈ͕ͭ͘Α͏ʹͳͬͨʢԾઆΛݕূͨ͠ʣ !13
None
Discussion w ࣮ݧʹରͯ͠۩ମతͳൺֱΛߦͬͨ w தࠃɿͷۙԽͰٕज़Պֶ͕ൃలͨ͜͠ͱʹΑΓɺ৽͍͠୯ ޠ͕ೖ͖ͬͯͨʁ !15