Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
文献紹介:Treat the Word As a Whole or Look Inside? ...
Search
Taichi Aida
September 16, 2019
Technology
0
350
文献紹介:Treat the Word As a Whole or Look Inside? Subword Embeddings Model Language Change and Typology
Taichi Aida
September 16, 2019
Tweet
Share
More Decks by Taichi Aida
See All by Taichi Aida
意味を表すベクトル表現を用いたテキスト分析
a1da4
0
74
PhD Defence: Considering Temporal and Contextual Information for Lexical Semantic Change Detection
a1da4
1
250
文献紹介:A Multidimensional Framework for Evaluating Lexical Semantic Change with Social Science Applications
a1da4
1
360
YANS2024:目指せ国際会議!「ネットワーキングの極意(国際会議編)」
a1da4
0
280
言語処理学会30周年記念事業留学支援交流会@YANS2024:「学生のための短期留学」
a1da4
1
400
新入生向けチュートリアル:文献のサーベイv2
a1da4
16
11k
文献紹介:Isotropic Representation Can Improve Zero-Shot Cross-Lingual Transfer on Multilingual Language Models
a1da4
0
200
文献紹介:WhitenedCSE: Whitening-based Contrastive Learning of Sentence Embeddings
a1da4
1
320
文献紹介:On the Transformation of Latent Space in Fine-Tuned NLP Models
a1da4
0
120
Other Decks in Technology
See All in Technology
ActiveJobUpdates
igaiga
1
320
Snowflake Industry Days 2025 Nowcast
takumimukaiyama
0
130
Strands Agents × インタリーブ思考 で変わるAIエージェント設計 / Strands Agents x Interleaved Thinking AI Agents
takanorig
5
2.1k
202512_AIoT.pdf
iotcomjpadmin
0
150
さくらのクラウド開発ふりかえり2025
kazeburo
2
1.2k
Amazon Connect アップデート! AIエージェントにMCPツールを設定してみた!
ysuzuki
0
140
SREが取り組むデプロイ高速化 ─ Docker Buildを最適化した話
capytan
0
150
7,000万ユーザーの信頼を守る「TimeTree」のオブザーバビリティ実践 ( Datadog Live Tokyo )
bell033
1
100
M&Aで拡大し続けるGENDAのデータ活用を促すためのDatabricks権限管理 / AEON TECH HUB #22
genda
0
270
Oracle Database@Azure:サービス概要のご紹介
oracle4engineer
PRO
2
200
New Relic 1 年生の振り返りと Cloud Cost Intelligence について #NRUG
play_inc
0
240
NIKKEI Tech Talk #41: セキュア・バイ・デザインからクラウド管理を考える
sekido
PRO
0
220
Featured
See All Featured
Unsuck your backbone
ammeep
671
58k
Primal Persuasion: How to Engage the Brain for Learning That Lasts
tmiket
0
190
The Cult of Friendly URLs
andyhume
79
6.7k
Game over? The fight for quality and originality in the time of robots
wayneb77
1
67
Visualization
eitanlees
150
16k
The SEO Collaboration Effect
kristinabergwall1
0
310
sira's awesome portfolio website redesign presentation
elsirapls
0
91
Rails Girls Zürich Keynote
gr2m
95
14k
Code Reviewing Like a Champion
maltzj
527
40k
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
34
2.6k
Impact Scores and Hybrid Strategies: The future of link building
tamaranovitovic
0
180
<Decoding/> the Language of Devs - We Love SEO 2024
nikkihalliwell
0
100
Transcript
จݙհʢʣ Treat the Word As a Whole or Look Inside?
Subword Embeddings Model Language Change and Typology Yang Xu, Jiasheng Zhang, David Reitter 1st International Workshop on Computational Approaches to Historical Language Change, ACL2019 Ԭٕज़Պֶେֶ ࣗવݴޠॲཧݚڀࣨɹ ૬ాɹଠҰ
Abstract • ݴޠֶతͳԾઆΛௐΔͨΊʹ subword Λߟྀͨ͠୯ޠࢄදݱΛఏҊ • Indo-European ͷݴޠ৽͍͠୯ޠ΄Ͳ subword ʹର͢ΔॏΈ͕૿͑ɺɹ
ٯʹதࠃޠ subword ʹର͢ΔॏΈ͕ݮΓɺ୯ޠʹର͢ΔॏΈ͕૿͑ͨ !2
Motivation w ݴޠֶతͳ݁ ʮதࠃޠʹ͓͍ͯɺ࣌ؒͱͱʹ༏Ґੑ͕୯ԻઅˠೋԻઅʹҠͬͨʯ w Ծઆ ʮݱͷதࠃޠʹ͓͍ͯɺ୯ޠʹؚ·ΕΔࣈจࣈʢTVCXPSEʣ ҙຯతͳׂ͕গͳ͍ʯ !3
Related Work w $#08ʢDPOUFYU ͔ΒUBSHFU Λ༧ଌʣ w $IBSBDUFSFOIBODFEXPSEFNCFEEJOH $8&
w 4LJQHSBNʢUBSHFU ͔ΒDPOUFYU Λ༧ଌʣ w GBTU5FYU vc ui vc ui !4 ୯ޠͱจࣈΛಉ͡ॏཁͰѻ͏
Method w %ZOBNJDTVCXPSEJODPSQPSBUFEFNCFEEJOHNPEFM %4& w %4&$#08 w %4&4( w
୯ޠʹ୯ޠͷॏΈ ͰɺTVCXPSEʹ ͰॏΈ͚͢Δ hw i 1 − hw i !5
Method !6
Experiment w %BUBTFUT w 5SBJOJOHXPSEFNCFEEJOH8JLJQFEJBEBUBCBTFEVNQT w $IJOFTF &OHMJTI 'SFODI (FSNBO
*UBMJBO 4QBOJTI w .PEFM w %4&$#08 %4&4(ʢఏҊख๏ʣ w $8& GBTU5FYU !7
Experiment w ࣮ݧ߲ ͱ୯ޠͷൃੜ࣌ظͱͷ૬ؔ w ൃੜ࣌ظɿ͋Δޠ͕(PPHMF#PPLT/HSBNʹॳΊͯొͨ͠ ޠͷҙຯλεΫ
w &NCFEEJOHͷੑೳΛଌΔ w 4JNJMBSJUZͱ"OBMPHZΛ༻ hw i !8
Result ୯ޠͷॏΈ ͱൃੜ࣌ظͱͷ૬ؔɿ*OEP&VSPQFBOͱதࠃͰਖ਼ର w hw i !9
Result ୯ޠͷॏΈ ͱൃੜ࣌ظͱͷ૬ؔɿ*OEP&VSPQFBOͱதࠃͰਖ਼ର w hw i !10 ͕࣌ਐΉͱ ୯ޠʹର͢ΔॏΈ͕ݮগ ˣ
୯ޠΑΓ 4VCXPSEΛॏࢹ
Result ୯ޠͷॏΈ ͱൃੜ࣌ظͱͷ૬ؔɿ*OEP&VSPQFBOͱதࠃͰਖ਼ର w hw i !11 ͕࣌ਐΉͱ ୯ޠʹର͢ΔॏΈ͕૿Ճ ˣ
4VCXPSEΑΓ ୯ޠΛॏࢹ ʢԾઆ͕͔֬ΊΒΕͨʣ
Result w ͦΕͧΕͷάϧʔϓͰൺֱ w $#08ܥʢ%4&$#08 $8&ʣ w 4LJQHSBNܥʢ%4&4( GBTUUFYUʣ w
%4&4(Ͱੑೳͷ্Λ֬ೝ !12
Conclusion w ԾઆΛݕূ͢ΔҝʹɺTVCXPSEΛߟྀ͢Δ୯ޠࢄදݱΛఏҊͨ͠ w *OEP&VSPQFBOͷݴޠͰ৽͘͠ੜ·ΕΔ୯ޠ΄ͲTVCXPSEʹҙຯͷ ॏΈ͕ॏࢹ͞ΕɺதࠃޠͰٯʹTVCXPSEͷॏΈ͕ݮΓɺ୯ޠͦͷ ͷʹରͯ͠ॏΈ͕ͭ͘Α͏ʹͳͬͨʢԾઆΛݕূͨ͠ʣ !13
None
Discussion w ࣮ݧʹରͯ͠۩ମతͳൺֱΛߦͬͨ w தࠃɿͷۙԽͰٕज़Պֶ͕ൃలͨ͜͠ͱʹΑΓɺ৽͍͠୯ ޠ͕ೖ͖ͬͯͨʁ !15