Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
文献紹介:Treat the Word As a Whole or Look Inside? ...
Search
Taichi Aida
September 16, 2019
Technology
0
350
文献紹介:Treat the Word As a Whole or Look Inside? Subword Embeddings Model Language Change and Typology
Taichi Aida
September 16, 2019
Tweet
Share
More Decks by Taichi Aida
See All by Taichi Aida
PhD Defence: Considering Temporal and Contextual Information for Lexical Semantic Change Detection
a1da4
1
210
文献紹介:A Multidimensional Framework for Evaluating Lexical Semantic Change with Social Science Applications
a1da4
1
300
YANS2024:目指せ国際会議!「ネットワーキングの極意(国際会議編)」
a1da4
0
230
言語処理学会30周年記念事業留学支援交流会@YANS2024:「学生のための短期留学」
a1da4
1
360
新入生向けチュートリアル:文献のサーベイv2
a1da4
15
10k
文献紹介:Isotropic Representation Can Improve Zero-Shot Cross-Lingual Transfer on Multilingual Language Models
a1da4
0
180
文献紹介:WhitenedCSE: Whitening-based Contrastive Learning of Sentence Embeddings
a1da4
1
290
文献紹介:On the Transformation of Latent Space in Fine-Tuned NLP Models
a1da4
0
110
新入生向けチュートリアル:文献のサーベイ
a1da4
0
490
Other Decks in Technology
See All in Technology
Bye-Bye Query Spaghetti: Write Queries You'll Actually Understand Using Pipelined SQL Syntax
tobiaslampertlotum
0
120
JavaScript 研修
recruitengineers
PRO
6
1.4k
Figma + Storybook + PlaywrightのMCPを使ったフロントエンド開発
yug1224
10
3.6k
モバイルアプリ研修
recruitengineers
PRO
5
1.7k
AIエージェントの活用に重要な「MCP (Model Context Protocol)」とは何か
masayamoriofficial
0
250
「魔法少女まどか☆マギカ Magia Exedra」のグローバル展開を支える、開発チームと翻訳チームの「意識しない協創」を実現するローカライズシステム
gree_tech
PRO
0
440
ヒューリスティック評価を用いたゲームQA実践事例
gree_tech
PRO
0
430
異業種出身エンジニアが気づいた、転向して十数年経っても変わらない自分の武器とは
macnekoayu
0
260
サンドボックス技術でAI利活用を促進する
koh_naga
0
150
オブザーバビリティが広げる AIOps の世界 / The World of AIOps Expanded by Observability
aoto
PRO
0
250
【 LLMエンジニアがヒューマノイド開発に挑んでみた 】 - 第104回 Machine Learning 15minutes! Hybrid
soneo1127
0
240
見てわかるテスト駆動開発
recruitengineers
PRO
6
2.4k
Featured
See All Featured
Thoughts on Productivity
jonyablonski
69
4.8k
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
507
140k
Agile that works and the tools we love
rasmusluckow
330
21k
Building Flexible Design Systems
yeseniaperezcruz
328
39k
Connecting the Dots Between Site Speed, User Experience & Your Business [WebExpo 2025]
tammyeverts
8
510
Practical Orchestrator
shlominoach
190
11k
How to Ace a Technical Interview
jacobian
279
23k
The Art of Programming - Codeland 2020
erikaheidi
55
13k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
46
7.6k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
34
3.1k
Speed Design
sergeychernyshev
32
1.1k
Producing Creativity
orderedlist
PRO
347
40k
Transcript
จݙհʢʣ Treat the Word As a Whole or Look Inside?
Subword Embeddings Model Language Change and Typology Yang Xu, Jiasheng Zhang, David Reitter 1st International Workshop on Computational Approaches to Historical Language Change, ACL2019 Ԭٕज़Պֶେֶ ࣗવݴޠॲཧݚڀࣨɹ ૬ాɹଠҰ
Abstract • ݴޠֶతͳԾઆΛௐΔͨΊʹ subword Λߟྀͨ͠୯ޠࢄදݱΛఏҊ • Indo-European ͷݴޠ৽͍͠୯ޠ΄Ͳ subword ʹର͢ΔॏΈ͕૿͑ɺɹ
ٯʹதࠃޠ subword ʹର͢ΔॏΈ͕ݮΓɺ୯ޠʹର͢ΔॏΈ͕૿͑ͨ !2
Motivation w ݴޠֶతͳ݁ ʮதࠃޠʹ͓͍ͯɺ࣌ؒͱͱʹ༏Ґੑ͕୯ԻઅˠೋԻઅʹҠͬͨʯ w Ծઆ ʮݱͷதࠃޠʹ͓͍ͯɺ୯ޠʹؚ·ΕΔࣈจࣈʢTVCXPSEʣ ҙຯతͳׂ͕গͳ͍ʯ !3
Related Work w $#08ʢDPOUFYU ͔ΒUBSHFU Λ༧ଌʣ w $IBSBDUFSFOIBODFEXPSEFNCFEEJOH $8&
w 4LJQHSBNʢUBSHFU ͔ΒDPOUFYU Λ༧ଌʣ w GBTU5FYU vc ui vc ui !4 ୯ޠͱจࣈΛಉ͡ॏཁͰѻ͏
Method w %ZOBNJDTVCXPSEJODPSQPSBUFEFNCFEEJOHNPEFM %4& w %4&$#08 w %4&4( w
୯ޠʹ୯ޠͷॏΈ ͰɺTVCXPSEʹ ͰॏΈ͚͢Δ hw i 1 − hw i !5
Method !6
Experiment w %BUBTFUT w 5SBJOJOHXPSEFNCFEEJOH8JLJQFEJBEBUBCBTFEVNQT w $IJOFTF &OHMJTI 'SFODI (FSNBO
*UBMJBO 4QBOJTI w .PEFM w %4&$#08 %4&4(ʢఏҊख๏ʣ w $8& GBTU5FYU !7
Experiment w ࣮ݧ߲ ͱ୯ޠͷൃੜ࣌ظͱͷ૬ؔ w ൃੜ࣌ظɿ͋Δޠ͕(PPHMF#PPLT/HSBNʹॳΊͯొͨ͠ ޠͷҙຯλεΫ
w &NCFEEJOHͷੑೳΛଌΔ w 4JNJMBSJUZͱ"OBMPHZΛ༻ hw i !8
Result ୯ޠͷॏΈ ͱൃੜ࣌ظͱͷ૬ؔɿ*OEP&VSPQFBOͱதࠃͰਖ਼ର w hw i !9
Result ୯ޠͷॏΈ ͱൃੜ࣌ظͱͷ૬ؔɿ*OEP&VSPQFBOͱதࠃͰਖ਼ର w hw i !10 ͕࣌ਐΉͱ ୯ޠʹର͢ΔॏΈ͕ݮগ ˣ
୯ޠΑΓ 4VCXPSEΛॏࢹ
Result ୯ޠͷॏΈ ͱൃੜ࣌ظͱͷ૬ؔɿ*OEP&VSPQFBOͱதࠃͰਖ਼ର w hw i !11 ͕࣌ਐΉͱ ୯ޠʹର͢ΔॏΈ͕૿Ճ ˣ
4VCXPSEΑΓ ୯ޠΛॏࢹ ʢԾઆ͕͔֬ΊΒΕͨʣ
Result w ͦΕͧΕͷάϧʔϓͰൺֱ w $#08ܥʢ%4&$#08 $8&ʣ w 4LJQHSBNܥʢ%4&4( GBTUUFYUʣ w
%4&4(Ͱੑೳͷ্Λ֬ೝ !12
Conclusion w ԾઆΛݕূ͢ΔҝʹɺTVCXPSEΛߟྀ͢Δ୯ޠࢄදݱΛఏҊͨ͠ w *OEP&VSPQFBOͷݴޠͰ৽͘͠ੜ·ΕΔ୯ޠ΄ͲTVCXPSEʹҙຯͷ ॏΈ͕ॏࢹ͞ΕɺதࠃޠͰٯʹTVCXPSEͷॏΈ͕ݮΓɺ୯ޠͦͷ ͷʹରͯ͠ॏΈ͕ͭ͘Α͏ʹͳͬͨʢԾઆΛݕূͨ͠ʣ !13
None
Discussion w ࣮ݧʹରͯ͠۩ମతͳൺֱΛߦͬͨ w தࠃɿͷۙԽͰٕज़Պֶ͕ൃలͨ͜͠ͱʹΑΓɺ৽͍͠୯ ޠ͕ೖ͖ͬͯͨʁ !15