Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
文献紹介:Treat the Word As a Whole or Look Inside? ...
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Taichi Aida
September 16, 2019
Technology
360
0
Share
文献紹介:Treat the Word As a Whole or Look Inside? Subword Embeddings Model Language Change and Typology
Taichi Aida
September 16, 2019
More Decks by Taichi Aida
See All by Taichi Aida
意味を表すベクトル表現を用いたテキスト分析
a1da4
0
120
スウェーデン滞在報告
a1da4
0
26
PhD Defence: Considering Temporal and Contextual Information for Lexical Semantic Change Detection
a1da4
1
290
文献紹介:A Multidimensional Framework for Evaluating Lexical Semantic Change with Social Science Applications
a1da4
1
390
YANS2024:目指せ国際会議!「ネットワーキングの極意(国際会議編)」
a1da4
0
310
言語処理学会30周年記念事業留学支援交流会@YANS2024:「学生のための短期留学」
a1da4
1
440
新入生向けチュートリアル:文献のサーベイv2
a1da4
16
12k
文献紹介:Isotropic Representation Can Improve Zero-Shot Cross-Lingual Transfer on Multilingual Language Models
a1da4
0
230
文献紹介:WhitenedCSE: Whitening-based Contrastive Learning of Sentence Embeddings
a1da4
1
380
Other Decks in Technology
See All in Technology
[Oracle TechNight#99] 生成AI時代のAI/ML入門 ~ AIとオラクルデータベースの関係 (前半)
oracle4engineer
PRO
2
240
ハーネスエンジニアリング入門
hatyibei
0
110
巨大プラットフォームを進化させる「第3のROI」
recruitengineers
PRO
2
2.5k
Building a Study Buddy AI Agent from Scratch: From Passive Chatbots to Autonomous Systems
itchimonji
0
140
AI対話分析の夢と、汚いデータの現実 Looker / Dataplex / Dataform で実現する品質ファーストな基盤設計
waiwai2111
0
250
Agent Skillsで実現する記憶領域の運用とその後
yamadashy
2
1.5k
いつの間にかデータエンジニア以外の業務も増えていたけど、意外と経験が役に立ってる
zozotech
PRO
0
210
Anthropic「Long-running a gents」をGeminiで再現してみた
tkikuchi
0
800
Digital Independence: Why, When and How
wannesrams
0
300
オライリーイベント登壇資料「鉄リサイクル・産廃業界におけるAI技術実応用のカタチ」
takarasawa_
0
350
AI時代に、 データアナリストがデータエンジニアに異動して
jackojacko_
0
430
要件定義の精度を高めるための型と生成AIの活用 / Using Types and Generative AI to Improve the Accuracy of Requirements Definition
haru860
0
310
Featured
See All Featured
The Hidden Cost of Media on the Web [PixelPalooza 2025]
tammyeverts
2
290
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
35
2.4k
Avoiding the “Bad Training, Faster” Trap in the Age of AI
tmiket
0
140
HDC tutorial
michielstock
2
650
More Than Pixels: Becoming A User Experience Designer
marktimemedia
3
400
YesSQL, Process and Tooling at Scale
rocio
174
15k
Breaking role norms: Why Content Design is so much more than writing copy - Taylor Woolridge
uxyall
0
270
sira's awesome portfolio website redesign presentation
elsirapls
0
230
The AI Revolution Will Not Be Monopolized: How open-source beats economies of scale, even for LLMs
inesmontani
PRO
3
3.4k
The World Runs on Bad Software
bkeepers
PRO
72
12k
A Tale of Four Properties
chriscoyier
163
24k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
31
3.2k
Transcript
จݙհʢʣ Treat the Word As a Whole or Look Inside?
Subword Embeddings Model Language Change and Typology Yang Xu, Jiasheng Zhang, David Reitter 1st International Workshop on Computational Approaches to Historical Language Change, ACL2019 Ԭٕज़Պֶେֶ ࣗવݴޠॲཧݚڀࣨɹ ૬ాɹଠҰ
Abstract • ݴޠֶతͳԾઆΛௐΔͨΊʹ subword Λߟྀͨ͠୯ޠࢄදݱΛఏҊ • Indo-European ͷݴޠ৽͍͠୯ޠ΄Ͳ subword ʹର͢ΔॏΈ͕૿͑ɺɹ
ٯʹதࠃޠ subword ʹର͢ΔॏΈ͕ݮΓɺ୯ޠʹର͢ΔॏΈ͕૿͑ͨ !2
Motivation w ݴޠֶతͳ݁ ʮதࠃޠʹ͓͍ͯɺ࣌ؒͱͱʹ༏Ґੑ͕୯ԻઅˠೋԻઅʹҠͬͨʯ w Ծઆ ʮݱͷதࠃޠʹ͓͍ͯɺ୯ޠʹؚ·ΕΔࣈจࣈʢTVCXPSEʣ ҙຯతͳׂ͕গͳ͍ʯ !3
Related Work w $#08ʢDPOUFYU ͔ΒUBSHFU Λ༧ଌʣ w $IBSBDUFSFOIBODFEXPSEFNCFEEJOH $8&
w 4LJQHSBNʢUBSHFU ͔ΒDPOUFYU Λ༧ଌʣ w GBTU5FYU vc ui vc ui !4 ୯ޠͱจࣈΛಉ͡ॏཁͰѻ͏
Method w %ZOBNJDTVCXPSEJODPSQPSBUFEFNCFEEJOHNPEFM %4& w %4&$#08 w %4&4( w
୯ޠʹ୯ޠͷॏΈ ͰɺTVCXPSEʹ ͰॏΈ͚͢Δ hw i 1 − hw i !5
Method !6
Experiment w %BUBTFUT w 5SBJOJOHXPSEFNCFEEJOH8JLJQFEJBEBUBCBTFEVNQT w $IJOFTF &OHMJTI 'SFODI (FSNBO
*UBMJBO 4QBOJTI w .PEFM w %4&$#08 %4&4(ʢఏҊख๏ʣ w $8& GBTU5FYU !7
Experiment w ࣮ݧ߲ ͱ୯ޠͷൃੜ࣌ظͱͷ૬ؔ w ൃੜ࣌ظɿ͋Δޠ͕(PPHMF#PPLT/HSBNʹॳΊͯొͨ͠ ޠͷҙຯλεΫ
w &NCFEEJOHͷੑೳΛଌΔ w 4JNJMBSJUZͱ"OBMPHZΛ༻ hw i !8
Result ୯ޠͷॏΈ ͱൃੜ࣌ظͱͷ૬ؔɿ*OEP&VSPQFBOͱதࠃͰਖ਼ର w hw i !9
Result ୯ޠͷॏΈ ͱൃੜ࣌ظͱͷ૬ؔɿ*OEP&VSPQFBOͱதࠃͰਖ਼ର w hw i !10 ͕࣌ਐΉͱ ୯ޠʹର͢ΔॏΈ͕ݮগ ˣ
୯ޠΑΓ 4VCXPSEΛॏࢹ
Result ୯ޠͷॏΈ ͱൃੜ࣌ظͱͷ૬ؔɿ*OEP&VSPQFBOͱதࠃͰਖ਼ର w hw i !11 ͕࣌ਐΉͱ ୯ޠʹର͢ΔॏΈ͕૿Ճ ˣ
4VCXPSEΑΓ ୯ޠΛॏࢹ ʢԾઆ͕͔֬ΊΒΕͨʣ
Result w ͦΕͧΕͷάϧʔϓͰൺֱ w $#08ܥʢ%4&$#08 $8&ʣ w 4LJQHSBNܥʢ%4&4( GBTUUFYUʣ w
%4&4(Ͱੑೳͷ্Λ֬ೝ !12
Conclusion w ԾઆΛݕূ͢ΔҝʹɺTVCXPSEΛߟྀ͢Δ୯ޠࢄදݱΛఏҊͨ͠ w *OEP&VSPQFBOͷݴޠͰ৽͘͠ੜ·ΕΔ୯ޠ΄ͲTVCXPSEʹҙຯͷ ॏΈ͕ॏࢹ͞ΕɺதࠃޠͰٯʹTVCXPSEͷॏΈ͕ݮΓɺ୯ޠͦͷ ͷʹରͯ͠ॏΈ͕ͭ͘Α͏ʹͳͬͨʢԾઆΛݕূͨ͠ʣ !13
None
Discussion w ࣮ݧʹରͯ͠۩ମతͳൺֱΛߦͬͨ w தࠃɿͷۙԽͰٕज़Պֶ͕ൃలͨ͜͠ͱʹΑΓɺ৽͍͠୯ ޠ͕ೖ͖ͬͯͨʁ !15