Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Multimodal Grounding for Language Processing
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
onizuka laboratory
October 17, 2018
Research
0
53
Multimodal Grounding for Language Processing
弊研究室で行なったCOLING2018読み会の発表資料です。
onizuka laboratory
October 17, 2018
Tweet
Share
More Decks by onizuka laboratory
See All by onizuka laboratory
Phrase-Based & Neural Unsupervised Machine Translation
onilab
0
120
Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions
onilab
0
72
Card-660: A Reliable Evaluation Framework for Rare Word Representation Models
onilab
0
37
A Word-Complexity Lexicon and A Neural Readability Ranking Model for Lexical Simplification
onilab
0
130
Integrating Transformer and Paraphrase Rules for Sentence Simplification
onilab
0
61
An Auto-Encoder Matching Model for Learning Utterance-Level Semantic Dependency in Dialogue Generation
onilab
0
57
Generating More Interesting Responses in Neural Conversation Models with Distributional Constraints
onilab
0
100
Modeling Multi-turn Conversation with Deep Utterance Aggregation
onilab
0
98
Learning Semantic Sentence Embeddings using Pair-wise Discriminator
onilab
0
120
Other Decks in Research
See All in Research
R&Dチームを起ち上げる
shibuiwilliam
1
140
Aurora Serverless からAurora Serverless v2への課題と知見を論文から読み解く/Understanding the challenges and insights of moving from Aurora Serverless to Aurora Serverless v2 from a paper
bootjp
6
1.5k
教師あり学習と強化学習で作る 最強の数学特化LLM
analokmaus
2
890
[Devfest Incheon 2025] 모두를 위한 친절한 언어모델(LLM) 학습 가이드
beomi
2
1.4k
Tiaccoon: Unified Access Control with Multiple Transports in Container Networks
hiroyaonoe
0
580
"主観で終わらせない"定性データ活用 ― プロダクトディスカバリーを加速させるインサイトマネジメント / Utilizing qualitative data that "doesn't end with subjectivity" - Insight management that accelerates product discovery
kaminashi
15
20k
生成AI による論文執筆サポート・ワークショップ 論文執筆・推敲編 / Generative AI-Assisted Paper Writing Support Workshop: Drafting and Revision Edition
ks91
PRO
0
120
Grounding Text Complexity Control in Defined Linguistic Difficulty [Keynote@*SEM2025]
yukiar
0
100
第66回コンピュータビジョン勉強会@関東 Epona: Autoregressive Diffusion World Model for Autonomous Driving
kentosasaki
0
270
Proposal of an Information Delivery Method for Electronic Paper Signage Using Human Mobility as the Communication Medium / ICCE-Asia 2025
yumulab
0
160
病院向け生成AIプロダクト開発の実践と課題
hagino3000
0
530
ロボット学習における大規模検索技術の展開と応用
denkiwakame
1
210
Featured
See All Featured
A Tale of Four Properties
chriscoyier
162
24k
Chasing Engaging Ingredients in Design
codingconduct
0
110
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
26
3.3k
職位にかかわらず全員がリーダーシップを発揮するチーム作り / Building a team where everyone can demonstrate leadership regardless of position
madoxten
57
50k
Have SEOs Ruined the Internet? - User Awareness of SEO in 2025
akashhashmi
0
270
How to Get Subject Matter Experts Bought In and Actively Contributing to SEO & PR Initiatives.
livdayseo
0
64
Building a Modern Day E-commerce SEO Strategy
aleyda
45
8.6k
Building an army of robots
kneath
306
46k
Prompt Engineering for Job Search
mfonobong
0
160
Stop Working from a Prison Cell
hatefulcrawdad
273
21k
Principles of Awesome APIs and How to Build Them.
keavy
128
17k
A designer walks into a library…
pauljervisheath
210
24k
Transcript
Multimodal Grounding for Language Processing 20181017 NomotoEriko 1
### # &0 NLP +$ 65#5- Contents: Ø"!
Ø # &0 '8 Ø # 3/*2 Ø # NLP Multimodal Grounding for Language Processing 2 %7. 4( )35- ,1
Multimodal Grounding for Language Processing 3
HF>T NLP A>O… "/(# '*!.@dQaBX g6O _QR>O… KIEWLS 8e^?2JO
Conceptual grounding: ]`QZ3fDZb7P14N: U ↑ S<),%+/$,M=0 ]` [c 4 !.&- YZC;hVZC;h\ZC;h9ZC;GDZ56M=
Multimodal Grounding for Language Processing 5
# 3 - ($ '+ . ,! &" 3
- ,)* (Cross-modal transfer) ,'+ (Cross-modal interpretation) %# (Joint multimodal processing) 6 #-
@>? (Cross-modal transfer) E*D8F &E*A):F -+ 1! 92
C "G Ø07E;=→4# '45F ØB,(.E→B6F ØE< /#→F 7 $3%C
GAF (Cross-modal interpretation) $)=( /,>3D+ G@6 . K &)B"IL?*B"142;<M !N
Ø9#80LHE 5:M Ø LJ&C-5:M 8 %7'K
;$ 3 (Joint multimodal processing) =- %/ F *'
#27 : $(> I Ø+*CDG6B<* "4H Ø@A7E&)8G5? 6B 3H 9 3 !F 09.1,
Multimodal Grounding for Language Processing 10
,(+ 3 0 ,( $ * 3
.- Ø%#,( '" (Concept representations) Ø !'" (Projection) Ø&/),( '" (Compositional representations) 11 ,(+
.)50/( (Concept representations) .)50 7&;.)9 #, '!( Ø<:.)50 6,:1 +50
'!( Ø 2 8 *"3 %- 12 50$4
,*:21) (Concept representations) (&!+.F Ø"6BA7C%#/D Ø↑ “cat” ;<5 “panther”
>@5 “dog” : E 8$ : Ø04 = 3 ?=- 13 :2'9
2394 (Projection) 86>:4 D 5. &(" 86 +$) A &("@0-
(Mapping) /?>:<@ (Joint representation space) 7;86>:4 B*'!#1= ;C 14 %) &,)>:1= ←Mapping ↓ Joint learning
./40 (Projection) !"=)( (Mapping) Ø!"+!" )( !: #$% → #$'
-9 Ø87>13 @6(*? '& %$2<,: maximize sim(#$% , #$' ), minimize sim(#$% , ./0123$' ) 15 #!%#;5-9
-/70 (Projection) &@?8;B (Joint representation space) Ø%#$ :9C2+35A( ü) ?8'6<E1.
F üB 5A =*E1.F 4D 2 B !,> 16 " ?8,>
;NB KA @5 OE*KA6L/- E* I-6L CD=P U$#
< 8>7: Ø( E* RFS3.90&TJ Ø," MG? E*'U 94 Q2 )+% 17 !KA1H
NLP Multimodal Grounding for Language Processing
18
NLP #&$'&,> 957.A=<4 ØD(6' (simLex-999 etc…) 1 7.A=@2 B*
Ø/0E #&$'&7.A=%!$'&- +)?95 8C?95 :; Ø#&$'&7.A=<4" Ø#&$'&7.A=8C?953 19 #&$'& NLP
NLP !%"'% +1 3,47 Ø0) 3, 6/.( Ø0)-# $&
' 2)-,5 ' (imSitu) *8 20 !%"'% NLP imSitu*- (http://imsitu.org/ )
+. )&$ Ø " >< !,% #&$ Ø!,% #'(%
" >< - % Ø* &$ / 21 NLP
Multimodal Grounding for Language Processing 22
NLP NLP " # !
NLP 23