Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Multimodal Grounding for Language Processing
Search
onizuka laboratory
October 17, 2018
Research
0
53
Multimodal Grounding for Language Processing
弊研究室で行なったCOLING2018読み会の発表資料です。
onizuka laboratory
October 17, 2018
Tweet
Share
More Decks by onizuka laboratory
See All by onizuka laboratory
Phrase-Based & Neural Unsupervised Machine Translation
onilab
0
120
Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions
onilab
0
72
Card-660: A Reliable Evaluation Framework for Rare Word Representation Models
onilab
0
37
A Word-Complexity Lexicon and A Neural Readability Ranking Model for Lexical Simplification
onilab
0
130
Integrating Transformer and Paraphrase Rules for Sentence Simplification
onilab
0
61
An Auto-Encoder Matching Model for Learning Utterance-Level Semantic Dependency in Dialogue Generation
onilab
0
57
Generating More Interesting Responses in Neural Conversation Models with Distributional Constraints
onilab
0
100
Modeling Multi-turn Conversation with Deep Utterance Aggregation
onilab
0
98
Learning Semantic Sentence Embeddings using Pair-wise Discriminator
onilab
0
120
Other Decks in Research
See All in Research
学習型データ構造:機械学習を内包する新しいデータ構造の設計と解析
matsui_528
6
3k
【NICOGRAPH2025】Photographic Conviviality: ボディペイント・ワークショップによる 同時的かつ共生的な写真体験
toremolo72
0
160
生成AI による論文執筆サポート・ワークショップ ─ サーベイ/リサーチクエスチョン編 / Workshop on AI-Assisted Paper Writing Support: Survey/Research Question Edition
ks91
PRO
0
140
生成的情報検索時代におけるAI利用と認知バイアス
trycycle
PRO
0
260
Collective Predictive Coding and World Models in LLMs: A System 0/1/2/3 Perspective on Hierarchical Physical AI (IEEE SII 2026 Plenary Talk)
tanichu
1
240
それ、チームの改善になってますか?ー「チームとは?」から始めた組織の実験ー
hirakawa51
0
620
Satellites Reveal Mobility: A Commuting Origin-destination Flow Generator for Global Cities
satai
3
490
Agentic AI フレームワーク戦略白書 (2025年度版)
mickey_kubo
1
120
Thirty Years of Progress in Speech Synthesis: A Personal Perspective on the Past, Present, and Future
ktokuda
0
160
AWSの耐久性のあるRedis互換KVSのMemoryDBについての論文を読んでみた
bootjp
1
450
ペットのかわいい瞬間を撮影する オートシャッターAIアプリへの スマートラベリングの適用
mssmkmr
0
250
説明可能な機械学習と数理最適化
kelicht
2
920
Featured
See All Featured
Paper Plane (Part 1)
katiecoart
PRO
0
4k
Self-Hosted WebAssembly Runtime for Runtime-Neutral Checkpoint/Restore in Edge–Cloud Continuum
chikuwait
0
320
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.6k
Future Trends and Review - Lecture 12 - Web Technologies (1019888BNR)
signer
PRO
0
3.2k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
25
1.7k
Effective software design: The role of men in debugging patriarchy in IT @ Voxxed Days AMS
baasie
0
220
First, design no harm
axbom
PRO
2
1.1k
Designing for Performance
lara
610
70k
Highjacked: Video Game Concept Design
rkendrick25
PRO
1
280
Joys of Absence: A Defence of Solitary Play
codingconduct
1
290
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
12
1.4k
How Software Deployment tools have changed in the past 20 years
geshan
0
32k
Transcript
Multimodal Grounding for Language Processing 20181017 NomotoEriko 1
### # &0 NLP +$ 65#5- Contents: Ø"!
Ø # &0 '8 Ø # 3/*2 Ø # NLP Multimodal Grounding for Language Processing 2 %7. 4( )35- ,1
Multimodal Grounding for Language Processing 3
HF>T NLP A>O… "/(# '*!.@dQaBX g6O _QR>O… KIEWLS 8e^?2JO
Conceptual grounding: ]`QZ3fDZb7P14N: U ↑ S<),%+/$,M=0 ]` [c 4 !.&- YZC;hVZC;h\ZC;h9ZC;GDZ56M=
Multimodal Grounding for Language Processing 5
# 3 - ($ '+ . ,! &" 3
- ,)* (Cross-modal transfer) ,'+ (Cross-modal interpretation) %# (Joint multimodal processing) 6 #-
@>? (Cross-modal transfer) E*D8F &E*A):F -+ 1! 92
C "G Ø07E;=→4# '45F ØB,(.E→B6F ØE< /#→F 7 $3%C
GAF (Cross-modal interpretation) $)=( /,>3D+ G@6 . K &)B"IL?*B"142;<M !N
Ø9#80LHE 5:M Ø LJ&C-5:M 8 %7'K
;$ 3 (Joint multimodal processing) =- %/ F *'
#27 : $(> I Ø+*CDG6B<* "4H Ø@A7E&)8G5? 6B 3H 9 3 !F 09.1,
Multimodal Grounding for Language Processing 10
,(+ 3 0 ,( $ * 3
.- Ø%#,( '" (Concept representations) Ø !'" (Projection) Ø&/),( '" (Compositional representations) 11 ,(+
.)50/( (Concept representations) .)50 7&;.)9 #, '!( Ø<:.)50 6,:1 +50
'!( Ø 2 8 *"3 %- 12 50$4
,*:21) (Concept representations) (&!+.F Ø"6BA7C%#/D Ø↑ “cat” ;<5 “panther”
>@5 “dog” : E 8$ : Ø04 = 3 ?=- 13 :2'9
2394 (Projection) 86>:4 D 5. &(" 86 +$) A &("@0-
(Mapping) /?>:<@ (Joint representation space) 7;86>:4 B*'!#1= ;C 14 %) &,)>:1= ←Mapping ↓ Joint learning
./40 (Projection) !"=)( (Mapping) Ø!"+!" )( !: #$% → #$'
-9 Ø87>13 @6(*? '& %$2<,: maximize sim(#$% , #$' ), minimize sim(#$% , ./0123$' ) 15 #!%#;5-9
-/70 (Projection) &@?8;B (Joint representation space) Ø%#$ :9C2+35A( ü) ?8'6<E1.
F üB 5A =*E1.F 4D 2 B !,> 16 " ?8,>
;NB KA @5 OE*KA6L/- E* I-6L CD=P U$#
< 8>7: Ø( E* RFS3.90&TJ Ø," MG? E*'U 94 Q2 )+% 17 !KA1H
NLP Multimodal Grounding for Language Processing
18
NLP #&$'&,> 957.A=<4 ØD(6' (simLex-999 etc…) 1 7.A=@2 B*
Ø/0E #&$'&7.A=%!$'&- +)?95 8C?95 :; Ø#&$'&7.A=<4" Ø#&$'&7.A=8C?953 19 #&$'& NLP
NLP !%"'% +1 3,47 Ø0) 3, 6/.( Ø0)-# $&
' 2)-,5 ' (imSitu) *8 20 !%"'% NLP imSitu*- (http://imsitu.org/ )
+. )&$ Ø " >< !,% #&$ Ø!,% #'(%
" >< - % Ø* &$ / 21 NLP
Multimodal Grounding for Language Processing 22
NLP NLP " # !
NLP 23