Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Multimodal Grounding for Language Processing
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
onizuka laboratory
October 17, 2018
Research
0
53
Multimodal Grounding for Language Processing
弊研究室で行なったCOLING2018読み会の発表資料です。
onizuka laboratory
October 17, 2018
Tweet
Share
More Decks by onizuka laboratory
See All by onizuka laboratory
Phrase-Based & Neural Unsupervised Machine Translation
onilab
0
120
Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions
onilab
0
72
Card-660: A Reliable Evaluation Framework for Rare Word Representation Models
onilab
0
37
A Word-Complexity Lexicon and A Neural Readability Ranking Model for Lexical Simplification
onilab
0
130
Integrating Transformer and Paraphrase Rules for Sentence Simplification
onilab
0
61
An Auto-Encoder Matching Model for Learning Utterance-Level Semantic Dependency in Dialogue Generation
onilab
0
57
Generating More Interesting Responses in Neural Conversation Models with Distributional Constraints
onilab
0
100
Modeling Multi-turn Conversation with Deep Utterance Aggregation
onilab
0
98
Learning Semantic Sentence Embeddings using Pair-wise Discriminator
onilab
0
120
Other Decks in Research
See All in Research
AWSの耐久性のあるRedis互換KVSのMemoryDBについての論文を読んでみた
bootjp
1
460
LiDARセキュリティ最前線(2025年)
kentaroy47
0
120
さまざまなAgent FrameworkとAIエージェントの評価
ymd65536
1
420
CoRL2025速報
rpc
4
4.1k
Aurora Serverless からAurora Serverless v2への課題と知見を論文から読み解く/Understanding the challenges and insights of moving from Aurora Serverless to Aurora Serverless v2 from a paper
bootjp
6
1.5k
SkySense V2: A Unified Foundation Model for Multi-modal Remote Sensing
satai
3
490
姫路市 -都市OSの「再実装」-
hopin
0
1.6k
【SIGGRAPH Asia 2025】Lo-Fi Photograph with Lo-Fi Communication
toremolo72
0
110
20年前に50代だった人たちの今
hysmrk
0
140
20251023_くまもと21の会例会_「車1割削減、渋滞半減、公共交通2倍」をめざして.pdf
trafficbrain
0
180
Earth AI: Unlocking Geospatial Insights with Foundation Models and Cross-Modal Reasoning
satai
3
480
Multi-Agent Large Language Models for Code Intelligence: Opportunities, Challenges, and Research Directions
fatemeh_fard
0
120
Featured
See All Featured
DBのスキルで生き残る技術 - AI時代におけるテーブル設計の勘所
soudai
PRO
62
49k
Building Experiences: Design Systems, User Experience, and Full Site Editing
marktimemedia
0
410
Automating Front-end Workflow
addyosmani
1371
200k
How GitHub (no longer) Works
holman
316
140k
Rebuilding a faster, lazier Slack
samanthasiow
85
9.4k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
254
22k
Are puppies a ranking factor?
jonoalderson
1
2.7k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
162
16k
Redefining SEO in the New Era of Traffic Generation
szymonslowik
1
210
It's Worth the Effort
3n
188
29k
Stop Working from a Prison Cell
hatefulcrawdad
273
21k
Faster Mobile Websites
deanohume
310
31k
Transcript
Multimodal Grounding for Language Processing 20181017 NomotoEriko 1
### # &0 NLP +$ 65#5- Contents: Ø"!
Ø # &0 '8 Ø # 3/*2 Ø # NLP Multimodal Grounding for Language Processing 2 %7. 4( )35- ,1
Multimodal Grounding for Language Processing 3
HF>T NLP A>O… "/(# '*!.@dQaBX g6O _QR>O… KIEWLS 8e^?2JO
Conceptual grounding: ]`QZ3fDZb7P14N: U ↑ S<),%+/$,M=0 ]` [c 4 !.&- YZC;hVZC;h\ZC;h9ZC;GDZ56M=
Multimodal Grounding for Language Processing 5
# 3 - ($ '+ . ,! &" 3
- ,)* (Cross-modal transfer) ,'+ (Cross-modal interpretation) %# (Joint multimodal processing) 6 #-
@>? (Cross-modal transfer) E*D8F &E*A):F -+ 1! 92
C "G Ø07E;=→4# '45F ØB,(.E→B6F ØE< /#→F 7 $3%C
GAF (Cross-modal interpretation) $)=( /,>3D+ G@6 . K &)B"IL?*B"142;<M !N
Ø9#80LHE 5:M Ø LJ&C-5:M 8 %7'K
;$ 3 (Joint multimodal processing) =- %/ F *'
#27 : $(> I Ø+*CDG6B<* "4H Ø@A7E&)8G5? 6B 3H 9 3 !F 09.1,
Multimodal Grounding for Language Processing 10
,(+ 3 0 ,( $ * 3
.- Ø%#,( '" (Concept representations) Ø !'" (Projection) Ø&/),( '" (Compositional representations) 11 ,(+
.)50/( (Concept representations) .)50 7&;.)9 #, '!( Ø<:.)50 6,:1 +50
'!( Ø 2 8 *"3 %- 12 50$4
,*:21) (Concept representations) (&!+.F Ø"6BA7C%#/D Ø↑ “cat” ;<5 “panther”
>@5 “dog” : E 8$ : Ø04 = 3 ?=- 13 :2'9
2394 (Projection) 86>:4 D 5. &(" 86 +$) A &("@0-
(Mapping) /?>:<@ (Joint representation space) 7;86>:4 B*'!#1= ;C 14 %) &,)>:1= ←Mapping ↓ Joint learning
./40 (Projection) !"=)( (Mapping) Ø!"+!" )( !: #$% → #$'
-9 Ø87>13 @6(*? '& %$2<,: maximize sim(#$% , #$' ), minimize sim(#$% , ./0123$' ) 15 #!%#;5-9
-/70 (Projection) &@?8;B (Joint representation space) Ø%#$ :9C2+35A( ü) ?8'6<E1.
F üB 5A =*E1.F 4D 2 B !,> 16 " ?8,>
;NB KA @5 OE*KA6L/- E* I-6L CD=P U$#
< 8>7: Ø( E* RFS3.90&TJ Ø," MG? E*'U 94 Q2 )+% 17 !KA1H
NLP Multimodal Grounding for Language Processing
18
NLP #&$'&,> 957.A=<4 ØD(6' (simLex-999 etc…) 1 7.A=@2 B*
Ø/0E #&$'&7.A=%!$'&- +)?95 8C?95 :; Ø#&$'&7.A=<4" Ø#&$'&7.A=8C?953 19 #&$'& NLP
NLP !%"'% +1 3,47 Ø0) 3, 6/.( Ø0)-# $&
' 2)-,5 ' (imSitu) *8 20 !%"'% NLP imSitu*- (http://imsitu.org/ )
+. )&$ Ø " >< !,% #&$ Ø!,% #'(%
" >< - % Ø* &$ / 21 NLP
Multimodal Grounding for Language Processing 22
NLP NLP " # !
NLP 23