Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions

Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions

弊研究室で行なったEMNLP2018読み会の発表資料です。

onizuka laboratory

December 18, 2018
Tweet

More Decks by onizuka laboratory

Other Decks in Research

Transcript

  1.   Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes

    and Captions Q. Li, J. Fu, D. Yu et al. EMNLP 2018  20181218      
  2. šVQA  CNN  RNN ;* end-to-end !" → <2AE+H

    šC0 A9#5  2 4G'  C0 <2I!".(AE)= š>'1 %FI6?B,?:$D/-8  3@ VQA  end-to-end &7 2 4G' <2AE)= 1
  3. š     Visual Q&A Q: where is

    the man swinging the racket? A: tennis court 2
  4. š     Visual Q&A Q: what kind

    of drink is in the glass? A: water 3
  5. š     Visual Q&A Q: what is

    walking next to the bus? A: cow 4
  6. š     Visual Q&A Q: does the

    man need a haircut? A: yes 5
  7.  "#    where is the man swinging

    the racket? yes no water tennis court ⋮ CNN RNN  $    ! 7
  8. š7# 9+ D š>(. <& D š(CB D š* 

    "6* D -2,4 end-to-end  2 3A $% @8181;?':/4-2 0),4 =C !581;?B 8
  9. š( ,# &+ Ø#'! Ø#(" š) ,( %*&$ ! Ø'

    ver. Ø( ver. Ø ver.  ( )  10
  10. š@,→H3L0 !%$ %)=1M š;F$ % @,  I87 H3L0 šResNet152

    '#(' .? ØBAK6 ED "&'&( š.?9 /$ % -NC> .? šG+ ;F$ % cos N*2 J17 H3L0 cos N*2.? 5:4< H3)= 12
  11. š/ →3*) š/  ResNet152  LSTM  š1 .%0

      1', (e.g. BLEU) 4") .%) 2$5! cos 6# (+&- / 3*).% 13
  12. š-VQA-real Ø1 &  3 ,1   10 

    )', š*!#-min(#)*+,-. /010-/ 2),2 ,-.345 6 , 1) Ø%)' 10 "  3 ( $) +  * 16
  13. š*  '"    Ø+!  & )%

     Ø+!  & )%  Ø+!  & )%  Ø+!  & )%  š    $  ,# - +! ( VQA & 20
  14. š  tennis, ball, man, racket, hit, court, play, player,

    swing, hold š  a man holding a tennis racket on a tennis court. š tennis court    &   Q: where is the man swinging the racket? A: tennis court 21
  15. š  bicycle, man, sit, eat, bike, look, outside, food,

    person, table š  a man sitting at a table with a plate of food. š beer    &   Q: what kind of drink is in the glass? A: water 22
  16. š  street, bus, cow, city, walk, car, drive, stand,

    road, white š  a cow that is walking in the street. š car   &   Q: what is walking next to the bus? A: cow 23
  17. š  woman, bear, teddy, hold, sit, glass, animal, large,

    lady š  a woman holding a sandwich in her hands. š yes   &   Q: does the man need a haircut? A: yes 24
  18.          30%

          65% yes/no    80%   25
  19. š>1=9 5 2 3B# VQA š#2 Ø7!6*A-"?&.% / Ø7! 

    <+; š0',4:(C $ 8  VQA =@ ) = 28