Visually Grounded Neural Syntax Acquisition

47bdb7f109e0b74652d3653102d02b93?s=47 Yuichiroh
September 28, 2019

Visually Grounded Neural Syntax Acquisition

summarization of the paper presented at ACL 2019.
at the state-of-the-art NLP study group.

47bdb7f109e0b74652d3653102d02b93?s=128

Yuichiroh

September 28, 2019
Tweet

Transcript

  1. 7JTVBMMZ(SPVOEFE/FVSBM 4ZOUBY"DRVJTJUJPO ಡΉਓদྛ ༏Ұ࿠ʢ౦๺େֶʣ  ࠷ઌ୺/-1ษڧձ )BPZVF4IJ +JBZVBO.BP ,FWJO(JNQFM ,BSFO-JWFTDV

    QSFTFOUFEBU"$-
  2. ֓ཁ • ߏ੒ૉϕʔεͷߏจղੳΛڭࢣͳֶ͠श • 7JTVBMTFNBOUJDFNCFEEJOHTQBDF</HJBN FUBM > ͷख๏Λར༻͢Δ • ը૾ͱͦͷΩϟϓγϣϯΛ༻͍ɺςΩετ۠ؒͷ

    DPODSFUFOFTT BCTUSBDUOFTTείΞΛఆٛ͠ɺ ۠ؒͷ݁߹ΛΨΠυ͢Δ • ςΩετ୯ମͰͷֶशΑΓޮ཰Α͘ɺ҆ఆֶͨ͠श ͕ߦ͑Δ ࠷ઌ୺/-1ษڧձ 
  3. ίϯηϓτ ࠷ઌ୺/-1ษڧձ  ը૾ͱϑϨʔζͷྨࣅੑ͕ΑΓΑ͘ܭࢉͰ͖Δ ྑ͍ߏ੒ૉͷ୯ҐΛߏ੒Ͱ͖Δ ʹ ͱ͍͏ԾఆΛஔ͖ɺ ͜ΕΛֶशͷγάφϧʹར༻͢Δ

  4. ؔ࿈ݚڀ  -JOHVJTUJDTUSVDUVSFJOEVDUJPOGSPNUFYU • ೥୅͸΄ͱΜͲ඼ࢺ͔Βελʔτ – ݶքʹ͍ͭͯ͸࣋ڮ͞Μ͕ݴٴ • μ΢ϯετϦʔϜλεΫ͔Β EJTUBOUTVQFSWJTJPOͰؼೲ

    ͢Δ <ʹͨ͘͞Μݚڀ> – ݴޠֶతʹଥ౰ͱࢥ͑Δߏ଄ͷಋग़ʹ੒ޭͤͣ • WJBMBOHVBHFNPEFMJOH – l5IJTBQQSPBDIIBTBDIJFWFESFNBSLBCMF QFSGPSNBODFz • 1BSTJOH3FBEJOH1SFEJDU/FUXPSL<4IFOFUBM B> • 0SEFSFE/FVSPO-45.<4IFOFUBM > ਫ໦͞Μ঺հ ࠷ઌ୺/-1ษڧձ  ͕࣌ؒͳ͍ͷͰ จݙϦετ͸࿦จΛݟ͍ͯͩ͘͞ ຊݚڀͰ͸ Language Modeling Ͱ͸ͳ͘ɺ ը૾ͱͷϚονϯάͰΨΠυ͢Δͱ͍͏ߟ͑Λಋೖ
  5. ؔ࿈ݚڀ  (SPVOEFEMBOHVBHFBDRVJTJUJPO • ը૾΍ಈըͱͦͷΩϟϓγϣϯ͔Βͷؼೲ – ͍͍ͩͨ͸ WJTVBMBUUSJCVUFT΍ BDUJPOʹ͍ͭͯਓ खͷϥϕϧ΍ϧʔϧʹج͍ͮͯؼೲ͢Δ

    • 7JTVBMTFNBOUJDFNCFEEJOHTQBDF</HJBN FUBM > – ը૾ͱςΩετͷϖΞΛѻ͏ηοςΟϯάͰ ͨ͘͞Μͷݚڀ͋Γ • JNBHFDBQUJPOSFUSJFWBM JNBHFDBQUJPO HFOFSBUJPO WJTVBM RVFTUJPOBOTXFSJOH ࠷ઌ୺/-1ษڧձ  ͕࣌ؒͳ͍ͷͰ จݙϦετ͸࿦จΛݟ͍ͯͩ͘͞ ຊݚڀ͸׬શͳڭࢣͳ͠ ͜ͷΞΠσΞΛआΓΔ
  6. ख๏ͷ֓ཁ ࠷ઌ୺/-1ษڧձ  [Ngiam et al., 2011] REINFORCE [Williams, 1992]

    ResNet-101 Bottom-up binary tree parsing Φ: ℝ$%&' → ℝ)*$ ͦΕͧΕͷߏ੒ૉʹ ϕΫτϧදݱ͕Ͱ͖Δ ಉۭؒ͡Ͱֶश
  7. 1BSTJOH TUFQ ࠷ઌ୺/-1ษڧձ  The selected pair is combined to

    form a single new constituent two-layer feedforward network
  8. 5SBJOJOH WJTVBMSFQSFTFOUBUJPONBQQJOH ࠷ઌ୺/-1ษڧձ  ͋Δը૾ͷΩϟϓγϣϯ಺ͷͦΕͧΕͷߏ੒ૉ͸ ผͷΩϟϓγϣϯͷߏ੒ૉΑΓ͜ͷը૾ͱࣅ͍ͯΔ΂͖ ͜ͷը૾͸ɺผͷը૾ͱൺ΂ͯɺରԠ͢Δ Ωϟϓγϣϯ಺ͷߏ੒ૉͱࣅ͍ͯΔ΂͖ ը૾ͱߏ੒ૉͷྨࣅ౓

  9. 5SBJOJOH 5FYUVBM4USVDUVSF3FQSFTFOUBUJPOT ࠷ઌ୺/-1ษڧձ  ใुؔ਺ Captionͷߏ੒ૉ͸ ը૾ͱͯ͠ө͍ͬͯΔ ΋ͷͱࣅ͍ͯΔ΂͖ ͬͪ͜͸ใुͳͷͰූ߸͕ٯ ͬͪ͜͸Ұͭͷߏ੒ૉͷ෼͚ͩܭࢉ

  10. 5SBJOJOH 5FYUVBM4USVDUVSF3FQSFTFOUBUJPOT ࠷ઌ୺/-1ษڧձ  Head-Initial Inductive Bias ޙΖଆ͕ functional wordΛؚΉͳΒ

    ͳΔ΂͘ޙ·Ͱ͚ͬͭ͘ͳ͍Ͱ͍͍ͨ a white on the lawn cat º the where … … desk º • ୯ମͰ͚ͬͭͨ͘͘ͳ͍ • ۟΍અΛ࡞͔ͬͯΒ͚͍ͬͭͨ͘ ⋅,⋅ ͱਖ਼൓ରͷείΞɻ ͭ·Γը૾ͱؔ࿈͕ബ͍ߏ੒ૉ ޙΖଆͷந৅౓ΛଌΔ
  11. ࣮ݧ • σʔλ .4$0$0 – USBJOEFWUFTU    –

    #FOFQBS <,JUBFW  ,MFJO > Λ࢖ͬͯQBSTJOH ͨ݁͠ՌΛ (0-%ͷ໦ͩͱࢥ͏ • ' POSBOEBNMZ TBNQMFEDBQUJPOT ࠷ઌ୺/-1ษڧձ  ͑ͬ…
  12. ݁Ռ ࠷ઌ୺/-1ษڧձ 

  13. ݁Ռ̎ ࠷ઌ୺/-1ษڧձ  The high correlation between VG-NSL and the

    concreteness scores produced by Turney et al. (2011) and Brysbaert et al. (2014) supports the argument that the linguistic concept of concreteness can be acquired in an unsupervised way Compared to PRPN trained on the full training set, VG-NSL and VG-NSL+HI reach comparable performance using only 20% of thedata. VG-NSL tends to quickly become more stable as the amount of data increases, while PRPN and ON-LSTM remain less stable.
  14. ·ͱΊ • ը૾ͱΩϟϓγϣϯͷϖΞΛ࢖ͬͯจͷ໦ߏ଄Λ׬શ ڭࢣͳ͠Ͱֶश • ݴޠϞσϧΛݩʹ͢Δख๏ʹൺ΂ͯޮ཰Α҆͘ఆͨ͠ ֶश͕Ͱ͖Δ • ը૾શମͰ͸ͳ͘ɺը૾ʹ΋෦෼ߏ଄Λ༩͑ͯ FH

    -VFUBM 8VFUBM   ΞϥΠϯϝϯτ͢Ε͹΋ͬͱ͍͍͔΋Ͷ • ಋೖͨ͠ Head-Initial Inductive Bias ͷΑ͏ͳ΋ͷΛ ࣗಈతʹखʹೖΕΔʹ͸Ͳ͏ͨ͠Β͍͍ΜͩΖ͏ ࠷ઌ୺/-1ษڧձ