Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Learning to compose neural networks for question answering

himkt
July 10, 2016

Learning to compose neural networks for question answering

Andreas, Jacob, et al. "Learning to compose neural networks for question answering." arXiv preprint arXiv:1601.01705 (2016).

This is the presentation material used in journal club at University of Tsukuba, Kasuga (2016/07/16)

himkt

July 10, 2016
Tweet

More Decks by himkt

Other Decks in Science

Transcript

  1. NAACL HLT 2016 ʢBest Paperʣ ࿦จಡΈձ@य़೔ΤϦΞɹ2016೥7݄16೔(౔) ಡΜͩਓɿhimkt * εϥΠυதͷਤ͸͢΂ͯ࿦จ͔ΒҾ༻

  2. Overview • ෳ਺ͷ࣭໰Ԡ౴λεΫʹରԠͰ͖ΔϞσϧͷఏҊ • ը૾ • ߏ଄Խ͞Εͨ஌ࣝϕʔε • ࣭໰จΛߏจղੳͯ͠ରԠ͢ΔωοτϫʔΫΛಈతʹ ߏங͢ΔʢDynamic

    Neural module networkʣ • ύϥϝʔλͷֶशʹ͸ڧԽֶशΛ࢖͍ͬͯΔ
  3. Overview 1 2 3

  4. Overview - 1. Network Layout • ࣭໰จΛ܎Γड͚ղੳʢStanford Dependency Parserʣ •

    ܎Γड͚݁Ռʹ΋ͱ͍ͮͯऔΓ͏ΔωοτϫʔΫߏ଄ͷ ީิΛྻڍ • ࣭໰จΛॴ༩ͱͨ͠ࡍͷωοτϫʔΫʹؔ͢Δ৚݅෇͖ ֬཰ΛධՁͯ͠ωοτϫʔΫΛܾఆ
  5. Overview - 2. Module inventory 1 2 3

  6. Module inventory • 6छྨͷϞδϡʔϧͱݺ͹ΕΔؔ਺ • Attention͔LabelΛग़ྗ͢Δ • Attention: pixels •

    Label: true/false or lexicon (e.g. “bird”) • ֤Ϟδϡʔϧ͸ग़ྗͱҾ਺ʹؔͯ͠ʮܕʯ੍໿Λ࣋ͭ • Lookup :: input -> Attention • Find :: input -> Attention • Relate :: Attention -> Attention • And :: Attention* -> Attention • Describe :: Attention -> Labels • Exists :: Attention -> Labels
  7. Attention • find :: input -> Attention • ը૾ͷҰ෦ʢpixelͷू߹ʣΛग़ྗ

  8. Overview - 3. Produce an answer 1 2 3

  9. Produce an answer • What color is the bird? ->

    (describe[color] find[bird]) -> black and white (lexicon) • Are there any states? -> (exists find[state]) -> true
  10. Components • Layout model • ωοτϫʔΫߏ଄Λਪఆ͢Δ • Execution model •

    ճ౴Λੜ੒͢Δ • Training • ;ͨͭͷύϥϝʔλΛಉ࣌ʹֶश • ڧԽֶश p(z|x; l ) pz (y|w; e )
  11. Layout Model • ৚͖݅ͭ֬཰͸ιϑτϚοΫεͷग़ྗ • ͨͩ͠ɼ • ɹɹɹɹɹɹɹ͸ύϥϝʔλ • ɹɹɹ͸LSTMͷग़ྗ

    • ɹɹɹ͸ɹ ʢi൪໨ͷީิͷωοτϫʔΫʣͷ embedding? ʢfeature vectorʣ p(zi |x; l) = es(zi |x) n j=1 es(zj |x) s(zi |x) = aT (Bhq (x) + Cf(zi ) + d) l = (a, B, C, d) hq (x) f(zi ) zi
  12. Execution Model • ճ౴Λੜ੒͢ΔϞσϧ • ࣗ਎ͷೖྗ͕Θ͔͍ͬͯΔͱ͖ ɹͱॻ͚Δ pz(y|z) = z

    w y ( z w )y = m(h1, h2) and(find, relate(lookup))
  13. Training • ڧԽֶश • ɹΛɹɹɹɹɹ͔ΒαϯϓϦϯά • ωοτϫʔΫ͕ܾఆͨ͠ΒɹɹɹɹɹɹɹΛ
 ௚઀࠷େԽͯ͠ɹɹΛߋ৽ • Policy

    Gradient MethodʹΑΓɹ Λߋ৽ • ޯ഑ɿɹɹɹɹɹɹɹɹɹɹɹɹɹɹʢɹ͸ใुʣ z p(z|x; l ) log p(y|z, x, e ) e l J( l ) = E log p(z|x; l ) · r r J( l ) = E log p(z|x; l ) log p(y|z, w; e )
  14. Experimental result • VisualQAʢTable 1ʣͱGeoQAʢTable 2ʣͰstate-of-the-art • VisualQA: images •

    GeoQAɿstructured domains • ෳ਺ͷ࣭໰Ԡ౴λεΫʹରԠͰ͖Δ͜ͱ͕ূ໌͞Εͨ