Upgrade to Pro — share decks privately, control downloads, hide ads and more …

N-gram統計量からの係り受け情報の復元

Yuya Unno
September 11, 2011

 N-gram統計量からの係り受け情報の復元

Yuya Unno

September 11, 2011
Tweet

More Decks by Yuya Unno

Other Decks in Technology

Transcript

  1. ΞΠσΞɿ⼤大൒ͷ܎Γड͚ؔ܎͸ྡ઀͍ͯ͠Δͷͳ Βɺྡ઀ස౓ͷ৘ใ͔Β܎Γؔ܎Λ෮ݩͰ͖Δ͸ͣ l  ྡ઀͍ͯ͠Δ୯ޠ͕܎Γ΍͍͢ l  ൒෼Ҏ্ͷ܎Γड͚͸ྡ઀͢Δ͜ͱ͕஌ΒΕ͍ͯΔ l  ྡ઀ස౓͕܎Γ΍͢͞ͷࢦඪʹͳΓͦ͏ l  म০෦͸লུ͞Ε΍͍͢

    l  ྫɿʮฒྻɾ෼ࢄɾॲཧʯ l  म০෦ͷʮ෼ࢄʯ͕লུ͞Εͨɺʮฒྻɾॲཧʯͱ͍͏ දݱ΋ͨ͘͞Μग़ݱ͢Δ l  NάϥϜͷ౷ܭྔͷΈͰ܎Γ΍͢͞ΛදݱͰ͖ΔͷͰ͸ ͳ͍͔ʁ
  2. ࠓճ͸⻑⾧長͍ෳ߹໊ࢺΛର৅ʹͨ͠ l  ෳ߹໊ࢺ͸ݎ͍⽂文ॻʹಛʹଟ͘ɺ⼀一෦Λ୳͍ͨ͠χʔζ ΋͋Δ l  ෳ߹໊ࢺͷ܎Γؔ܎ʹؔ͢Δਖ਼ղσʔλ͕ͳ͍ l  ҩྍ⽤用ޠ l  ⼤大଼⾻骨ܱ෦಺ଆ⾻骨ં

    l  ด࠹ੑಈ຺ߗԽ঱ l  ੓࣏⽤用ޠ l  ֤෎ল৘ใԽ౷ׅ੹೚ऀิࠤ׭౳࿈བྷձٞ l  ࠃՈ҆શอো໰୊୲౰⼤大౷ྖิࠤ׭ l  ͦͷଞ l  ه࿥త୹࣌ؒ⼤大⾬雨৘ใ l  ઓུత૑଄ݚڀਪਐࣄۀ
  3. EisnerΞϧΰϦζϜ [Eisner96] l  ܎Γड͚⽊木 T ʹର͢ΔείΞ S(T) ΛɺہॴతͳείΞ ͷ࿨Ͱද͢ l 

    S(T) = ∑ (m, h)˥T s(m, h) l  (m, h) ͸ T தͷ͢΂ͯͷम০ɾ⾮非म০ϖΞ l  S(T) Λ࠷⼤大ʹ͢ΔT opt ͸࣌ؒܭࢉྔ O(n3) Ͱٻ·Δ A B C D E A D B D B C D root D E root + + + + =
  4. Google N-gramσʔλ͔Βਪఆͨ͠⾃自⼰己૬ޓ৘ใྔ ʢPMIʣͰείΞؔ਺Λઃܭ͢Δ l  Google⽇日ຊޠNάϥϜσʔλͷස౓Λར⽤用͢Δ l  #(mh) ͸m, hͷόΠάϥϜස౓ l 

    #(m) ͸mͷϢχάϥϜස౓ l  EisnerͷࣜͰ⾜足͠߹ΘͤΔs(m, h) ͷݸ਺͸ T ʹΑΒͣ ⼀一ఆͳͷͰɺ্هͷconst ͸ແࢹͯ͠ྑ͍
  5. ݁Ռɿ༧૝Ҏ্ʹ͏·͍͕͘͘ɺมͳͱ͜Ζ΋ ઓུ త ૑଄ ݚڀ ਪਐ ࣄۀ ه࿥ త ୹࣌ؒ

    ⼤大⾬雨 ৘ใ ⼤大଼ ⾻骨 ܱ෦ ಺ଆ ⾻骨ં ࠃՈ ׭ ҆શ อো ໰୊ ୲౰ ิࠤ ⼤大౷ྖ ੒ޭ ࣦഊ
  6. ۩ମྫతʹؒҧͬͨྫΛ੔ཧ͢Δ 1.  ෆ⾃自વͳम০෦͕Ͱ͖ͯ͠·͏έʔε l  ໌Β͔ʹ੾Εͳ͍ɺ઀ඌදݱ͔Βम০෦͕࢝·Δ l  dੑɺdݝɺdతɺdܥɺdݕ౼ɺdิঈɾɾɾ 2.  म০෦ʹ܎ͬͯ͠·͏έʔε l 

    ઀಄දݱͳͲͷ໌Β͔ͳम০෦ʹ܎ͬͯ͠·͏ l  ⼤大dɺ௒dɺ४dɺಛผdɺ؆қdɾɾɾ 3.  ߏ଄తʹෆ⾃自વͳέʔε l  ϖΞͰ͸ଥ౰ͳީิ͕ෳ਺͋Δͱ͖ʹɺෆ⾃自વͳߏ଄ʹͳΔ l  ྫɿʮ௒ɾ⼤大ɾن໛ɾ෼ࢄɾฒྻɾॲཧʯ l  ʮ௒ɾ⼤大ʯʮ௒ɾ෼ࢄʯʮ௒ɾฒྻʯ͸ͲΕ΋⾃自વ
  7. ؔ࿈ݚڀ ֬཰త୯ޠ෼ׂ[⼯工౻05][Ԭ໺ݪ+06]   l  ୯ޠ෼ׂޡΓʹରͯ͠ϩόετʹ͢ΔͨΊʹɺ୯ޠ෼ׂΛ֬཰తʹ ग़⼒力力͢Δ l  ୯ޠڥք͔൱͔ͷ֬཰ͷੵͰ୯ޠͷ༗ແΛείΞԽ͢Δ l  ݕࡧ݁Ռ͕ϩόετʹͳΔ 0.95

    0.05 0.95 0.95 0.05 0.95 0.05 0.95 0.05 0.05 0.05 0.95 ֬ ཰ త ୯ ޠ ෼ ׂ ί ʔ ύ ε 0.99 0.01 0.99 0.89 0.18 0.85 0.19 0.95 0.0 0.0 0.0 0.99 1 0 1 1 0 1 0 1 0 0 0 1 (1) ܗଶૉղੳ݁Ռ (3)֬཰త୯ޠ෼ׂ (1) (2) (3)   (2)ैདྷͷSSC (Ћ=0.95)
  8. ࢀߟ⽂文ݙ l  [Eisner96] J. M. Eisner. Three New Probabilistic Models

    for Dependency Parsing: An Exploration. COLING ‘96. l  [⼯工౻05] ⼯工౻୓. ܗଶૉपล֬཰Λ⽤用͍ͨ෼͔ͪॻ͖ͷ⼀一ൠԽͱͦ ͷԠ⽤用. ⾔言ޠॲཧֶձશࠃ⼤大ձ’05. l  [Ԭ໺ݪ+06] Ԭ໺ݪ⼤大ี, ⼯工౻୓, ৿৴հ. ܗଶૉपล֬཰Λ⽤用͍ͨ ֬཰త୯ޠ෼ׂίʔύεͷߏஙͱͦͷԠ⽤用. NLPए⼿手ͷձγϯϙδ ΢Ϝ ‘06. l  [Zhou+11] G. Zhou, J. Zhao, K. Liu, L. Cai. Exploiting Web- Derived Selectional Preference to Improve Statistical Dependency Parsing. ACL ’11.