Upgrade to Pro — share decks privately, control downloads, hide ads and more …

完全なアノテーションが得られない状況下での固有表現抽出

Avatar for Koga Kobayashi Koga Kobayashi
September 18, 2019

 完全なアノテーションが得られない状況下での固有表現抽出

Avatar for Koga Kobayashi

Koga Kobayashi

September 18, 2019
Tweet

More Decks by Koga Kobayashi

Other Decks in Technology

Transcript

  1. NERͷֶशʹ࢖͑Δσʔλ Ξϊςʔγϣϯίʔύε • NERλεΫͰҰൠతʹར༻͞ΕΔσʔληοτ • ୯ޠʹରͯ͠ɼ֤ʑରԠ͢Δϥϕϧ͕෇༩͞Ε͍ͯΔ σϝϦοτ • υϝΠϯຖʹ࡞Γ௚͢ඞཁ͕͋Δ •

    Ξϊςʔγϣϯʹߴ͍ίετ͕͔͔Δ Donald B-PER John I-PER Trump E-PER is O president O of O the O US S-LOC the O ΞϊςʔγϣϯίʔύεΛ࡞ΔίετΛݮΒ͢ or ແ͍ͨ͘͠  
  2. NERͷֶशʹ࢖͑Δσʔλ ෦෼తΞϊςʔγϣϯίʔύε • Ұ෦ͷ୯ޠʹ͚ͩΞϊςʔγϣϯ͕෇༩͞Ε͍ͯΔίʔύε • Ξϊςʔλ͸ࣗ৴ͷͳ͍୯ޠʹΞϊςʔγϣϯ͠ͳͯ͘ࡁΉ σϝϦοτ • ֶशʹগ͠޻෉͕ඞཁ(ҰൠతͳCRFͰ͸ֶशग़དྷͳ͍) 

     Donald - John B-PER Trump E-PER is - president - of O the - US - the O ෦෼తΞϊςʔγϣϯίʔύε - Eraldo R Fernandes and Ulf Brefeld. 2011. Learning from partially annotated sequences. In Proceedings of ECML-KDD.
 - Andrew Carlson, Scott Gaffney, and Flavian Vasile. 2009. Learning a named entity tagger from gazetteers with the partial perceptron. In Proceedings of AAAI Spring Symposium: Learning by Reading and Learning to Read.
 - Jie Zhanming, Xie Pengjun, Lu Wei, Ding Ruixue and Li Linlin. 2019. Better Modeling of Incomplete Annotations for Named Entity Recognition. In Proceedings of NAACL.
  3. NERͷֶशʹ࢖͑Δσʔλ   ࣙॻ • ͍ΘΏΔࣙॻɼͨͩݴ༿ͷ಺༰ɼҙຯ·Ͱ࢖͏͜ͱ͸গͳ͍ • ༷ʑͳ෼໺ʹஔ͍ͯݩ͔Βଘࡏ͍ͯ͠Δ͜ͱ͕ଟ͍ • ਓ໊ࣙయ,

    ༀֶ༻ޠࣙయ, ෺࣭ɾࡐྉσʔλϕʔε ࣙॻϚονʹΑΔݻ༗දݱநग़Ͱ͸ • ࣙॻʹଘࡏ͠ͳ͍୯ޠΛݕग़Ͱ͖ͳ͍ • ݻ༗දݱʹؒҧͬͨϥϕϧ͕෇༩͞ΕΔ ͱ͍ͬͨ໰୊͕ੜ͡Δ Donald - John S-PER Trump - is - president - of - the - US - the - Person Dictionary John Michael ࣙॻϚον ࣙॻ + ੜίʔύεΛ༻͍ͨDistantly Supervisedͱ͍͏ख๏͕ग़ݱ
  4. ࠓճͷ෼໺ʹؔ܎͋Δ࿦จ ෦෼తΞϊςʔγϣϯίʔύε • Better Modeling of Incomplete Annotation for Named

    Entity Recognition Distantly Supervised NER • Distantly Supervised Named Entity Recognition using Positive-Unlabeled Learning • Distantly Supervised NER with Partial Annotation Learning and Reinforcement Learning • Learning Named Entity Tagger using Domain-Specific Dictionary • Distant supervision for relation extraction without labeled data. • Distant supervision for relation extraction via piecewise convolutional neural networks.  
  5. Better Modeling of Incomplete Annotations
 for Named Entity Recognition ෆ׬શͳΞϊςʔγϣϯ͕ߦΘΕ͍ͯΔσʔλ͔ΒNERΛߦ͏ϞσϧͷఏҊ

      Zhanming Jie, Pengjun Xie, Wei Lu, Ruixue Ding, Linlin Li NAACL 2019 എܠ • ෆ׬શͳΞϊςʔγϣϯΛԾఆ͢Δࡍʹɼ
 ୯ޠϨϕϧͰϥϕϧΛऔΓআ͘ͷ͸ݱ࣮తͰ͸ͳ͍ (A.1) • ໌ࣔతʹOλάΛΞϊςʔγϣϯ͢Δ͜ͱ͸ແ͍ (A.1, A.2) ࣮ӡ༻Ͱى͖͏Δϥϕϧܽଛ͸
 A.3ʹͳΔͱஶऀ͸ओு https://github.com/kajyuuen/Incomplete-NER-Methods ࠶ݱ࣮૷͠·ͨ͠:
  6. Better Modeling of Incomplete Annotations
 for Named Entity Recognition 

     Zhanming Jie, Pengjun Xie, Wei Lu, Ruixue Ding, Linlin Li NAACL 2019 ௨ৗͷCRF ఏҊख๏ͷCRF ఏҊख๏ CRFΛ֦ு͠ɼऔΓ͏Δϥϕϧͷ૊Έ߹ΘͤΛߟྀ͢ΔϞσϧ ֬཰෼෍ ͷਪఆ Hard: ࠷΋Մೳੑͷߴ͍ϥϕϧܥྻʹ֬཰1ΛׂΓ౰ͯΔ Soft: ͋Γ͏Δશͯͷϥϕϧܥྻʹରͯ֬͠཰ΛׂΓ౰ͯΔ ͜ΕΒͷ֬཰෼෍͸k෼ׂަࠩݕূʹΑͬͯਪఆΛߦ͏ q q = 1 CRF-PA ఏҊख๏
  7. Better Modeling of Incomplete Annotations
 for Named Entity Recognition 

     Zhanming Jie, Pengjun Xie, Wei Lu, Ruixue Ding, Linlin Li NAACL 2019 ݁Ռ ৚݅ઃఆ: ޒׂͷϥϕϧܽଛ+શͯͷOλάΛ࡟আ • ׬શͳΞϊςʔγϣϯ͕෇͍ͨͱ͖ʹൺ΂ͯ΋·ͣ·ͣͳੑೳΛࣔ͢ • ϥϕϧ͕෇͍͍ͯͳ͍৔ॴΛOͱͯ͠Έͳ͢Simpleʹ͸େউར
  8. Distantly Supervised Named Entity Recognition 
 using Positive-Unlabeled Learning 

     Minlong Peng, Xiaoyu Xing, Qi Zhang, Jinlan Fu, Xuanjing Huang ACL 2019 PUֶशΛར༻ͯࣙ͠ॻͱੜςΩετ͚ͩΛ༻͍ͯNERΛߦ͏ ఏҊख๏ • ࠷௕Ұக๏Λ༻͍ͯɼࣙॻ͔Βੜίʔύεʹରͯ͠ΞϊςʔγϣϯΛߦ͏ • ϥϕϧ෇͚͕ߦΘΕͨσʔλΛPositive, ͦΕҎ֎ΛUnlabeledͱֶͯ͠श • BIOɼBIOESͱ͍ͬͨϥϕϦϯάεΩʔϚΛར༻͠ͳ͍͜ͱͰɼ
 ࣙॻʹΑΔޡΞϊςʔγϣϯΛݮΒ͢͜ͱ͕ग़དྷΔ • ֤ΫϥεຖʹPU෼ྨػΛ࡞੒ɼ༧ଌ֬཰͕࠷΋ߴ͍ΫϥεΛ࠾༻
  9. Distantly Supervised Named Entity Recognition 
 using Positive-Unlabeled Learning 

     Minlong Peng, Xiaoyu Xing, Qi Zhang, Jinlan Fu, Xuanjing Huang ACL 2019 ݁Ռ Ξϊςʔγϣϯίʔύε ࣙॻͱੜίʔύε ࣙॻϚονʹൺ΂ͯɼେ͖͘ੑೳΛ޲্ͤͨ͞
  10. Distantly Supervised NER with Partial Annotation Learning and Reinforcement Learning

      Yaosheng Yang, Wenliang Chen, Zhenghua Li, Zhengqiu He, Min Zhang COLING 2018 Distant SupervisionͱڧԽֶशͷ૊Έ߹ΘͤʹΑΔNER ఏҊख๏ • ڧԽֶशΛ༻͍ͯnoisyͳϥϕϧ෇͚Λ࡟আ • CRF-PAΛ༻͍Δ͜ͱͰ
 imcompleteͳܥྻͰ΋ֶशΛՄೳʹ͢Δ എܠ • ࣙॻϚονʹΑͬͯ࡞੒ͨ͠ڭࢣσʔλ(Distant Supervision)ʹ͸
 imcomplete, noisyͳϥϕϧ෇͚͕ߦΘΕΔͱ͍͏໰୊͕͋Δ
  11. Distantly Supervised NER with Partial Annotation Learning and Reinforcement Learning

      Yaosheng Yang, Wenliang Chen, Zhenghua Li, Zhengqiu He, Min Zhang COLING 2018 ݁Ռ ࣙॻϚον΍LSTM-CRF-PAͷΈͷ৔߹ʹൺ΂ͯɼߴ͍நग़ੑೳΛ࣋ͭ ͕খ͞ΊͷΞϊςʔγϣϯίʔύε ͕Distantly SupervisedʹΑͬͯ࡞ΒΕͨڭࢣσʔλ ℋ
  12. αʔϕΠ ײ૝ • ෦෼తΞϊςʔγϣϯίʔύεΛֶश͢ΔϞσϧ͸සൟʹݟΒΕͨ • Distantly Supervised NERͷݚڀͰ͸͔ͳΓͷ֬཰Ͱར༻ɼҾ༻͞ΕΔ • σʔληοτͲ͏͢Δͷ໰୊

    • ࣗ࡞͢Δύλʔϯ͕͔ͳΓଟ͍ • CoNLL2003͔ΒϥϯμϜܽଛͤ͞Δύλʔϯ΋ݟΔ • ͲͪΒʹ͠Ζ࠶ݱੑͷ͋ΔϥϕϧܽଛΛߦ͏͜ͱ͕೉͍͠ • ྲྀߦ͖͍ͬͯͯΔײ͡͸͢Δ • Nested NER΍DS NERͷΑ͏ͳෳࡶͳλεΫઃఆͷ࿦จଟ͍ • ࣙॻ͚ͩͰNERΛߦ͍͍ͨؾ࣋ͪͷਓͨͪ͸୔ࢁ͍ͦ͏ • ී௨ͷܥྻϥϕϦϯάλεΫʹݶք͕དྷ͍ͯΔʁ