Save 37% off PRO during our Black Friday Sale! »

固有表現抽出に関する論文紹介&固有表現抽出器によるニュースからの企業名抽出の取り組み事例の紹介/Introduction of Ner Survey Paper and Practical Example of Organization Extraction

13d936e697fe0f4fa96f926d0a712f6c?s=47 Sansan
PRO
January 30, 2019

固有表現抽出に関する論文紹介&固有表現抽出器によるニュースからの企業名抽出の取り組み事例の紹介/Introduction of Ner Survey Paper and Practical Example of Organization Extraction

■イベント
【京都開催】第一回SIL勉強会 自然言語処理編
sansan.connpass.com/event/116853/

■登壇概要
タイトル:
固有表現抽出に関する論文紹介と固有表現抽出器によるニュースからの企業名抽出の取り組み事例の紹介

登壇者:
Sansan株式会社 DSOC R&D Group 高橋寛治

▼Sansan Builders Box
https://buildersbox.corp-sansan.com/

13d936e697fe0f4fa96f926d0a712f6c?s=128

Sansan
PRO

January 30, 2019
Tweet

Transcript

  1. +&% .'!  +&%    *% #-! 0

    0 @ 5 5 5 2 5 0 05 & 2 0 5 $SIL) " ,((2019/01/30)
  2. ,! -)($/*% Nadeau, David and Sekine, Satoshi. A survey of

    named entity recognition and classification. Linguisticae Investigationes, Vol. 30 , No. 1, pp.3-26, 2007. Yadav, Vikas and Steven Bethard. A Survey on Recent Advances in Named Entity Recognition from Deep Learning models. COLING, pp.2145-2158, 2018. -)($  +($# & .  "' 
  3. ) 0 01 9 E i 2 s 2n e

    N () ) Sd a t E 3 m 2 m   
  4.      

  5. 2 2007 2018 8 SNS Precision, Recall, F1 Precision, Recall,

    F1 7 10
  6. A survey of named entity recognition and classification (2007) 

      
  7. 1991 (Lisa F. Rau 1991) 1996 6

  8.  

  9. 8 8 SVM CRF WordNet 8 (E. Alfonseca et al.

    2002) 8 8 (Shinyama et al. 2004)
  10. 29 9 2 I-Inside O-Outside B-Beginning B 29 2 U.N.

    NNP I-NP I-ORG official NN I-NP O Ekeus NNP I-NP I-PER heads VBZ I-VP O for IN I-PP O Baghdad NNP I-NP I-LOC . . O O CoNLL2003
  11. (Sansan) (3 ) 0 1

  12. MUC(the Sixth Message Understanding Conference) 1 Precision, Recall, f-measure

  13. 1 2 1 220 40 2 (6 , 6 6

    )
  14. A Survey on Recent Advances in Named Entity Recognition from

    Deep Learning models (2018)    
  15. n f (Collobert et al.2008) → 2011e f F 1

    f ( i g i ) f n - a 4 i i i
  16. 1 LSTM Label 5 CRF

  17. 1 6

  18. CoNLL1 90% 7

  19. + ( ) 8 1 n-gram

  20. 1 1 CoNLL2002 CoNLL2003 9 State-of-the-art

  21.  NN word + character + affix  

  22. 1 IOB2 or State-of-the-art embedding 2

  23.    

  24. 3 2 3 Web 3 3 2

  25. 4 2

  26. h 2 L E Eight5 R L 5 U L

    L t5 2 UI5 i g
  27. 2 2 6

  28.  

  29. 2 (1991) 2 (2018) Sansan 8