単語埋め込みに関連した応用事例の紹介 / Case Study in Word Embedding

13d936e697fe0f4fa96f926d0a712f6c?s=47 Sansan
January 30, 2019

単語埋め込みに関連した応用事例の紹介 / Case Study in Word Embedding

■イベント
【京都開催】第一回SIL勉強会 自然言語処理編
https://sansan.connpass.com/event/116853/

■登壇概要
タイトル:
単語埋め込みに関連した応用事例の紹介

登壇者:
DSOC R&D研究員 奥田裕樹

▼Sansan Builders Box
https://buildersbox.corp-sansan.com/

13d936e697fe0f4fa96f926d0a712f6c?s=128

Sansan

January 30, 2019
Tweet

Transcript

  1.   & & & & & &  SIL

    
  2. None
  3.  Character  Word  document Sentence Clause 2 2

    Sansan 2 2 
  4. A La Carte Embedding 3 EmbedRank

  5. 2 0 1 A La Carte Embedding: Cheap but Effective

    Induction of Semantic Feature Vectors e t [C g a Ai E n 4 Ae t [C A b]C L 8 b b b]C m d 8 r ASansan Advent Calendar & https://yag-ays.github.io/project/alacarte/
  6. n-gram 5 A La Carte Embedding

  7. 6 / 6 . . 1

  8. 2 7 7 7 2 . SGNS CBOW 7 2

    2 7 2
  9. 3 . . . 8

  10. . 4 9 4 . 9

  11. 1 ( ) 2010 2009 , 2012 , 2008 ,

    2011 , 2013 , 2007 , 2006 , 2014 , 2011, 2006 , , , , , , , , , , , , , , , r, , , m p , d, W , W , a , a , PSM, r, e , DDR , , d, 2 d, , d, , bk , i , a d e TV , r, , , , , , W g , , p d bigram0
  12. A A La Carte Embedding dE aE A2 on-the-fly b

    t g i aE skip-thought Ag A m & E t g A 1 n 11 e C A A L r e
  13. 0 1 Simple Unsupervised Keyphrase Extraction using Sentence Embeddings N

    8 2 E2C N 8 E2C O [ Maximal Marginal Relevance L R https://github.com/swisscom/ai-research-keyphrase-extraction   EmbedRankEmbedRank++   EmbedRank++ 
  14. EmbedRank 3 1 E Ra d E 1 E 3

    d (Sentence Embedding) b sent2vec doc2vec ed •Pagliardini et al., 2017 “Unsupervised Learning of Sentence Embeddings using Compositional n-Gram Features.” •Lau and Baldwin, 2016 “An empirical evaluation of doc2vec with practical insights into document embedding generation”
  15. a 4 naïve d e 4 Maximal Marginal Relevance b

          E R 14 4
  16. R 5 1 MMR 5 E1 1 topological shape topological

    shapes
  17. 6 1 PositionRank https://github.com/ymym3412/position-rank ACL2018, PageRank 1 6 2 termextract

    https://github.com/kanjirz50/termextract
  18. 1 7 EmbedRank PositionRank termextract • • • • •

    • • • • • • • • • • 7 1 7 7 7 7 7 1 https://www.jstage.jst.go.jp/article/pjsai/JSAI2017/0/JSAI2017_1J14/_article/-char/ja/
  19. ‒ ( ) ( ) 42 8 ( ) 30

    8 18 2013 - 6 1 https://ja.wikinews.org/wiki/ EmbedRank PositionRank termextract • • • ( ) • ‒ ( • • • ‒ • _ • • • • • • •
  20. 1 9       • '"%/

    • -  • 0 • & • *0) • 6+1# • ,  • 2013+ • . (2 • 1 • ! • !(2 • $42!   • (1  . )
  21. EmbedRank E MMRE k E & d m e+MMR b0

    R 0 sentence embedding e a 2 2 2
  22. A La Carte Embedding 21 2 EmbedRank 2 2 2

    1