Upgrade to Pro — share decks privately, control downloads, hide ads and more …

単語埋め込みに関連した応用事例の紹介 / Case Study in Word Embedding

Sansan DSOC
January 30, 2019

単語埋め込みに関連した応用事例の紹介 / Case Study in Word Embedding

■イベント
【京都開催】第一回SIL勉強会 自然言語処理編
https://sansan.connpass.com/event/116853/

■登壇概要
タイトル:
単語埋め込みに関連した応用事例の紹介

登壇者:
DSOC R&D研究員 奥田裕樹

▼Sansan DSOC
https://sansan-dsoc.com/

Sansan DSOC

January 30, 2019
Tweet

More Decks by Sansan DSOC

Other Decks in Technology

Transcript

  1. 2 0 1 A La Carte Embedding: Cheap but Effective

    Induction of Semantic Feature Vectors e t [C g a Ai E n 4 Ae t [C A b]C L 8 b b b]C m d 8 r ASansan Advent Calendar & https://yag-ays.github.io/project/alacarte/
  2. 1 ( ) 2010 2009 , 2012 , 2008 ,

    2011 , 2013 , 2007 , 2006 , 2014 , 2011, 2006 , , , , , , , , , , , , , , , r, , , m p , d, W , W , a , a , PSM, r, e , DDR , , d, 2 d, , d, , bk , i , a d e TV , r, , , , , , W g , , p d bigram0
  3. A A La Carte Embedding dE aE A2 on-the-fly b

    t g i aE skip-thought Ag A m & E t g A 1 n 11 e C A A L r e
  4. 0 1 Simple Unsupervised Keyphrase Extraction using Sentence Embeddings N

    8 2 E2C N 8 E2C O [ Maximal Marginal Relevance L R https://github.com/swisscom/ai-research-keyphrase-extraction   EmbedRankEmbedRank++   EmbedRank++ 
  5. EmbedRank 3 1 E Ra d E 1 E 3

    d (Sentence Embedding) b sent2vec doc2vec ed •Pagliardini et al., 2017 “Unsupervised Learning of Sentence Embeddings using Compositional n-Gram Features.” •Lau and Baldwin, 2016 “An empirical evaluation of doc2vec with practical insights into document embedding generation”
  6. a 4 naïve d e 4 Maximal Marginal Relevance b

          E R 14 4
  7. 1 7 EmbedRank PositionRank termextract • • • • •

    • • • • • • • • • • 7 1 7 7 7 7 7 1 https://www.jstage.jst.go.jp/article/pjsai/JSAI2017/0/JSAI2017_1J14/_article/-char/ja/
  8. ‒ ( ) ( ) 42 8 ( ) 30

    8 18 2013 - 6 1 https://ja.wikinews.org/wiki/ EmbedRank PositionRank termextract • • • ( ) • ‒ ( • • • ‒ • _ • • • • • • •
  9. 1 9       • '"%/

    • -  • 0 • & • *0) • 6+1# • ,  • 2013+ • . (2 • 1 • ! • !(2 • $42!   • (1  . )
  10. EmbedRank E MMRE k E & d m e+MMR b0

    R 0 sentence embedding e a 2 2 2