Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Entity Set Search of Scientific Literature: An unsupervised Ranking Approach

Wataru Hirota
September 29, 2018

Entity Set Search of Scientific Literature: An unsupervised Ranking Approach

論文の15分紹介

Wataru Hirota

September 29, 2018
Tweet

More Decks by Wataru Hirota

Other Decks in Research

Transcript

  1. Wataru Hirota (M1) [email protected] Entity Set Search of
 Scientific Literature:


    An Unsupervised Ranking Approach 4*(*3 Shen, J., Xiao, J., He, X., Shang, J., Sinha, S., & Han, J. (University of Illinois)
  2. Entity ͱ͸? ! ≒ʮ࣮ମʯ ! ≒ Wikipedia ͷλΠτϧ - ʮླ໦Ұ࿠ʯʮσʔλϕʔεʯ

    - ֊૚తͳ type Λ࣋ͭ 3 5IJOH 1FSTPO 3FBM1FSTPO "UIMFUF *DIJSP4V[VLJ &OUJUZ֊૚
  3. Entity Set Search ͱ͸? ! ΫΤϦதʹෳ਺ͷ entity ؚ͕·ΕΔΑ͏ͳݕࡧ ! ͢΂ͯͷ

    entity ؚ͕·Ε͍ͯΔΑ͏ͳ
 ݕࡧ݁Ռ͕΄͍͠ ! e.g. knowledge base for document retrieval - knowledge base ͚ͩͰ΋ document retrieval ͚ͩͰ΋
 ෆे෼ 4
  4. 1. Entity Extraction 6 ! ΫΤϦ͔Β entity Λநग़ ! ಉ࣌ʹநग़ͨ͠

    entity ಉ࢜ͷ
 LCA (࠷খڞ௨૆ઌ)΋
 ܭࢉ
  5. 2. Generation of Query Graph ! ΫΤϦΛ2ͭͷάϥϑͰදݱ͢Δ - word graph,

    entity graph 7 XPSEHSBQI ลͷॏΈ ྡ઀͢Δ୯ޠ͚ͩ XPSEHSBQI ลͷॏΈ-$" ྡ઀͢Δ୯ޠ͚ͩ
  6. ݕࡧείΞ ! ʹ͓͚ΔϊʔυͷϙςϯγϟϧͱΤοδͷ
 ϙςϯγϟϧͷ࿨ - … ΫΤϦάϥϑ q ͷ͏ͪจষ di

    ͕ඃ෴͢Δ෦෼ ! ϊʔυͷϙςϯγϟϧ: ! Τοδͷϙςϯγϟϧ : - t ͸ தͷ word ·ͨ͸ entity - a(.) ͸׆ੑԽؔ਺ 8 Σt a(P(t|di )) Gq|di Gq|di Gq|di Σ(t1 ,t2 ) a(P(t1 |di )) a(P(t2 |di ))
  7. ύϥϝʔλબఆ ! ؊ͱͳΔΞΠσΞ͸
 ͋Δݕࡧ݁Ռ͕, ͢΂ͯͷύϥϝʔλͷ
 ݕࡧ݁Ռͱ߹ࢉͨ͠΋ͷͱࣅ͍ͯΔ΄Ͳ
 ͦΕ͸ྑ͍ݕࡧ݁ՌͰ͋Δ 11 ύϥϝʔλ ύϥϝʔλ

    ύϥϝʔλO ݕࡧ݁Ռ ߹ࢉͨ݁͠Ռ ύϥϝʔλN ࣅ͍ͯΔ΄Ͳ HPPE ஫ύϥϝʔλू߹͸༗ݶ
 ͔ͭखಈͰ༩͑Δ
  8. ࣮ݧ (ධՁࢦඪ) ! ࢦඪʹ͸ nDCG[1] Λ࢖༻ - ߴ͍ॱҐͷจষͷείΞ (≒ద߹౓߹͍) ͕ߴ͍΄Ͳ


    nDCG ͸ߴ͘ͳΔ - ৘ใݕࡧʹ͓͚ΔϝδϟʔͳධՁࢦඪ 12 nDCG = ΣK i=2 scorei log i [1] Järvelin, Kalervo, et al. "Cumulated gain-based evaluation of IR techniques. TOIS. 2012.
  9. ࣮ݧઃఆ ! Ωʔϫʔυݕࡧͷఆ൪3ͭͱ nDCG Λൺֱ - BM25[1] - Likelihood Model[2]

    ▪ Dirichlet prior smoothing (LM-DIR) ▪ Jelinek Mercer smoothing (LM-JM) - Information-Based Model[3] (IB) 13 [1] Stephen E, et al. The Probabilistic Relevance Framework: BM25 and Beyond. Foundations and Trends in Information
 Retrieval. 2009. [2] ChengXiang Zhai and John D. Lafferty. A Study of Smoothing Methods for Language Models Applied to Ad Hoc
 Information Retrieval. SIGIR. 2001. [3] Stéphane Clinchant and Éric Gaussier. Information-based models for ad hoc IR. SIGIR. 2010.
  10. ࣮ݧ݁Ռ (άϥϑߏ଄ͷ༗༻ੑ) ! -t … entity ͷΤοδͷॏΈΛ LCA Ͱ͸ͳ͘ 1

    ʹݻఆ ! -s … ΤοδΛ࡟আ ! -t, -ts ૒ํΑΓ΋ߴ͍ nDCG
 → ఏҊͨ͠άϥϑߏ଄͸༗ޮ 15