About "Towards Better Text Understanding and Retrieval through Kernel Entity Salience Modeling"

Reading Group 17.10.2018 Towards Better Text Understanding and Retrieval through
Kernel Entity Salience Modeling Xiong, Liu, Callan, and Liu SIGIR 2018

Motivation • Interest in knowing how salient (important and central)
a term (word, entity) is in a document • Word frequency largely exploited for document retrieval • But frequency is not necessarily same as salience • Entity salience is still a young task • Effectiveness of salience for ad hoc search is not explored yet

Main Messages • They represent an entity combining the textual
information and semantics from knowledge base • They obtain a better modeling of salience, beyond frequency, by entity interactions • They can improve web document search with salience, generalizing from news corpus

Knowledge Enriched Embedding • Adds to the textual embedding the
semantics from the knowledge graph

Kernel Interaction Model

Experimental Setup • Two datasets • New York Times (~
500k articles with summaries) • e in article is salient if it's also in summary • Semantic Scholar (~ 1m abstracts) • e in abstract is salient if it's also in title • Ranking-focused metrics: P@1, P@5, R@1, R@5 • Well documented parameter settings

Results: Entity Salience

Analysis VS. Frequency • KESM is able to model salience
of tail entities • KESM is more reliable on short documents

Salience for Ad hoc Search • Entity salience should model
a better text understanding • So it should hep for document search • Ranking uses the salience of the query entities in a candidate document • End-to-end training with sufﬁcient relevance labels, or using model pre-trained on salience

Salience-based Ranking

Results: Ad hoc Search

• Figures taken from these slides: http://www.cs.cmu.edu/~cx/slides/KESM-slides.pdf

About "Towards Better Text Understanding and Re...

About "Towards Better Text Understanding and Retrieval through Kernel Entity Salience Modeling"

Darío Garigliotti

More Decks by Darío Garigliotti

Other Decks in Research

Featured

Transcript

Reading Group 17.10.2018 Towards Better Text Understanding and Retrieval through

Motivation • Interest in knowing how salient (important and central)

Main Messages • They represent an entity combining the textual

Knowledge Enriched Embedding • Adds to the textual embedding the

Kernel Interaction Model

Kernel Interaction Model

Kernel Interaction Model

Kernel Interaction Model

Kernel Interaction Model

Kernel Interaction Model

Experimental Setup • Two datasets • New York Times (~

Results: Entity Salience

Analysis VS. Frequency • KESM is able to model salience

Salience for Ad hoc Search • Entity salience should model

Salience-based Ranking

Results: Ad hoc Search

• Figures taken from these slides: http://www.cs.cmu.edu/~cx/slides/KESM-slides.pdf