Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Type-Aware Entity Retrieval

Type-Aware Entity Retrieval

Date: March 3rd, 2016
Venue: Trondheim, Norway. Doctoral Seminar at NTNU

Please cite, link to or credit this presentation when using it or part of it in your work.

#InformationRetrieval #IR #EntityRanking #EntityRetrieval #ER #EntityTypes #EntityOrientedSearch #KnowledgeBases #SemanticSearch

Darío Garigliotti

March 03, 2016
Tweet

More Decks by Darío Garigliotti

Other Decks in Research

Transcript

  1. From Information Retrieval to Entity Retrieval • Traditional Information Retrieval

    recently extended to an Entity-oriented Search • Satisfaction of more complex information needs • Current support on search engines
  2. Countries where one can pay with the euro • Related

    entities (via a relation or predicate) • Types or categories or classes Entity Retrieval
  3. Countries where one can pay with the euro Impressionist art

    museums in The Netherlands • Related entities (via a relation or predicate) • Types or categories or classes Entity Retrieval
  4. Countries where one can pay with the euro Impressionist art

    museums in The Netherlands • Related entities (via a relation or predicate) • Types or categories or classes Entity Retrieval
  5. Countries where one can pay with the euro Impressionist art

    museums in The Netherlands • Related entities (via a relation or predicate) • Types or categories or classes Entity Retrieval
  6. Entity Retrieval Evaluated tasks • Entity ranking (given a textual

    query and target categories) • List completion (given Q and entity examples, and? types) • Related entity finding (given entity E, relation R and type T) e.g. E = "Schumacher", R = "His teammates when he was on Ferrari", T = "Person" from Q = "Schumacher teammates when he was on Ferrari"
  7. Type-aware entity retrieval Our research questions 1. How to represent

    type-based information? 2. How to combine type-based and textual information? 3. How to estimate type-based information?
  8. Type-aware entity retrieval RQ2. How to combine type-based and textual

    information? • Basics: term-based models • A variety of related tasks across the literature • Entity retrieval approaches • Where to look for entities? How to find them? How to rank them? • Major model families • Common main insight: types help!
  9. Type-aware entity retrieval RQ1. How to represent type-based information? •

    Dimensions we identified • type taxonomies • hierarchical structure • dataset version • Minimal concerning in the related work
  10. Type taxonomies • We consider four well-known type taxonomies Type

    system Wikipedia DBpedia Freebase YAGO #types 753,524 591 1719 568,672 #top-level types NA 58 92 61 #most-specific-level types 753,524 472 1626 549,623 depth NA 7 2 19 entities w/ type 4.12M 3.24M 3.77M 2.89M avg #types/entity 4.02 6.30 9.57 16.44
  11. Type representation • We consider different ways of modeling type

    assignments: Top level, most specific level, and path-to-top r e r e r e
  12. Experimental setup • Our experimental environment looks like this: Term-based

    representation Query model Entity model Query model Entity model Type-based representation P(e|q) / P(q|e)P(e) p(t|✓T e ) p(t|✓T q ) KL(✓T q ||✓T e ) p(t0 | ✓T 0 e ) p(t0 | ✓T 0 q ) KL(✓T 0 q k ✓T 0 e ) P(q | e) = (1 )P(✓T 0 q | ✓T 0 e ) + P(✓T q | ✓T e )
  13. Experimental setup • Term-based component: Mixture of LM method •

    We obtain combinations of these elements: • Type taxonomies • Models • Type-based representations
  14. Ingredients • Model instantiations for • M1 (Mixture): • M2

    (Multiplicative): • M3 (Filtering): P(e | q) / P(q | e)P(e) P(q | e) = (1 )P(✓T 0 q | ✓T 0 e ) + P(✓T q | ✓T e ) P(q | e) = P(✓T 0 q | ✓T 0 e )P(✓T q | ✓T e ) P(✓T q | ✓T e ) 2 {0, 1}
  15. Ingredients • Query model for the type-based representation is provided

    by a target types oracle P(t|✓T q ) Query: guitar origin blues DBpedia Types: <dbo:Album>: 4 <dbo:MusicalArtist>: 43 ... Freebase Types: <fb:music.group_member>: 34 <fb:people.deceased_person>: 17 ... Wikipedia Categories: <dbpedia:Category:Blues_musicians_from_New_Orleans,_Louisiana>: 2 <dbpedia:Category:Blues_songs>: 2 ...
  16. Ingredients • Our experimental environment looks like this: Query model

    Entity model Type-based representation P(e|q) / P(q|e)P(e) p(t|✓T e ) p(t|✓T q ) KL(✓T q ||✓T e ) P(q | e) = (1 )P(✓T 0 q | ✓T 0 e ) + P(✓T q | ✓T e )
  17. Ingredients • Entity model for the type-based representation is a

    distribution estimated through the entity types Query: guitar origin blues Relevant entities: <dbpedia:The_Merle_Travis_Guitar> <dbpedia:Blues_Breakers_with_Eric_Clapton> <dbpedia:Poor_Boy_Blues> ... ... Freebase Types: ... DBpedia Types: <dbo:Album> <dbo:MusicalWork> ... ... Freebase Types: ... Wikipedia Categories: <Category:1950_albums> <Category:Merle_Travis_albums> ...
  18. Results (1) RQ1. How to represent type-based information? Type representation

    - Model M1 MAP 0 0.058 0.115 0.173 0.23 all assigned types most specific level path-to-top top level YAGO Freebase Wikipedia DBpedia Type representation - Model M2 MAP 0 0.045 0.09 0.135 0.18 all assigned types most specific level path-to-top top level YAGO Freebase Wikipedia DBpedia Type representation - Model M3 MAP 0 0.055 0.11 0.165 0.22 all assigned types most specific level path-to-top top level YAGO Freebase Wikipedia DBpedia
  19. Results (2) RQ2. How to combine type-based and textual information?

    Combining information - All assigned types MAP 0 0.06 0.12 0.18 0.24 YAGO Freebase Wikipedia DBpedia M1 M2 M3 Combining information - Most-specific-level types MAP 0 0.06 0.12 0.18 0.24 YAGO Freebase Wikipedia DBpedia M1 M2 M3
  20. Future work RQ3: How to estimate type-based information? Term-based representation

    Query model Entity model Query model Entity model Type-based representation P(e|q) / P(q|e)P(e) p(t|✓T e ) p(t|✓T q ) KL(✓T q ||✓T e ) p(t0 | ✓T 0 e ) p(t0 | ✓T 0 q ) KL(✓T 0 q k ✓T 0 e ) P(q | e) = (1 )P(✓T 0 q | ✓T 0 e ) + P(✓T q | ✓T e )
  21. Future work • Main focus will be on query typing,

    but eventually on entity typing as well • How to take the best from different type taxonomies