Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Type Information in Entity Retrieval

Type Information in Entity Retrieval

Date: October 7, 2016
Venue: Stavanger, Norway. Technical talk at UiS TN-IDE

Please cite, link to or credit this presentation when using it or part of it in your work.

#InformationRetrieval #IR #EntityRanking #EntityRetrieval #ER #EntityTypes #EntityOrientedSearch #KnowledgeBases #SemanticSearch

Darío Garigliotti

October 07, 2016
Tweet

More Decks by Darío Garigliotti

Other Decks in Research

Transcript

  1. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Type Information in Entity Retrieval Dar´ ıo Garigliotti University of Stavanger October 7th, 2016 Dar´ ıo Garigliotti Type Information in Entity Retrieval
  2. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Outline: 1 Entities, Properties, and Knowledge Bases 2 Types and Entity Retrieval 3 Dimensions of Type Information Type taxonomies Type representations Retrieval models Dar´ ıo Garigliotti Type Information in Entity Retrieval
  3. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Entity Retrieval - An example: Henrik Ibsen Dar´ ıo Garigliotti Type Information in Entity Retrieval
  4. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information An example: Henrik Ibsen (in Wikipedia) Dar´ ıo Garigliotti Type Information in Entity Retrieval
  5. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Entities and properties An entity is an individual or thing, uniquely identified We describe its properties using triples Attributes Henrik Ibsen, birthdate, 20 March 1828 Types Henrik Ibsen, is a, writer Relations Henrik Ibsen, child, Sigurd Ibsen Henrik Ibsen, work, A Doll’s House Dar´ ıo Garigliotti Type Information in Entity Retrieval
  6. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information RDF and knowledge bases RDF (Resource Description Framework) A family of specifications to describe Web resources A way to represent structured knowledge Dar´ ıo Garigliotti Type Information in Entity Retrieval
  7. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information RDF and knowledge bases RDF (Resource Description Framework) A family of specifications to describe Web resources A way to represent structured knowledge A knowledge base is a set of triples For example, our entity Henrik Ibsen in DBpedia Dar´ ıo Garigliotti Type Information in Entity Retrieval
  8. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information E.g. Henrik Ibsen in DBpedia Dar´ ıo Garigliotti Type Information in Entity Retrieval
  9. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information E.g. Henrik Ibsen in DBpedia (continued) Dar´ ıo Garigliotti Type Information in Entity Retrieval
  10. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information RDF and knowledge bases RDF (Resource Description Framework) A way to represent structured knowledge A knowledge base is a set of triples There are many knowledge bases Domain-specific, e.g. GeoNames, DOI, BBCMusic Cross-domain, e.g. DBpedia, YAGO, Freebase, Google Knowledge Graph Dar´ ıo Garigliotti Type Information in Entity Retrieval
  11. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Knowledge bases as knowledge graphs Dar´ ıo Garigliotti Type Information in Entity Retrieval
  12. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Knowledge bases as knowledge graphs Dar´ ıo Garigliotti Type Information in Entity Retrieval
  13. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information RDF and knowledge bases RDF (Resource Description Framework) A way to represent structured knowledge A knowledge base is a set of triples There are many knowledge bases They are interconnected as Linked Open Data Dar´ ıo Garigliotti Type Information in Entity Retrieval
  14. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Linked Open Data Dar´ ıo Garigliotti Type Information in Entity Retrieval
  15. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Linked Open Data Dar´ ıo Garigliotti Type Information in Entity Retrieval
  16. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Entity types A typical property of an entity is the type(s) Henrik Ibsen, is a, writer Henrik Ibsen, is a, Norwegian writer Henrik Ibsen, is a, person Dar´ ıo Garigliotti Type Information in Entity Retrieval
  17. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information E.g. Henrik Ibsen types in DBpedia Dar´ ıo Garigliotti Type Information in Entity Retrieval
  18. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information E.g. Henrik Ibsen types in Wikipedia Dar´ ıo Garigliotti Type Information in Entity Retrieval
  19. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Entity types A typical property of an entity is the type(s) Types are organized in hierarchies (or taxonomies, or ontologies) Dar´ ıo Garigliotti Type Information in Entity Retrieval
  20. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information E.g. DBpedia Ontology Dar´ ıo Garigliotti Type Information in Entity Retrieval
  21. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Entity types A typical property of an entity is the type(s) Types are organized in hierarchies (or taxonomies, or ontologies) Types are grouping similar information They help to reduce the space of search Dar´ ıo Garigliotti Type Information in Entity Retrieval
  22. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information E.g. Buying a book on Amazon Dar´ ıo Garigliotti Type Information in Entity Retrieval
  23. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information E.g. Buying a book on Amazon Dar´ ıo Garigliotti Type Information in Entity Retrieval
  24. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information E.g. Buying a book on Amazon Dar´ ıo Garigliotti Type Information in Entity Retrieval
  25. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information E.g. Buying a book on Amazon Dar´ ıo Garigliotti Type Information in Entity Retrieval
  26. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Type information in Entity Retrieval Types are useful for entity retrieval They naturally appear in many queries countries where one can pay with the euro art museums in Amsterdam Queries could (somehow) have assigned types Dar´ ıo Garigliotti Type Information in Entity Retrieval
  27. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Query types Dar´ ıo Garigliotti Type Information in Entity Retrieval
  28. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Query types Dar´ ıo Garigliotti Type Information in Entity Retrieval
  29. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Type taxonomies Type representations Retrieval models Dimensions of type information We analyse 3 dimensions Type taxonomies Type representations Retrieval models Dar´ ıo Garigliotti Type Information in Entity Retrieval
  30. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Type taxonomies Type representations Retrieval models Type taxonomies Which type taxonomy to use? DBpedia Ontology (7 levels, 600 types) Freebase Types (2 levels, 2K types) Wikipedia Categories (34 levels, 600K types) YAGO Taxonomy (19 levels, 500K types) These vary a lot in terms of hierarchical structure and in how entity-type assignments are recorded Dar´ ıo Garigliotti Type Information in Entity Retrieval
  31. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Type taxonomies Type representations Retrieval models Type representations How to represent the hierarchical information? Dar´ ıo Garigliotti Type Information in Entity Retrieval
  32. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Type taxonomies Type representations Retrieval models Type representations How to represent the hierarchical information? t3 t3 t2 t2 t5 t5 t4 t4 t9 t9 t8 t8 e t6 t6 t12 t12 t7 t7 … t10 t10 t11 t11 t0 t0 t1 t1 … Type(s) along path to top Dar´ ıo Garigliotti Type Information in Entity Retrieval
  33. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Type taxonomies Type representations Retrieval models Type representations How to represent the hierarchical information? t3 t3 t2 t2 t5 t5 t4 t4 t9 t9 t8 t8 e t6 t6 t12 t12 t7 t7 … t10 t10 t11 t11 t0 t0 t1 t1 … Type(s) along path to top t3 t3 t2 t2 t5 t5 t4 t4 t9 t9 t8 t8 e t6 t6 t12 t12 t7 t7 … t10 t10 t11 t11 t0 t0 t1 t1 … Top-level type(s) Dar´ ıo Garigliotti Type Information in Entity Retrieval
  34. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Type taxonomies Type representations Retrieval models Type representations How to represent the hierarchical information? t3 t3 t2 t2 t5 t5 t4 t4 t9 t9 t8 t8 e t6 t6 t12 t12 t7 t7 … t10 t10 t11 t11 t0 t0 t1 t1 … Type(s) along path to top t3 t3 t2 t2 t5 t5 t4 t4 t9 t9 t8 t8 e t6 t6 t12 t12 t7 t7 … t10 t10 t11 t11 t0 t0 t1 t1 … Top-level type(s) t3 t3 t2 t2 t5 t5 t4 t4 t9 t9 t8 t8 e t6 t6 t12 t12 t7 t7 … t10 t10 t11 t11 t0 t0 t1 t1 … Most specific type(s) Dar´ ıo Garigliotti Type Information in Entity Retrieval
  35. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Type taxonomies Type representations Retrieval models Retrieval models How to add type information into entity retrieval? Retrieval task defined in a generative probabilistic framework P(q | e) query entity Olympic games target types Rio de Janeiro term-based similarity type-based similarity … … entity types Both query and entity considered in the term space as well as in the type space Dar´ ıo Garigliotti Type Information in Entity Retrieval
  36. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Type taxonomies Type representations Retrieval models Retrieval models (Strict) Filtering model P(q | e) = P(θT q | θT e ) · χ[types(q) ∩ types(e) = ∅] Types(q) Types(q) Types(e) Types(e) Dar´ ıo Garigliotti Type Information in Entity Retrieval
  37. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Type taxonomies Type representations Retrieval models Retrieval models (Soft) Filtering model P(q | e) = P(θT q | θT e ) · P(θT q | θT e ) Dar´ ıo Garigliotti Type Information in Entity Retrieval
  38. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Type taxonomies Type representations Retrieval models Retrieval models Interpolation model P(q | e) = (1 − λ) · P(θT q | θT e ) + λ · P(θT q | θT e ) Dar´ ıo Garigliotti Type Information in Entity Retrieval
  39. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Type taxonomies Type representations Retrieval models What did we do? We systematically identified and compared all combinations of those dimensions 4 type taxonomies: DBpedia Ontology (3.9), Freebase Types (2015-03-31), Wikipedia Categories (for DBpedia 3.9) and YAGO Taxonomy (3.0.2) 3 type representations: path-to-top, top-level, most specific 3 models: strict and soft filtering, interpolation Environment: from idealized to realistic entities fully typed in all the taxonomies Dar´ ıo Garigliotti Type Information in Entity Retrieval
  40. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Type taxonomies Type representations Retrieval models What did we do? Results Dar´ ıo Garigliotti Type Information in Entity Retrieval
  41. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Type taxonomies Type representations Retrieval models Lessons learned Summary of insights: Type information proves most useful when larger, deeper type taxonomies provide very specific types. How to represent hierarchical entity type information? Using the most specific types is the most effective way What (kind of) type taxonomies to use? Wikipedia performs best in most of the cases What combination model to choose? All models suffer from missing type information, but interpolation appears to be the most robust Dar´ ıo Garigliotti Type Information in Entity Retrieval
  42. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Type taxonomies Type representations Retrieval models Future work Identify the queries suitable for type-aware entity retrieval Move the environment: from idealized to realistic We used a query types oracle Then, to have an automatic query target type detection Dar´ ıo Garigliotti Type Information in Entity Retrieval
  43. Entities, Properties, and Knowledge Bases Types and Entity Retrieval Dimensions

    of Type Information Thanks! Questions? Dar´ ıo Garigliotti Type Information in Entity Retrieval