Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Learning to Rank from Relevance Feedback

Learning to Rank from Relevance Feedback

When searches involve ambiguous terms, require the retrieval of many documents, or are conducted in multiple interactions with the search system, user feedback is especially useful for improving search results. To address these common scenarios we design a search system that uses novel methods to learn from the user's relevance judgements of documents returned for their search. By combining the traditional method of query expansion with learning to rank, our search system uses the interactive nature of search to improve result ordering, even when there are only a small number of judged documents. We present experimental results indicating that our learning to rank method improves result ordering beyond that achievable when using solely query expansion.

Peter Lubell-Doughtie

August 26, 2011
Tweet

More Decks by Peter Lubell-Doughtie

Other Decks in Technology

Transcript

  1. Learning to Rank from Relevance Feedback Peter Lubell-Doughtie University of

    Amsterdam [email protected] supervised by Maarten van Someren and Katja Hofmann August 29th, 2011
  2. Learning to Rank from Relevance Feedback Search is inherently interactive,

    how can we use this to make search better? Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 2 / 32
  3. Learning to Rank from Relevance Feedback Search is inherently interactive,

    how can we use this to make search better? Our method uses judged documents for query expansion and then for learning to rank Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 2 / 32
  4. Learning to Rank from Relevance Feedback Search is inherently interactive,

    how can we use this to make search better? Our method uses judged documents for query expansion and then for learning to rank We improve search result ordering beyond that achieved when using only query expansion Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 2 / 32
  5. Learning to Rank from Relevance Feedback Search is inherently interactive,

    how can we use this to make search better? Our method uses judged documents for query expansion and then for learning to rank We improve search result ordering beyond that achieved when using only query expansion Success depends on how we extract features Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 2 / 32
  6. Learning to Rank from Relevance Feedback Search is inherently interactive,

    how can we use this to make search better? Our method uses judged documents for query expansion and then for learning to rank We improve search result ordering beyond that achieved when using only query expansion Success depends on how we extract features And on how we request document judgements Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 2 / 32
  7. Contents Motivation Background Method Experiments and Results Conclusions Peter Lubell-Doughtie

    (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 3 / 32
  8. Motivation: Learn from Users Peter Lubell-Doughtie (UvA) Learning to Rank

    from Relevance Feedback August 29th, 2011 4 / 32
  9. Motivation: Learn from Users Search needs extra help when Peter

    Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 4 / 32
  10. Motivation: Learn from Users Search needs extra help when Queries

    and documents contain ambiguous or domain specific terms Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 4 / 32
  11. Motivation: Learn from Users Search needs extra help when Queries

    and documents contain ambiguous or domain specific terms The user wants to retrieve many or all relevant documents Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 4 / 32
  12. Motivation: Learn from Users Search needs extra help when Queries

    and documents contain ambiguous or domain specific terms The user wants to retrieve many or all relevant documents There are multiple interactions with the search system Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 4 / 32
  13. Motivation: Learn from Users Search needs extra help when Queries

    and documents contain ambiguous or domain specific terms The user wants to retrieve many or all relevant documents There are multiple interactions with the search system Search is interactive, let’s learn from this Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 4 / 32
  14. Motivation: Learn from Users Search needs extra help when Queries

    and documents contain ambiguous or domain specific terms The user wants to retrieve many or all relevant documents There are multiple interactions with the search system Search is interactive, let’s learn from this Interaction is especially useful for difficult search tasks Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 4 / 32
  15. Motivation: Learn from Users Search needs extra help when Queries

    and documents contain ambiguous or domain specific terms The user wants to retrieve many or all relevant documents There are multiple interactions with the search system Search is interactive, let’s learn from this Interaction is especially useful for difficult search tasks But providing explicit feedback is costly for the user Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 4 / 32
  16. Motivation: Learn from Users Search needs extra help when Queries

    and documents contain ambiguous or domain specific terms The user wants to retrieve many or all relevant documents There are multiple interactions with the search system Search is interactive, let’s learn from this Interaction is especially useful for difficult search tasks But providing explicit feedback is costly for the user We must use the feedback we receive well Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 4 / 32
  17. Overarching Research Question How can a system that is aware

    of the interactive nature of search exploit this interactivity to better serve the user’s needs? Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 5 / 32
  18. Background: Learning to Rank from Relevance Feedback Peter Lubell-Doughtie (UvA)

    Learning to Rank from Relevance Feedback August 29th, 2011 6 / 32
  19. Background: Learning to Rank from Relevance Feedback The Rocchio Algorithm

    and its Discontents Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 6 / 32
  20. Background: Learning to Rank from Relevance Feedback The Rocchio Algorithm

    and its Discontents Combines an initial query, a positive contribution from relevant documents, and a negative contribution from non-relevant documents Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 6 / 32
  21. Background: Learning to Rank from Relevance Feedback The Rocchio Algorithm

    and its Discontents Combines an initial query, a positive contribution from relevant documents, and a negative contribution from non-relevant documents Judged documents only affect ordering through the expanded query Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 6 / 32
  22. Background: Learning to Rank from Relevance Feedback The Rocchio Algorithm

    and its Discontents Combines an initial query, a positive contribution from relevant documents, and a negative contribution from non-relevant documents Judged documents only affect ordering through the expanded query Negative feedback is often ignored because it is hard to use Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 6 / 32
  23. Background: Learning to Rank from Relevance Feedback The Rocchio Algorithm

    and its Discontents Combines an initial query, a positive contribution from relevant documents, and a negative contribution from non-relevant documents Judged documents only affect ordering through the expanded query Negative feedback is often ignored because it is hard to use But negative feedback is found to be especially beneficial for difficult queries (Wang and Wang et al.) Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 6 / 32
  24. Background: Learning to Rank from Relevance Feedback Our method addresses

    these challenges Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 7 / 32
  25. Background: Learning to Rank from Relevance Feedback Our method addresses

    these challenges To make better use of judged documents Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 7 / 32
  26. Background: Learning to Rank from Relevance Feedback Our method addresses

    these challenges To make better use of judged documents We use feedback in both learning to rank and query expansion Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 7 / 32
  27. Background: Learning to Rank from Relevance Feedback Our method addresses

    these challenges To make better use of judged documents We use feedback in both learning to rank and query expansion To avoid the problems of negative feedback Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 7 / 32
  28. Background: Learning to Rank from Relevance Feedback Our method addresses

    these challenges To make better use of judged documents We use feedback in both learning to rank and query expansion To avoid the problems of negative feedback We use non-relevant documents in learning to rank, not query expansion Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 7 / 32
  29. Background: Learning to Rank from Relevance Feedback Peter Lubell-Doughtie (UvA)

    Learning to Rank from Relevance Feedback August 29th, 2011 8 / 32
  30. Background: Learning to Rank from Relevance Feedback Extract feature vectors

    from data Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 8 / 32
  31. Background: Learning to Rank from Relevance Feedback Extract feature vectors

    from data Use the training data to learn a model h Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 8 / 32
  32. Background: Learning to Rank from Relevance Feedback Extract feature vectors

    from data Use the training data to learn a model h Use this model to predict the relevance of unlabeled test data Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 8 / 32
  33. Background: Learning to Rank from Relevance Feedback Extract feature vectors

    from data Use the training data to learn a model h Use this model to predict the relevance of unlabeled test data Sort documents by their predicted relevance Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 8 / 32
  34. Background: Learning to Rank from Relevance Feedback Pairwise Learning to

    Rank Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 9 / 32
  35. Background: Learning to Rank from Relevance Feedback Pairwise Learning to

    Rank Use document judgements to construct a preference pair set Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 9 / 32
  36. Background: Learning to Rank from Relevance Feedback Pairwise Learning to

    Rank Use document judgements to construct a preference pair set (i, j) in preference pairs iff document i prefered to document j Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 9 / 32
  37. Background: Learning to Rank from Relevance Feedback Pairwise Learning to

    Rank Use document judgements to construct a preference pair set (i, j) in preference pairs iff document i prefered to document j E.g. if d1 and d2 are preferred to d3, then P = {(d1, d3), (d2, d3)} Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 9 / 32
  38. Background: Learning to Rank from Relevance Feedback Pairwise Learning to

    Rank Use document judgements to construct a preference pair set (i, j) in preference pairs iff document i prefered to document j E.g. if d1 and d2 are preferred to d3, then P = {(d1, d3), (d2, d3)} Minimize Incorrectly Ordered Pairs Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 9 / 32
  39. Background: Learning to Rank from Relevance Feedback Pairwise Learning to

    Rank Use document judgements to construct a preference pair set (i, j) in preference pairs iff document i prefered to document j E.g. if d1 and d2 are preferred to d3, then P = {(d1, d3), (d2, d3)} Minimize Incorrectly Ordered Pairs Using feature vectors for documents in the preference pair set Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 9 / 32
  40. Background: Learning to Rank from Relevance Feedback Pairwise Learning to

    Rank Use document judgements to construct a preference pair set (i, j) in preference pairs iff document i prefered to document j E.g. if d1 and d2 are preferred to d3, then P = {(d1, d3), (d2, d3)} Minimize Incorrectly Ordered Pairs Using feature vectors for documents in the preference pair set Calculate a weight vector over features Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 9 / 32
  41. Background: Learning to Rank from Relevance Feedback Pairwise Learning to

    Rank Use document judgements to construct a preference pair set (i, j) in preference pairs iff document i prefered to document j E.g. if d1 and d2 are preferred to d3, then P = {(d1, d3), (d2, d3)} Minimize Incorrectly Ordered Pairs Using feature vectors for documents in the preference pair set Calculate a weight vector over features Score unlabeled documents using this weight vector Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 9 / 32
  42. Background: A Difficult Recall Oriented Task Peter Lubell-Doughtie (UvA) Learning

    to Rank from Relevance Feedback August 29th, 2011 10 / 32
  43. Background: A Difficult Recall Oriented Task e-Discovery and TREC Legal

    Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 10 / 32
  44. Background: A Difficult Recall Oriented Task e-Discovery and TREC Legal

    Ideally the search system returns all documents relevant to the query Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 10 / 32
  45. Background: A Difficult Recall Oriented Task e-Discovery and TREC Legal

    Ideally the search system returns all documents relevant to the query Ordered with the most relevant first Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 10 / 32
  46. Background: A Difficult Recall Oriented Task e-Discovery and TREC Legal

    Ideally the search system returns all documents relevant to the query Ordered with the most relevant first A hard task, the best systems have trouble with some queries Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 10 / 32
  47. Background: A Difficult Recall Oriented Task e-Discovery and TREC Legal

    Ideally the search system returns all documents relevant to the query Ordered with the most relevant first A hard task, the best systems have trouble with some queries Learning is important, previous TREC Legal participants’ systems benefit substantially from incorporating feedback Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 10 / 32
  48. Background: A Difficult Recall Oriented Task e-Discovery and TREC Legal

    Ideally the search system returns all documents relevant to the query Ordered with the most relevant first A hard task, the best systems have trouble with some queries Learning is important, previous TREC Legal participants’ systems benefit substantially from incorporating feedback We can simulate feedback by using already judged documents Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 10 / 32
  49. Open Questions What documents should the user judged? Peter Lubell-Doughtie

    (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 12 / 32
  50. Open Questions What documents should the user judged? How do

    we represent documents for learning? Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 12 / 32
  51. How to Request Judgements: Exploitative Exploitative Sampling Strategy Peter Lubell-Doughtie

    (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 13 / 32
  52. How to Request Judgements: Exploitative Exploitative Sampling Strategy Suppose we

    have a set of seed documents we expect are informative Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 13 / 32
  53. How to Request Judgements: Exploitative Exploitative Sampling Strategy Suppose we

    have a set of seed documents we expect are informative Sample these documents descending from the highest ranked Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 13 / 32
  54. How to Request Judgements: Exploitative Exploitative Sampling Strategy Suppose we

    have a set of seed documents we expect are informative Sample these documents descending from the highest ranked Example Suppose the returned documents are ordered: d∗ 1 > d2 > d∗ 3 Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 13 / 32
  55. How to Request Judgements: Exploitative Exploitative Sampling Strategy Suppose we

    have a set of seed documents we expect are informative Sample these documents descending from the highest ranked Example Suppose the returned documents are ordered: d∗ 1 > d2 > d∗ 3 Those marked ∗ are expected to be informative Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 13 / 32
  56. How to Request Judgements: Exploitative Exploitative Sampling Strategy Suppose we

    have a set of seed documents we expect are informative Sample these documents descending from the highest ranked Example Suppose the returned documents are ordered: d∗ 1 > d2 > d∗ 3 Those marked ∗ are expected to be informative We would request judgements on d1 and then d3 Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 13 / 32
  57. How to Request Judgements: Exploitative Exploitative Sampling Strategy Suppose we

    have a set of seed documents we expect are informative Sample these documents descending from the highest ranked Example Suppose the returned documents are ordered: d∗ 1 > d2 > d∗ 3 Those marked ∗ are expected to be informative We would request judgements on d1 and then d3 Close to the way documents are judged in a real-life search system Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 13 / 32
  58. How to Request Judgements: Exploitative Exploitative Sampling Strategy Suppose we

    have a set of seed documents we expect are informative Sample these documents descending from the highest ranked Example Suppose the returned documents are ordered: d∗ 1 > d2 > d∗ 3 Those marked ∗ are expected to be informative We would request judgements on d1 and then d3 Close to the way documents are judged in a real-life search system Biases judged documents towards those ranked higher Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 13 / 32
  59. How to Request Judgements: Random Seeds Random Seed Sampling Strategy

    Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 14 / 32
  60. How to Request Judgements: Random Seeds Random Seed Sampling Strategy

    Randomly sample from the set of informative documents Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 14 / 32
  61. How to Request Judgements: Random Seeds Random Seed Sampling Strategy

    Randomly sample from the set of informative documents Same document set as in exploitative sampling Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 14 / 32
  62. How to Request Judgements: Random Seeds Random Seed Sampling Strategy

    Randomly sample from the set of informative documents Same document set as in exploitative sampling No rank based bias Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 14 / 32
  63. Open Questions What documents should the user judge? 1 Exploitative

    2 Random Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 15 / 32
  64. Open Questions What documents should the user judge? 1 Exploitative

    2 Random How do we represent documents for learning? Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 15 / 32
  65. Representing Documents: Cumulative Features Features are the retrieval scores for

    queries expanded with different sets of judged documents Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 16 / 32
  66. Representing Documents: Cumulative Features Features are the retrieval scores for

    queries expanded with different sets of judged documents Example Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 16 / 32
  67. Representing Documents: Cumulative Features Features are the retrieval scores for

    queries expanded with different sets of judged documents Example Suppose there are three documents Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 16 / 32
  68. Representing Documents: Cumulative Features Features are the retrieval scores for

    queries expanded with different sets of judged documents Example Suppose there are three documents The retrieval scores before expansion are 1, 2, 3 Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 16 / 32
  69. Representing Documents: Cumulative Features Features are the retrieval scores for

    queries expanded with different sets of judged documents Example Suppose there are three documents The retrieval scores before expansion are 1, 2, 3 After judging one document on the first iteartion Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 16 / 32
  70. Representing Documents: Cumulative Features Features are the retrieval scores for

    queries expanded with different sets of judged documents Example Suppose there are three documents The retrieval scores before expansion are 1, 2, 3 After judging one document on the first iteartion The document profile is P = {d} and the expansion term is t1 Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 16 / 32
  71. Representing Documents: Cumulative Features Features are the retrieval scores for

    queries expanded with different sets of judged documents Example Suppose there are three documents The retrieval scores before expansion are 1, 2, 3 After judging one document on the first iteartion The document profile is P = {d} and the expansion term is t1 Retrieval scores with the expanded query are 1, 1, 3 Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 16 / 32
  72. Representing Documents: Cumulative Features Features are the retrieval scores for

    queries expanded with different sets of judged documents Example Suppose there are three documents The retrieval scores before expansion are 1, 2, 3 After judging one document on the first iteartion The document profile is P = {d} and the expansion term is t1 Retrieval scores with the expanded query are 1, 1, 3 We are only collecting features up to and including the first iteration Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 16 / 32
  73. Representing Documents: Cumulative Features Features are the retrieval scores for

    queries expanded with different sets of judged documents Example Suppose there are three documents The retrieval scores before expansion are 1, 2, 3 After judging one document on the first iteartion The document profile is P = {d} and the expansion term is t1 Retrieval scores with the expanded query are 1, 1, 3 We are only collecting features up to and including the first iteration We are reordering the top two documents Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 16 / 32
  74. Example: Cumulative Features The input to the cumulative feature extractor

    Φ will be:     {q0, ∅, 1, 2, 3 } {q0 ◦ t1, {d}, 1, 1, 3 } . . .     , Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 17 / 32
  75. Example: Cumulative Features The input to the cumulative feature extractor

    Φ will be:     {q0, ∅, 1, 2, 3 } {q0 ◦ t1, {d}, 1, 1, 3 } . . .     , and the output features will be: 1, 2 1, 1 . Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 17 / 32
  76. Example: Cumulative Features The input to the cumulative feature extractor

    Φ will be:     {q0, ∅, 1, 2, 3 } {q0 ◦ t1, {d}, 1, 1, 3 } . . .     , and the output features will be: 1, 2 1, 1 . The top row are the retrieval scores without expansion Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 17 / 32
  77. Example: Cumulative Features The input to the cumulative feature extractor

    Φ will be:     {q0, ∅, 1, 2, 3 } {q0 ◦ t1, {d}, 1, 1, 3 } . . .     , and the output features will be: 1, 2 1, 1 . The top row are the retrieval scores without expansion The bottom row are retrieval scores with expansion Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 17 / 32
  78. Example: Cumulative Features The input to the cumulative feature extractor

    Φ will be:     {q0, ∅, 1, 2, 3 } {q0 ◦ t1, {d}, 1, 1, 3 } . . .     , and the output features will be: 1, 2 1, 1 . The top row are the retrieval scores without expansion The bottom row are retrieval scores with expansion The first column are features for the first document Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 17 / 32
  79. Example: Cumulative Features The input to the cumulative feature extractor

    Φ will be:     {q0, ∅, 1, 2, 3 } {q0 ◦ t1, {d}, 1, 1, 3 } . . .     , and the output features will be: 1, 2 1, 1 . The top row are the retrieval scores without expansion The bottom row are retrieval scores with expansion The first column are features for the first document The second column are features for the second document Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 17 / 32
  80. Representing Documents: Constant Features Features are the retrieval scores for

    the same set of judged documents but using different numbers of expansion terms Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 18 / 32
  81. Representing Documents: Constant Features Features are the retrieval scores for

    the same set of judged documents but using different numbers of expansion terms Example Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 18 / 32
  82. Representing Documents: Constant Features Features are the retrieval scores for

    the same set of judged documents but using different numbers of expansion terms Example Suppose there are three documents Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 18 / 32
  83. Representing Documents: Constant Features Features are the retrieval scores for

    the same set of judged documents but using different numbers of expansion terms Example Suppose there are three documents The judged document profile is P = {d} Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 18 / 32
  84. Representing Documents: Constant Features Features are the retrieval scores for

    the same set of judged documents but using different numbers of expansion terms Example Suppose there are three documents The judged document profile is P = {d} The first two expansion terms are {t1, t2} Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 18 / 32
  85. Representing Documents: Constant Features Features are the retrieval scores for

    the same set of judged documents but using different numbers of expansion terms Example Suppose there are three documents The judged document profile is P = {d} The first two expansion terms are {t1, t2} The retrieval scores when expanding with t1 are 2, 2, 3 Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 18 / 32
  86. Representing Documents: Constant Features Features are the retrieval scores for

    the same set of judged documents but using different numbers of expansion terms Example Suppose there are three documents The judged document profile is P = {d} The first two expansion terms are {t1, t2} The retrieval scores when expanding with t1 are 2, 2, 3 The retrieval scores when expanding with t1 and t2 the retreival scores are 1, 2, 3 Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 18 / 32
  87. Representing Documents: Constant Features Features are the retrieval scores for

    the same set of judged documents but using different numbers of expansion terms Example Suppose there are three documents The judged document profile is P = {d} The first two expansion terms are {t1, t2} The retrieval scores when expanding with t1 are 2, 2, 3 The retrieval scores when expanding with t1 and t2 the retreival scores are 1, 2, 3 We are reordering the top two documents Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 18 / 32
  88. Example: Constant Features Our input to the constant feature extractor

    Φ will be: {q0 ◦ t1, {d}, 2, 2, 3 } {q0 ◦ t1 ◦ t2, {d}, 1, 2, 3 } , Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 19 / 32
  89. Example: Constant Features Our input to the constant feature extractor

    Φ will be: {q0 ◦ t1, {d}, 2, 2, 3 } {q0 ◦ t1 ◦ t2, {d}, 1, 2, 3 } , and the output features will be: 2, 2 1, 2 . Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 19 / 32
  90. Example: Constant Features Our input to the constant feature extractor

    Φ will be: {q0 ◦ t1, {d}, 2, 2, 3 } {q0 ◦ t1 ◦ t2, {d}, 1, 2, 3 } , and the output features will be: 2, 2 1, 2 . The top row are the retrieval scores for the query q0 ◦ t1 Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 19 / 32
  91. Example: Constant Features Our input to the constant feature extractor

    Φ will be: {q0 ◦ t1, {d}, 2, 2, 3 } {q0 ◦ t1 ◦ t2, {d}, 1, 2, 3 } , and the output features will be: 2, 2 1, 2 . The top row are the retrieval scores for the query q0 ◦ t1 The bottom row are retrieval scores for the query q0 ◦ t1 ◦ t2 Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 19 / 32
  92. Example: Constant Features Our input to the constant feature extractor

    Φ will be: {q0 ◦ t1, {d}, 2, 2, 3 } {q0 ◦ t1 ◦ t2, {d}, 1, 2, 3 } , and the output features will be: 2, 2 1, 2 . The top row are the retrieval scores for the query q0 ◦ t1 The bottom row are retrieval scores for the query q0 ◦ t1 ◦ t2 The first column are features for the first document Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 19 / 32
  93. Example: Constant Features Our input to the constant feature extractor

    Φ will be: {q0 ◦ t1, {d}, 2, 2, 3 } {q0 ◦ t1 ◦ t2, {d}, 1, 2, 3 } , and the output features will be: 2, 2 1, 2 . The top row are the retrieval scores for the query q0 ◦ t1 The bottom row are retrieval scores for the query q0 ◦ t1 ◦ t2 The first column are features for the first document The second column are features for the second document Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 19 / 32
  94. Example: Term Frequency Features Features are the term frequency of

    the most common terms in judged documents and highly ranked documents Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 20 / 32
  95. Example: Term Frequency Features Features are the term frequency of

    the most common terms in judged documents and highly ranked documents Example Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 20 / 32
  96. Example: Term Frequency Features Features are the term frequency of

    the most common terms in judged documents and highly ranked documents Example Suppose there are three documents Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 20 / 32
  97. Example: Term Frequency Features Features are the term frequency of

    the most common terms in judged documents and highly ranked documents Example Suppose there are three documents Their retrieval scores are 1, 2, 3 Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 20 / 32
  98. Example: Term Frequency Features Features are the term frequency of

    the most common terms in judged documents and highly ranked documents Example Suppose there are three documents Their retrieval scores are 1, 2, 3 The term based features for the query are fq = [t1, t2] Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 20 / 32
  99. Example: Term Frequency Features Features are the term frequency of

    the most common terms in judged documents and highly ranked documents Example Suppose there are three documents Their retrieval scores are 1, 2, 3 The term based features for the query are fq = [t1, t2] For the first document the frequency of t1 is 1 and that of t2 is 2, i.e. its term frequency vector is [1, 2] Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 20 / 32
  100. Example: Term Frequency Features Features are the term frequency of

    the most common terms in judged documents and highly ranked documents Example Suppose there are three documents Their retrieval scores are 1, 2, 3 The term based features for the query are fq = [t1, t2] For the first document the frequency of t1 is 1 and that of t2 is 2, i.e. its term frequency vector is [1, 2] For the second document the frequency of t1 is 1 and that of t2 is 1, i.e. its term frequency vector is [1, 1] Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 20 / 32
  101. Example: Term Frequency Features Features are the term frequency of

    the most common terms in judged documents and highly ranked documents Example Suppose there are three documents Their retrieval scores are 1, 2, 3 The term based features for the query are fq = [t1, t2] For the first document the frequency of t1 is 1 and that of t2 is 2, i.e. its term frequency vector is [1, 2] For the second document the frequency of t1 is 1 and that of t2 is 1, i.e. its term frequency vector is [1, 1] We are reordering the top two documents Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 20 / 32
  102. Example: Term Frequency Features Our input to the term frequency

    feature extractor Φ will be: {q, Pq, 1, 2, 3 } , Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 21 / 32
  103. Example: Term Frequency Features Our input to the term frequency

    feature extractor Φ will be: {q, Pq, 1, 2, 3 } , and we define the output features as the term frequency vectors: 1, 1 2, 1 . Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 21 / 32
  104. Example: Term Frequency Features Our input to the term frequency

    feature extractor Φ will be: {q, Pq, 1, 2, 3 } , and we define the output features as the term frequency vectors: 1, 1 2, 1 . The top row are the term frequencies of t1 Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 21 / 32
  105. Example: Term Frequency Features Our input to the term frequency

    feature extractor Φ will be: {q, Pq, 1, 2, 3 } , and we define the output features as the term frequency vectors: 1, 1 2, 1 . The top row are the term frequencies of t1 The bottom row are the term frequencies of t2 Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 21 / 32
  106. Example: Term Frequency Features Our input to the term frequency

    feature extractor Φ will be: {q, Pq, 1, 2, 3 } , and we define the output features as the term frequency vectors: 1, 1 2, 1 . The top row are the term frequencies of t1 The bottom row are the term frequencies of t2 The first column are features for the first document Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 21 / 32
  107. Example: Term Frequency Features Our input to the term frequency

    feature extractor Φ will be: {q, Pq, 1, 2, 3 } , and we define the output features as the term frequency vectors: 1, 1 2, 1 . The top row are the term frequencies of t1 The bottom row are the term frequencies of t2 The first column are features for the first document The second column are features for the second document Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 21 / 32
  108. Open Questions What documents should the user judge? 1 Exploitative:

    choose highly ranked first 2 Random: choose at random Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 22 / 32
  109. Open Questions What documents should the user judge? 1 Exploitative:

    choose highly ranked first 2 Random: choose at random How do we represent documents for learning? 1 Cumulative: retrieval scores for different document sets 2 Constant: retrieval scores with different numbers of expansion terms 3 Term frequency: vector space model Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 22 / 32
  110. Learning to Rank improves Query Expansion Peter Lubell-Doughtie (UvA) Learning

    to Rank from Relevance Feedback August 29th, 2011 23 / 32
  111. Learning to Rank improves Query Expansion Random Seeds and Constant

    Features Metric Baseline Expansion Learning MAP 0.0674 0.130 0.137 NDCG 0.565 0.646 0.652 Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 23 / 32
  112. Learning to Rank improves Query Expansion Random Seeds and Constant

    Features Metric Baseline Expansion Learning MAP 0.0674 0.130 0.137 NDCG 0.565 0.646 0.652 Baseline is language model retrieval with the original query Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 23 / 32
  113. Learning to Rank improves Query Expansion Random Seeds and Constant

    Features Metric Baseline Expansion Learning MAP 0.0674 0.130 0.137 NDCG 0.565 0.646 0.652 Baseline is language model retrieval with the original query Expansion is language model retrieval with the expanded query Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 23 / 32
  114. Learning to Rank improves Query Expansion Random Seeds and Constant

    Features Metric Baseline Expansion Learning MAP 0.0674 0.130 0.137 NDCG 0.565 0.646 0.652 Baseline is language model retrieval with the original query Expansion is language model retrieval with the expanded query Learning from 20 relevant and 20 non-relelvant documents per query Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 23 / 32
  115. Learning to Rank improves Query Expansion Random Seeds and Constant

    Features Metric Baseline Expansion Learning MAP 0.0674 0.130 0.137 NDCG 0.565 0.646 0.652 Baseline is language model retrieval with the original query Expansion is language model retrieval with the expanded query Learning from 20 relevant and 20 non-relelvant documents per query Scores are averaged over all (8) queries Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 23 / 32
  116. Learning to Rank improves Query Expansion Random Seeds and Constant

    Features Metric Baseline Expansion Learning MAP 0.0674 0.130 0.137 NDCG 0.565 0.646 0.652 Baseline is language model retrieval with the original query Expansion is language model retrieval with the expanded query Learning from 20 relevant and 20 non-relelvant documents per query Scores are averaged over all (8) queries are statistically significant at the 0.01 level and at the 0.001 level Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 23 / 32
  117. Learning to Rank improves Query Expansion By combining traditional query

    expansion with learning to rank, a search system can use the interactive nature of search to better serve the user’s needs. Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 24 / 32
  118. How to Best Learn from Documents Peter Lubell-Doughtie (UvA) Learning

    to Rank from Relevance Feedback August 29th, 2011 25 / 32
  119. How to Best Learn from Documents The best method for

    representing documents is codependent with the best method for requesting document judgements Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 25 / 32
  120. How to Best Learn from Documents The best method for

    representing documents is codependent with the best method for requesting document judgements Experiment Compare evaluation scores when using different feature spaces Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 25 / 32
  121. How to Best Learn from Documents The best method for

    representing documents is codependent with the best method for requesting document judgements Experiment Compare evaluation scores when using different feature spaces Compare evaluation scores when using different strategies to choose documents for judgement Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 25 / 32
  122. Retrieval score features outperform term frequency features when using random

    sampling Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 26 / 32
  123. Why do retrieval score features outperform term frequency features when

    using random sampling? Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 27 / 32
  124. Why do retrieval score features outperform term frequency features when

    using random sampling? Random sampling judges documents at many different ranks Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 27 / 32
  125. Why do retrieval score features outperform term frequency features when

    using random sampling? Random sampling judges documents at many different ranks This provides an unbiased picture of the relationship between retrieval scores and document judgements Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 27 / 32
  126. Why do retrieval score features outperform term frequency features when

    using random sampling? Random sampling judges documents at many different ranks This provides an unbiased picture of the relationship between retrieval scores and document judgements Learning to rank with low ranked documents could increase the rank of other relevant but incorrectly low ranked documents Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 27 / 32
  127. Why do retrieval score features outperform term frequency features when

    using random sampling? Random sampling judges documents at many different ranks This provides an unbiased picture of the relationship between retrieval scores and document judgements Learning to rank with low ranked documents could increase the rank of other relevant but incorrectly low ranked documents This could significantly improve scores Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 27 / 32
  128. Why do retrieval score features outperform term frequency features when

    using random sampling? Random sampling judges documents at many different ranks This provides an unbiased picture of the relationship between retrieval scores and document judgements Learning to rank with low ranked documents could increase the rank of other relevant but incorrectly low ranked documents This could significantly improve scores Term frequency features only consider content and ignore rank Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 27 / 32
  129. Why do retrieval score features outperform term frequency features when

    using random sampling? Random sampling judges documents at many different ranks This provides an unbiased picture of the relationship between retrieval scores and document judgements Learning to rank with low ranked documents could increase the rank of other relevant but incorrectly low ranked documents This could significantly improve scores Term frequency features only consider content and ignore rank Content of low ranked documents may be uninformative Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 27 / 32
  130. Term frequency features outperform retrieval score features for exploitative sampling

    Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 28 / 32
  131. Why do term frequency features outperform retrieval score features when

    using exploitative sampling? Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 29 / 32
  132. Why do term frequency features outperform retrieval score features when

    using exploitative sampling? Exploitative sampling overrepresents highly ranked judged documents Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 29 / 32
  133. Why do term frequency features outperform retrieval score features when

    using exploitative sampling? Exploitative sampling overrepresents highly ranked judged documents With term frequency features the content of judged documents must be helpful for reordering Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 29 / 32
  134. Why do term frequency features outperform retrieval score features when

    using exploitative sampling? Exploitative sampling overrepresents highly ranked judged documents With term frequency features the content of judged documents must be helpful for reordering Our retrieval method orders documents based on the relevance of their terms Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 29 / 32
  135. Why do term frequency features outperform retrieval score features when

    using exploitative sampling? Exploitative sampling overrepresents highly ranked judged documents With term frequency features the content of judged documents must be helpful for reordering Our retrieval method orders documents based on the relevance of their terms Terms in highly ranked documents are likely to contain more informative terms than those from random documents Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 29 / 32
  136. Why do term frequency features outperform retrieval score features when

    using exploitative sampling? Exploitative sampling overrepresents highly ranked judged documents With term frequency features the content of judged documents must be helpful for reordering Our retrieval method orders documents based on the relevance of their terms Terms in highly ranked documents are likely to contain more informative terms than those from random documents frequency(top five terms in exploitative sampled documents) = 5 × frequency(top five terms in random sampled documents) Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 29 / 32
  137. Summary and Conclusions Our method combines learning to rank and

    query expansion Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 30 / 32
  138. Summary and Conclusions Our method combines learning to rank and

    query expansion This improves search result ordering beyond no expansion and beyond using only query expansion Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 30 / 32
  139. Summary and Conclusions Our method combines learning to rank and

    query expansion This improves search result ordering beyond no expansion and beyond using only query expansion Both constant and term frequency features can improve ordering Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 30 / 32
  140. Summary and Conclusions Our method combines learning to rank and

    query expansion This improves search result ordering beyond no expansion and beyond using only query expansion Both constant and term frequency features can improve ordering TREC Legal is not significantly different from other search task, such as ad-hoc web page search Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 30 / 32
  141. Summary and Conclusions Our method combines learning to rank and

    query expansion This improves search result ordering beyond no expansion and beyond using only query expansion Both constant and term frequency features can improve ordering TREC Legal is not significantly different from other search task, such as ad-hoc web page search Our method’s improvement in scores may be a general result that will apply to other corpuses Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 30 / 32
  142. Future Work Design a better hybrid method Peter Lubell-Doughtie (UvA)

    Learning to Rank from Relevance Feedback August 29th, 2011 31 / 32
  143. Future Work Design a better hybrid method Apply to additional

    corpora Peter Lubell-Doughtie (UvA) Learning to Rank from Relevance Feedback August 29th, 2011 31 / 32