Upgrade to Pro — share decks privately, control downloads, hide ads and more …

08 Interpreting Feedback

LiLa'16
March 20, 2016

08 Interpreting Feedback

LiLa'16

March 20, 2016
Tweet

More Decks by LiLa'16

Other Decks in Research

Transcript

  1. Anne Schuth (Blendle / University of Amsterdam, The Netherlands) Krisztian

    Balog (University of Stavanger, Norway) Tutorial at ECIR 2016 in Padua, Italy Interpreting Feedback
  2. 2 Why do interleaving? • Within subject design • As

    opposed to between subject (A/B testing)
  3. 2 Why do interleaving? • Within subject design • As

    opposed to between subject (A/B testing) • Reduces variance (same users/queries for both A and B)
  4. 2 Why do interleaving? • Within subject design • As

    opposed to between subject (A/B testing) • Reduces variance (same users/queries for both A and B) • Need 1 to 2 orders of magnitude less data
  5. 2 Why do interleaving? • Within subject design • As

    opposed to between subject (A/B testing) • Reduces variance (same users/queries for both A and B) • Need 1 to 2 orders of magnitude less data • ~100K queries for interleaving in a mature web search engine (>>1M for A/B testing)
  6. 3 Downsides of interleaving • Online possible for measuring differences

    in ranking algorithms, such as: new ranking algorithms new ranking features new (types of) documents • So, not for UI changes not for ways of displaying snippets not for other aspects such as colors/fonts/… that change
  7. 4 Interleaving Methods • Interleaving • Balanced interleave (Joachims et

    al., 2006) • Team Draft interleave (Radlinski et al., 2008) • Document constraints interleave (He et al., 2009) • Probabilistic interleave (Hofmann et al., 2011) • Optimized interleave (Radlinski and Craswell, 2013) • Upper bound interleave (Kharitonov et al., 2013) • Vertical aware team draft interleave (Chuklin et al., 2013) • Generalized team draft interleave (Kharitonov et al 2015) • Multileaving • Team draft multileave (Schuth et al., 2014) • Optimized multileave (Schuth et al., 2014) • Probabilistic multileave (Schuth et al., 2015)
  8. Team Draft Interleave doc 1 doc 2 doc 3 doc

    4 doc 5 doc 2 doc 4 doc 7 doc 1 doc 3 A B
  9. Team Draft Interleave doc 1 doc 2 doc 3 doc

    4 doc 5 doc 2 doc 4 doc 7 doc 1 doc 3 A B
  10. Team Draft Interleave doc 1 doc 3 doc 2 doc

    4 doc 7 A B A > B Inference:
  11. Team Draft Multileave doc 1 doc 2 doc 3 doc

    4 doc 5 doc 2 doc 4 doc 7 doc 1 doc 3 doc 1 doc 2 doc 8 doc 3 doc 9 doc 4 doc 2 doc 1 doc 9 doc 5 doc 3 doc 1 doc 2 doc 5 doc 7 A B C D E
  12. Team Draft Multileave doc 1 doc 2 doc 3 doc

    4 doc 5 doc 2 doc 4 doc 7 doc 1 doc 3 doc 1 doc 2 doc 8 doc 3 doc 9 doc 4 doc 2 doc 1 doc 9 doc 5 doc 3 doc 1 doc 2 doc 5 doc 7 A B C D E
  13. Team Draft Multileave doc 2 doc 4 doc 1 doc

    9 doc 3 A B C D E X X X X X
  14. Team Draft Multileave doc 2 doc 4 doc 1 doc

    9 doc 3 A B C D E X X X X X
  15. Team Draft Multileave doc 2 doc 4 doc 1 doc

    9 doc 3 A B C D E A > E & B & C & D Inference: X X X X X
  16. Optimized Interleave (OI) doc 1 doc 2 doc 3 doc

    4 doc 2 doc 4 doc 3 doc 1 A B 1. Prefix Constraints:
  17. Optimized Interleave (OI) doc 1 doc 2 doc 3 doc

    4 doc 2 doc 4 doc 3 doc 1 A B doc 1 doc 2 doc 3 doc 4 1. Prefix Constraints:
  18. Optimized Interleave (OI) doc 1 doc 2 doc 3 doc

    4 doc 2 doc 4 doc 3 doc 1 A B doc 1 doc 2 doc 3 doc 4 1. Prefix Constraints:
  19. Optimized Interleave (OI) doc 1 doc 2 doc 3 doc

    4 doc 2 doc 4 doc 3 doc 1 A B doc 1 doc 2 doc 3 doc 4 doc 1 1. Prefix Constraints:
  20. Optimized Interleave (OI) doc 1 doc 2 doc 3 doc

    4 doc 2 doc 4 doc 3 doc 1 A B doc 1 doc 2 doc 3 doc 4 doc 1 doc 2 1. Prefix Constraints:
  21. Optimized Interleave (OI) doc 1 doc 2 doc 3 doc

    4 doc 2 doc 4 doc 3 doc 1 A B doc 1 doc 2 doc 3 doc 4 doc 1 doc 2 doc 4 1. Prefix Constraints:
  22. Optimized Interleave (OI) doc 1 doc 2 doc 3 doc

    4 doc 2 doc 4 doc 3 doc 1 A B doc 1 doc 2 doc 3 doc 4 doc 1 doc 2 doc 4 doc 3 1. Prefix Constraints:
  23. Optimized Interleave (OI) doc 1 doc 2 doc 3 doc

    4 doc 2 doc 4 doc 3 doc 1 A B doc 1 doc 2 doc 3 doc 4 doc 1 doc 2 doc 4 doc 3 1. Prefix Constraints:
  24. Optimized Interleave (OI) doc 1 doc 2 doc 3 doc

    4 doc 2 doc 4 doc 3 doc 1 A B doc 1 doc 2 doc 3 doc 4 doc 1 doc 2 doc 4 doc 3 doc 2 doc 1 doc 3 doc 4 doc 2 doc 1 doc 4 doc 3 doc 2 doc 4 doc 1 doc 3 doc 2 doc 4 doc 3 doc 1 1. Prefix Constraints:
  25. Optimized Interleave (OI) doc 1 doc 2 doc 3 doc

    4 doc 2 doc 4 doc 3 doc 1 A B doc 1 doc 2 doc 3 doc 4 doc 1 doc 2 doc 4 doc 3 doc 2 doc 1 doc 3 doc 4 doc 2 doc 1 doc 4 doc 3 doc 2 doc 4 doc 1 doc 3 doc 2 doc 4 doc 3 doc 1 1. Prefix 2. Unbiased Constraints:
  26. Optimized Interleave (OI) doc 1 doc 2 doc 3 doc

    4 doc 2 doc 4 doc 3 doc 1 A B doc 1 doc 2 doc 3 doc 4 doc 1 doc 2 doc 4 doc 3 doc 2 doc 1 doc 3 doc 4 doc 2 doc 1 doc 4 doc 3 doc 2 doc 4 doc 1 doc 3 doc 2 doc 4 doc 3 doc 1 1. Prefix 2. Unbiased 3 Constraints:
  27. Optimized Interleave (OI) doc 1 doc 2 doc 3 doc

    4 doc 2 doc 4 doc 3 doc 1 A B doc 1 doc 2 doc 3 doc 4 doc 1 doc 2 doc 4 doc 3 doc 2 doc 1 doc 3 doc 4 doc 2 doc 1 doc 4 doc 3 doc 2 doc 4 doc 1 doc 3 doc 2 doc 4 doc 3 doc 1 1. Prefix 2. Unbiased 3 -1 Constraints:
  28. Optimized Interleave (OI) doc 1 doc 2 doc 3 doc

    4 doc 2 doc 4 doc 3 doc 1 A B doc 1 doc 2 doc 3 doc 4 doc 1 doc 2 doc 4 doc 3 doc 2 doc 1 doc 3 doc 4 doc 2 doc 1 doc 4 doc 3 doc 2 doc 4 doc 1 doc 3 doc 2 doc 4 doc 3 doc 1 1. Prefix 2. Unbiased 3 -1 0 Constraints:
  29. Optimized Interleave (OI) doc 1 doc 2 doc 3 doc

    4 doc 2 doc 4 doc 3 doc 1 A B doc 1 doc 2 doc 3 doc 4 doc 1 doc 2 doc 4 doc 3 doc 2 doc 1 doc 3 doc 4 doc 2 doc 1 doc 4 doc 3 doc 2 doc 4 doc 1 doc 3 doc 2 doc 4 doc 3 doc 1 1. Prefix 2. Unbiased 3 -1 0 -2 3 -1 -2 0 -1 3 0 -2 -1 3 -2 0 -1 -2 3 0 -1 -2 0 3 Constraints:
  30. Optimized Interleave (OI) doc 1 doc 2 doc 3 doc

    4 doc 2 doc 4 doc 3 doc 1 A B doc 1 doc 2 doc 3 doc 4 doc 1 doc 2 doc 4 doc 3 doc 2 doc 1 doc 3 doc 4 doc 2 doc 1 doc 4 doc 3 doc 2 doc 4 doc 1 doc 3 doc 2 doc 4 doc 3 doc 1 1. Prefix 2. Unbiased * p1 + * p2 + *p3 + *p4 + *p5 + *p6 = 0 3 -1 0 -2 3 -1 -2 0 -1 3 0 -2 -1 3 -2 0 -1 -2 3 0 -1 -2 0 3 Constraints:
  31. Optimized Interleave (OI) doc 1 doc 2 doc 3 doc

    4 doc 2 doc 4 doc 3 doc 1 A B doc 1 doc 2 doc 3 doc 4 doc 1 doc 2 doc 4 doc 3 doc 2 doc 1 doc 3 doc 4 doc 2 doc 1 doc 4 doc 3 doc 2 doc 4 doc 1 doc 3 doc 2 doc 4 doc 3 doc 1 1. Prefix 2. Unbiased * p1 + * p2 + *p3 + *p4 + *p5 + *p6 = 0 * p1 + * p2 + *p3 + *p4 + *p5 + *p6 = 0 * p1 + * p2 + *p3 + *p4 + *p5 + *p6 = 0 * p1 + * p2 + *p3 + *p4 + *p5 + *p6 = 0 3 -1 0 -2 3 -1 -2 0 -1 3 0 -2 -1 3 -2 0 -1 -2 3 0 -1 -2 0 3 Constraints:
  32. Optimized Interleave (OI) doc 1 doc 2 doc 3 doc

    4 doc 2 doc 4 doc 3 doc 1 A B doc 1 doc 2 doc 3 doc 4 doc 1 doc 2 doc 4 doc 3 doc 2 doc 1 doc 3 doc 4 doc 2 doc 1 doc 4 doc 3 doc 2 doc 4 doc 1 doc 3 doc 2 doc 4 doc 3 doc 1 1. Prefix 2. Unbiased * p1 + * p2 + *p3 + *p4 + *p5 + *p6 = 0 p2=.25 p4=.35 p5=.40 * p1 + * p2 + *p3 + *p4 + *p5 + *p6 = 0 * p1 + * p2 + *p3 + *p4 + *p5 + *p6 = 0 * p1 + * p2 + *p3 + *p4 + *p5 + *p6 = 0 3 -1 0 -2 3 -1 -2 0 -1 3 0 -2 -1 3 -2 0 -1 -2 3 0 -1 -2 0 3 Constraints:
  33. Optimized Interleave (OI) doc 1 doc 2 doc 3 doc

    4 doc 2 doc 4 doc 3 doc 1 A B doc 1 doc 2 doc 3 doc 4 doc 1 doc 2 doc 4 doc 3 doc 2 doc 1 doc 3 doc 4 doc 2 doc 1 doc 4 doc 3 doc 2 doc 4 doc 1 doc 3 doc 2 doc 4 doc 3 doc 1 1. Prefix 2. Unbiased p2=.25 p4=.35 p5=.40 3 -1 -2 0 -1 3 -2 0 -1 -2 3 0 Constraints:
  34. Optimized Interleave (OI) doc 1 doc 2 doc 3 doc

    4 doc 2 doc 4 doc 3 doc 1 A B doc 1 doc 2 doc 3 doc 4 doc 1 doc 2 doc 4 doc 3 doc 2 doc 1 doc 3 doc 4 doc 2 doc 1 doc 4 doc 3 doc 2 doc 4 doc 1 doc 3 doc 2 doc 4 doc 3 doc 1 1. Prefix 2. Unbiased p2=.25 p4=.35 p5=.40 3. Sensitivity 3 -1 -2 0 -1 3 -2 0 -1 -2 3 0 Constraints:
  35. Optimized Interleave (OI) doc 1 doc 2 doc 3 doc

    4 doc 2 doc 4 doc 3 doc 1 A B doc 1 doc 2 doc 3 doc 4 doc 1 doc 2 doc 4 doc 3 doc 2 doc 1 doc 3 doc 4 doc 2 doc 1 doc 4 doc 3 doc 2 doc 4 doc 1 doc 3 doc 2 doc 4 doc 3 doc 1 1. Prefix 2. Unbiased p2=.25 p4=.35 p5=.40 3. Sensitivity p5 3 -1 -2 0 -1 3 -2 0 -1 -2 3 0 Constraints:
  36. Optimized Interleave (OI) doc 1 doc 2 doc 3 doc

    4 doc 2 doc 4 doc 3 doc 1 A B doc 2 doc 4 doc 1 doc 3 1. Prefix 2. Unbiased 3. Sensitivity -1 -2 3 0 Constraints:
  37. Optimized Interleave (OI) doc 1 doc 2 doc 3 doc

    4 doc 2 doc 4 doc 3 doc 1 A B doc 2 doc 4 doc 1 doc 3 1. Prefix 2. Unbiased 3. Sensitivity -1 -2 3 0 Constraints:
  38. Optimized Interleave (OI) doc 1 doc 2 doc 3 doc

    4 doc 2 doc 4 doc 3 doc 1 A B doc 2 doc 4 doc 1 doc 3 1. Prefix 2. Unbiased 3. Sensitivity A > B Inference: -1 -2 3 0 Constraints:
  39. Optimized Multileave doc 1 doc 2 doc 3 doc 4

    doc 2 doc 4 doc 7 doc 1 doc 1 doc 2 doc 8 doc 3 doc 4 doc 2 doc 1 doc 9 doc 3 doc 1 doc 2 doc 5 A B C D E
  40. Optimized Multileave doc 1 doc 2 doc 3 doc 4

    doc 2 doc 4 doc 7 doc 1 doc 1 doc 2 doc 8 doc 3 doc 4 doc 2 doc 1 doc 9 doc 3 doc 1 doc 2 doc 5 A B C D E • Prefix constraint: too many multileavings
  41. Optimized Multileave doc 1 doc 2 doc 3 doc 4

    doc 2 doc 4 doc 7 doc 1 doc 1 doc 2 doc 8 doc 3 doc 4 doc 2 doc 1 doc 9 doc 3 doc 1 doc 2 doc 5 A B C D E • Prefix constraint: too many multileavings • Sampling
  42. Optimized Multileave doc 1 doc 2 doc 3 doc 4

    doc 2 doc 4 doc 7 doc 1 doc 1 doc 2 doc 8 doc 3 doc 4 doc 2 doc 1 doc 9 doc 3 doc 1 doc 2 doc 5 A B C D E • Prefix constraint: too many multileavings • Sampling • In expectation unbiased
  43. Optimized Multileave doc 1 doc 2 doc 3 doc 4

    doc 2 doc 4 doc 7 doc 1 doc 1 doc 2 doc 8 doc 3 doc 4 doc 2 doc 1 doc 9 doc 3 doc 1 doc 2 doc 5 A B C D E doc 1 doc 2 doc 8 doc 4 doc 3 doc 2 doc 4 doc 7 doc 1 doc 2 doc 4 doc 9 doc 2 doc 4 doc 1 doc 7
  44. Optimized Multileave doc 1 doc 2 doc 3 doc 4

    doc 2 doc 4 doc 7 doc 1 doc 1 doc 2 doc 8 doc 3 doc 4 doc 2 doc 1 doc 9 doc 3 doc 1 doc 2 doc 5 A B C D E doc 1 doc 2 doc 8 doc 4 doc 3 doc 2 doc 4 doc 7 doc 1 doc 2 doc 4 doc 9 doc 2 doc 4 doc 1 doc 7 p1=.25 p3=.35 p4=.30 p2=.10
  45. Optimized Multileave doc 1 doc 2 doc 3 doc 4

    doc 2 doc 4 doc 7 doc 1 doc 1 doc 2 doc 8 doc 3 doc 4 doc 2 doc 1 doc 9 doc 3 doc 1 doc 2 doc 5 A B C D E doc 1 doc 2 doc 8 doc 4 doc 3 doc 2 doc 4 doc 7 doc 1 doc 2 doc 4 doc 9 doc 2 doc 4 doc 1 doc 7 p1=.25 p3=.35 p4=.30 p2=.10 p2
  46. Optimized Multileave doc 1 doc 2 doc 3 doc 4

    doc 2 doc 4 doc 7 doc 1 doc 1 doc 2 doc 8 doc 3 doc 4 doc 2 doc 1 doc 9 doc 3 doc 1 doc 2 doc 5 A B C D E doc 3 doc 2 doc 4 doc 7
  47. Optimized Multileave doc 1 doc 2 doc 3 doc 4

    doc 2 doc 4 doc 7 doc 1 doc 1 doc 2 doc 8 doc 3 doc 4 doc 2 doc 1 doc 9 doc 3 doc 1 doc 2 doc 5 A B C D E doc 3 doc 2 doc 4 doc 7
  48. Optimized Multileave doc 1 doc 2 doc 3 doc 4

    doc 2 doc 4 doc 7 doc 1 doc 1 doc 2 doc 8 doc 3 doc 4 doc 2 doc 1 doc 9 doc 3 doc 1 doc 2 doc 5 A B C D E doc 3 doc 2 doc 4 doc 7 1/2 + 1/4 1/1 + 1/2 1/2 + 1/5 1/2 + 1/1 1/3 + 1/5
  49. Optimized Multileave doc 1 doc 2 doc 3 doc 4

    doc 2 doc 4 doc 7 doc 1 doc 1 doc 2 doc 8 doc 3 doc 4 doc 2 doc 1 doc 9 doc 3 doc 1 doc 2 doc 5 A B C D E doc 3 doc 2 doc 4 doc 7 1/2 + 1/4 1/1 + 1/2 1/2 + 1/5 1/2 + 1/1 1/3 + 1/5 B A > E & & C & D Inference: A E C & > E C >
  50. Probabilistic Interleave doc 1 doc 2 doc 3 doc 4

    doc 2 doc 4 doc 7 doc 1 doc 3 A B doc 5
  51. Probabilistic Interleave doc 1 doc 2 doc 3 doc 4

    doc 2 doc 4 doc 7 doc 1 doc 3 A B doc 5
  52. Probabilistic Interleave doc 1 doc 2 doc 3 doc 4

    doc 2 doc 4 doc 7 doc 3 A B doc 5
  53. Probabilistic Interleave A B doc 1 doc 2 doc 3

    doc 4 doc 2 doc 4 doc 7 doc 1 doc 3 doc 5 doc 1 doc 4 doc 3 doc 2 doc 5
  54. Probabilistic Interleave A B doc 1 doc 2 doc 3

    doc 4 doc 2 doc 4 doc 7 doc 1 doc 3 doc 5 doc 1 doc 4 doc 3 doc 2 doc 5 doc 1 doc 4 doc 3 doc 2 doc 5 doc 1 doc 4 doc 3 doc 2 doc 5 doc 1 doc 4 doc 3 doc 2 doc 5 doc 1 doc 4 doc 3 doc 2 doc 5 …
  55. Probabilistic Interleave A B doc 1 doc 2 doc 3

    doc 4 doc 2 doc 4 doc 7 doc 1 doc 3 doc 5 doc 1 doc 4 doc 3 doc 2 doc 5 doc 1 doc 4 doc 3 doc 2 doc 5 doc 1 doc 4 doc 3 doc 2 doc 5 doc 1 doc 4 doc 3 doc 2 doc 5 doc 1 doc 4 doc 3 doc 2 doc 5 … .3 .2 .1 .1 .1 illustrative example, not actual …
  56. Probabilistic Interleave A B doc 1 doc 2 doc 3

    doc 4 doc 2 doc 4 doc 7 doc 1 doc 3 doc 5 doc 1 doc 4 doc 3 doc 2 doc 5 doc 1 doc 4 doc 3 doc 2 doc 5 doc 1 doc 4 doc 3 doc 2 doc 5 doc 1 doc 4 doc 3 doc 2 doc 5 doc 1 doc 4 doc 3 doc 2 doc 5 … .3 .2 .1 .1 .1 illustrative example, not actual …
  57. Probabilistic Interleave A B A > B Inference: doc 1

    doc 2 doc 3 doc 4 doc 2 doc 4 doc 7 doc 1 doc 3 doc 5 doc 1 doc 4 doc 3 doc 2 doc 5 doc 1 doc 4 doc 3 doc 2 doc 5 doc 1 doc 4 doc 3 doc 2 doc 5 doc 1 doc 4 doc 3 doc 2 doc 5 doc 1 doc 4 doc 3 doc 2 doc 5 … .3 .2 .1 .1 .1 illustrative example, not actual …
  58. 10 Interleaving Methods Team Draft Optimized Probabilistic In between interleavings

    yes yes no Sensitive no yes yes Allows for data reuse no no (?) yes
  59. 10 Interleaving Methods Team Draft Optimized Probabilistic In between interleavings

    yes yes no Sensitive no yes yes Allows for data reuse no no (?) yes Fast yes no no
  60. 10 Interleaving Methods Team Draft Optimized Probabilistic In between interleavings

    yes yes no Sensitive no yes yes Allows for data reuse no no (?) yes Fast yes no no Multileave yes yes yes
  61. 10 Interleaving Methods Team Draft Optimized Probabilistic In between interleavings

    yes yes no Sensitive no yes yes Allows for data reuse no no (?) yes Fast yes no no Multileave yes yes yes Used in practice yes no (?) no (?)
  62. • T. Joachims, L. A. Granka, B. Pan, H. Hembrooke,F.Radlinski,

    and G. Gay. Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search. ACM Transactions on Information Systems (TOIS), 25(2), 2007. • O.Chapelle,T.Joachims,F.Radlinski,and Y.Yue. Large-scale validation and analysis of interleaved search evaluation. ACM Transactions on Information Systems (TOIS), 30(1), 2012. • T. Joachims. Evaluating Retrieval Performance using Clickthrough Data. In TextMining. Physica/ Springer, 2003. • F. Radlinski, M. Kurup, and T. Joachims. How does clickthrough data reflect retrieval quality? In CIKM'08, ACM Press, 2008. • J. He,C. Zhai,and X. Li. Evaluation of methods for relative comparison of retrieval systems based on clickthroughs. In CIKM ’09. ACM Press, 2009. • K. Hofmann, S. Whiteson, and M. de Rijke. A probabilistic method for inferring preferences from clicks. In CIKM ’11. ACM Press, 2011. • E. Kharitonov, C. Macdonald, P. Serdyukov. Using Historical Click Data to Increase Interleaving Sensitivity. In CIKM ’13. ACM Press, 2013. • F. Radlinski and N. Craswell. Optimized interleaving for online retrieval evaluation. In WSDM’13. ACM Press, 2013. • E. Kharitonov, C. Macdonald, P. Serdyukov, and I. Ounis. Generalized Team Draft Interleaving. In CIKM'15. ACM Press, 2015. References