Slide 58
Slide 58 text
References (1/2)
[Saito+,21] Yuta Saito, Shunsuke Aihara, Megumi Matsutani, and Yusuke Narita.
“Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible Off-Policy
Evaluation.” NeurIPS dataset&benchmark, 2021. https://arxiv.org/abs/2008.07146
[Li+,18] Shuai Li, Yasin Abbasi-Yadkori, Branislav Kveton, S. Muthukrishnan, Vishwa
Vinay, and Zheng Wen. “Offline Evaluation of Ranking Policies with Click Models.”
KDD, 2018. https://arxiv.org/abs/1804.10488
[McInerney+,20] James McInerney, Brian Brost, Praveen Chandar, Rishabh Mehrotra,
and Ben Carterette. “Counterfactual Evaluation of Slate Recommendations with
Sequential Reward Interactions.” KDD, 2020. https://arxiv.org/abs/2007.12986
[Strehl+,10] Alex Strehl, John Langford, Sham Kakade, and Lihong Li. “Learning from
Logged Implicit Exploration Data.” NeurIPS, 2010. https://arxiv.org/abs/1003.0120
[Athey&Imbens,16] Susan Athey and Guido Imbens. “Recursive Partitioning for
Heterogeneous Causal Effects.” PNAS, 2016. https://arxiv.org/abs/1504.01132
August 2023 Adaptive OPE of Ranking Policies @ KDD'23 58