George Tucker, Ziyu Wang, Alexander Novikov, Mengjiao Yang, Michael R. Zhang, Yutian Chen, Aviral Kumar, Cosmin Paduraru, Sergey Levine, and Tom Le Paine. “Benchmarks for Deep Off-Policy Evaluation.” ICLR, 2021. https://arxiv.org/abs/2103.16596 [Voloshin+,21 (COBS)] Cameron Voloshin, Hoang M. Le, Nan Jiang, and Yisong Yue. “Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning.” NeurIPS dataset&benchmark, 2021. https://arxiv.org/abs/1911.06854 [Rohde+,18 (RecoGym)] David Rohde, Stephen Bonner, Travis Dunlop, Flavian Vasile, and Alexandros Karatzoglou “RecoGym: A Reinforcement Learning Environment for the problem of Product Recommendation in Online Advertising.” 2018. https://arxiv.org/abs/1808.00720 September 2022 OFRL: Offline RL and OPE platform @ CONSEQUENCES+REVEAL WS 47