Novikov, Tom Le Paine, Sergio Gomez Colmenarejo, Konrad Zolna, Rishabh Agarwal, Josh Merel, Daniel Mankowitz, Cosmin Paduraru, Gabriel Dulac-Arnold, Jerry Li, Mohammad Norouzi, Matt Hoffman, Ofir Nachum, George Tucker, Nicolas Heess, and Nando de Freitas. “RL Unplugged: Benchmarks for Offline Reinforcement Learning”. arXiv preprint, 2020. https://arxiv.org/abs/2006.13888 [A. Kumar, 2019] A. Kumar. “Data-Driven Deep Reinforcement Learning”. BAIR blog, 2019. https://bair.berkeley.edu/blog/2019/12/05/bear/ [A. Kendall+, 2019] Alex Kendall, Jeffrey Hawke, David Janz, Przemyslaw Mazur, Daniele Reda, John-Mark Allen, Vinh-Dieu Lam, Alex Bewley, and Amar Shah. “Learning to Drive in a Day”. ICRA, 2019. https://arxiv.org/abs/1807.00412 [H. Zhu+, 2017] Han Zhu, Junqi Jin, Chang Tan, Fei Pan, Yifan Zeng, Han Li, and Kun Gai. “Optimized Cost per Click in Taobao Display Advertising”. KDD, 2017. https://arxiv.org/abs/1703.02091 [B. Zoph & Q. V. Le, 2016] Barret Zoph and Quoc V. Le. “Neural Architecture Search with Reinforcement Learning”. ICLR, 2016. https://arxiv.org/abs/1611.01578 2021/03/22 オフライン強化学習チュートリアル @ 強化学習若手の会 77