Slide 77
Slide 77 text
参考文献 (4/6)
[C. Gulcehre+, 2020] Caglar Gulcehre, Ziyu Wang, Alexander Novikov, Tom Le Paine,
Sergio Gomez Colmenarejo, Konrad Zolna, Rishabh Agarwal, Josh Merel, Daniel
Mankowitz, Cosmin Paduraru, Gabriel Dulac-Arnold, Jerry Li, Mohammad Norouzi,
Matt Hoffman, Ofir Nachum, George Tucker, Nicolas Heess, and Nando de Freitas.
“RL Unplugged: Benchmarks for Offline Reinforcement Learning”. arXiv preprint,
2020. https://arxiv.org/abs/2006.13888
[A. Kumar, 2019] A. Kumar. “Data-Driven Deep Reinforcement Learning”. BAIR blog,
2019. https://bair.berkeley.edu/blog/2019/12/05/bear/
[A. Kendall+, 2019] Alex Kendall, Jeffrey Hawke, David Janz, Przemyslaw Mazur,
Daniele Reda, John-Mark Allen, Vinh-Dieu Lam, Alex Bewley, and Amar Shah.
“Learning to Drive in a Day”. ICRA, 2019. https://arxiv.org/abs/1807.00412
[H. Zhu+, 2017] Han Zhu, Junqi Jin, Chang Tan, Fei Pan, Yifan Zeng, Han Li, and Kun
Gai. “Optimized Cost per Click in Taobao Display Advertising”. KDD, 2017.
https://arxiv.org/abs/1703.02091
[B. Zoph & Q. V. Le, 2016] Barret Zoph and Quoc V. Le. “Neural Architecture Search
with Reinforcement Learning”. ICLR, 2016. https://arxiv.org/abs/1611.01578
2021/03/22 オフライン強化学習チュートリアル @ 強化学習若手の会 77