Slide 26
Slide 26 text
ࢀߟจݙ
[Brown+,20] Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan,
Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al.
2020. Language models are few-shot learners. Advances in neural information processing
systems, 33:1877–1901.
[Wei+,22] Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, brian ichter, Fei Xia,
Ed Chi, Quoc V Le, and Denny Zhou. 2022. Chain-of-thought prompting elicits reasoning in
large language models. In Advances in Neural Information Processing Systems, volume 35,
pages 24824–24837. Curran Associates, Inc.
[Cobbe+,21] Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Jacob Hilton, Reiichiro
Nakano, Christopher Hesse, and John Schulman. 2021. Training verifiers to solve math word
problems. arXiv preprint arXiv:2110.14168.
[Ouyang+,22] Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L Wainwright, Pamela
Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. 2022. Training
language models to follow instructions with human feedback. arXiv preprint arXiv:2203.02155.
25