N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł. & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems (p./pp. 5998--6008), . https://arxiv.org/abs/1706.03762 Transformer-XL:[Dai et al., 2019] Zihang Dai, Zhilin Yang, Yiming Yang, William W Cohen, Jaime Carbonell, Quoc V Le, and Ruslan Salakhutdinov, Transformer-XL: Attentive language models beyond a fixed- length context. In ACL, 2019. https://aclanthology.org/P19-1285.pdf Meena:Daniel Adiwardana, Minh-Thang Luong, David R. So, Jamie Hall, Noah Fiedel, Romal Thoppilan, Zi Yang, Apoorv Kulshreshtha, Gaurav Nemade, Yifeng Lu, Quoc V. Le: Towards a Human-like Open-Domain Chatbot. CoRR abs/2001.09977 (2020) https://arxiv.org/abs/2001.09977 T5 : Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu; : Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. 21(140):1−67, 2020. https://arxiv.org/abs/1910.10683 RETRO : Sebastian Borgeaud, Arthur Mensch, Jordan Hoffmann, Trevor Cai, Eliza Rutherford, Katie Millican, George van den Driessche, Jean-Baptiste Lespiau, Bogdan Damoc, Aidan Clark, Diego de Las Casas, Aurelia Guy, Jacob Menick, Roman Ring, Tom Hennigan, Saffron Huang, Loren Maggiore, Chris Jones, Albin Cassirer, Andy Brock, Michela Paganini, Geoffrey Irving, Oriol Vinyals, Simon Osindero, Karen Simonyan, Jack W. Rae, Erich Elsen, and Laurent Sifre. Improving language models by retrieving from trillions of tokens. arXiv preprint arXiv:2112.04426, 2021. https://proceedings.mlr.press/v162/borgeaud22a.html Safety: LauraWeidinger,JohnMellor,MaribethRauh,ConorGriffin,JonathanUesato,Po-SenHuang,MyraCheng,Mia Glaese, Borja Balle, Atoosa Kasirzadeh, Zac Kenton, Sasha Brown, Will Hawkins, Tom Stepleton, Courtney Biles, Abeba Birhane, Julia Haas, Laura Rimell, Lisa Anne Hendricks, William Isaac, Sean Legassick, Geoffrey Irving, and Iason Gabriel. Ethical and social risks of harm from language models. arXiv preprint arXiv:2112.04359, 2021. https://arxiv.org/abs/2112.04359 31