Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin Attention Is All You Need In Neural Information Processing Systems, (NIPS),2017 David R. So, Wojciech Manke, Hanxiao Liu, Zihang Dai, Noam Shazeer, Quoc V. Le Primer: Searching for Efficient Transformers for Language Modeling In Neural Information Processing Systems, (NeurIPS),2021