for time series forecasting: The influenza prevalence case." arXiv preprint arXiv:2001.08317 (2020). [Yu, 2020] Yu, Cunjun, Xiao Ma, Jiawei Ren, Haiyu Zhao, and Shuai Yi. "Spatio-temporal graph transformer networks for pedestrian trajectory prediction." In European Conference on Computer Vision, pp. 507-523. Springer, Cham, 2020. [Vaswani, 2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998-6008). [Yu, 2017] Yu, Fisher, Vladlen Koltun, and Thomas Funkhouser. "Dilated residual networks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. [Chen, 2019] Chen, Nanxin, et al. "Listen and fill in the missing letters: Non-autoregressive transformer for speech recognition." arXiv preprint arXiv. [Li, 2019] Li, Shiyang, et al. "Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting." Advances in Neural Information Processing Systems 32 (2019): 5243-5253. [Taylor, 2018] Taylor, Sean J., and Benjamin Letham. "Forecasting at scale." The American Statistician 72.1 (2018): 37-45. [Bahdanau, 2014] Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and translate." arXiv preprint arXiv:1409.0473 (2014). [Lai, 2018] Lai, Guokun, et al. "Modeling long-and short-term temporal patterns with deep neural networks." The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 2018. [Salinas, 2020] Salinas, David, et al. "DeepAR: Probabilistic forecasting with autoregressive recurrent networks." International Journal of Forecasting 36.3 (2020): 1181-1191. [Li, 2019] Li, S.; Jin, X.; Xuan, Y.; Zhou, X.; Chen, W.; Wang, Y.-X.; and Yan, X. 2019. Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Fore- casting. arXiv:1907.00235 . [Kitaev, 2019] Kitaev, N.; Kaiser, L.; and Levskaya, A. 2019. Reformer: The Efficient Transformer. In ICLR. Reference