Slide 32
Slide 32 text
参考⽂献
32
• [Gu+, NeurIPS20]: Albert Gu, Tri Dao, Stefano Ermon, Atri Rudra, and Christopher Re. “Hippo: Recurrent
memory with optimal polynomial projections”. NeurIPS, 2020.
• [Gu+, NeurIPS21]: Gu, Albert, Isys Johnson, Karan Goel, Khaled Saab, Tri Dao, Atri Rudra, and
Christopher Ré. "Combining recurrent, convolutional, and continuous-time models with linear state
space layers." NeurIPS, 2021.
• [Gu+, ICLR22]: Albert Gu, Karan Goel, and Christopher Re. “Efficiently modeling long sequences with
structured state spaces”. ICLR, 2022.
• [Gu+, NeurIPS22]: Albert Gu, Ankit Gupta, Karan Goel, and Christopher Re. “On the parameterization
and initialization of diagonal state space models”. NeurIPS, 2022.
• [Ma+, ICLR23]: Xuezhe Ma, Chunting Zhou, Xiang Kong, Junxian He, Liangke Gui, Graham Neubig,
Jonathan May, and Luke Zettlemoyer. "Mega: Moving Average Equipped Gated Attention." ICLR, 2023.
• [Smith+, ICLR23]: Jimmy T.H. Smith, Andrew Warrington, Scott Linderman, “Simplified State Space
Layers for Sequence Modeling” , ICLR, 2023.
• [Mehta+, ICLR23]: Harsh Mehta, Ankit Gupta, Ashok Cutkosky, Behnam Neyshabur. “Long Range
Language Modeling via Gated State Spaces”, ICLR, 2023
• [Tay+, ICLR20]: Yi Tay, Mostafa Dehghani, Samira Abnar, Yikang Shen, Dara Bahri, Philip Pham, Jinfeng
Rao, Liu Yang, Sebastian Ruder, and Donald Metzler. “Long range arena: A benchmark for efficient
transformers”, ICLR, 2020.