Slide 17
Slide 17 text
References
• Dai, Z. et al., Transformer-XL: Attentive Language Models beyond a Fixed-Length Context, Proc. ACL (2019)
• Gehring, J. et al., Convolutional Sequence to Sequence Learning, Proc. ICML (2017)
• Huang, C.-Z. et al., Music Transformer: Generating Music with Long-Term Structure, Proc. ICLR (2019)
• Park, D. S. et al., SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition, Proc.
Interspeech (2019)
• Pham, N.-Q. et al., Very Deep Self-Attention Networks for End-to-End Speech Recognition, Proc. Interspeech
(2019a)
• Pham, N.-Q. et al, The IWSLT 2019 KIT Speech Translation System, Proc. IWSLT (2019b)
• Shaw, P. et al., Self-Attention with Relative Position Representations, Proc. NAACL-HLT (2018)
• Sukhbaatar, S. et al., End-To-End Memory Networks, Proc. NIPS (2015)
• Takase, S. et al., Positional Encoding to Control Output Sequence Length, Proc. NAACL-HLT (2019)
• Tom, K. et al., Audio Augmentation for Speech Recognition, Proc. Interspeech (2015)
• Vaswani, A. et al., Attention Is All You Need, Proc. NIPS (2017)
Interspeech 2020 Reading Group (2020/11/20)
17