segmentation," European Conference on Computer Vision, Cham: Springer Nature Switzerland, 2022, • Dosovitskiy, Alexey, et al, "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale," International Conference on Learning Representations, 2020, • Cho, S,, Hong, S,, Jeon, S,, Lee, Y,, Sohn, K,, Kim, S,: Cats: Cost aggregation transformers for visual correspondence, In: Thirty-Fifth Conference on Neural Information Processing Systems (2021) • Cheng, H,K,, Chung, J,, Tai, Y,W,, Tang, C,K,: CascadePSP: Toward classagnostic and very high- resolution segmentation via global and local refinement,In: CVPR (2020) • Shaban, Amirreza, et al, "One-shot learning for semantic segmentation," arXiv preprint arXiv:1709,03410 (2017), • Wu, Chuhan, et al, "Fastformer: Additive attention can be all you need," arXiv preprint arXiv:2108,09084 (2021), • Katharopoulos, Angelos, et al, "Transformers are rnns: Fast autoregressive transformers with linear attention," International conference on machine learning, PMLR, 2020, 30 参考文献 (1/2)