Slide 20
Slide 20 text
SENet, SKNetと比べると...
Relation to Existing Attention Methods. First introduced in SE-Net [29], the idea of
squeeze-and-attention (called excitation in the original paper) is to employ a global context to
predict channel-wise attention factors. With radix = 1, our Split-Attention block is applying a
squeeze-and-attention operation to each cardinal group, while the SE-Net operates on top of the
entire block regardless of multiple groups. Previous models like SK-Net [38] introduced feature
attention between two network branches, but their operation is not optimized for training
efficiency and scaling to large neural networks. Our method generalizes prior work on
feature-map attention [29, 38] within a cardinal group setting [60], and its implementation
remains computationally efficient. Figure 1 shows an overall comparison with SE-Net and
SK-Net blocks.
(本文より引用)