Slide 7
Slide 7 text
4
Lambda Layer
画像特化したtransformer
ViTより少ない計算量で
全ピクセル間の関係を取得可能
Apply convolution to 𝒉 to generate query, key, value
𝑄 = Conv 𝒉 ,
𝑉 = Conv(𝒉)
𝐾 = Softmax Conv 𝒉
Apply convolution to value to generate 𝝀!
Compute the product of key and value to generate 𝝀"
𝝀! = Conv 𝑉 , 𝝀" = 𝐾#𝑉
Compute output 𝒉$
by the following equation:
𝒉$ = 𝝀! + 𝝀"
#
𝑄
Related Works: Lambda Networks[Bello+, ICLR21]