[28] Mor Geva et al. (2021) “Transformer Feed-Forward Layers Are Key-Value Memories” フィードフォワードネットを記憶装置とみなす § フィードフォワードネット (2層MLP) は注意機構と似ている Feed-Forward Network FFN(key) Activation inner product weighted sum FFN(val) FFN Output Hidden State The capital of Ireland is [MASK] Self-Attention Layer Feed-Forward Network Dublin … … … … Knowledge Neurons … … 𝐿 × Figure 2: Illustration of how an FFN module in a Transformer block works as a key-value memory. The first linear layer FFN(key) computes intermediate neurons through inner product. Taking the activation of these neurons as weights, the second linear layer FFN(val) integrates value vectors through weighted sum. We hypothesize that knowledge neurons in the FFN module are responsible for expressing factual knowledge. Attention head Attention weights Key vectors Value vectors !! !" !# … weighted sum inner product … … … … … dule in a Transformer block works as a key-value memory. The first linear neurons through inner product. Taking the activation of these neurons as al) integrates value vectors through weighted sum. We hypothesize that re responsible for expressing factual knowledge. nowledge at- g and ampli- ffects the ex- dge. Second, a fact tend to g knowledge- e knowledge g prompts re- ually express om activating lation. erage knowl- al knowledge y fine-tuning. pdating facts, ng the knowl- edge surgery in Transformers, even without any fine-tuning. 2 Background: Transformer Transformer (Vaswani et al., 2017) is one of the most popular and effective NLP architectures. A Transformer encoder is stacked with L identical blocks. Each Transformer block mainly contains two modules: a self-attention module, and a feed- forward network (abbreviated as FFN) module. Let X 2 Rn⇥d denote the input matrix, two modules can be formulated as follows: Qh = XW Q h ,Kh = XW K h , Vh = XW V h , (1) Self-Atth(X) = softmax QhK T h Vh, (2) FFN(H) = gelu (HW1) W2, (3) Q K V ʢॏΈߦྻʣ ʢॏΈߦྻʣ ⾮常に類似 Figure 2: Illustration of how an FFN module in a Transformer block works as a key-value memory. The first linear layer FFN(key) computes intermediate neurons through inner product. Taking the activation of these neurons as weights, the second linear layer FFN(val) integrates value vectors through weighted sum. We hypothesize that knowledge neurons in the FFN module are responsible for expressing factual knowledge. the effectiveness of the proposed knowledge at- tribution method. First, suppressing and ampli- fying knowledge neurons notably affects the ex- pression of the corresponding knowledge. Second, we find that knowledge neurons of a fact tend to be activated more by corresponding knowledge- expressing prompts. Third, given the knowledge neurons of a fact, the top activating prompts re- trieved from open-domain texts usually express the corresponding fact, while the bottom activating prompts do not express the correct relation. In our case studies, we try to leverage knowl- edge neurons to explicitly edit factual knowledge in pretrained Transformers without any fine-tuning. We present two preliminary studies: updating facts, and erasing relations. After identifying the knowl- edge neurons, we perform a knowledge surgery for pretrained Transformers by directly modify- in Transformers, even without any fine-tuning. 2 Background: Transformer Transformer (Vaswani et al., 2017) is one of the most popular and effective NLP architectures. A Transformer encoder is stacked with L identical blocks. Each Transformer block mainly contains two modules: a self-attention module, and a feed- forward network (abbreviated as FFN) module. Let X 2 Rn⇥d denote the input matrix, two modules can be formulated as follows: Qh = XW Q h ,Kh = XW K h , Vh = XW V h , (1) Self-Atth(X) = softmax QhK T h Vh, (2) FFN(H) = gelu (HW1) W2, (3) where W Q h , W K h , W V h , W1, W2 are parameter ma- trices; Self-Atth(X) computes a single attention Query vector 注意機構 • Queryベクトルが⼊⼒される • Keyベクトルたち (⽂脈情報たち) との 内積で注意重みを計算 • 各Valueベクトルに対応する注意重みを かけながら総和