LLM講座2024年「Day10. LLMの分析と理論」（後半パート）

Slide 1

Slide 1 text

Slide 53

Slide 53 text

LLM 大規模言語モデル講座講義資料 © 2024 by 東京大学松尾・岩澤研究室 53 ① フィードフォワードネットを知識保存モジュールとして見る [28] Mor Geva et al. (2021) “Transformer Feed-Forward Layers Are Key-Value Memories” フィードフォワードネットを記憶装置とみなす § フィードフォワードネット (2層MLP) は注意機構と似ている Feed-Forward Network FFN(key) Activation inner product weighted sum FFN(val) FFN Output Hidden State The capital of Ireland is [MASK] Self-Attention Layer Feed-Forward Network Dublin … … … … Knowledge Neurons … … 𝐿 × Figure 2: Illustration of how an FFN module in a Transformer block works as a key-value memory. The first linear layer FFN(key) computes intermediate neurons through inner product. Taking the activation of these neurons as weights, the second linear layer FFN(val) integrates value vectors through weighted sum. We hypothesize that knowledge neurons in the FFN module are responsible for expressing factual knowledge. Attention head Attention weights Key vectors Value vectors !! !" !# … weighted sum inner product … … … … … dule in a Transformer block works as a key-value memory. The first linear neurons through inner product. Taking the activation of these neurons as al) integrates value vectors through weighted sum. We hypothesize that re responsible for expressing factual knowledge. nowledge at- g and ampli- ffects the ex- dge. Second, a fact tend to g knowledge- e knowledge g prompts re- ually express om activating lation. erage knowl- al knowledge y fine-tuning. pdating facts, ng the knowledge surgery in Transformers, even without any fine-tuning. 2 Background: Transformer Transformer (Vaswani et al., 2017) is one of the most popular and effective NLP architectures. A Transformer encoder is stacked with L identical blocks. Each Transformer block mainly contains two modules: a self-attention module, and a feed- forward network (abbreviated as FFN) module. Let X 2 Rn⇥d denote the input matrix, two modules can be formulated as follows: Qh = XW Q h ,Kh = XW K h , Vh = XW V h , (1) Self-Atth(X) = softmax QhK T h Vh, (2) FFN(H) = gelu (HW1) W2, (3) Q K V ʢॏΈߦྻʣ ʢॏΈߦྻʣ ⾮常に類似 Figure 2: Illustration of how an FFN module in a Transformer block works as a key-value memory. The first linear layer FFN(key) computes intermediate neurons through inner product. Taking the activation of these neurons as weights, the second linear layer FFN(val) integrates value vectors through weighted sum. We hypothesize that knowledge neurons in the FFN module are responsible for expressing factual knowledge. the effectiveness of the proposed knowledge at- tribution method. First, suppressing and ampli- fying knowledge neurons notably affects the ex- pression of the corresponding knowledge. Second, we find that knowledge neurons of a fact tend to be activated more by corresponding knowledge- expressing prompts. Third, given the knowledge neurons of a fact, the top activating prompts re- trieved from open-domain texts usually express the corresponding fact, while the bottom activating prompts do not express the correct relation. In our case studies, we try to leverage knowledge neurons to explicitly edit factual knowledge in pretrained Transformers without any fine-tuning. We present two preliminary studies: updating facts, and erasing relations. After identifying the knowledge neurons, we perform a knowledge surgery for pretrained Transformers by directly modify- in Transformers, even without any fine-tuning. 2 Background: Transformer Transformer (Vaswani et al., 2017) is one of the most popular and effective NLP architectures. A Transformer encoder is stacked with L identical blocks. Each Transformer block mainly contains two modules: a self-attention module, and a feed- forward network (abbreviated as FFN) module. Let X 2 Rn⇥d denote the input matrix, two modules can be formulated as follows: Qh = XW Q h ,Kh = XW K h , Vh = XW V h , (1) Self-Atth(X) = softmax QhK T h Vh, (2) FFN(H) = gelu (HW1) W2, (3) where W Q h , W K h , W V h , W1, W2 are parameter ma- trices; Self-Atth(X) computes a single attention Query vector

Slide 54

Slide 54 text

LLM 大規模言語モデル講座講義資料 © 2024 by 東京大学松尾・岩澤研究室 54 ① フィードフォワードネットを知識保存モジュールとして見る [28] Mor Geva et al. (2021) “Transformer Feed-Forward Layers Are Key-Value Memories” フィードフォワードネットを記憶装置とみなす § フィードフォワードネット (2層MLP) は注意機構と似ている Feed-Forward Network FFN(key) Activation inner product weighted sum FFN(val) FFN Output Hidden State The capital of Ireland is [MASK] Self-Attention Layer Feed-Forward Network Dublin … … … … Knowledge Neurons … … 𝐿 × Figure 2: Illustration of how an FFN module in a Transformer block works as a key-value memory. The first linear layer FFN(key) computes intermediate neurons through inner product. Taking the activation of these neurons as weights, the second linear layer FFN(val) integrates value vectors through weighted sum. We hypothesize that knowledge neurons in the FFN module are responsible for expressing factual knowledge. Attention head Attention weights Key vectors Value vectors !! !" !# … weighted sum inner product … … … … … dule in a Transformer block works as a key-value memory. The first linear neurons through inner product. Taking the activation of these neurons as al) integrates value vectors through weighted sum. We hypothesize that re responsible for expressing factual knowledge. nowledge at- g and ampli- ffects the ex- dge. Second, a fact tend to g knowledge- e knowledge g prompts re- ually express om activating lation. erage knowl- al knowledge y fine-tuning. pdating facts, ng the knowledge surgery in Transformers, even without any fine-tuning. 2 Background: Transformer Transformer (Vaswani et al., 2017) is one of the most popular and effective NLP architectures. A Transformer encoder is stacked with L identical blocks. Each Transformer block mainly contains two modules: a self-attention module, and a feed- forward network (abbreviated as FFN) module. Let X 2 Rn⇥d denote the input matrix, two modules can be formulated as follows: Qh = XW Q h ,Kh = XW K h , Vh = XW V h , (1) Self-Atth(X) = softmax QhK T h Vh, (2) FFN(H) = gelu (HW1) W2, (3) Q K V ʢॏΈߦྻʣ ʢॏΈߦྻʣ ⾮常に類似 Figure 2: Illustration of how an FFN module in a Transformer block works as a key-value memory. The first linear layer FFN(key) computes intermediate neurons through inner product. Taking the activation of these neurons as weights, the second linear layer FFN(val) integrates value vectors through weighted sum. We hypothesize that knowledge neurons in the FFN module are responsible for expressing factual knowledge. the effectiveness of the proposed knowledge at- tribution method. First, suppressing and ampli- fying knowledge neurons notably affects the ex- pression of the corresponding knowledge. Second, we find that knowledge neurons of a fact tend to be activated more by corresponding knowledge- expressing prompts. Third, given the knowledge neurons of a fact, the top activating prompts re- trieved from open-domain texts usually express the corresponding fact, while the bottom activating prompts do not express the correct relation. In our case studies, we try to leverage knowledge neurons to explicitly edit factual knowledge in pretrained Transformers without any fine-tuning. We present two preliminary studies: updating facts, and erasing relations. After identifying the knowledge neurons, we perform a knowledge surgery for pretrained Transformers by directly modify- in Transformers, even without any fine-tuning. 2 Background: Transformer Transformer (Vaswani et al., 2017) is one of the most popular and effective NLP architectures. A Transformer encoder is stacked with L identical blocks. Each Transformer block mainly contains two modules: a self-attention module, and a feed- forward network (abbreviated as FFN) module. Let X 2 Rn⇥d denote the input matrix, two modules can be formulated as follows: Qh = XW Q h ,Kh = XW K h , Vh = XW V h , (1) Self-Atth(X) = softmax QhK T h Vh, (2) FFN(H) = gelu (HW1) W2, (3) where W Q h , W K h , W V h , W1, W2 are parameter ma- trices; Self-Atth(X) computes a single attention Query vector 注意機構 • Queryベクトルが⼊⼒される • Keyベクトルたち (⽂脈情報たち) との内積で注意重みを計算 • 各Valueベクトルに対応する注意重みをかけながら総和

Slide 55

Slide 55 text

LLM 大規模言語モデル講座講義資料 © 2024 by 東京大学松尾・岩澤研究室 55 ① フィードフォワードネットを知識保存モジュールとして見る [28] Mor Geva et al. (2021) “Transformer Feed-Forward Layers Are Key-Value Memories” フィードフォワードネットを記憶装置とみなす § フィードフォワードネット (2層MLP) は注意機構と似ている Feed-Forward Network FFN(key) Activation inner product weighted sum FFN(val) FFN Output Hidden State The capital of Ireland is [MASK] Self-Attention Layer Feed-Forward Network Dublin … … … … Knowledge Neurons … … 𝐿 × Figure 2: Illustration of how an FFN module in a Transformer block works as a key-value memory. The first linear layer FFN(key) computes intermediate neurons through inner product. Taking the activation of these neurons as weights, the second linear layer FFN(val) integrates value vectors through weighted sum. We hypothesize that knowledge neurons in the FFN module are responsible for expressing factual knowledge. Attention head Attention weights Key vectors Value vectors !! !" !# … weighted sum inner product … … … … … dule in a Transformer block works as a key-value memory. The first linear neurons through inner product. Taking the activation of these neurons as al) integrates value vectors through weighted sum. We hypothesize that re responsible for expressing factual knowledge. nowledge at- g and ampli- ffects the ex- dge. Second, a fact tend to g knowledge- e knowledge g prompts re- ually express om activating lation. erage knowl- al knowledge y fine-tuning. pdating facts, ng the knowledge surgery in Transformers, even without any fine-tuning. 2 Background: Transformer Transformer (Vaswani et al., 2017) is one of the most popular and effective NLP architectures. A Transformer encoder is stacked with L identical blocks. Each Transformer block mainly contains two modules: a self-attention module, and a feed- forward network (abbreviated as FFN) module. Let X 2 Rn⇥d denote the input matrix, two modules can be formulated as follows: Qh = XW Q h ,Kh = XW K h , Vh = XW V h , (1) Self-Atth(X) = softmax QhK T h Vh, (2) FFN(H) = gelu (HW1) W2, (3) Q K V ʢॏΈߦྻʣ ʢॏΈߦྻʣ ⾮常に類似 Figure 2: Illustration of how an FFN module in a Transformer block works as a key-value memory. The first linear layer FFN(key) computes intermediate neurons through inner product. Taking the activation of these neurons as weights, the second linear layer FFN(val) integrates value vectors through weighted sum. We hypothesize that knowledge neurons in the FFN module are responsible for expressing factual knowledge. the effectiveness of the proposed knowledge at- tribution method. First, suppressing and ampli- fying knowledge neurons notably affects the ex- pression of the corresponding knowledge. Second, we find that knowledge neurons of a fact tend to be activated more by corresponding knowledge- expressing prompts. Third, given the knowledge neurons of a fact, the top activating prompts re- trieved from open-domain texts usually express the corresponding fact, while the bottom activating prompts do not express the correct relation. In our case studies, we try to leverage knowledge neurons to explicitly edit factual knowledge in pretrained Transformers without any fine-tuning. We present two preliminary studies: updating facts, and erasing relations. After identifying the knowledge neurons, we perform a knowledge surgery for pretrained Transformers by directly modify- in Transformers, even without any fine-tuning. 2 Background: Transformer Transformer (Vaswani et al., 2017) is one of the most popular and effective NLP architectures. A Transformer encoder is stacked with L identical blocks. Each Transformer block mainly contains two modules: a self-attention module, and a feed- forward network (abbreviated as FFN) module. Let X 2 Rn⇥d denote the input matrix, two modules can be formulated as follows: Qh = XW Q h ,Kh = XW K h , Vh = XW V h , (1) Self-Atth(X) = softmax QhK T h Vh, (2) FFN(H) = gelu (HW1) W2, (3) where W Q h , W K h , W V h , W1, W2 are parameter ma- trices; Self-Atth(X) computes a single attention Query vector フィードフォワードネット • 中間表現が⼊⼒される • 1つ⽬の重み⾏列の列たちとの内積で活性化値を計算 • 2つ⽬の重み⾏列の各列に対応する活性化値をかけながら総和

Slide 56

Slide 56 text

LLM 大規模言語モデル講座講義資料 © 2024 by 東京大学松尾・岩澤研究室 56 ① フィードフォワードネットを知識保存モジュールとして見るフィードフォワードネットを記憶装置とみなす § フィードフォワードネット (2層MLP) は注意機構と似ている Feed-Forward Network FFN(key) Activation inner product weighted sum FFN(val) FFN Output Hidden State The capital of Ireland is [MASK] Self-Attention Layer Feed-Forward Network Dublin … … … … Knowledge Neurons … … 𝐿 × Figure 2: Illustration of how an FFN module in a Transformer block works as a key-value memory. The first linear layer FFN(key) computes intermediate neurons through inner product. Taking the activation of these neurons as weights, the second linear layer FFN(val) integrates value vectors through weighted sum. We hypothesize that knowledge neurons in the FFN module are responsible for expressing factual knowledge. Attention head Attention weights Key vectors Value vectors !! !" !# … weighted sum inner product … … … … … dule in a Transformer block works as a key-value memory. The first linear neurons through inner product. Taking the activation of these neurons as al) integrates value vectors through weighted sum. We hypothesize that re responsible for expressing factual knowledge. nowledge at- g and ampli- ffects the ex- dge. Second, a fact tend to g knowledge- e knowledge g prompts re- ually express om activating lation. erage knowl- al knowledge y fine-tuning. pdating facts, ng the knowledge surgery in Transformers, even without any fine-tuning. 2 Background: Transformer Transformer (Vaswani et al., 2017) is one of the most popular and effective NLP architectures. A Transformer encoder is stacked with L identical blocks. Each Transformer block mainly contains two modules: a self-attention module, and a feed- forward network (abbreviated as FFN) module. Let X 2 Rn⇥d denote the input matrix, two modules can be formulated as follows: Qh = XW Q h ,Kh = XW K h , Vh = XW V h , (1) Self-Atth(X) = softmax QhK T h Vh, (2) FFN(H) = gelu (HW1) W2, (3) Q K V ʢॏΈߦྻʣ ʢॏΈߦྻʣ Query vector 周囲の単語表現から情報を集める重みパラメータから情報を集める

Slide 1

Slide 1 text

Slide 2

Slide 2 text

Slide 3

Slide 3 text

Slide 4

Slide 4 text

Slide 5

Slide 5 text

Slide 6

Slide 6 text

Slide 7

Slide 7 text

Slide 8

Slide 8 text

Slide 9

Slide 9 text

Slide 10

Slide 10 text

Slide 11

Slide 11 text

Slide 12

Slide 12 text

Slide 13

Slide 13 text

Slide 14

Slide 14 text

Slide 15

Slide 15 text

Slide 16

Slide 16 text

Slide 17

Slide 17 text

Slide 18

Slide 18 text

Slide 19

Slide 19 text

Slide 20

Slide 20 text

Slide 21

Slide 21 text

Slide 22

Slide 22 text

Slide 23

Slide 23 text

Slide 24

Slide 24 text

Slide 25

Slide 25 text

Slide 26

Slide 26 text

Slide 27

Slide 27 text

Slide 28

Slide 28 text

Slide 29

Slide 29 text

Slide 30

Slide 30 text

Slide 31

Slide 31 text

Slide 32

Slide 32 text

Slide 33

Slide 33 text

Slide 34

Slide 34 text

Slide 35

Slide 35 text

Slide 36

Slide 36 text

Slide 37

Slide 37 text

Slide 38

Slide 38 text

Slide 39

Slide 39 text

Slide 40

Slide 40 text