LiberalArts
June 05, 2021
1.1k

# Inductive Bias and Graph Networks

Inductive biasとGraph Networkに基づくDeepLearningのトレンドについて取りまとめを行いました。

（細かい記載に誤りがある可能性がありますので、なるべく出典を参照しつつご確認ください。）

なお、完成版は下記で取り扱いました。
https://lib-arts.booth.pm/items/3162254

June 05, 2021

## Transcript

1. ### 7 痥 2 畍 Graph Network 睗 2 皹ךעյ"Relational inductive

biases, deep learning, and graph networks"؅⩧מ Graph Network כ Inductive bias מחַי 牞霼׊ױ׌ն2-1 硼ך Inductive biasյ2-2 硼ך Graph Network מח ַיא׿ב׿牞霼׊ג䔿מյ2-3 硼ךאסױכ״׷脝㷋؅鉿ַױ׌ն 2.1 Inductive bias הכ 2.1.1 Inductive bias ך嚊銲 Inductive bias(䊟等溷فؕؓت) עրDeepLearning סؾشع٠٭ؠ圸ꅎ ؅脝ֻ׾מֵגזי陭㴻׌׾⯼䳀ցכ槏闋׌׾כ虘ַ־כ䘼ַױ׌նגכֻ ף氺⦐מ㸐׊י CNN ؅氠ַ׾갾עյ ր氺⦐עꁿ⤒סمؠجٜכ潸◦✑氠׊ יַ׾סךյنٜؔذ⭦槏؅鉿ֹ׆כך杅䖇ꓪ؅䬂⮂ך׀׾ցכַֹ⯼䳀؅ 氠ַיֽ׽յ׆׿؅ Inductive bias כײם׌׆כֿך׀ױ׌նCNN מꮹ׼ ׍յRNN ךע笠⮬ظ٭ذֿ꽄沁מ潸◦✑氠؅׊יַׂ坎㲳؅銨׊יֽ׽յ ׆׿׵ Inductive bias כ脝ֻ׾׆כֿך׀ױ׌ն ׆ס׻ֹמյInductive bias עⷃם׾㳔肪כעױגꇙזג镄掾־׼㳔肪 מֵגזיסقنؚ٭ُ٤ت⻔┪؅雧ײ׾⺅׽篁ײכ脝ֻ׾׆כֿך׀ױ 7
2. ### 8 痥 2 畍 Graph Network 2.1 Inductive bias הכ

׌ն׆׆ך峜䟨׊גַסֿ bias כ臝ׂכꇃ㳔肪 (overﬁtting) ؅䞯鱍׊㎇ꉌ ׌׬׀כַֹⷦ骭׵⺇ׄױ׌ֿյ րInductive bias ؅㎇ꉌ׌׾ցכַֹ׻׽ עրInductive bias ؅嵛氠׌׾ցכַֹ镄掾־׼饗韢׈׿׾׆כֿ㝂ַ׆כ ך׌ն 嚣锡؅䲖׳מֵגזיעյ րלס׻ֹם⯼䳀؅ַֽיٓةٖ٭ٜٝيٜך לס׻ֹם⭦槏圸ꅎ؅脝ֻ׾־ց؅⺅׽䪒ֹסֿ Inductive bias דכ脝ֻ׾ סֿ虘ַ־כ䘼ַױ׌ն 2.1.2 Graph Network ך锷俑ך鎸鯹 [2018] Inductive bias מחַיס饗韢ע♧⯼׻׽׈׿יַױ׌ֿյꁿ䌑ס DeepLearning ס煝疴מֽׄ׾ Inductive bias מחַיס脝㷋מֵגזי䫅 ֻיֽׂכ虘ַסֿ Graph Network[2018] ס韢倀ך׌ն ㎫ 2.1: Graph Network 韢倀 ٬Relational inductive biases, deep learning, and graph networks https://arxiv.org/abs/1806.01261 ׆ס韢倀ךע Graph Neural Network מꫀꅙ׌׾ MPNN(Message Pass- 8
3. ### 9 痥 2 畍 Graph Network 2.1 Inductive bias הכ

ing Neural Network) ׷յ Transformer מꫀꅙ׌׾ NLNN(Non-local Neural Network) םל؅簡⻉溷מ⺅׽䪒ֹنٝ٭ّ٠٭ؠס Graph Network ؅㸬 ⪜׌׾מֵגזיյInductive bias מחַי⻎免מ饗韢ֿ鉿؂׿יַױ׌ն ㎫ 2.2: Inductive bias ס阾鼥 (Graph Network 韢倀׻׽) ┪阾ֿ Graph Network 韢倀מֽׄ׾ Inductive bias ס阾鼥ך׌ֿյ糽ך 虝♕ׄ׊גכ׆؀ֿ嚣锡؅䲖׳מֵגזיעꓨ锡ך׌ն րInductive bias ؅ 氠ַ׾׆כך㳔肪ٜؓإٛثّע⨲⩰꽄⛺؅尴״׾׆כֿך׀׾ցכ׈׿י ֽ׽յ2-1-1 硼ך牞霼׊ג⫐㵼כ⻉蔹׊ױ׌ն 9
4. ### 1 0 痥 2 畍 Graph Network 2.1 Inductive bias

הכ ㎫ 2.3: Table.1(Graph Network 韢倀׻׽) ㎫ 2.3 ֿ Graph Network 韢倀ס Table.1 ך׌ֿյDeepLearning סٓ ةٖ٭ٜ (׆׆ךע Component כ銨槁׈׿יַ׾) ׇכס Inductive bias מחַי饗韢ֿ鉿؂׿יַױ׌ն׆׆ך銨מַֽי Entity ע阛砯ס㸐骭յ Relation ע Entity סꫀꅙ䙎յInductive bias ע⭦槏圸ꅎמֽׄ׾⯼䳀מח ַי銨槁׈׿יַױ׌ն沑ײꁎײ (Convolutional) ע㹾䨾溷յٛ؜ٝ٤ع (Recurrent) עꅙ籽溷 (Sequential)յGraph Network ע⚈䟨 (Arbitrary) כ 銨阾׈׿יַ׾סע䫅ֻיֽׂכ虘ַכ䘼ַױ׌ն ׆׿׼ס雄❿מֵגזיעյ沑ײꁎײ׷ٛ؜ٝ٤عמ♳㴻׌׾ Inductive bias ע㝕׀ׂյGraph Network ע⚈䟨ךյFully connected מ♳㴻׌׾ Inductive bias ע㸯׈ַ׆כֿ鞅ײ⺅׿ױ׌ն׆׆ך׈׼ם׾圸ꅎ溷ם䱱筺 ؅鉿ֹמֵגזיע Graph Network ؅氠ַ׾סֿ┞薭溷דכ䘼ַױ׌ֿյ ┞䍲 Fully connected ؅╚䖥מنؚ٭؜ت׊י Inductive bias ؅㸯׈ׂ׊ג ┪ך䱱寛׌׾כַֹס׵脝ֻ偙ס┞חדכ䘼ַױ׌ն ׆׿מꫀꅙ׊י⮂י׀גסֿ MLP-Mixer ׷ gMLP דכ脝ֻי虘ַכ 䘼ַױ׌ֿյ׆ס MLP-Mixer ׷ gMLP ׵ Graph Network ס阾岺ך銨׎ ׾כ脝ֻ׼׿ױ׌סךյא׿ב׿؅❈ַ⮔ׄ׾כַֹ׻׽ע⪢✄؅ Graph Network ס銨阾؅⩧מ俠槏׊ג┪ך Inductive bias ؅脝ֻ׾כַֹסֿꈌ ⮗ךעםַ־כַֹסֿ瞉脢ס锶闋ך׌ն כַֹ׆כךյ籽ׂ 2-2 硼ךע׆׆ױךס需מꫀꅙ׊י Graph Net- work[2018] ס韢倀ס銨阾ס牞霼؅鉿ַױ׌ն 10
5. ### 1 1 痥 2 畍 Graph Network 2.2 Graph Network

ה DeepLearning ך侭椚 2.2 Graph Network ה DeepLearning ך侭椚 2.2.1 Graph Network ך邌鎸 (GN block) 2-2-1 硼ךע Graph Network 韢倀מֽׄ׾ Graph Network ס銨阾מח ַי牞霼؅鉿ַױ׌նGraph Network ס韢倀מַֽיյْؕ٤ס⭦槏ע GN block ךٓةٖ٭ٜⵊ׈׿ױ׌ն ㎫ 2.4: GN block(Graph Network 韢倀׻׽) ㎫ 2.4 ֿ Graph Network ס韢倀מֽׄ׾ GN block ס俙䑑ס銨阾ך׌ն 剳偆ꫀ俙 (update function) ס φ כ겏笴ꫀ俙 (aggregation function) ס ρ מ ׻זי GN block ע圸䧯׈׿ױ׌ն剳偆ꫀ俙 (update function) ס φ כ겏笴 ꫀ俙 (aggregation function) ס ρ עא׿ב׿ 3 חֵ׽ױ׌ֿյ剳偆ꫀ俙ע ؙشةյؿ٭غյءٚنס㺲䙎ס 3 חךյ겏笴ꫀ俙עؙشة̕ؿ٭غյؿ٭ غ̕ءٚنյؙشة̕ءٚنס 3 חֿ銨阾׈׿יַױ׌ն ׆ס銨阾דׄך槏闋׌׾סע㝕㜟םסךյ┞薭溷ם MPNN(Message Passing Neural Network) ؅❛מյ⪽✄溷מ銨阾؅牞霼׊ױ׌ն 11
6. ### 1 2 痥 2 畍 Graph Network 2.2 Graph Network

ה DeepLearning ך侭椚 ㎫ 2.5: GN block כ MPNN(Graph Network 韢倀׻׽) ㎫ 2.5 ע Graph Network 韢倀ס Figure.4 ס┞ꌃך׌ֿյGN block ס䓺 䑑ךס MPNN(Message Passing Neural Network) ס銨阾ך׌նMPNN ך ע剳偆ꫀ俙עؙشةյؿ٭غյءٚنס㺲䙎ס 3 ח؅氠ַ׼׿յ겏笴ꫀ俙ע ؙشة̕ؿ٭غյؿ٭غ̕ءٚنס 2 חֿא׿ב׿氠ַ׼׿יַױ׌ն׆ס ⭦槏ע MPNN(Graph Neural Network) ס 1 חס㺽ס阛砯מ㸐䗎׊יַױ ׌ն׆׆ךյu עءٚنס⪢✄ס㺲䙎ךֵ׽յNode Classiﬁcation ذتؠס ׻ֹמذتؠ陭㴻מ׻זיע┮锡ס㖪⻉׵ֵ׽ױ׌ն ׆ס GN block ס銨阾ע㸴չ⫩ꩽםⷦ骭׵⺇ׄױ׌ֿյ RNNյ Transformer םל坎չםؾشع٠٭ؠ圸ꅎ؅⺅׽䪒ֹמֵגזי俠槏׊׷׌ַכַֹ׆כ ך㸬⪜׈׿גסךעםַ־כ䘼؂׿ױ׌׵ֹ㸴׊虘ַ銨阾׵ֵ׽אֹם宜 ׵׌׾סך׌ֿյGraph Network ס韢倀ע䑛氠׈׿׷׌ַ׻ֹםסך韢倀 ס銨阾מ⻉؂׎יֽׂ偙ֿ虘׈אֹך׌ ն 2-2-2 硼ךע GN block ؅氠ַי Transformer ס牞霼؅鉿ַױ׌ն 12
7. ### 1 3 痥 2 畍 Graph Network 2.2 Graph Network

ה DeepLearning ך侭椚 2.2.2 Transformer ה Graph Network(GN block) 2-2-2 硼ךע 2-2-1 硼ך牞霼׊ג GN block ס銨阾מ㕈טַי Transformer ס銨阾מחַי牞霼׊ױ׌նTransformer ע NLNN(Non-local Neural Net- work) ס┞甦כ׈׿יַ׾סךյNLNN מꫀ׌׾阾鼥؅牞霼׊ױ׌ն ㎫ 2.6: GN block כ NLNN(Graph Network 韢倀׻׽) 韢倀ס Figure.4 ס┞ꌃ؅獏׊ג㎫ 2.6 ךע NLNN מחַיס阾鼥ֿ鉿؂ ׿יַױ׌նTransformer ؅⩧מ脝ֻ׾ם׼յφe כ ρe→v ֿ Dot product attentionյφv ֿ Feed Forward Network(FFN) מ㸐䗎׌׾כ脝ֻי虘ַ־ כ䘼ַױ׌ն 13
8. ### 1 4 痥 2 畍 Graph Network 2.2 Graph Network

ה DeepLearning ך侭椚 ㎫ 2.7: GN block כ NLNN Ώ (Graph Network 韢倀׻׽) ㎫ 2.8: GN block כ NLNN ΐ (Graph Network 韢倀׻׽) ㎫ 2.9: GN block כ Transformer(Graph Network 韢倀׻׽) Transformer ס⭦槏ס霄箖ע㎫ 2.7֐㎫ 2.9 מ阾鼥׈׿יַױ׌ն俙䑑מ 14
9. ### 1 5 痥 2 畍 Graph Network 2.2 Graph Network

ה DeepLearning ך侭椚 חַיע坎չם煝疴؅┞חס銨阾ךױכ״ג׆כך韢倀ס阾鼥蔦✄ֿ㸴չ؂ ־׽מַׂכ䘼؂׿׾ג״յ㎫ 2.7 ؅⩧מ Transformer ס銨阾؅⫙㴻聋׊ ױ׌ն φe α (VQ, VK ) = softmax QKT dk φe β (VV al ) = VV al ρe→v(E, VQ, VK, VV al ) = φe α (VQ, VK )φe β (VV al ) = softmax QKT dk VV al = Vres φv = FFN(Vres ) ┪阾ס俙䑑ע㎫ 2.7 כ Transformer ס韢倀؅⹧脝מ⫙圸䧯؅鉿ַױ׊ גնVQ յVK յVV al עא׿ב׿ Transformer 韢倀ס Q כ K כ V ؅銨׊י ַױ׌նױגյVres ע Dot product attention ס⭦槏䔿ס鉿⮬յFFN ע Transformer מֽׄ׾ Feed Forward Network(MLP כ⻎聋) ؅銨׊יַױ ׌ն겏笴ꫀ俙 ρ ס阾鼥؅祔ⷃמ׌׾ג״מ鉿⮬銨阾؅氠ַי⪢יסؿ٭غכ ؙشة؅┞䍲מ⺅׽䪒ַױ׊גֿյGraph Network ס韢倀ךע┞ח┞ח⺅ ׽䪒זיַ׾׻ֹך阾鼥ֿ㸴չ沌ם׽ױ׌ն 2-2-1 硼ךע MPNN מꫀ׊יյ2-2-2 硼ךע Transformer ؅ NLNN ס ┞❛ס镄掾־׼牞霼؅鉿ַױ׊גն籽ׂ 2-2-3 硼ךע׆׆ױך牞霼׊ג GN block(Graph Network) ס銨阾ס寯氠䙎ס牞霼מֵגזיյאס♑ס DeepLearning מחַי׵ GN block ס銨阾؅⩧מ牞霼؅鉿ַױ׌ն 2.2.3 Graph Network כ圫ղז DeepLearning ׾邌ׇ׷ ׆׆ױךך MPNN ׷ NLNN ؅牞霼׊ױ׊גֿյ2-2-3 硼ךע Graph Network ؅氠ַי׆׆ױך⺅׽䪒זג♧㜽ס DeepLearning סؓ٭؞طؠ زٔ؅牞霼׊ױ׌ն 15
10. ### 1 6 痥 2 畍 Graph Network 2.2 Graph Network

ה DeepLearning ך侭椚 ㎫ 2.10: GN block כ RNN(Graph Network 韢倀׻׽) ㎫ 2.10 ע Graph Network 韢倀ס Figure.4 ס┞ꌃך׆׿מ׻׽ RNN ؅ 銨׌כ׈׿יַ׾׻ֹך׌նRNN ךעꅙ籽溷 (Sequential) מؙشة־׼ ؿ٭غמ⚻ꇖֿ鱍׆׾ג״յ׆׿ךע銨槁ך׀םַסךעםַ־כ׵脝ֻ׼ ׿׾סך׌ֿյGraph Network ס韢倀ךע Figure.6 ך鏿俙ס GN block ס ꅙ篙מחַי⺅׽䪒؂׿יֽ׽յ׆ה׼כס篁ײ⻉؂׎ך RNN ؅銨槁׌׾ כ脝ֻ׼׿יַ׾׻ֹך׌ն 16
11. ### 1 7 痥 2 畍 Graph Network 2.3 Inductive bias

ה Graph Network ㎫ 2.11: GN block סꅙ篙 (Graph Network 韢倀׻׽) ㎫ 2.11 מ韢倀ס Figure.6 ؅獏׊ױ׊גն2-2-1 硼ך⺅׽䪒זג MPNN כ 2-2-2 硼ך⺅׽䪒זג NLNN(Transformer etc) ךע㎫ 2.11 ס (a) ؅氠 ַיֽ׽յ׆׆ךע (c) ؅氠ַגכ脝ֻ׼׿׾כ䘼ַױ׌ն ׆ ס ׻ ֹ מ GN block כ RNN א ס ꅙ 篙 ؅ 氠 ַ ׾ ד ׄ ך 㝂 ׂ ס DeepLearning סؓ٭؞طؠزٔ؅銨槁ך׀׾כ脝ֻי虘ַ־כ䘼ַױ׌ն 2.3 Inductive bias ה Graph Network 2-1 硼ך Inductive bias מחַיյ2-2 硼ך Graph Network מחַי א׿ב׿⺅׽䪒זגסךյ2-3 硼ךעא׿׼מ㕈טַיⴭ䭇溷ם镄掾־׼ DeepLearning סؓ٭؞طؠزٔמחַיױכ״׷脝㷋؅鉿ַױ׌ն 2.3.1 Inductive bias ׾וך״ֲח⟎㹀ׅ׷ַ 2-3-1 硼ךע Inductive bias מ㕈טַיؼٖ٭ٜٚؾشع٠٭ؠס圸ꅎ؅ לס׻ֹמ♳㴻׌׾־מחַי脝㷋؅鉿ַױ׌ն2-1-2 硼ס㎫ 2.2 ך牞霼׊ 17
12. ### 1 8 痥 2 畍 Graph Network 2.3 Inductive bias

ה Graph Network ג Inductive bias ך׌ֿյؼٖ٭ٜٚؾشع٠٭ؠסؓ٭؞طؠزٔדׄ ךםׂيؕث笠ס䩘岺מֽׄ׾◜⯼⮔䉘 (prior distribution) ׷յ㳔肪מֵ גזיס婞⯵ⵊ꽃סꃯⲎםל㝂㼜מ廌׾כ׈׿יַױ׌ն ׆ס׻ֹמ Inductive bias מע坎չם锶偙ֵֿ׽ױ׌ֿյ㝂㸴ꈌ䓜ם锶 偙؅׊י虘ַסךֵ׿ףյ׆׿ױךֵ׾瓦䍲篙卸؅媘׊י׀ג煝疴מֽׄ׾ Inductive bias ע㸴ם־׼׍䟨⽱ֵֿ׾כ脝ֻי虘ַכ䘼ַױ׌նאסג ״յط؞تع׷韢倀םלס劔⻏ם׵סעא׿ב׿潸䗎ס Inductive bias ؅⻻ ؆דٓظٛ٤ءֿם׈׿יַ׾כ脝ֻי虘ַ־כ䘼ַױ׌ն׆ס需עؼٖ٭ ٜٚؾشع٠٭ؠמכלױ׼׍յب٤وٜם㎇䊟⮔卥׷յⴢꏕه٭تطؔ٤ ءםלס Tree-based סؓوٞ٭زםלאס♑ס塌唩㳔肪סٜؓإٛثّך ׵⻎坎ך׌ն ׆׿׼؅⺇ׄיյInductive bias מחַי脝ֻ׾갾עյ׆׿ױךס煝疴מ ֽׄ׾ٓظٛ٤ءס⯼䳀מא׿ב׿✇ֿ翝־׿י׀גס־כַֹ镄掾ך虝չ כ䪻䳢׌׾׆כֿꓨ锡דכ䘼ַױ׌նגכֻף Tree-based סٜؓإٛثّ עր✇־ס匛⚂מ㕈טׂ⮔㼜؅繪׽ꂉ׌׆כցֿ⯼䳀מ翝־׿גٜؓإٛث ّך׌ն׆׿׵ Inductive bias כ脝ֻ׾׆כֿך׀׾־כ䘼ַױ׌ն ׆ס׻ֹמյ րא׿ב׿ס䩘岺מַֽיא׿ב׿לס׻ֹם⯼䳀ֿ翝־׿ יַ׾־ցמחַי䊬מ脝䢩׊םֿ׼劳ױ׊ַ Inductive bias ס♳㴻מחַ י嗱阧׌׾כַֹסֿ䖩锡םסךעםַ־כ䘼ַױ׌ն 㝕卽ס㕈勓溷ם脝ֻ偙מחַי牞霼ך׀גסךյ2-3-1 硼ע׆׆ױךכ׊յ 籽ׂ 2-3-2 硼ךע DeepLearning מꫀ➳׌׾ٓةٖ٭ٜס Inductive bias מ חַי牞霼׊ױ׌ն 2.3.2 CNNծRNNծGNNծTransformerծMLP 2-3-2 硼ךע CNNյRNNյGNNյTransformerյMLP םלמ䞯㴻׈׿׾ Inductive bias מחַי牞霼؅鉿ַױ׌ն CNN מ䞯㴻׈׿׾ Inductive bias עյ ր氺⦐מַֽיעꁿ⤒סمؠجٜס ⡑כ潸ꫀ؅䭥חցכַֹסֿ╭ם♳㴻ך׌նCNN ך氠ַ׼׿יַ׾沑ײꁎ 18
13. ### 1 9 痥 2 畍 Graph Network 2.3 Inductive bias

ה Graph Network ײ⭦槏עنٜؔذ⭦槏כ⻎坎ךյنٜؔذ؅ظ٭ذ־׼蔦ⳛ㳔肪׌׾כַֹ סֿ CNN ס㳔肪ס╭ם嵣׿ך׌նױגյCNN ך氠ַ׼׿יַ׾ Pooling ⭦槏ע氺⦐ס㐁緐מ׻׾杅䖇ꓪס䬂⮂؅䟨⽱׊יֽ׽յ׆׿׵闋⦐䍲؅┫ׅ י׵氺⦐ס䟨⽱⻉ַע㜟؂׼םַכַֹ⯼䳀 (Inductive bias) מ㕈טַיַ ׾כ脝ֻ׾׆כֿך׀ױ׌ն RNN(׆׆ךע LSTMյGRU ؅⻻ײױ׌) מ䞯㴻׈׿׾ Inductive bias עյ ր笠⮬ظ٭ذע׆׿ױךס⮂槁篙卸מ㕈טַי姌ֿ气䧯׈׿׾ցכ脝ֻ י虘ַ־כ䘼ַױ׌նRNN 笠ס䩘岺ע♑ס䩘岺מ嬟׬יא׿׮ל篙卸ֿ⮂ יַםַכַֹ׻ֹמ׵䘼؂׿ױ׌ֿյRNN 笠ס䩘岺עꩽַ笠⮬ס⺅׽䪒 ֵַֿױ׽ֹױׂ鉿־םַכַֹסֵֿ׾כ׈׿יַױ׌նאַֹֹ䟨⽱ך RNN 笠׻׽׵ Transformer ס׻ֹם self-attention ي٭تס䩘岺ס偙ֿ笠 ⮬ظ٭ذס⺅׽䪒ַמֵגזיע׻׽虘ַ Inductive bias ؅♳㴻ך׀גכ 锶׾׆כ׵ך׀׾ס־׵׊׿ױ׎؆ն GNN ע⻎免מ┰ֻגءٚنס圸ꅎמ㕈טַי阛砯؅鉿ֹכַֹסֿ Inductive bias כ׊י氠ַ׼׿יַ׾כ脝ֻי虘ַכ䘼ַױ׌նGNN ؅ 栄聋ך脝ֻג갾ע 2-2-1 硼ך⺅׽䪒זג MPNN(Message Passing Neural Network) ؅ْؕ٭ة׌׾כ虘ַכ䘼ַױ׌ֿյ䌮聋ך脝ֻ׾כא׵א׵♑ ס Neural Network ע⪢יءٚنמ㕈טַיַ׾כ锶׾׆כ׵ך׀׾סךյ DeepLearning ⪢✄ֿ GNN דכ脝ֻ׾׆כ׵ך׀׾כ䘼ַױ׌ն2-2 硼ך ⺅׽䪒זג Graph Network ס煝疴ךע׆סꁊס簡┞溷ם⺅׽䪒ַֿ׈׿י ֽ׽յ2021 䌑免ך需꾴ס MLP ي٭تס䩘岺׵ Graph Network ס┞甦כ锶 ׾׆כ׵ך׀ױ׌ն Transformer ע self-attention מ㕈טׂ䩘岺כ׈׿ֿהך׌ֿյ GNN 溷ם ءٚن؅א׿ב׿סؿ٭غ阋靣⭦槏ס㖪⻉עⷃ靣ס꿔⛣䍲 (Dot product attention ס Dot product ע⫐畤ךֵ׽յCos 꿔⛣䍲כ㕈勓溷מע⻎聋ך׌) מ㕈טַי圸碎׊ MPNN כ⻎坎ס⭦槏؅鉿ֹכ脝ֻ׾׆כ׵ך׀ױ׌նא סג״յ րGNN ؅脝ֻ׾מֵגזיסءٚن؅ؿ٭غס꿔⛣䍲מ㕈טַי气 䧯ך׀׾ցכ脝ֻ׾סֿ Transformer ס Inductive bias כ锶ם׊י虘ַ־ כ䘼ַױ׌ն 19
14. ### 2 0 痥 2 畍 Graph Network 2.3 Inductive bias

ה Graph Network ױגյ 2021 䌑免掾ך׻ׂ需꾴מ┪ֿ׾סֿ MLP ي٭تס䩘岺ך׌ն րself- attention vs MLPցס镄掾ך锶׼׿ֿהך׌ֿյא׵א׵ Transformer 蔦 ✄յattention ⭦槏♧㜽סכ׆؀ע MLP ךֵ׾סךյ㵅עא׿׮לꇙַֿ ֵ׾־כ阋ֻףאֹךעםַסךעםַ־כַֹסֿ瞉脢ס锶闋ך׌ն2-2 硼ס Graph Network ס煝疴ךעؿ٭غⷃ⛺ס MLP ⭦槏מؙشة؅闋׊ גؿ٭غꪨ⭦槏؅㸬⪜׊ג׵ס؅ Graph Network כ㴻聋׊יַגסךյ MLP-Mixer ׷ gMLP םלס䩘岺؅ Transformer כ嬟鼛׊ג갾מյלה׼ ׵ Graph Network כ锶׿׾׆כ־׼א׿׮ל㝕׀םꇙַעםַסךעםַ ־כ脝ֻי虘ַסךעכ䘼ַױ׌ն ׆׆ױך坎չם DeepLearning ס圸䧯מחַי牞霼؅鉿ַױ׊גֿյ㕈勓 溷מע Graph Network(㴻聋ֿױד㴻ױזיַםַⷦ骭סג״յGNN כ׌ ׾־לֹ־ס偂阋עꉌׄױ׌) מ㕈טַיյ Inductive bias ؅㸬⪜׊י圸䧯؅ 尴״׾כ脝ֻ׾סֿ虘ַסךעכ䘼ַױ׌ն籽ׂ 2-3-3 硼ךע MLP-Mixerյ 2-3-4 硼ךע gMLP מחַיא׿ב׿⺅׽䪒ַױ׌ն 2.3.3 MLP-Mixer(MLP ΍) 2-3-3 硼ךע MLP-Mixer מחַי牞霼؅鉿ַױ׌ն ㎫ 2.12: MLP-Mixer ס⭦槏嚣锡 (MLP-Mixer 韢倀׻׽) ٬MLP-Mixer: An all-MLP Architecture for Vision https://arxiv.org/abs/2105.01601 ㎫ 2.12 ע MLP-Mixer ס韢倀ס Figure.1 ־׼ס䫕穀ך׌ֿյ㎫מֽׄ׾ 20
15. ### 2 1 痥 2 畍 Graph Network 2.3 Inductive bias

ה Graph Network MLP1 עقشزꪨ (ViT םלס氺⦐⭦槏؅⯼䳀ס阾鼥סג״ Patch כ׈׿ יַױ׌ֿյGNN דכؿ٭غյNLP דכ token מם׾׆כע䫅ֻיֽׂ כ虘ַכ䘼ַױ׌) ס MLP מ׻׾悍砯յMLP2 עא׿ב׿סقشز⫐ס MLP 悍砯מ㸐䗎׌׾כ槏闋׌׾כ虘ַ־כ䘼ַױ׌ն MLP-Mixer ס⭦槏؅ GNN ס镄掾־׼锶׾ם׼յMLP2 עؿ٭غ⫐ ךס MLP ךֵ׽յ׆׿ע Transformer ס FFN(Feed Forward Network) ס⭦槏מ┞蔹׊ױ׌ն┞偙ך MLP1 ע Dot product attention מ㕈טׂ Transformer כע沌ם׽յⷃמؿ٭غꪨ (قشزꪨ) ךס MLP מ׻׾悍砯 כם׽ױ׌նMLP1յMLP2 ס悍砯מֵגזיעقْٚ٭ذ؅氠ַיֽ׽յ 杅מ MLP1 ךע GNN ס걋䱸鉿⮬מ㸐䗎׌׾قْٚ٭ذס㳔肪ֿ鉿؂׿י ַ׾׆כע䪻䳢׊יֽׂכ虘ַ־כ䘼ַױ׌ն ׉זׂ׽脝ֻ׾ם׼յ րTransfomer ךעⷃ靣ס꿔⛣䍲מ㕈טַיءٚن؅ 圸碎׊יַגֿյMLP-Mixer ךעءٚنס圸ꅎ؅㳔肪׌׾ցכַֹ锶偙׵ ⺪茣דכ䘼ַױ׌ն 2.3.4 gMLP(MLP Ύ) 2-3-4 硼ךע gMLP מחַי牞霼؅鉿ַױ׌ն 21
16. ### 2 2 痥 2 畍 Graph Network 2.3 Inductive bias

ה Graph Network ㎫ 2.13: gMLP ס⭦槏嚣锡Ώ (gMLP 韢倀׻׽) ٬Pay Attention to MLPs https://arxiv.org/abs/2105.08050 ㎫ 2.13 ע gMLP ס韢倀ס Figure.1 ך׌ն㸴׊׆ס㎫דׄדכ؂־׽מׂ ַך׌ֿյ㎫ 2.14 םלמ俙䑑ס阾鼥ֵֿ׽յChannel Proj ע Transformer ס FFN ס⭦槏כ⻎坎ךֵ׾כ韢倀מ阾鼥׈׿יֽ׽յאסג״׆׆ךע ؿ٭غ⫐ךס MLP ס阛砯ֿ鉿؂׿יַ׾׆כֿ؂־׽ױ׌ն 22
17. ### 2 3 痥 2 畍 Graph Network 2.3 Inductive bias

ה Graph Network ㎫ 2.14: gMLP ס⭦槏嚣锡ΐ (gMLP 韢倀׻׽) ┞偙ך Spatial Gating Unit ֿ gMLP 栃蔦ס⭦槏ך׌ֿյ┫阾ס㎫ 2.15 ך俙䑑םלס阾鼥ֵֿ׽ױ׌ն 23
18. ### 2 4 痥 2 畍 Graph Network 2.3 Inductive bias

ה Graph Network ㎫ 2.15: Spatial Gating Unit ס⭦槏 (gMLP 韢倀׻׽) MLP-Mixer מ嬟鼛׊י㸴չ⭦槏סْؕ٭ةֿח׀מַׂג״յ׆ה׼ ׻׽׵ MLP-Mixer ؅╚䖥ך䪻䳢׊יֽׂ偙ֿ虘׈אֹםⷦ骭ך׌նױגյ 2-3-3 硼ך⺅׽䪒זג MLP-Mixer כ 2-3-4 硼ך⺅׽䪒זג gMLP עלה ׼׵ Graph Network ס阾岺ך⺅׽䪒ֻ׾׆כ׵䫅ֻיֽׂכ虘ַ־כ䘼ַ ױ׌ն 2.3.5 ➙䖓ך㾜劄חאְגך✮䟝 睗 2 皹ךע׆׆ױך坎չם氠靣מחַי牞霼׊ױ׊גֿյ րGraph Network ס韢倀םלמ㕈טׂ✇־׊׼ס簡┞溷ם銨阾؅⩧מ Inductive bias ؅לס ׻ֹמ陭阛׌׾־ցמ煝疴סنؚ٭؜تֿ瓌׾סךעםַ־յכַֹסֿ瞉 脢ס锶闋ך׌ն銨阾מꫀ׊יע✇־׊׼ס簡┞銨阾ֿ尴ױ׾כ虘ַך׌ֿյ ׵ֹ㸴׊免ꪨֿ־־׾־׵׊׿ױ׎؆ն 24
19. ### 2 5 痥 2 畍 Graph Network 2.3 Inductive bias

ה Graph Network ױגյ րTransformer vs MLPցמחַיע♀䔿׵坎չם饗韢ֿ⮂יׂ׾כ 䘼ַױ׌ֿյTransformer 蔦✄ MLP כ锶ם׎׾סךא׿׮ל㝕׀ם䟨⽱ס םַ嬟鼛דכ䘼ַױ׌նꫀꅙ׊י潲䠊溷םْؕ٭ةֿח־״׾כ虘ַכ䘼ַ ױ׌סך MPNN ؅⩧מ GNN סْؕ٭ة؅䲖ײյGraph Network ס銨阾 ך嵞气ס煝疴؅⻻״י俠槏׌׾כַֹסֿꓨ锡ךעםַ־כ䘼ַױ׌ն ױגյInductive bias ס脝㷋ֿ儨♀㙟ֻיַ׾סךյ韢倀؅⹧攍׌׾갾מ ע Inductive bias ס⽟ꁊס饗韢؅⨲⩰溷מ牞霼׌׾כַֹס׵㝕◜ךעם ַ־כ䘼ַױ׌ն 25