Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Inductive Bias and Graph Networks

Inductive Bias and Graph Networks

Inductive biasとGraph Networkに基づくDeepLearningのトレンドについてまとめてみました。
少々記載が難しいと思われるので、追記などを増やして理解しやすい内容とし有料版とできればと考えています。
出典の論文などを読むにあたって並行で参照いただけると良いのではないかと思います。
(細かい記載に誤りがある可能性がありますので、なるべく出典を参照しつつご確認いただけたらと思います)

87c236e94282fcf81192203e84a6e784?s=128

LiberalArts

June 05, 2021
Tweet

Transcript

  1. 7 痥 2 畍 Graph Network 睗 2 皹ךעյ"Relational inductive

    biases, deep learning, and graph networks"؅⩧מ Graph Network כ Inductive bias מחַי 牞霼׊ױ׌ն2-1 硼ך Inductive biasյ2-2 硼ך Graph Network מח ַיא׿ב׿牞霼׊ג䔿מյ2-3 硼ךאסױכ״׷脝㷋؅鉿ַױ׌ն 2.1 Inductive bias הכ 2.1.1 Inductive bias ך嚊銲 Inductive bias(䊟等溷فؕؓت) עրDeepLearning סؾشع٠٭ؠ圸ꅎ ؅脝ֻ׾מֵגזי陭㴻׌׾⯼䳀ցכ槏闋׌׾כ虘ַ־כ䘼ַױ׌նגכֻ ף氺⦐מ㸐׊י CNN ؅氠ַ׾갾עյ ր氺⦐עꁿ⤒סمؠجٜכ潸◦✑氠׊ יַ׾סךյنٜؔذ⭦槏؅鉿ֹ׆כך杅䖇ꓪ؅䬂⮂ך׀׾ցכַֹ⯼䳀؅ 氠ַיֽ׽յ׆׿؅ Inductive bias כײם׌׆כֿך׀ױ׌նCNN מꮹ׼ ׍յRNN ךע笠⮬ظ٭ذֿ꽄沁מ潸◦✑氠؅׊יַׂ坎㲳؅銨׊יֽ׽յ ׆׿׵ Inductive bias כ脝ֻ׾׆כֿך׀ױ׌ն ׆ס׻ֹמյInductive bias עⷃם׾㳔肪כעױגꇙזג镄掾־׼㳔肪 מֵגזיסقنؚ٭ُ٤ت⻔┪؅雧ײ׾⺅׽篁ײכ脝ֻ׾׆כֿך׀ױ 7
  2. 8 痥 2 畍 Graph Network 2.1 Inductive bias הכ

    ׌ն׆׆ך峜䟨׊גַסֿ bias כ臝ׂכꇃ㳔肪 (overfitting) ؅䞯鱍׊㎇ꉌ ׌׬׀כַֹⷦ骭׵⺇ׄױ׌ֿյ րInductive bias ؅㎇ꉌ׌׾ցכַֹ׻׽ עրInductive bias ؅嵛氠׌׾ցכַֹ镄掾־׼饗韢׈׿׾׆כֿ㝂ַ׆כ ך׌ն 嚣锡؅䲖׳מֵגזיעյ րלס׻ֹם⯼䳀؅ַֽיٓةٖ٭ٜٝيٜך לס׻ֹם⭦槏圸ꅎ؅脝ֻ׾־ց؅⺅׽䪒ֹסֿ Inductive bias דכ脝ֻ׾ סֿ虘ַ־כ䘼ַױ׌ն 2.1.2 Graph Network ך锷俑ך鎸鯹 [2018] Inductive bias מחַיס饗韢ע♧⯼׻׽׈׿יַױ׌ֿյꁿ䌑ס DeepLearning ס煝疴מֽׄ׾ Inductive bias מחַיס脝㷋מֵגזי䫅 ֻיֽׂכ虘ַסֿ Graph Network[2018] ס韢倀ך׌ն ㎫ 2.1: Graph Network 韢倀 ٬Relational inductive biases, deep learning, and graph networks https://arxiv.org/abs/1806.01261 ׆ס韢倀ךע Graph Neural Network מꫀꅙ׌׾ MPNN(Message Pass- 8
  3. 9 痥 2 畍 Graph Network 2.1 Inductive bias הכ

    ing Neural Network) ׷յ Transformer מꫀꅙ׌׾ NLNN(Non-local Neural Network) םל؅簡⻉溷מ⺅׽䪒ֹنٝ٭ّ٠٭ؠס Graph Network ؅㸬 ⪜׌׾מֵגזיյInductive bias מחַי⻎免מ饗韢ֿ鉿؂׿יַױ׌ն ㎫ 2.2: Inductive bias ס阾鼥 (Graph Network 韢倀׻׽) ┪阾ֿ Graph Network 韢倀מֽׄ׾ Inductive bias ס阾鼥ך׌ֿյ糽ך 虝♕ׄ׊גכ׆؀ֿ嚣锡؅䲖׳מֵגזיעꓨ锡ך׌ն րInductive bias ؅ 氠ַ׾׆כך㳔肪ٜؓإٛثّע⨲⩰꽄⛺؅尴״׾׆כֿך׀׾ցכ׈׿י ֽ׽յ2-1-1 硼ך牞霼׊ג⫐㵼כ⻉蔹׊ױ׌ն 9
  4. 1 0 痥 2 畍 Graph Network 2.1 Inductive bias

    הכ ㎫ 2.3: Table.1(Graph Network 韢倀׻׽) ㎫ 2.3 ֿ Graph Network 韢倀ס Table.1 ך׌ֿյDeepLearning סٓ ةٖ٭ٜ (׆׆ךע Component כ銨槁׈׿יַ׾) ׇכס Inductive bias מחַי饗韢ֿ鉿؂׿יַױ׌ն׆׆ך銨מַֽי Entity ע阛砯ס㸐骭յ Relation ע Entity סꫀꅙ䙎յInductive bias ע⭦槏圸ꅎמֽׄ׾⯼䳀מח ַי銨槁׈׿יַױ׌ն沑ײꁎײ (Convolutional) ע㹾䨾溷յٛ؜ٝ٤ع (Recurrent) עꅙ籽溷 (Sequential)յGraph Network ע⚈䟨 (Arbitrary) כ 銨阾׈׿יַ׾סע䫅ֻיֽׂכ虘ַכ䘼ַױ׌ն ׆׿׼ס雄❿מֵגזיעյ沑ײꁎײ׷ٛ؜ٝ٤عמ♳㴻׌׾ Inductive bias ע㝕׀ׂյGraph Network ע⚈䟨ךյFully connected מ♳㴻׌׾ Inductive bias ע㸯׈ַ׆כֿ鞅ײ⺅׿ױ׌ն׆׆ך׈׼ם׾圸ꅎ溷ם䱱筺 ؅鉿ֹמֵגזיע Graph Network ؅氠ַ׾סֿ┞薭溷דכ䘼ַױ׌ֿյ ┞䍲 Fully connected ؅╚䖥מنؚ٭؜ت׊י Inductive bias ؅㸯׈ׂ׊ג ┪ך䱱寛׌׾כַֹס׵脝ֻ偙ס┞חדכ䘼ַױ׌ն ׆׿מꫀꅙ׊י⮂י׀גסֿ MLP-Mixer ׷ gMLP דכ脝ֻי虘ַכ 䘼ַױ׌ֿյ׆ס MLP-Mixer ׷ gMLP ׵ Graph Network ס阾岺ך銨׎ ׾כ脝ֻ׼׿ױ׌סךյא׿ב׿؅❈ַ⮔ׄ׾כַֹ׻׽ע⪢✄؅ Graph Network ס銨阾؅⩧מ俠槏׊ג┪ך Inductive bias ؅脝ֻ׾כַֹסֿꈌ ⮗ךעםַ־כַֹסֿ瞉脢ס锶闋ך׌ն כַֹ׆כךյ籽ׂ 2-2 硼ךע׆׆ױךס需מꫀꅙ׊י Graph Net- work[2018] ס韢倀ס銨阾ס牞霼؅鉿ַױ׌ն 10
  5. 1 1 痥 2 畍 Graph Network 2.2 Graph Network

    ה DeepLearning ך侭椚 2.2 Graph Network ה DeepLearning ך侭椚 2.2.1 Graph Network ך邌鎸 (GN block) 2-2-1 硼ךע Graph Network 韢倀מֽׄ׾ Graph Network ס銨阾מח ַי牞霼؅鉿ַױ׌նGraph Network ס韢倀מַֽיյْؕ٤ס⭦槏ע GN block ךٓةٖ٭ٜⵊ׈׿ױ׌ն ㎫ 2.4: GN block(Graph Network 韢倀׻׽) ㎫ 2.4 ֿ Graph Network ס韢倀מֽׄ׾ GN block ס俙䑑ס銨阾ך׌ն 剳偆ꫀ俙 (update function) ס φ כ겏笴ꫀ俙 (aggregation function) ס ρ מ ׻זי GN block ע圸䧯׈׿ױ׌ն剳偆ꫀ俙 (update function) ס φ כ겏笴 ꫀ俙 (aggregation function) ס ρ עא׿ב׿ 3 חֵ׽ױ׌ֿյ剳偆ꫀ俙ע ؙشةյؿ٭غյءٚنס㺲䙎ס 3 חךյ겏笴ꫀ俙עؙشة̕ؿ٭غյؿ٭ غ̕ءٚنյؙشة̕ءٚنס 3 חֿ銨阾׈׿יַױ׌ն ׆ס銨阾דׄך槏闋׌׾סע㝕㜟םסךյ┞薭溷ם MPNN(Message Passing Neural Network) ؅❛מյ⪽✄溷מ銨阾؅牞霼׊ױ׌ն 11
  6. 1 2 痥 2 畍 Graph Network 2.2 Graph Network

    ה DeepLearning ך侭椚 ㎫ 2.5: GN block כ MPNN(Graph Network 韢倀׻׽) ㎫ 2.5 ע Graph Network 韢倀ס Figure.4 ס┞ꌃך׌ֿյGN block ס䓺 䑑ךס MPNN(Message Passing Neural Network) ס銨阾ך׌նMPNN ך ע剳偆ꫀ俙עؙشةյؿ٭غյءٚنס㺲䙎ס 3 ח؅氠ַ׼׿յ겏笴ꫀ俙ע ؙشة̕ؿ٭غյؿ٭غ̕ءٚنס 2 חֿא׿ב׿氠ַ׼׿יַױ׌ն׆ס ⭦槏ע MPNN(Graph Neural Network) ס 1 חס㺽ס阛砯מ㸐䗎׊יַױ ׌ն׆׆ךյu עءٚنס⪢✄ס㺲䙎ךֵ׽յNode Classification ذتؠס ׻ֹמذتؠ陭㴻מ׻זיע┮锡ס㖪⻉׵ֵ׽ױ׌ն ׆ס GN block ס銨阾ע㸴չ⫩ꩽםⷦ骭׵⺇ׄױ׌ֿյ RNNյ Transformer םל坎չםؾشع٠٭ؠ圸ꅎ؅⺅׽䪒ֹמֵגזי俠槏׊׷׌ַכַֹ׆כ ך㸬⪜׈׿גסךעםַ־כ䘼؂׿ױ׌׵ֹ㸴׊虘ַ銨阾׵ֵ׽אֹם宜 ׵׌׾סך׌ֿյGraph Network ס韢倀ע䑛氠׈׿׷׌ַ׻ֹםסך韢倀 ס銨阾מ⻉؂׎יֽׂ偙ֿ虘׈אֹך׌ ն 2-2-2 硼ךע GN block ؅氠ַי Transformer ס牞霼؅鉿ַױ׌ն 12
  7. 1 3 痥 2 畍 Graph Network 2.2 Graph Network

    ה DeepLearning ך侭椚 2.2.2 Transformer ה Graph Network(GN block) 2-2-2 硼ךע 2-2-1 硼ך牞霼׊ג GN block ס銨阾מ㕈טַי Transformer ס銨阾מחַי牞霼׊ױ׌նTransformer ע NLNN(Non-local Neural Net- work) ס┞甦כ׈׿יַ׾סךյNLNN מꫀ׌׾阾鼥؅牞霼׊ױ׌ն ㎫ 2.6: GN block כ NLNN(Graph Network 韢倀׻׽) 韢倀ס Figure.4 ס┞ꌃ؅獏׊ג㎫ 2.6 ךע NLNN מחַיס阾鼥ֿ鉿؂ ׿יַױ׌նTransformer ؅⩧מ脝ֻ׾ם׼յφe כ ρe→v ֿ Dot product attentionյφv ֿ Feed Forward Network(FFN) מ㸐䗎׌׾כ脝ֻי虘ַ־ כ䘼ַױ׌ն 13
  8. 1 4 痥 2 畍 Graph Network 2.2 Graph Network

    ה DeepLearning ך侭椚 ㎫ 2.7: GN block כ NLNN Ώ (Graph Network 韢倀׻׽) ㎫ 2.8: GN block כ NLNN ΐ (Graph Network 韢倀׻׽) ㎫ 2.9: GN block כ Transformer(Graph Network 韢倀׻׽) Transformer ס⭦槏ס霄箖ע㎫ 2.7֐㎫ 2.9 מ阾鼥׈׿יַױ׌ն俙䑑מ 14
  9. 1 5 痥 2 畍 Graph Network 2.2 Graph Network

    ה DeepLearning ך侭椚 חַיע坎չם煝疴؅┞חס銨阾ךױכ״ג׆כך韢倀ס阾鼥蔦✄ֿ㸴չ؂ ־׽מַׂכ䘼؂׿׾ג״յ㎫ 2.7 ؅⩧מ Transformer ס銨阾؅⫙㴻聋׊ ױ׌ն φe α (VQ, VK ) = softmax QKT dk φe β (VV al ) = VV al ρe→v(E, VQ, VK, VV al ) = φe α (VQ, VK )φe β (VV al ) = softmax QKT dk VV al = Vres φv = FFN(Vres ) ┪阾ס俙䑑ע㎫ 2.7 כ Transformer ס韢倀؅⹧脝מ⫙圸䧯؅鉿ַױ׊ גնVQ յVK յVV al עא׿ב׿ Transformer 韢倀ס Q כ K כ V ؅銨׊י ַױ׌նױגյVres ע Dot product attention ס⭦槏䔿ס鉿⮬յFFN ע Transformer מֽׄ׾ Feed Forward Network(MLP כ⻎聋) ؅銨׊יַױ ׌ն겏笴ꫀ俙 ρ ס阾鼥؅祔ⷃמ׌׾ג״מ鉿⮬銨阾؅氠ַי⪢יסؿ٭غכ ؙشة؅┞䍲מ⺅׽䪒ַױ׊גֿյGraph Network ס韢倀ךע┞ח┞ח⺅ ׽䪒זיַ׾׻ֹך阾鼥ֿ㸴չ沌ם׽ױ׌ն 2-2-1 硼ךע MPNN מꫀ׊יյ2-2-2 硼ךע Transformer ؅ NLNN ס ┞❛ס镄掾־׼牞霼؅鉿ַױ׊גն籽ׂ 2-2-3 硼ךע׆׆ױך牞霼׊ג GN block(Graph Network) ס銨阾ס寯氠䙎ס牞霼מֵגזיյאס♑ס DeepLearning מחַי׵ GN block ס銨阾؅⩧מ牞霼؅鉿ַױ׌ն 2.2.3 Graph Network כ圫ղז DeepLearning ׾邌ׇ׷ ׆׆ױךך MPNN ׷ NLNN ؅牞霼׊ױ׊גֿյ2-2-3 硼ךע Graph Network ؅氠ַי׆׆ױך⺅׽䪒זג♧㜽ס DeepLearning סؓ٭؞طؠ زٔ؅牞霼׊ױ׌ն 15
  10. 1 6 痥 2 畍 Graph Network 2.2 Graph Network

    ה DeepLearning ך侭椚 ㎫ 2.10: GN block כ RNN(Graph Network 韢倀׻׽) ㎫ 2.10 ע Graph Network 韢倀ס Figure.4 ס┞ꌃך׆׿מ׻׽ RNN ؅ 銨׌כ׈׿יַ׾׻ֹך׌նRNN ךעꅙ籽溷 (Sequential) מؙشة־׼ ؿ٭غמ⚻ꇖֿ鱍׆׾ג״յ׆׿ךע銨槁ך׀םַסךעםַ־כ׵脝ֻ׼ ׿׾סך׌ֿյGraph Network ס韢倀ךע Figure.6 ך鏿俙ס GN block ס ꅙ篙מחַי⺅׽䪒؂׿יֽ׽յ׆ה׼כס篁ײ⻉؂׎ך RNN ؅銨槁׌׾ כ脝ֻ׼׿יַ׾׻ֹך׌ն 16
  11. 1 7 痥 2 畍 Graph Network 2.3 Inductive bias

    ה Graph Network ㎫ 2.11: GN block סꅙ篙 (Graph Network 韢倀׻׽) ㎫ 2.11 מ韢倀ס Figure.6 ؅獏׊ױ׊גն2-2-1 硼ך⺅׽䪒זג MPNN כ 2-2-2 硼ך⺅׽䪒זג NLNN(Transformer etc) ךע㎫ 2.11 ס (a) ؅氠 ַיֽ׽յ׆׆ךע (c) ؅氠ַגכ脝ֻ׼׿׾כ䘼ַױ׌ն ׆ ס ׻ ֹ מ GN block כ RNN א ס ꅙ 篙 ؅ 氠 ַ ׾ ד ׄ ך 㝂 ׂ ס DeepLearning סؓ٭؞طؠزٔ؅銨槁ך׀׾כ脝ֻי虘ַ־כ䘼ַױ׌ն 2.3 Inductive bias ה Graph Network 2-1 硼ך Inductive bias מחַיյ2-2 硼ך Graph Network מחַי א׿ב׿⺅׽䪒זגסךյ2-3 硼ךעא׿׼מ㕈טַיⴭ䭇溷ם镄掾־׼ DeepLearning סؓ٭؞طؠزٔמחַיױכ״׷脝㷋؅鉿ַױ׌ն 2.3.1 Inductive bias ׾וך״ֲח⟎㹀ׅ׷ַ 2-3-1 硼ךע Inductive bias מ㕈טַיؼٖ٭ٜٚؾشع٠٭ؠס圸ꅎ؅ לס׻ֹמ♳㴻׌׾־מחַי脝㷋؅鉿ַױ׌ն2-1-2 硼ס㎫ 2.2 ך牞霼׊ 17
  12. 1 8 痥 2 畍 Graph Network 2.3 Inductive bias

    ה Graph Network ג Inductive bias ך׌ֿյؼٖ٭ٜٚؾشع٠٭ؠסؓ٭؞طؠزٔדׄ ךםׂيؕث笠ס䩘岺מֽׄ׾◜⯼⮔䉘 (prior distribution) ׷յ㳔肪מֵ גזיס婞⯵ⵊ꽃סꃯⲎםל㝂㼜מ廌׾כ׈׿יַױ׌ն ׆ס׻ֹמ Inductive bias מע坎չם锶偙ֵֿ׽ױ׌ֿյ㝂㸴ꈌ䓜ם锶 偙؅׊י虘ַסךֵ׿ףյ׆׿ױךֵ׾瓦䍲篙卸؅媘׊י׀ג煝疴מֽׄ׾ Inductive bias ע㸴ם־׼׍䟨⽱ֵֿ׾כ脝ֻי虘ַכ䘼ַױ׌նאסג ״յط؞تع׷韢倀םלס劔⻏ם׵סעא׿ב׿潸䗎ס Inductive bias ؅⻻ ؆דٓظٛ٤ءֿם׈׿יַ׾כ脝ֻי虘ַ־כ䘼ַױ׌ն׆ס需עؼٖ٭ ٜٚؾشع٠٭ؠמכלױ׼׍յب٤وٜם㎇䊟⮔卥׷յⴢꏕه٭تطؔ٤ ءםלס Tree-based סؓوٞ٭زםלאס♑ס塌唩㳔肪סٜؓإٛثّך ׵⻎坎ך׌ն ׆׿׼؅⺇ׄיյInductive bias מחַי脝ֻ׾갾עյ׆׿ױךס煝疴מ ֽׄ׾ٓظٛ٤ءס⯼䳀מא׿ב׿✇ֿ翝־׿י׀גס־כַֹ镄掾ך虝չ כ䪻䳢׌׾׆כֿꓨ锡דכ䘼ַױ׌նגכֻף Tree-based סٜؓإٛثّ עր✇־ס匛⚂מ㕈טׂ⮔㼜؅繪׽ꂉ׌׆כցֿ⯼䳀מ翝־׿גٜؓإٛث ّך׌ն׆׿׵ Inductive bias כ脝ֻ׾׆כֿך׀׾־כ䘼ַױ׌ն ׆ס׻ֹמյ րא׿ב׿ס䩘岺מַֽיא׿ב׿לס׻ֹם⯼䳀ֿ翝־׿ יַ׾־ցמחַי䊬מ脝䢩׊םֿ׼劳ױ׊ַ Inductive bias ס♳㴻מחַ י嗱阧׌׾כַֹסֿ䖩锡םסךעםַ־כ䘼ַױ׌ն 㝕卽ס㕈勓溷ם脝ֻ偙מחַי牞霼ך׀גסךյ2-3-1 硼ע׆׆ױךכ׊յ 籽ׂ 2-3-2 硼ךע DeepLearning מꫀ➳׌׾ٓةٖ٭ٜס Inductive bias מ חַי牞霼׊ױ׌ն 2.3.2 CNNծRNNծGNNծTransformerծMLP 2-3-2 硼ךע CNNյRNNյGNNյTransformerյMLP םלמ䞯㴻׈׿׾ Inductive bias מחַי牞霼؅鉿ַױ׌ն CNN מ䞯㴻׈׿׾ Inductive bias עյ ր氺⦐מַֽיעꁿ⤒סمؠجٜס ⡑כ潸ꫀ؅䭥חցכַֹסֿ╭ם♳㴻ך׌նCNN ך氠ַ׼׿יַ׾沑ײꁎ 18
  13. 1 9 痥 2 畍 Graph Network 2.3 Inductive bias

    ה Graph Network ײ⭦槏עنٜؔذ⭦槏כ⻎坎ךյنٜؔذ؅ظ٭ذ־׼蔦ⳛ㳔肪׌׾כַֹ סֿ CNN ס㳔肪ס╭ם嵣׿ך׌նױגյCNN ך氠ַ׼׿יַ׾ Pooling ⭦槏ע氺⦐ס㐁緐מ׻׾杅䖇ꓪס䬂⮂؅䟨⽱׊יֽ׽յ׆׿׵闋⦐䍲؅┫ׅ י׵氺⦐ס䟨⽱⻉ַע㜟؂׼םַכַֹ⯼䳀 (Inductive bias) מ㕈טַיַ ׾כ脝ֻ׾׆כֿך׀ױ׌ն RNN(׆׆ךע LSTMյGRU ؅⻻ײױ׌) מ䞯㴻׈׿׾ Inductive bias עյ ր笠⮬ظ٭ذע׆׿ױךס⮂槁篙卸מ㕈טַי姌ֿ气䧯׈׿׾ցכ脝ֻ י虘ַ־כ䘼ַױ׌նRNN 笠ס䩘岺ע♑ס䩘岺מ嬟׬יא׿׮ל篙卸ֿ⮂ יַםַכַֹ׻ֹמ׵䘼؂׿ױ׌ֿյRNN 笠ס䩘岺עꩽַ笠⮬ס⺅׽䪒 ֵַֿױ׽ֹױׂ鉿־םַכַֹסֵֿ׾כ׈׿יַױ׌նאַֹֹ䟨⽱ך RNN 笠׻׽׵ Transformer ס׻ֹם self-attention ي٭تס䩘岺ס偙ֿ笠 ⮬ظ٭ذס⺅׽䪒ַמֵגזיע׻׽虘ַ Inductive bias ؅♳㴻ך׀גכ 锶׾׆כ׵ך׀׾ס־׵׊׿ױ׎؆ն GNN ע⻎免מ┰ֻגءٚنס圸ꅎמ㕈טַי阛砯؅鉿ֹכַֹסֿ Inductive bias כ׊י氠ַ׼׿יַ׾כ脝ֻי虘ַכ䘼ַױ׌նGNN ؅ 栄聋ך脝ֻג갾ע 2-2-1 硼ך⺅׽䪒זג MPNN(Message Passing Neural Network) ؅ْؕ٭ة׌׾כ虘ַכ䘼ַױ׌ֿյ䌮聋ך脝ֻ׾כא׵א׵♑ ס Neural Network ע⪢יءٚنמ㕈טַיַ׾כ锶׾׆כ׵ך׀׾סךյ DeepLearning ⪢✄ֿ GNN דכ脝ֻ׾׆כ׵ך׀׾כ䘼ַױ׌ն2-2 硼ך ⺅׽䪒זג Graph Network ס煝疴ךע׆סꁊס簡┞溷ם⺅׽䪒ַֿ׈׿י ֽ׽յ2021 䌑免ך需꾴ס MLP ي٭تס䩘岺׵ Graph Network ס┞甦כ锶 ׾׆כ׵ך׀ױ׌ն Transformer ע self-attention מ㕈טׂ䩘岺כ׈׿ֿהך׌ֿյ GNN 溷ם ءٚن؅א׿ב׿סؿ٭غ阋靣⭦槏ס㖪⻉עⷃ靣ס꿔⛣䍲 (Dot product attention ס Dot product ע⫐畤ךֵ׽յCos 꿔⛣䍲כ㕈勓溷מע⻎聋ך׌) מ㕈טַי圸碎׊ MPNN כ⻎坎ס⭦槏؅鉿ֹכ脝ֻ׾׆כ׵ך׀ױ׌նא סג״յ րGNN ؅脝ֻ׾מֵגזיסءٚن؅ؿ٭غס꿔⛣䍲מ㕈טַי气 䧯ך׀׾ցכ脝ֻ׾סֿ Transformer ס Inductive bias כ锶ם׊י虘ַ־ כ䘼ַױ׌ն 19
  14. 2 0 痥 2 畍 Graph Network 2.3 Inductive bias

    ה Graph Network ױגյ 2021 䌑免掾ך׻ׂ需꾴מ┪ֿ׾סֿ MLP ي٭تס䩘岺ך׌ն րself- attention vs MLPցס镄掾ך锶׼׿ֿהך׌ֿյא׵א׵ Transformer 蔦 ✄յattention ⭦槏♧㜽סכ׆؀ע MLP ךֵ׾סךյ㵅עא׿׮לꇙַֿ ֵ׾־כ阋ֻףאֹךעםַסךעםַ־כַֹסֿ瞉脢ס锶闋ך׌ն2-2 硼ס Graph Network ס煝疴ךעؿ٭غⷃ⛺ס MLP ⭦槏מؙشة؅闋׊ גؿ٭غꪨ⭦槏؅㸬⪜׊ג׵ס؅ Graph Network כ㴻聋׊יַגסךյ MLP-Mixer ׷ gMLP םלס䩘岺؅ Transformer כ嬟鼛׊ג갾מյלה׼ ׵ Graph Network כ锶׿׾׆כ־׼א׿׮ל㝕׀םꇙַעםַסךעםַ ־כ脝ֻי虘ַסךעכ䘼ַױ׌ն ׆׆ױך坎չם DeepLearning ס圸䧯מחַי牞霼؅鉿ַױ׊גֿյ㕈勓 溷מע Graph Network(㴻聋ֿױד㴻ױזיַםַⷦ骭סג״յGNN כ׌ ׾־לֹ־ס偂阋עꉌׄױ׌) מ㕈טַיյ Inductive bias ؅㸬⪜׊י圸䧯؅ 尴״׾כ脝ֻ׾סֿ虘ַסךעכ䘼ַױ׌ն籽ׂ 2-3-3 硼ךע MLP-Mixerյ 2-3-4 硼ךע gMLP מחַיא׿ב׿⺅׽䪒ַױ׌ն 2.3.3 MLP-Mixer(MLP ΍) 2-3-3 硼ךע MLP-Mixer מחַי牞霼؅鉿ַױ׌ն ㎫ 2.12: MLP-Mixer ס⭦槏嚣锡 (MLP-Mixer 韢倀׻׽) ٬MLP-Mixer: An all-MLP Architecture for Vision https://arxiv.org/abs/2105.01601 ㎫ 2.12 ע MLP-Mixer ס韢倀ס Figure.1 ־׼ס䫕穀ך׌ֿյ㎫מֽׄ׾ 20
  15. 2 1 痥 2 畍 Graph Network 2.3 Inductive bias

    ה Graph Network MLP1 עقشزꪨ (ViT םלס氺⦐⭦槏؅⯼䳀ס阾鼥סג״ Patch כ׈׿ יַױ׌ֿյGNN דכؿ٭غյNLP דכ token מם׾׆כע䫅ֻיֽׂ כ虘ַכ䘼ַױ׌) ס MLP מ׻׾悍砯յMLP2 עא׿ב׿סقشز⫐ס MLP 悍砯מ㸐䗎׌׾כ槏闋׌׾כ虘ַ־כ䘼ַױ׌ն MLP-Mixer ס⭦槏؅ GNN ס镄掾־׼锶׾ם׼յMLP2 עؿ٭غ⫐ ךס MLP ךֵ׽յ׆׿ע Transformer ס FFN(Feed Forward Network) ס⭦槏מ┞蔹׊ױ׌ն┞偙ך MLP1 ע Dot product attention מ㕈טׂ Transformer כע沌ם׽յⷃמؿ٭غꪨ (قشزꪨ) ךס MLP מ׻׾悍砯 כם׽ױ׌նMLP1յMLP2 ס悍砯מֵגזיעقْٚ٭ذ؅氠ַיֽ׽յ 杅מ MLP1 ךע GNN ס걋䱸鉿⮬מ㸐䗎׌׾قْٚ٭ذס㳔肪ֿ鉿؂׿י ַ׾׆כע䪻䳢׊יֽׂכ虘ַ־כ䘼ַױ׌ն ׉זׂ׽脝ֻ׾ם׼յ րTransfomer ךעⷃ靣ס꿔⛣䍲מ㕈טַיءٚن؅ 圸碎׊יַגֿյMLP-Mixer ךעءٚنס圸ꅎ؅㳔肪׌׾ցכַֹ锶偙׵ ⺪茣דכ䘼ַױ׌ն 2.3.4 gMLP(MLP Ύ) 2-3-4 硼ךע gMLP מחַי牞霼؅鉿ַױ׌ն 21
  16. 2 2 痥 2 畍 Graph Network 2.3 Inductive bias

    ה Graph Network ㎫ 2.13: gMLP ס⭦槏嚣锡Ώ (gMLP 韢倀׻׽) ٬Pay Attention to MLPs https://arxiv.org/abs/2105.08050 ㎫ 2.13 ע gMLP ס韢倀ס Figure.1 ך׌ն㸴׊׆ס㎫דׄדכ؂־׽מׂ ַך׌ֿյ㎫ 2.14 םלמ俙䑑ס阾鼥ֵֿ׽յChannel Proj ע Transformer ס FFN ס⭦槏כ⻎坎ךֵ׾כ韢倀מ阾鼥׈׿יֽ׽յאסג״׆׆ךע ؿ٭غ⫐ךס MLP ס阛砯ֿ鉿؂׿יַ׾׆כֿ؂־׽ױ׌ն 22
  17. 2 3 痥 2 畍 Graph Network 2.3 Inductive bias

    ה Graph Network ㎫ 2.14: gMLP ס⭦槏嚣锡ΐ (gMLP 韢倀׻׽) ┞偙ך Spatial Gating Unit ֿ gMLP 栃蔦ס⭦槏ך׌ֿյ┫阾ס㎫ 2.15 ך俙䑑םלס阾鼥ֵֿ׽ױ׌ն 23
  18. 2 4 痥 2 畍 Graph Network 2.3 Inductive bias

    ה Graph Network ㎫ 2.15: Spatial Gating Unit ס⭦槏 (gMLP 韢倀׻׽) MLP-Mixer מ嬟鼛׊י㸴չ⭦槏סْؕ٭ةֿח׀מַׂג״յ׆ה׼ ׻׽׵ MLP-Mixer ؅╚䖥ך䪻䳢׊יֽׂ偙ֿ虘׈אֹםⷦ骭ך׌նױגյ 2-3-3 硼ך⺅׽䪒זג MLP-Mixer כ 2-3-4 硼ך⺅׽䪒זג gMLP עלה ׼׵ Graph Network ס阾岺ך⺅׽䪒ֻ׾׆כ׵䫅ֻיֽׂכ虘ַ־כ䘼ַ ױ׌ն 2.3.5 ➙䖓ך㾜劄חאְגך✮䟝 睗 2 皹ךע׆׆ױך坎չם氠靣מחַי牞霼׊ױ׊גֿյ րGraph Network ס韢倀םלמ㕈טׂ✇־׊׼ס簡┞溷ם銨阾؅⩧מ Inductive bias ؅לס ׻ֹמ陭阛׌׾־ցמ煝疴סنؚ٭؜تֿ瓌׾סךעםַ־յכַֹסֿ瞉 脢ס锶闋ך׌ն銨阾מꫀ׊יע✇־׊׼ס簡┞銨阾ֿ尴ױ׾כ虘ַך׌ֿյ ׵ֹ㸴׊免ꪨֿ־־׾־׵׊׿ױ׎؆ն 24
  19. 2 5 痥 2 畍 Graph Network 2.3 Inductive bias

    ה Graph Network ױגյ րTransformer vs MLPցמחַיע♀䔿׵坎չם饗韢ֿ⮂יׂ׾כ 䘼ַױ׌ֿյTransformer 蔦✄ MLP כ锶ם׎׾סךא׿׮ל㝕׀ם䟨⽱ס םַ嬟鼛דכ䘼ַױ׌նꫀꅙ׊י潲䠊溷םْؕ٭ةֿח־״׾כ虘ַכ䘼ַ ױ׌סך MPNN ؅⩧מ GNN סْؕ٭ة؅䲖ײյGraph Network ס銨阾 ך嵞气ס煝疴؅⻻״י俠槏׌׾כַֹסֿꓨ锡ךעםַ־כ䘼ַױ׌ն ױגյInductive bias ס脝㷋ֿ儨♀㙟ֻיַ׾סךյ韢倀؅⹧攍׌׾갾מ ע Inductive bias ס⽟ꁊס饗韢؅⨲⩰溷מ牞霼׌׾כַֹס׵㝕◜ךעם ַ־כ䘼ַױ׌ն 25