Upgrade to Pro — share decks privately, control downloads, hide ads and more …

SYNTHESIZER: Rethinking Self-Attention in Transformer Models

SYNTHESIZER: Rethinking Self-Attention in Transformer Models

Scatter Lab Inc.

May 08, 2020
Tweet

More Decks by Scatter Lab Inc.

Other Decks in Research

Transcript

  1. • EPUQSPEVDUTFMG"UUFOUJPOݫ஠פ્਷5SBOTGPSNFSݽ؛ࢿמ੄೨ब੸ੋݽٕ۽ঌ۰ઉ੓਺ • ೞ૑݅ ੉࠺ऴো࢑੉੿݈۽5SBOTGPSNFSࢿמীѾ੿੸ੌө ী੄ޙਸы਺ • ࠄ֤ޙীࢲחEPUQSPEVDUBUUFOUJPOэ਷FYQMJDJUೠUPLFOUPLFOBUUFOUJPOҳઑܳߡܻҊ
 4ZOUIFUJD"UUFOUJPOҳઑܳыח4ZOUIFTJ[FSݽ؛ਸઁউ •

    /.5 -.١׮নೠకझ௼ীࢲ405"5SBOTGPSNFSݽ؛ী࠺١ೠࢿמਸࠁ੐ • ژೠ SBOEPNMFBSOBCMFBMJHONFOUNBUSJDFT۽بજ਷ࢿמਸյࣻ੓׮חࢎपҗ
 ӝઓ5SBOTGPSNFSীࢲ੄UPLFOUPLFOEFQFOEFODJFTоજ਷ࢿמਸղחؘ
 ߈٘द೙ਃೞ૓ঋ׮חࢎपਸഛੋ ѐਃ
  2. %FOTF4ZOUIFTJ[FS Bi = F(Xi ) ӡ੉ ରਗ ੋੑ۱ ী؀೧
 ߣ૩ష௾

    ܳ ରਗীࢲ ରਗਵ۽ࢎ৔ೞחೣࣻ ܳ੿੄ೞৈ Y ରਗੋBMJHONFOUNBUSJY ܳҳࢿ l d X ∈ ℝl×d i Xi d l F l l B F(X) = W(σR (W(X) + b)) + b ੉ೣࣻחইې৬э਷க'FFE'PSXBSE/FUXPSL۽ҳഅ ਤ੄BMJHONFOUNBUSJYܳ੉ਊೞৈ5SBOTGPSNFS৬زੌೞѱ"UUFOUJPOো࢑ Y = Softmax(B)G(X) חӝઓ5SBOTGPSNFS੄7BMVF৬زੌ G(X)
  3. 3BOEPN4ZOUIFTJ[FS ؊рױೞѱBMJHONFOUNBUSJYܳੑ۱ ী੄ઓೞ૑ঋחSBOEPNJOJUJBMJ[FENBUSJY ۽ҳࢿ X R Y = Softmax(R)G(X) •

    ೨बই੉٣যחUPLFOCZUPLFOJOUFSBDUJPO੉աпUPLFO੄੿ࠁܳഝਊೞחѱইפۄ
 ౠ੿కझ௼ী੸೤ೠBMJHONFOUܳ೟णೞѷ׮חѪ • ী؀೧ о૑׮Ѩૐ R Trainable, Fixed
  4. 'BDUPSJ[FE%FOTF4ZOUIFTJ[FS A, B = FA (Xi ), FB (Xi )

     ח ܳпп ରਗਵ۽ࢎ৔ೞחೣࣻױ   FA FB Xi a, b a * b = l Y = Softmax(C)G(X) ੉Ҋ )חױࣽUJMJOHGVODUJPO ੑ۱чਸLߣEVQMJDBUF C = HA (A) * HB (B) ӝઓ੄ ରਗ੄ੑ۱ ܳ ରਗਵ۽ࢎ৔ೞחೣࣻ ܳѐ੄ೣࣻ  ۽'BDUPSJ[BUJPO d Xi l F FA FB
  5. 'BDUPSJ[FE3BOEPN4ZOUIFTJ[FS Y = Softmax(R1 RT 2 )G(X) SBOEPNBMJHONFOUNBUSJY ਸ ௼ӝܳыח

    ۽'BDUPSJ[BUJPO   R l × k R1 , R2 k l ౵ۄ޷ఠѐࣻо ীࢲ ۽хࣗ QSBDUJDBMೞѱח ਸࢎਊ l2 2kl k = 8
  6. .JYUVSFPG4ZOUIFTJ[FST Y = Softmax(α1 S1 (X) + . . .

    + αN SN (X))G(X) ࠂࣻѐ੄4ZOUIFTJ[JOHGVODUJPOਸഒ೤೧ࢲࢎਊ חখীࢲࢸݺೠ%FOTF഑਷3BOEPN4ZOUIFTJ[FS  ח೟णоמೠ౵ۄ޷ఠ S α ∑ α = 1