Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
文献紹介 1月24日
Search
gumigumi7
January 24, 2019
0
230
文献紹介 1月24日
Pay Less Attention with Lightweight and Dynamic Convolutions
gumigumi7
January 24, 2019
Tweet
Share
More Decks by gumigumi7
See All by gumigumi7
文献紹介 11月7日
gumigumi7
0
120
文献紹介 10月3日
gumigumi7
0
320
文献紹介 9月3日
gumigumi7
0
250
文献紹介 8月10日
gumigumi7
0
120
文献紹介 7月16日
gumigumi7
0
260
文献紹介 6月12日
gumigumi7
0
330
文献紹介 5月16日
gumigumi7
0
180
文献紹介 4月18日
gumigumi7
0
140
文献紹介 12月15日
gumigumi7
0
110
Featured
See All Featured
Happy Clients
brianwarren
92
6.4k
The Invisible Side of Design
smashingmag
294
49k
For a Future-Friendly Web
brad_frost
172
9k
WebSockets: Embracing the real-time Web
robhawkes
59
7k
Navigating Team Friction
lara
179
13k
The Pragmatic Product Professional
lauravandoore
26
5.8k
Done Done
chrislema
178
15k
Side Projects
sachag
451
41k
Fantastic passwords and where to find them - at NoRuKo
philnash
38
2.5k
Bootstrapping a Software Product
garrettdimon
PRO
302
110k
Principles of Awesome APIs and How to Build Them.
keavy
121
16k
10 Git Anti Patterns You Should be Aware of
lemiorhan
649
58k
Transcript
Felix Wu, Angela Fan, Alexei Baevski, Yann Dauphin, Michael Auli,
International Conference on Learning Representations, 2019
%1 2 n Transformer self-attention -# )6( ;7+ 0'&:* n
'$.4 SotA8" (ICLR2019 3 !/) n 251, 0!/39
3 n RNNCNNself-Attention.( Sequence Modeling"& % n +
4*(self-attention))'05 ( $, l Ex. ) Transformer l self-attention -!/ #31 2
$ 4 n ) 0%'2 8(# 417! n 417!(#$)
88"3*9! n 0%'2 +&/ (Tang et al., 2018) n .6-,5
5 n ) Self-attention l "!
"# n ) Dynamic convolution () l $ "
6 n Self-attention n Gated linear units
(GLU) Lightweight conv() n Dynamic conv
7 n Self-attention n
8 n Depthwise convolutions n "! ! n
# we have to go to Tokyo tonight we have to go to Tokyo tonight Normal convolutions Depthwise convolutions
9 n Lightweight convolutions n
n Softmax we have to go to Tokyo tonight Lightweight convolutions
10 n & ' " Dynamic convolutions
n $ # # & ( ' !" & ' " n %# $!self-attention
11 n Encoder-Decoder n Transformer self-attentionLightweight Conv,
Dynamic Conv n
12
() 13 • En-De, En-Fr self-attention (Vaswani
et al., 2017) SotA • Zn-En
3+ (48) 14 • - 9 • :%CNN 0'(5
(CNN, k=3) • Kernel$# /1&72 $(5! • Softmax;*,6 " 3+.)
( ) 15 • • Self-attention
-( (0,) 16 • Self-attention" 4&/13 • Bottom-Up
0, ) sequence-to-sequence • $!.*(Celikyilmaz et al., 2018) +# $!.LightConv, DynamicConv 2* / 4&/'%
17 n Self-attention ) 5$-' ,+28 ! . n
# # 64;SotA9( n 7&31/ : "&0*%