Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
文献紹介 1月24日
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
gumigumi7
January 24, 2019
0
250
文献紹介 1月24日
Pay Less Attention with Lightweight and Dynamic Convolutions
gumigumi7
January 24, 2019
Tweet
Share
More Decks by gumigumi7
See All by gumigumi7
文献紹介 11月7日
gumigumi7
0
140
文献紹介 10月3日
gumigumi7
0
330
文献紹介 9月3日
gumigumi7
0
270
文献紹介 8月10日
gumigumi7
0
130
文献紹介 7月16日
gumigumi7
0
260
文献紹介 6月12日
gumigumi7
0
330
文献紹介 5月16日
gumigumi7
0
190
文献紹介 4月18日
gumigumi7
0
150
文献紹介 12月15日
gumigumi7
0
120
Featured
See All Featured
WENDY [Excerpt]
tessaabrams
9
36k
VelocityConf: Rendering Performance Case Studies
addyosmani
333
24k
AI in Enterprises - Java and Open Source to the Rescue
ivargrimstad
0
1.1k
Money Talks: Using Revenue to Get Sh*t Done
nikkihalliwell
0
160
Building Applications with DynamoDB
mza
96
6.9k
Google's AI Overviews - The New Search
badams
0
910
AI: The stuff that nobody shows you
jnunemaker
PRO
2
280
AI Search: Implications for SEO and How to Move Forward - #ShenzhenSEOConference
aleyda
1
1.1k
Music & Morning Musume
bryan
47
7.1k
Getting science done with accelerated Python computing platforms
jacobtomlinson
2
120
Context Engineering - Making Every Token Count
addyosmani
9
670
How GitHub (no longer) Works
holman
316
140k
Transcript
Felix Wu, Angela Fan, Alexei Baevski, Yann Dauphin, Michael Auli,
International Conference on Learning Representations, 2019
%1 2 n Transformer self-attention -# )6( ;7+ 0'&:* n
'$.4 SotA8" (ICLR2019 3 !/) n 251, 0!/39
3 n RNNCNNself-Attention.( Sequence Modeling"& % n +
4*(self-attention))'05 ( $, l Ex. ) Transformer l self-attention -!/ #31 2
$ 4 n ) 0%'2 8(# 417! n 417!(#$)
88"3*9! n 0%'2 +&/ (Tang et al., 2018) n .6-,5
5 n ) Self-attention l "!
"# n ) Dynamic convolution () l $ "
6 n Self-attention n Gated linear units
(GLU) Lightweight conv() n Dynamic conv
7 n Self-attention n
8 n Depthwise convolutions n "! ! n
# we have to go to Tokyo tonight we have to go to Tokyo tonight Normal convolutions Depthwise convolutions
9 n Lightweight convolutions n
n Softmax we have to go to Tokyo tonight Lightweight convolutions
10 n & ' " Dynamic convolutions
n $ # # & ( ' !" & ' " n %# $!self-attention
11 n Encoder-Decoder n Transformer self-attentionLightweight Conv,
Dynamic Conv n
12
() 13 • En-De, En-Fr self-attention (Vaswani
et al., 2017) SotA • Zn-En
3+ (48) 14 • - 9 • :%CNN 0'(5
(CNN, k=3) • Kernel$# /1&72 $(5! • Softmax;*,6 " 3+.)
( ) 15 • • Self-attention
-( (0,) 16 • Self-attention" 4&/13 • Bottom-Up
0, ) sequence-to-sequence • $!.*(Celikyilmaz et al., 2018) +# $!.LightConv, DynamicConv 2* / 4&/'%
17 n Self-attention ) 5$-' ,+28 ! . n
# # 64;SotA9( n 7&31/ : "&0*%