Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Speaker Deck
PRO
Sign in
Sign up for free
LSPC deep-people for music processing #06 RNN
Yuya Yamamoto
May 05, 2022
Research
0
5
LSPC deep-people for music processing #06 RNN
筑波大学人と音の情報学研究室で行われた,
深層学習×音楽データの勉強会の資料を公開しています.
誤りなどがあるかもしれません.その場合,ご指摘お願いします.
#06 再帰型ニューラルネットワーク
Yuya Yamamoto
May 05, 2022
Tweet
Share
More Decks by Yuya Yamamoto
See All by Yuya Yamamoto
2022年度情報学学位プログラム説明会 学生体験談
yamathcy
0
21
LSPC deep-people for music processing #01 導入
yamathcy
0
9
LSPC deep-people for music processing #05 CNN
yamathcy
0
8
LSPC博士前期チュートリアル
yamathcy
0
81
2020年度修士論文最終発表
yamathcy
0
28
MULTIMODAL METRIC LEARNING FOR TAG-BASED MUSIC RETRIEVAL@ICASSP2021読み会
yamathcy
0
500
SIGMUS130-yamamoto
yamathcy
0
27
#muana IRM
yamathcy
0
1k
論文サタデーナイト#01 SEMI-SUPERVISED LEARNING USING TEACHER-STUDENT MODELS FOR VOCAL MELODY EXTRACTION
yamathcy
0
110
Other Decks in Research
See All in Research
AI最新論文読み会2022年4月
ailaboocu
0
330
Celebrate UTIG: Staff and Student Awards 2022
utig
0
140
生成的モデリングによる集合データのVisual Analytics(博士論文公聴会)
ae14watanabe
4
900
第11回チャンピオンズミーティング・ピスケス杯ラウンド2集計 / Umamusume Pisces 2022 Round2
kitachan_black
0
880
Instance-Based Neural Dependency Parsing
hiroki13
1
140
GDPナウキャスティング・webアプリ「NowcastingR」の概要
secondapunta
0
230
Winnti is Coming - Evolution after Prosecution@HITCON2021
aragorntseng
0
910
JGS594 Lecture 18
javiergs
PRO
0
340
ヒトをめぐるデータサイエンスにおけるAIの最新動向.pdf
tdailab
0
170
Natural language processing tells us the shape of language
eumesy
0
280
時間情報表現抽出とルールベース解析器のこれから / Temporal Expression Analysis in Japanese and Future of Rule-based Approach
yag_ays
PRO
0
770
Iterative source steering を用いたオンライン補助関数型独立ベクトル分析に基づくブラインド音源分離 / Blind source separation using online auxiliary-function-based independent vector analysis with iterative source steering
taishi
0
140
Featured
See All Featured
Code Reviewing Like a Champion
maltzj
506
37k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
655
120k
The Mythical Team-Month
searls
208
39k
Design by the Numbers
sachag
271
17k
Web Components: a chance to create the future
zenorocha
303
40k
The Most Common Mistakes in Cover Letters
jrick
PRO
4
24k
The Invisible Customer
myddelton
110
11k
Mobile First: as difficult as doing things right
swwweet
213
7.5k
A Philosophy of Restraint
colly
192
14k
Infographics Made Easy
chrislema
233
17k
How To Stay Up To Date on Web Technology
chriscoyier
780
250k
VelocityConf: Rendering Performance Case Studies
addyosmani
316
22k
Transcript
Recurrent Neural Network (RNN) Deep-people #6
લճͷ͓͞Β͍ • CNN • ہॴੑͱҐஔෆมੑ • ΈࠐΈɼϓʔϦϯάɼશ݁߹ • ༗໊Ϟσϧͷհ 2
ࠓͷࢀߟࢿྉ • Juhanͷࢿྉɹ • https://mac.kaist.ac.kr/~juhan/gct634/Slides/ [week9-1]%20recurrent%20neural%20network.pdf • Under standing LSTM
Networks • http://colah.github.io/posts/2015-08-Understanding-LSTMs/ 3
ࠓճͷ͓ • ࠶ؼܕχϡʔϥϧωοτϫʔΫʢҎ߱ɼRNNʣ • ࣗવݴޠॲཧͰޭΛ͛ͨχϡʔϥϧωοτϫʔΫ • ಛʹܥྻσʔλͷॲཧʹڧ͍ͱ͞Ε͍ͯΔ 4
ܥྻσʔλ • Իͷσʔλ͠͠ܥྻσʔλΛग़ྗ͢Δ͜ͱ͕ٻΊΒΕΔ • CNN͑Δ͕ɼ͋͘·Ͱ1ϑϨʔϜͷॲཧΛಠཱʹߦ͏ʢԼਤʣ • →ଞͷϑϨʔϜͷӨڹʹίϯςΩετใΛ׆༻Ͱ͖ͳ͍͔ʁ 5 ex. ϐονਪఆ
by CNN ಉ͡ϐονΛ৳͍ͯͯ͠ɼͦͷ࣌Ͱͷ ใ͚ͩͰϐονͷਪఆΛܾΊͯ͠·͏ ʹଞͷ࣌ͷਪఆ݁ՌΛ׆༻Ͱ͖͍ͯͳ͍
ଞͷϑϨʔϜใΛܨ͛Δ 6 • ண࣌ͷೖྗʹՃ͑ɼաڈ or ະདྷͷӅΕͷঢ়ଶΛ׆༻͢Δ • ӅΕͷঢ়ଶΛจ຺ใͱͯ͠ೖྗ͢Δ • →ΑΓ͍ൣғͷӨڹΛߟྀͰ͖ΔΑ͏ʹʂ
Recurrent Neural Network ʢRNNʣ • લ࣌ࠁͷӅΕͷঢ়ଶͱɼݱࡏͷঢ়ଶΛೖྗ͢ΔχϡʔϥϧωοτϫʔΫ 7
Vanilla RNN • ͍ΘΏΔ࠷γϯϓϧͳRNN • લ࣌ࠁͷӅΕঢ়ଶͷॏΈWt ͱݱ࣌ࠁͷೖྗͷॏΈWh ʢόΠΞεbʣͷ2छྨ͕ଘࡏ • ׆ੑԽؔg(ɾ)ʹtanh͕Α͘༻͍ΒΕΔ
• ग़ྗʹΞϑΟϯม+׆ੑԽؔfΛ͔͚Δʢgͱ۠ผ͠ͳ͍͜ͱʣ 8
ॱൖʢforwardܭࢉʣ • ֤࣌ࠁ͝ͱͷঢ়ଶΛԣʹల։͢Δ • ֤࣌ࠁͰॏΈʢWh (i),Wt (i)ʣΛڞ༗͢ΔɼͲͰ͔͍1ͭͷNNͱଊ͑Δ 9
Back-propagation through time (BPTT)ͱΑΕΔ 10 ٯൖʢBackwardʣ ࣌ࠁ͕ޙͷํ͔Βɼ࣌ࠁ͕લͷํʹൖͤ͞Δ
BPTTͷ • ޯফࣦ/രൃͷ • ܥྻ͕͘ͳΔ΄Ͳޯͷ͕ෆ҆ఆʹͳΔ • ޯരൃʹޯΫϦοϐϯάʢᮢΛ͑ͨΒͦͷᮢʹ͢Δʣ͕ΘΕΔ • ҰํɼޯফࣦʹରԠͰ͖ͳ͍ͷͰɼߏࣗମͷվળΛߟ͑Δ 11
ޯΫϦοϐϯά https://medium.com/@ayushch612/vanishing-gradient-and- exploding-gradient-problems-7737c0aa535f
ήʔτ͖RNN • Gated Recurrent Unit (GRU)ͱ Long Short-Term Memory (LSTM)
12 ܥྻͷ͍σʔλͰֶशՄೳ
LSTMͷ͘͠Έ • 4ͭͷϞδϡʔϧʹ͍ͭͯղઆ 13
ෆཁͳաڈͷใΛ”ΕΔ” 14 • ft ɿ͕͜͜[0,1]ͷൣғΛͱΓɼલ࣌ࠁͷهԱηϧͷӨڹྗΛ੍ޚ • 0ʹ͍ۙ΄ͲલͷใΛغ٫͠ɼ1ʹ͍ۙ΄Ͳอ࣋͢Δ ᶃ٫ήʔτ
ݱ࣌ͷ৽͍͠هԱΛՃ͢Δ 15 • Ci ɿݱ࣌ࠁͷใΛՃ͑Δ ~ • it ɿCi ͷใͷՁΛஅ͢ΔͨΊͷॏΈ
~ ᶄݱ࣌ࠁͷೖྗΛར༻͢Δʮ৽͍͠هԱηϧʯ
࣍ͷ࣌ࠁʹ͢هԱηϧΛܭࢉ͢Δ 16 • ˒ɿ٫͞Εͬͨલ࣌ࠁͷهԱ • ˒ɿ֮͑Δ͖ݱ࣌ࠁͷهԱ ᶅهԱηϧͷߋ৽ ˎཁૉੵ
ϝΠϯ෦ 17 • Ot ɿલ࣌ࠁͷӅΕঢ়ଶht-1 ͱݱ࣌ࠁͷೖྗXt ͷΞϑΟϯ&׆ੑԽؔ • ht ɿOt
ͱ׆ੑԽؔΛ௨ͨ͠هԱηϧCt ͷཁૉੵ → ࣍ͷ࣌ࠁ ᶆग़ྗήʔτ
શ෦Ͱ6ͭͷࣜ 18 ·ͱΊ ᶃ ᶄ ᶅ ᶆ ᶇ ᶈ ᶃ
ᶄ ᶅ ᶆ ᶇ ᶈ ύϥϝʔλ4छྨ (36লུ͠·͕͢ େମ͓Μͳ͡Ͱ͢
ઃఆʹΑͬͯม͑Α͏ 19 RNNͷೖग़ྗ৭ʑ One-to-Many ܥྻσʔλΛ0͔Β ੜ͢Δ ʢೖྗx։࢝τʔΫϯʣ ʢԻָੜʣ Many-to-One ࣌ܥྻͷσʔλ͔Β
άϩʔόϧͳใΛਪఆ ʢจষײਪఆʣ Many-to-ManyʢରԠ͋Γʣ ೖྗͱग़ྗͷ࣌ࠁతͳ ରԠΛਪఆ͢Δ ʢ୯ޠͷࢺਪఆʣ Many-to-ManyʢରԠͳ͠ʣ Seq2Seqͱɽ ͞ͷҟͳΔܥྻؒͷม ʢػց༁ʣ
աڈˠະདྷʹՃ͑ɼະདྷˠաڈͷใߟྀ 20 • ະདྷͷใ͕ࣝผʹཱͭ߹RNNΛํʹ͍͍ͯ͠ • ʢϦΞϧλΠϜॲཧͰΘͳ͍ํ͕͍͍͔͠Εͳ͍…?ʣ • ͚ͲͬͯΔจβϥʹݟΔ ํʢBi-DirectionalʣRNN
CNNͱΈ߹ΘͤͯԻλεΫͰޭ 21 • ಈըɾԻͳͲɼͦͷ࣌ࠁͷಛநग़ΛCNNͰΔ • ʴͦͷ࣌ؒมԽʹؔ͢ΔಛΛRNNʹΑͬͯଊ͑Δ ΈࠐΈ࠶ؼܕNNʢCRNNʣ https://ys0510.hatenablog.com/entry/crnn ΑΓ
CRNNCNNͱRNNͷ͍͍ͱ͜औΓ • CNNपใΛɼRNN࣌ ؒใΛΑ͘ଊ͑Δ • εϖΫτϩάϥϜΛͬͨॲཧ ͱ૬ੑ˕ 22 $BLS FUBM$POWPMVUJPOBMSFDVSSFOUOFVSBMOFUXPSLTGPSQPMZQIPOJDTPVOEFWFOUEFUFDUJPO5"4-1
࠶ؼߏΛͱ͍ͬͯΔͨΊֶश͕Ίɿ͍ସϞσϧ͕͋Δ 23 • Temporal Convolutional NetworkʢTCNʣ • ࣌ࠁํʹ1࣍ݩΈࠐΈΛߦ͏ + ड༰Λdilated֦ͯ͠େ
• ֶश͕͍͠ɼਫ਼LSTMΑΓΑ͘ͳΔ߹͋Δʢࢁຊͷײʣ ࣮RNNۦஞ͞Εؾຯɽɽɽ #BJFUBM"O&NQJSJDBM&WBMVBUJPOPG(FOFSJD$POWPMVUJPOBMBOE3FDVSSFOU/FUXPSLTGPS4FRVFODF.PEFMJOH
࠶ؼߏΛͱ͍ͬͯΔͨΊֶश͕Ίɿ͍ସϞσϧ͕͋Δ 24 • Transformer • ʮͳΜ͔࠷͖ۙͯΔͭʯ • ҙػߏʢAttentionʣʹجͮ͘Ϟσϧ • CNNΧʔωϧͷൣғɼɹɹɹɹ
RNNܥྻͷҰͭҰͭΛண͢Δ͕ɼ TransformerܥྻશͯΛҰؾʹண ͯ͠ҙ͢ΔϙΠϯτΛܾΊΔ ࣮RNNۦஞ͞Εؾຯɽɽɽ ৄ͘͠ະདྷͷTransformerճʹ 7BTXBOJFUBM"UUFOUJPOJTBMMZPVOFFE/FVS*14
·ͱΊ • ܥྻσʔλʹRNN͕ޮՌత • ͍ܥྻσʔλʹ͓͚ΔޯফࣦͷΛղܾ͢Δήʔτ͖RNN • ೖग़ྗʹΑΔRNNͷηοςΟϯά • CNNΛΈ߹ΘͤͨCRNN •
࠷ۙͷܥྻσʔλॲཧͰRNNΛۦஞͭͭ͋͠ΔϞσϧͷհ 25