Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
ディープラーニングで音楽生成
Search
masa-ita
October 13, 2018
Technology
0
1.2k
ディープラーニングで音楽生成
DL4USの最終課題として、ディープラーニングでの音楽生成を試みた。
LSTMによる予測モデル、VAE、GANを試した。
Python機械学習勉強会in新潟 2018-10-13での発表スライド。
masa-ita
October 13, 2018
Tweet
Share
More Decks by masa-ita
See All by masa-ita
Run Instant NeRF on Docker
itagakim
1
2.1k
3D Clustering and Metric Learning
itagakim
0
270
Cloud TPUの使い方〜BigBirdの日本語学習済みモデルを作る〜
itagakim
0
600
多言語学習済みモデルmT5とは?
itagakim
1
590
AWSのGPUを安く使ってTensorFlowモデルを訓練する方法
itagakim
0
310
最近の自然言語処理モデルの動向
itagakim
1
520
ディープラーニングで芸術はできるか?〜生成系ネットワークの進展〜
itagakim
0
270
AWSとTerraform初心者がやってみたこと
itagakim
1
410
IntroductionToTensorFlow2_0.pdf
itagakim
1
310
Other Decks in Technology
See All in Technology
AWS Lambda のトラブルシュートをしていて思うこと
kazzpapa3
2
180
Oracle Cloud Infrastructureデータベース・クラウド:各バージョンのサポート期間
oracle4engineer
PRO
28
13k
Incident Response Practices: Waroom's Features and Future Challenges
rrreeeyyy
0
160
Introduction to Works of ML Engineer in LY Corporation
lycorp_recruit_jp
0
140
B2B SaaSから見た最近のC#/.NETの進化
sansantech
PRO
0
890
Shopifyアプリ開発における Shopifyの機能活用
sonatard
4
250
Can We Measure Developer Productivity?
ewolff
1
150
FlutterアプリにおけるSLI/SLOを用いたユーザー体験の可視化と計測基盤構築
ostk0069
0
100
AI前提のサービス運用ってなんだろう?
ryuichi1208
8
1.4k
組織成長を加速させるオンボーディングの取り組み
sudoakiy
2
210
初心者向けAWS Securityの勉強会mini Security-JAWSを9ヶ月ぐらい実施してきての近況
cmusudakeisuke
0
130
Amazon CloudWatch Network Monitor のススメ
yuki_ink
1
210
Featured
See All Featured
Being A Developer After 40
akosma
87
590k
jQuery: Nuts, Bolts and Bling
dougneiner
61
7.5k
Git: the NoSQL Database
bkeepers
PRO
427
64k
Bootstrapping a Software Product
garrettdimon
PRO
305
110k
RailsConf 2023
tenderlove
29
900
Building Flexible Design Systems
yeseniaperezcruz
327
38k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
47
2.1k
Fantastic passwords and where to find them - at NoRuKo
philnash
50
2.9k
Java REST API Framework Comparison - PWX 2021
mraible
PRO
28
8.2k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
169
50k
Happy Clients
brianwarren
98
6.7k
Visualization
eitanlees
145
15k
Transcript
Pythonػցֶशษڧձ in ৽ׁ Restart #2 Kerasで ⾳楽を作る 2018/10/13 ൘֞ ਖ਼හ
DL4USに参加し た DL4US౦ژେֶদඌ๛ݚڀ ࣨʹΑΔɺҰൠʹެ։͞Εͨɺ σΟʔϓϥʔχϯάͷΦϯϥ Πϯߨ࠲
7週にわたるカリキュラム ❖ Lesson1: खॻ͖จࣈೝࣝ —- χϡʔϥϧωοτϫʔΫ, Keras, ࠷దԽख๏, աֶशରࡦ ❖
Lesson2: ΈࠐΈχϡʔϥϧωοτϫʔΫͰը૾ೝࣝ —- CNN, σʔλ֦ு, Batch Normalization, Skip Connection ❖ Lesson3: ܥྻσʔλͰ༧ଌ —- RNNجૅ, LSTM, BPTT, Clipping, γϣʔτΧοτ, ήʔτ ❖ Lesson4: χϡʔϥϧ༁Ϟσϧ —- ݴޠϞσϧ, Seq2Seq, Attentionػߏ ❖ Lesson5: ը૾͔ΒΩϟϓγϣϯੜ —- Ωϟϓγϣϯੜ, సҠֶश, ϏʔϜαʔν ❖ Lesson6: χϡʔϥϧωοτͰը૾ੜ —- ਂੜϞσϧ, VAE, GAN ❖ Lesson7: χϡʔϥϧωοτͰήʔϜΛ߈ུ͢ΔAI —- DQN, OpenAI Gym, Double DQN, Dueling Network
iLect ΦϯϥΠϯߨ࠲Ͱఏڙ͞Εͨ GPU͕͑ΔԾڥ JupterLabͰڭࡐఏڙ ՝ίϯςετܗࣜ
最終課題[レポート] ❖ ʮσΟʔϓϥʔχϯάʹؔ͢Δ͜ͱͳΒ ԿΛ͍͍ͬͯΑʯ
⾳楽⽣成をやってみよう!
作戦 ❖ ࣮Ͱ͋·ΓΘͳ͍ੜϞσϧΛ͍Ζ͍Ζͱࢼͯ͠ ΈΔ ❖ RNNʢLSTMʣʹΑΔ༧ଌϞσϧ ❖ VAEʢVariational Auto EncoderʣʹΑΔੜ
❖ GANʢGenerative Adversarial NetworkʣʹΑΔੜ
LSTM ❖ Long Short Term Memory ❖ γʔέϯγϟϧσʔλΛѻ͏ߏͷද֨ http://colah.github.io/posts/2015-08-Understanding-LSTMs/
LSTMによる予測モデル ❖ աڈͷγʔέϯε͔Βɺ࣍ʹԿ͕དྷΔ͔ͷ֬Λग़͢ɻ https://towardsdatascience.com/lstm-by-example-using-tensorflow-feb0c1968537
VAE ❖ Auto Encoderʢࣗݾූ߸Խثʣೖྗͷ࣍ݩΛݮͨ͠જࡏۭؒΛɺೖྗͱग़ྗ͕ಉ͡Α͏ʹͳ ΔΑ͏ʹֶश͢Δɻ ❖ Variational Auto EncoderʢมΦʔτΤϯίʔμʣɺજࡏۭؒΛଟมྔਖ਼نۭؒͱఆ͠ɺ ͦͷฏۉͱΛֶशʹΑͬͯٻΊΔɻ
http://mlexplained.com/2017/12/28/an-intuitive-explanation-of-variational-autoencoders-vaes-part-1/
VAE で⽣成される潜在空間 ❖ VAEʹΑͬͯੜ͞ΕΔજࡏۭؒʹʮҙຯͷ͋Δ࠲ඪ࣠ʯ͕ظ͞ΕΔɻ ❖ ্هMNISTͷखॻ͖ࣈͷྫ͕ͩɺإࣸਅͰʮײʯʮϝΨωʯʮͻ͛ʯʮஉ ঁʯͳͲͷ࠲ඪ͕࣠ݟग़͞Ε͍ͯΔɻ https://tiao.io/post/tutorial-on-variational-autoencoders-with-a-concise-keras-implementation/
GAN ❖ GANʢఢରతੜωοτϫʔΫʣͰɺʮآ࡞ऀʢGeneratorʣʯͱʮؑఆՈ ʢDiscriminatorʣʯ͕᛭ୖຏ͠ͳ͕ΒֶशΛߦ͏ɻ ❖ ݁Ռͱͯ͠ʮآ࡞ऀʯ͕ϥϯμϜͳϊΠζΛͱʹɺʮؑఆՈʯʹݟഁΒΕͳ͍Α͏ ͳʮຊΒ͍͠࡞ʯΛੜ͢Δ͜ͱ͕ظ͞ΕΔɻ https://skymind.ai/wiki/generative-adversarial-network-gan
LSTMのモデル ❖ LSTMΛ3ॏͶͨϞσϧ ❖ աֶशͷ੍ʹDropoutΛೖΕ͍ͯΔ͕ɾɾɾ model = Sequential() model.add(LSTM(512, input_shape=(sequence_length,
n_vocab), return_sequences=True)) model.add(Dropout(0.3)) model.add(LSTM(512, return_sequences=True)) model.add(Dropout(0.3)) model.add(LSTM(512)) model.add(Dense(256)) model.add(Dropout(0.3)) model.add(Dense(n_vocab, activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['acc'])
VAEのモデル # Encoder x = Input(shape=(max_length, n_vocab)) h = LSTM(lstm_dim,
return_sequences=False, name='lstm_1')(x) z_mean = Dense(latent_dim)(h) # જࡏมͷฏۉ μ z_log_var = Dense(latent_dim)(h) #જࡏมͷࢄ σͷlog encoder = Model(inputs=x, outputs=[z_mean, z_log_var]) def sampling(args): z_mean, z_log_var = args epsilon = K.random_normal(shape=(batch_size, latent_dim), mean=0., stddev=1.0) return z_mean + K.exp(z_log_var) * epsilon z = Lambda(sampling, output_shape=(latent_dim,))([z_mean, z_log_var]) # Decoder decoder_input = Input(shape=(latent_dim,)) repeated_context = RepeatVector(max_length)(decoder_input) h_decoded = LSTM(lstm_dim, return_sequences=True)(repeated_context) decoder_output = TimeDistributed(Dense(n_vocab, activation='softmax'))(h_decoded) decoder = Model(inputs=decoder_input, outputs=decoder_output) x_decoded = decoder(z)
VAEの損失関数 ❖ VAEͷଛࣦؔɺೖྗͱग़ྗͷؒͷࠩҟΛද͢ʮ෮ݩޡࠩʯʹՃ͑ ͯɺજࡏۭؒͷύϥϝʔλΛنఆ͢Δʮਖ਼ଇԽ߲ʯΛ༻͍Δɻ class CustomVariationalLayer(Layer): # Layer classͷܧঝ def
__init__(self, **kwargs): self.is_placeholder = True super(CustomVariationalLayer, self).__init__(**kwargs) def vae_loss(self, x, x_decoded): x = K.flatten(x) x_decoded = K.flatten(x_decoded) xent_loss = max_length * metrics.binary_crossentropy(x, x_decoded) # ෮ݩޡࠩ kl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1) # ਖ਼ଇԽ߲ return K.mean(xent_loss + kl_loss) def call(self, inputs): x = inputs[0] x_decoded = inputs[1] loss = self.vae_loss(x, x_decoded) self.add_loss(loss, inputs=inputs) # Layer class ͷadd_lossΛར༻ return x # ࣮࣭తʹग़ྗར༻͠ͳ͍ y = CustomVariationalLayer()([x, x_decoded]) vae = Model(x, y) # xΛinputʹyΛग़ྗ, ग़ྗ࣮࣭ؔͳ͍ vae.compile(optimizer='rmsprop', loss=None) # CustomVariationalLayerͰՃͨ͠LossΛར༻͢ΔͷͰ͜͜ͰͷlossNoneͱ͢Δ
GANのモデル ❖ GANͷ܇࿅GͱDΛަޓʹֶशͤ͞Δϓϩηε # Generator generator_input = Input(shape=(max_length, latent_dim,)) x
= LSTM(lstm_dim, return_sequences=True)(generator_input) generator_output = TimeDistributed(Dense(n_vocab, activation='softmax'))(x) generator = Model(generator_input, generator_output) # Discriminator discriminator_input = Input(shape=(max_length, n_vocab)) x = LSTM(lstm_dim)(discriminator_input) dense_output = Dense(256, activation='relu')(x) discriminator_output = Dense(2, activation='softmax')(dense_output) discriminator = Model(discriminator_input, discriminator_output) discriminator.compile(loss='binary_crossentropy', optimizer=Adam(lr=0.1)) # GAN gan_input = Input(shape=(max_length, latent_dim)) x = generator(gan_input) gan_output = discriminator(x) model = Model(gan_input, gan_output) model.compile(loss='binary_crossentropy', optimizer=opt)
実験 ❖ σʔλmidiworld.comͷόοϋͷ2ΠϯϕϯγϣϯͷMIDIϑΝΠϧΛ༻ ❖ LSTMͱGANͰɺ܇࿅༻ָۂ͔ΒΓग़ͨ͠அยΛɺVAEͰ܇࿅༻ָۂͦͷ ͷΛ༻ͨ͠ ❖ LSTMͰ܇࿅༻ָۂͷχϡΞϯεʹ͍ۙϝϩσΟ͕ੜ͞Εͨ ❖ VAEͰ܇࿅༻ָۂΛೖྗʹͨ͠߹ɺݪۂͷχϡΞϯεʹ͍ۙϝϩσΟ͕ੜ
͞ΕΔ͕ɺͦΕҎ֎ͷજࡏۭؒͷΛࢦఆͨ͠߹ʹϥϯμϜੑͷڧ͍ϝ ϩσΟ͕ੜ͞Εͨ ❖ GANͰֶश͕͏·͘Ώ͔ͣɺ࣌ંύλʔϯੑͷڧ͍ϝϩσΟ͕ੜ͞Εͨ ͕ɺ΄ͱΜͲಉ͡Իූͷ܁Γฦ͠ʹͳͬͯ͠·ͬͨ
https://github.com/masa-ita/keras-music-generators https://soundcloud.com/itagakim
宣伝 ❖ python/django meetup in ৽ׁ ❖ 10݄24ʢਫʣ19:00-21:00 @ Prototype
Cafe ❖ https://pyml-niigata.connpass.com/event/104872/ ❖ ΦʔϓϯιʔεΧϯϑΝϨϯε 2018 Niigata ❖ 11݄10ʢʣ11:00-17:30 @ ΄ΜΆʔͱ ❖ https://www.ospn.jp/osc2018-niigata/
Python機械学習勉強会in新潟では、 Slackを使った情報交換を⾏っています。 後ほどconnpassのグループで招待リンクをお送りします。