Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
ディープラーニングで音楽生成
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
masa-ita
October 13, 2018
Technology
0
1.3k
ディープラーニングで音楽生成
DL4USの最終課題として、ディープラーニングでの音楽生成を試みた。
LSTMによる予測モデル、VAE、GANを試した。
Python機械学習勉強会in新潟 2018-10-13での発表スライド。
masa-ita
October 13, 2018
Tweet
Share
More Decks by masa-ita
See All by masa-ita
Ollamaを使ったLocal Language Model活用法
itagakim
1
190
Run Instant NeRF on Docker
itagakim
1
2.3k
3D Clustering and Metric Learning
itagakim
0
390
Cloud TPUの使い方〜BigBirdの日本語学習済みモデルを作る〜
itagakim
0
710
多言語学習済みモデルmT5とは?
itagakim
1
760
AWSのGPUを安く使ってTensorFlowモデルを訓練する方法
itagakim
0
400
最近の自然言語処理モデルの動向
itagakim
1
580
ディープラーニングで芸術はできるか?〜生成系ネットワークの進展〜
itagakim
0
370
AWSとTerraform初心者がやってみたこと
itagakim
1
500
Other Decks in Technology
See All in Technology
【社内勉強会】新年度からコーディングエージェントを使いこなす - 構造と制約で引き出すClaude Codeの実践知
nwiizo
10
5.5k
Kiro Meetup #7 Kiro アップデート (2025/12/15〜2026/3/20)
katzueno
1
180
Astro Islandsの 内部実装を 「日本で一番わかりやすく」 ざっくり解説!
knj
0
140
夢の無限スパゲッティ製造機 #phperkaigi
o0h
PRO
0
320
【Λ(らむだ)】最近のアプデ情報 / RPALT20260318
lambda
0
140
FastMCP OAuth Proxy with Cognito
hironobuiga
3
110
SSoT(Single Source of Truth)で「壊して再生」する設計
kawauso
1
110
GitHub Copilot CLI で Azure Portal to Bicep
tsubakimoto_s
0
160
Copilot 宇宙へ 〜生成AIで「専門データの壁」を壊す方法〜
nakasho
0
140
1GB RAMのラズピッピで何ができるのか試してみよう / 20260319-rpijam-1gb-rpi-whats-possible
akkiesoft
0
710
詳解 強化学習 / In-depth Guide to Reinforcement Learning
prinlab
0
350
GCASアップデート(202601-202603)
techniczna
0
250
Featured
See All Featured
Groundhog Day: Seeking Process in Gaming for Health
codingconduct
0
130
How to train your dragon (web standard)
notwaldorf
97
6.6k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
141
35k
Navigating the moral maze — ethical principles for Al-driven product design
skipperchong
2
300
Getting science done with accelerated Python computing platforms
jacobtomlinson
2
150
My Coaching Mixtape
mlcsv
0
82
Docker and Python
trallard
47
3.8k
New Earth Scene 8
popppiees
1
1.8k
Google's AI Overviews - The New Search
badams
0
940
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
659
61k
Conquering PDFs: document understanding beyond plain text
inesmontani
PRO
4
2.5k
How to Think Like a Performance Engineer
csswizardry
28
2.5k
Transcript
Pythonػցֶशษڧձ in ৽ׁ Restart #2 Kerasで ⾳楽を作る 2018/10/13 ൘֞ ਖ਼හ
DL4USに参加し た DL4US౦ژେֶদඌ๛ݚڀ ࣨʹΑΔɺҰൠʹެ։͞Εͨɺ σΟʔϓϥʔχϯάͷΦϯϥ Πϯߨ࠲
7週にわたるカリキュラム ❖ Lesson1: खॻ͖จࣈೝࣝ —- χϡʔϥϧωοτϫʔΫ, Keras, ࠷దԽख๏, աֶशରࡦ ❖
Lesson2: ΈࠐΈχϡʔϥϧωοτϫʔΫͰը૾ೝࣝ —- CNN, σʔλ֦ு, Batch Normalization, Skip Connection ❖ Lesson3: ܥྻσʔλͰ༧ଌ —- RNNجૅ, LSTM, BPTT, Clipping, γϣʔτΧοτ, ήʔτ ❖ Lesson4: χϡʔϥϧ༁Ϟσϧ —- ݴޠϞσϧ, Seq2Seq, Attentionػߏ ❖ Lesson5: ը૾͔ΒΩϟϓγϣϯੜ —- Ωϟϓγϣϯੜ, సҠֶश, ϏʔϜαʔν ❖ Lesson6: χϡʔϥϧωοτͰը૾ੜ —- ਂੜϞσϧ, VAE, GAN ❖ Lesson7: χϡʔϥϧωοτͰήʔϜΛ߈ུ͢ΔAI —- DQN, OpenAI Gym, Double DQN, Dueling Network
iLect ΦϯϥΠϯߨ࠲Ͱఏڙ͞Εͨ GPU͕͑ΔԾڥ JupterLabͰڭࡐఏڙ ՝ίϯςετܗࣜ
最終課題[レポート] ❖ ʮσΟʔϓϥʔχϯάʹؔ͢Δ͜ͱͳΒ ԿΛ͍͍ͬͯΑʯ
⾳楽⽣成をやってみよう!
作戦 ❖ ࣮Ͱ͋·ΓΘͳ͍ੜϞσϧΛ͍Ζ͍Ζͱࢼͯ͠ ΈΔ ❖ RNNʢLSTMʣʹΑΔ༧ଌϞσϧ ❖ VAEʢVariational Auto EncoderʣʹΑΔੜ
❖ GANʢGenerative Adversarial NetworkʣʹΑΔੜ
LSTM ❖ Long Short Term Memory ❖ γʔέϯγϟϧσʔλΛѻ͏ߏͷද֨ http://colah.github.io/posts/2015-08-Understanding-LSTMs/
LSTMによる予測モデル ❖ աڈͷγʔέϯε͔Βɺ࣍ʹԿ͕དྷΔ͔ͷ֬Λग़͢ɻ https://towardsdatascience.com/lstm-by-example-using-tensorflow-feb0c1968537
VAE ❖ Auto Encoderʢࣗݾූ߸Խثʣೖྗͷ࣍ݩΛݮͨ͠જࡏۭؒΛɺೖྗͱग़ྗ͕ಉ͡Α͏ʹͳ ΔΑ͏ʹֶश͢Δɻ ❖ Variational Auto EncoderʢมΦʔτΤϯίʔμʣɺજࡏۭؒΛଟมྔਖ਼نۭؒͱఆ͠ɺ ͦͷฏۉͱΛֶशʹΑͬͯٻΊΔɻ
http://mlexplained.com/2017/12/28/an-intuitive-explanation-of-variational-autoencoders-vaes-part-1/
VAE で⽣成される潜在空間 ❖ VAEʹΑͬͯੜ͞ΕΔજࡏۭؒʹʮҙຯͷ͋Δ࠲ඪ࣠ʯ͕ظ͞ΕΔɻ ❖ ্هMNISTͷखॻ͖ࣈͷྫ͕ͩɺإࣸਅͰʮײʯʮϝΨωʯʮͻ͛ʯʮஉ ঁʯͳͲͷ࠲ඪ͕࣠ݟग़͞Ε͍ͯΔɻ https://tiao.io/post/tutorial-on-variational-autoencoders-with-a-concise-keras-implementation/
GAN ❖ GANʢఢରతੜωοτϫʔΫʣͰɺʮآ࡞ऀʢGeneratorʣʯͱʮؑఆՈ ʢDiscriminatorʣʯ͕᛭ୖຏ͠ͳ͕ΒֶशΛߦ͏ɻ ❖ ݁Ռͱͯ͠ʮآ࡞ऀʯ͕ϥϯμϜͳϊΠζΛͱʹɺʮؑఆՈʯʹݟഁΒΕͳ͍Α͏ ͳʮຊΒ͍͠࡞ʯΛੜ͢Δ͜ͱ͕ظ͞ΕΔɻ https://skymind.ai/wiki/generative-adversarial-network-gan
LSTMのモデル ❖ LSTMΛ3ॏͶͨϞσϧ ❖ աֶशͷ੍ʹDropoutΛೖΕ͍ͯΔ͕ɾɾɾ model = Sequential() model.add(LSTM(512, input_shape=(sequence_length,
n_vocab), return_sequences=True)) model.add(Dropout(0.3)) model.add(LSTM(512, return_sequences=True)) model.add(Dropout(0.3)) model.add(LSTM(512)) model.add(Dense(256)) model.add(Dropout(0.3)) model.add(Dense(n_vocab, activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['acc'])
VAEのモデル # Encoder x = Input(shape=(max_length, n_vocab)) h = LSTM(lstm_dim,
return_sequences=False, name='lstm_1')(x) z_mean = Dense(latent_dim)(h) # જࡏมͷฏۉ μ z_log_var = Dense(latent_dim)(h) #જࡏมͷࢄ σͷlog encoder = Model(inputs=x, outputs=[z_mean, z_log_var]) def sampling(args): z_mean, z_log_var = args epsilon = K.random_normal(shape=(batch_size, latent_dim), mean=0., stddev=1.0) return z_mean + K.exp(z_log_var) * epsilon z = Lambda(sampling, output_shape=(latent_dim,))([z_mean, z_log_var]) # Decoder decoder_input = Input(shape=(latent_dim,)) repeated_context = RepeatVector(max_length)(decoder_input) h_decoded = LSTM(lstm_dim, return_sequences=True)(repeated_context) decoder_output = TimeDistributed(Dense(n_vocab, activation='softmax'))(h_decoded) decoder = Model(inputs=decoder_input, outputs=decoder_output) x_decoded = decoder(z)
VAEの損失関数 ❖ VAEͷଛࣦؔɺೖྗͱग़ྗͷؒͷࠩҟΛද͢ʮ෮ݩޡࠩʯʹՃ͑ ͯɺજࡏۭؒͷύϥϝʔλΛنఆ͢Δʮਖ਼ଇԽ߲ʯΛ༻͍Δɻ class CustomVariationalLayer(Layer): # Layer classͷܧঝ def
__init__(self, **kwargs): self.is_placeholder = True super(CustomVariationalLayer, self).__init__(**kwargs) def vae_loss(self, x, x_decoded): x = K.flatten(x) x_decoded = K.flatten(x_decoded) xent_loss = max_length * metrics.binary_crossentropy(x, x_decoded) # ෮ݩޡࠩ kl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1) # ਖ਼ଇԽ߲ return K.mean(xent_loss + kl_loss) def call(self, inputs): x = inputs[0] x_decoded = inputs[1] loss = self.vae_loss(x, x_decoded) self.add_loss(loss, inputs=inputs) # Layer class ͷadd_lossΛར༻ return x # ࣮࣭తʹग़ྗར༻͠ͳ͍ y = CustomVariationalLayer()([x, x_decoded]) vae = Model(x, y) # xΛinputʹyΛग़ྗ, ग़ྗ࣮࣭ؔͳ͍ vae.compile(optimizer='rmsprop', loss=None) # CustomVariationalLayerͰՃͨ͠LossΛར༻͢ΔͷͰ͜͜ͰͷlossNoneͱ͢Δ
GANのモデル ❖ GANͷ܇࿅GͱDΛަޓʹֶशͤ͞Δϓϩηε # Generator generator_input = Input(shape=(max_length, latent_dim,)) x
= LSTM(lstm_dim, return_sequences=True)(generator_input) generator_output = TimeDistributed(Dense(n_vocab, activation='softmax'))(x) generator = Model(generator_input, generator_output) # Discriminator discriminator_input = Input(shape=(max_length, n_vocab)) x = LSTM(lstm_dim)(discriminator_input) dense_output = Dense(256, activation='relu')(x) discriminator_output = Dense(2, activation='softmax')(dense_output) discriminator = Model(discriminator_input, discriminator_output) discriminator.compile(loss='binary_crossentropy', optimizer=Adam(lr=0.1)) # GAN gan_input = Input(shape=(max_length, latent_dim)) x = generator(gan_input) gan_output = discriminator(x) model = Model(gan_input, gan_output) model.compile(loss='binary_crossentropy', optimizer=opt)
実験 ❖ σʔλmidiworld.comͷόοϋͷ2ΠϯϕϯγϣϯͷMIDIϑΝΠϧΛ༻ ❖ LSTMͱGANͰɺ܇࿅༻ָۂ͔ΒΓग़ͨ͠அยΛɺVAEͰ܇࿅༻ָۂͦͷ ͷΛ༻ͨ͠ ❖ LSTMͰ܇࿅༻ָۂͷχϡΞϯεʹ͍ۙϝϩσΟ͕ੜ͞Εͨ ❖ VAEͰ܇࿅༻ָۂΛೖྗʹͨ͠߹ɺݪۂͷχϡΞϯεʹ͍ۙϝϩσΟ͕ੜ
͞ΕΔ͕ɺͦΕҎ֎ͷજࡏۭؒͷΛࢦఆͨ͠߹ʹϥϯμϜੑͷڧ͍ϝ ϩσΟ͕ੜ͞Εͨ ❖ GANͰֶश͕͏·͘Ώ͔ͣɺ࣌ંύλʔϯੑͷڧ͍ϝϩσΟ͕ੜ͞Εͨ ͕ɺ΄ͱΜͲಉ͡Իූͷ܁Γฦ͠ʹͳͬͯ͠·ͬͨ
https://github.com/masa-ita/keras-music-generators https://soundcloud.com/itagakim
宣伝 ❖ python/django meetup in ৽ׁ ❖ 10݄24ʢਫʣ19:00-21:00 @ Prototype
Cafe ❖ https://pyml-niigata.connpass.com/event/104872/ ❖ ΦʔϓϯιʔεΧϯϑΝϨϯε 2018 Niigata ❖ 11݄10ʢʣ11:00-17:30 @ ΄ΜΆʔͱ ❖ https://www.ospn.jp/osc2018-niigata/
Python機械学習勉強会in新潟では、 Slackを使った情報交換を⾏っています。 後ほどconnpassのグループで招待リンクをお送りします。