逆FM音源はWasserstein距離を誤差関数に使うと良い、という話をします これは2021年3月20日に行われた Kernel/VM探検隊online part2での発表資料です サンプルコード: https://github.com/Fadis/wifm
assersteinٯFMԻݯNAOMASA MATSUBAYASHITwitter: @fadis_WASSERSTEININVERTED FREQUENCY MODULATION SYNTHESIZERhttps://github.com/Fadis/wifmαϯϓϧίʔυWݱࡏςετԻΛ࠶ੜ͍ͯ͠·͢ɻԻฉ͍͑ͯ͜·͔͢
View Slide
assersteinٯFMԻݯNAOMASA MATSUBAYASHITwitter: @fadis_WASSERSTEININVERTED FREQUENCY MODULATION SYNTHESIZERhttps://github.com/Fadis/wifmαϯϓϧίʔυW
FMԻݯੲͷPCήʔϜػʹΑ͘ࡌ͍ͬͯͨԻָΛͰΔϋʔυΣΞ
άϥϯυϐΞϊΞϧταοΫεϏϒϥϑΥϯόΠΦϦϯϚϦϯόຊͷָثͷԻΛϑʔϦΤม
261.63Hz 523.25Hz 784.88Hz 1046.5Hz 1308.1Hz 1569.8Hz 1831.4Hz 2093.0Hz 2354.6HzԻ֊ʹରԠ͢ΔपάϥϯυϐΞϊΞϧταοΫεϏϒϥϑΥϯόΠΦϦϯϚϦϯόجԻ
261.63Hz 523.25Hz 784.88Hz 1046.5Hz 1308.1Hz 1569.8Hz 1831.4Hz 2093.0Hz 2354.6HzͦͷഒͷपάϥϯυϐΞϊΞϧταοΫεϏϒϥϑΥϯόΠΦϦϯϚϦϯόഒԻ
ͦͷഒͷपͷάϥϯυϐΞϊߴ͍प΄Ͳૣ͘ݮਰ͢Δ
͜͏ͨ͠ৼΔ͍Λ؆୯ͳܭࢉͰਅࣅΔࣄ͕Ͱ͖Ε؆୯ͳϋʔυΣΞͰຊͷָثͬΆ͍Ի͕ग़ͤΔ
f (t) = Ec(t) sin (2πωct + Em(t) B sin (2πωmt))͜ͷαΠϯͰ͜ͷαΠϯΛΊΔFMԻݯ
1ωc= ωm2ωc= ωm3ωc= ωm4ωc= ωm5ωc= ωmf (t) = Ec(t) sin (2πωct + Em(t) B sin (2πωmt))6ωc= ωmύϥϝʔλʹΑ༷ͬͯʑͳपʹ༷ʑͳେ͖͞ͷഒԻ͕ग़Δ
f (t) = Ec(t) sin (2πωct + Em(t) B sin (2πωmt))͜Ε ͜Ε ͜Ε ͜Ε ͜Ε͜ΕΒͷύϥϝʔλΛௐͯ͠ຊͷָثͱಉ͡ഒԻΛ࣋ͭԻΛग़ͤΕͦͷָثͷԻʹฉ͑͜Δ
͜ΕΒͷύϥϝʔλΛௐͯ͠ຊͷָثͱಉ͡ഒԻΛ࣋ͭԻΛग़ͤΕͦͷָثͷԻʹฉ͑͜Δf (t) = Ec(t) sin (2πωct + Em(t) B sin (2πωmt)) FMԻݯ໓Μͩ͜Ε ͜Ε ͜Ε ͜Ε ͜Ε͋·Γྑ͍Ի͕ग़ͳ͍ͱ͞Ε
αϯϓϦϯάԻΛपղੳϥϯμϜͳҨࢠΛੜ֤ҨࢠΛ'.ԻݯͰԋ֤ҨࢠͷԻΛपղੳҰ༷ަ伹ͱಥવมҟͰҨࢠΛੜαϯϓϦϯάԻͱͷڑΛܭࢉϧʔϨοτબͰݸମΛݫબ࠷େղೳͰ্ҐʹมԽ͕ݟΒΕͳ͍͍͍͍͑ղೳΛ্͛Δ͔Ͳ͏͔ΛஅҨతFMԻݯ2016https://speakerdeck.com/fadis/yi-chuan-de-fmyin-yuanҨతΞϧΰϦζϜͰΑΓ༩͑ΒΕͨԻͷεϖΫτϧʹ͍ۙԻʹͳΔύϥϝʔλΛੜଘͤ͞Δୈ12ճ ΧʔωϧʗVM୳ݕୂ
ຊͷϐΞϊͷԻ'.ԻݯͷϐΞϊͷԻ͜ͷ࣮ݧͷաఔͰFMԻݯͰग़ͤͳ͍ͱ͞Ε͍ͯͨϐΞϊͷԻΛͦͦ͜͜࠶ݱ͢Δύϥϝʔλ͕ݟ͔ͭͬͨҨతFMԻݯ2016ୈ12ճ ΧʔωϧʗVM୳ݕୂ
FMԻݯͰָثͷԻ৭Λ࠶ݱͰ͖ͳ͍ਓؒͷΧϯͰFMԻݯͰָثͷԻ৭Λ࠶ݱͰ͖ͳ͍
https://speakerdeck.com/fadis/niyurarufmyin-yuanٯFMԻݯ2020FMԻݯͷࣜΛඍͯ͠ޯ๏ͰύϥϝʔλΛൃݟ͢Δf (t) = A∞∑n=−∞Jn(B) cos (2πt (nωm+ ωc))Χʔωϧ7.୳ݕୂ!ؔճdJn(x)dx=12(Jn−1(x) − Jn+1(x))
࠷ԫ৭͘ͳΔ ͱ Λݟ͚͍ͭͨω B
Ұ൪͍ॴΛ୳͍ͨ͠ࠓͷҐஔͷ͖ΛݟͯΑΓ͍ํʹগ͠Ҡಈ͢Δޯ๏ͬͱࠨ͔ͳ͜ͷΜʹͨͲΓண͘ϘʔϧΛస͕ͯ͠Ұ൪͍ॴͰࢭ·ΔͷΛظ͢Δͷʹࣅ͍ͯΔ
ޯ๏͜ͷΜʹͨͲΓண͘͜͜ʹ͋Δຊͷ࠷খʹͨͲΓண͚ͳ͍ͬͱࠨ͔ͳͬͱӈ͔ͳ
͜ͷঢ়ଶͰޯ๏ͷద༻ແཧ
261.63Hz 523.25Hz 784.88Hz 1046.5Hz 1308.1Hz 1569.8Hz 1831.4Hz 2093.0Hz 2354.6HzഒԻഒͷपۙʹग़ΔάϥϯυϐΞϊΞϧταοΫεϏϒϥϑΥϯόΠΦϦϯϚϦϯόৗʹʹͳΔ͜ͱʹ͢Δω =ωmωc
ৗʹʹͳΔ͜ͱʹ͢Δω =ωmωc
͜ͷࢁΛ্ʹ͑ΒΕΔͱࠔΔ
ෳͷॳظ͔Β୳࢝͠Ίͯ࠷খʹ౸ୡͨ͠1ͭΛબͿมͷ͕૿͑Δͱࢼߦճ͕ٸܹʹ૿͑Δ3ΦϖϨʔλҎ্ʹద༻͕ࠔ
ຊͷόΠΦϦϯٯFMԻݯόΠΦϦϯ࣮ࡍόΠΦϦϯͷԻʹฉ͑͜ͳ͍σνϡʔϯ͜ͷΜ͕ͬͦ͝Γফ͍͑ͯΔ 2ΦϖϨʔλ ϑΟʔυόοΫͳ͠
ຊͷόΠΦϦϯٯFMԻݯόΠΦϦϯഒͷप͔Βগͣ͠ΕͨपͰഒԻ͕ग़͍ͯΔ(ഒԻͷपͷഒ)ৗʹʹͳΔ͜ͱʹ͢Δω =ωmωcͱ͍͏੍Λ՝ͨ͠ͷͰͣΕͨഒԻ͕શͯແࢹ͞Ε͍ͯΔ
ޡࠩͷ͕Α͘ͳ͍খࡉͰΛྑ͘͢ΔͱָثͷಛΛऔΓଛͶΔͦͦଛࣦ͕ؔྑ͘ͳ͍ͷͰ?͔͜͜Β͕ࠓճͷ৽ωλ
ฏۉೋޡࠩL =1nn∑i=0(ei− gi)2ຊͷपੜͷपશͯͷपʹ͍ͭͯͷޡࠩΛͯ͠ฏۉΛऔΔຊͱੜͷ͕ࠩਖ਼ͰෛͰޡࠩਖ਼ʹͳΔͱ ͕ࣅ͍ͯΔఔ খ͘͞ͳΔe g LٯFMԻݯͷଛࣦؔ
ฏۉೋޡࠩຊੜຊͰग़͍ͯͳ͍पͷ͕ग़͍ͯΔຊͰग़͍ͯΔपͷ͕ग़͍ͯͳ͍͜͜Ͱग़͍ͯΔؙ͕͝ͱޡࠩ͜͜Ͱग़͍ͯͳؙ͍͕͝ͱޡࠩޡࠩ=A + B = A= B
ฏۉೋޡࠩຊੜຊͰग़͍ͯΔपͷ͕ग़͍ͯͳ͍͜͜Ͱग़͍ͯͳؙ͍͕͝ͱޡࠩؒҧͬͨपͰԻ͕ग़͍ͯΔΑΓԻ͕ग़͍ͯͳ͍ํ͕·ͩϚγޡࠩ=B= B
ҹͷํʹҠಈ͢Δ΄Ͳߴप͕ՄௌҬ֎ʹग़Δ= Ի͕ग़͍ͯͳ͍ͷͰϚγঢ়ଶࢁͷݪҼ
ฏۉೋޡࠩຊੜޡࠩ=શ෦ͳ͕͠มΘΔͱഒԻͷपҠಈ͢Δ͕ฏۉೋޡࠩͰຊͱҰக͢ΔҰॠΛআ͍ͯ͘͠ग़͍ͯͳ͍ํ͕Ϛγω
͜ͷ݁Ռ ํຊͱप͕Ұக͢Δ͔ᷮͳॠ͚ؒͩԫ৭͘ͳΔω
L =1nn∑i=0(ei− gi)2ฏۉೋޡ͕ࠩԻ৭ಉ࢜ͷڑΛଌΔखஈͱͯ͠ద͍ͯ͠ͳ͍ࣄ໌Β͔
Wassersteinڑ
WassersteinڑW (ℙ, ℚ) = infJ∈𝒥(ℙ, ℚ)∫∥x − y∥dJ (x, y)ຊͷੜͷͷ͋Δαϯϓϧ͔Βͷ͋Δαϯϓϧͷڑℙℚ࠷ޮͷྑ͍༌ૹํ๏Λ༻͍ͨ߹ͷ2ͭͷͷྨࣅΛଌΔڑͷ͋Δαϯϓϧ͔Βͷ͋Δαϯϓϧͷ༌ૹྔℙℚ༌ૹྔ ڑͷ૯×
Wassersteinڑ00.250.50.7510 1 2 3 4 500.250.50.7510 1 2 3 4 5ℙ ℚW (ℙ, ℚ) = × 1+ × 1+ × 2ͷঢ়ଶ͔Β ͷঢ়ଶʹ͢Δҝʹӡͳ͚ΕͳΒͳ͍ ӡͿڑ ͷ૯ℙ ℚ×͜ͷ͕খ͍͞ఔ2ͭͷࣅ͍ͯΔͱݴ͑Δ
ֶతʹͪΌΜͱͨ͠આ໌ޮͷྑ͍ٻΊํhttp://www.stat.cmu.edu/~larry/=sml/36-708 Statistical Methods for Machine Learning (CMUͷߨٛ)ͷࢿྉͱ͔https://arxiv.org/abs/1701.07875Arjovsky, Martin, Soumith Chintala, and Léon Bottou. "Wasserstein generative adversarialnetworks." International conference on machine learning. PMLR, 2017.Wasserstein GANWassersteinڑΛͬͯϞʔυ่յΛ͙GANʹ͍ͭͯͷจOptimal Transport and Wasserstein Distance͋ͨΓݟͯ
Wassersteinڑͷੌ͍ॴຊੜฏۉೋޡࠩϚϯੜԿ͔ؒҧ͍ͬͯΔੜͷ ཁΒͳ͍ॴʹੜ͍͑ͯΔͷͰແ͍ํ͕ྑ͍WassersteinϚϯ੯͍͠ੜͷ ͋ͱ͏গ͠ࠨʹ͋ͬͨΒᘳͩͬͨͷʹ΄Μͷগ͠ͷࠩ
ਓ͕ؒݟͯ࠷దͳ ͱ ͷҐஔ͕ݟ͑ΔϨϕϧB ω
ϑʔϦΤมWassersteinڑؔ ΤϯϕϩʔϓਪఆຊͷָثͷԻ∑ਖ਼نԽຊͷԻͱੜͨ͠ԻͷڑL ԻྔAपྖҬFMԻݯมௐͷڧ͞B 2ͭͷαΠϯͷपൺω·ͣదͳ ͱ Ͱ ΛٻΊͯB ω L
ϑʔϦΤมपྖҬFMԻݯͷඍWassersteinڑؔͷඍ ΤϯϕϩʔϓਪఆຊͷָثͷԻ∑ਖ਼نԽมௐͷڧ͞B 2ͭͷαΠϯͷपൺωຊͷԻͱੜͨ͠ԻͷڑL ԻྔAAdamޡࠩٯͰ ͔Β ͱ ΛͲ͏मਖ਼͖͔͢ΛٻΊΔL B ω
ϑʔϦΤมपྖҬFMԻݯͷඍWassersteinڑؔͷඍ ΤϯϕϩʔϓਪఆຊͷָثͷԻ∑ਖ਼نԽมௐͷڧ͞B 2ͭͷαΠϯͷपൺωຊͷԻͱੜͨ͠ԻͷڑL ԻྔAAdamޡࠩٯͰ ͔Β ͱ ΛͲ͏मਖ਼͖͔͢ΛٻΊΔL B ωWassersteinڑؔղੳతʹඍͰ͖ͳ͍
std::tuple< std::vector< T >, std::vector< T > > backward(const std::vector &A,const std::vector &AWeights,const std::vector &B,const std::vector &BWeights,T dist, T delta, T wdelta) {std::vector dAWeights( AWeights.size() );std::vector dA( A.size() );#pragma omp parallel forfor( size_t i = 0; i < A.size(); ++i ) {auto modif_ = A;modif_[ i ] += delta;auto modified_dist = forward_x( modif_, AWeights, B, BWeights );dA[ i ] = ( modified_dist - dist ) / delta;}#pragma omp parallel forfor( size_t i = 0; i < AWeights.size(); ++i ) {auto modif_ = AWeights;modif_[ i ] += wdelta;auto modified_dist = forward_x( A, modif_, B, BWeights );dAWeights[ i ] = ( modified_dist - dist ) / wdelta;}return std::make_tuple( dA, dAWeights );}ղੳతʹඍͰ͖ͳ͍ͳΒతʹඍ͢Εྑ͍
ɺ Ͱ࢝ΊΔ ෳͷॳظ͔Βࢼ͢ඞཁͳ͍B = 5 ω = 5
݁Ռ
ຊͷόΠΦϦϯٯFMԻݯόΠΦϦϯ͜ͷΜ͕ͬͦ͝Γফ͍͑ͯΔ࣮ࡍόΠΦϦϯͷԻʹฉ͑͜ͳ͍2ΦϖϨʔλ ϑΟʔυόοΫͳ͠
ຊͷόΠΦϦϯWassersteinٯFMԻݯόΠΦϦϯόΠΦϦϯͷԻʹฉ͑͜Δ2ΦϖϨʔλ ϑΟʔυόοΫͳ͠
ຊͷϐΞϊਓ͕ؒ࡞ͬͨFMԻݯϐΞϊ80ʙ90ͷPCήʔϜͱ͔ͰΑ͘໐ͬͯͨͭ2ΦϖϨʔλ ϑΟʔυόοΫͳ͠
ຊͷϐΞϊٯFMԻݯͰ࡞ͬͨFMԻݯϐΞϊ2ΦϖϨʔλ ϑΟʔυόοΫͳ͠
ຊͷϐΞϊWassersteinٯFMԻݯͰ࡞ͬͨFMԻݯϐΞϊ2ΦϖϨʔλ ϑΟʔυόοΫͳ͠
՝
$ time wav2fm2 -i Piano.ff.C4.aiff -d 1 -n 60 -t 40...real 4m43.748suser 17m5.382ssys 0m3.239s$ time wav2fm -i Piano.ff.C4.aiff -d 1 -n 60...real 3m27.189suser 3m29.071ssys 0m0.036sٯFMԻݯͰϐΞϊͷύϥϝʔλΛ୳͢WassersteinٯFMԻݯͰϐΞϊͷύϥϝʔλΛ୳͢Γඍ͍Intel Core i5-6500 (Gentoo Linux)Λ༻
Sinkhorn Distances: Lightspeed Computation of Optimal TransportCuturi, Marco. "Sinkhorn distances: lightspeed computation of optimal transport." NIPS.Vol. 2. No. 3. 2013.https://papers.nips.cc/paper/2013/hash/af21d0c97db2e27e13572cbf59eb343d-Abstract.htmlWassersteinڑͷۙࣅDifferential Properties of Sinkhorn Approximation for Learning with Wasserstein DistanceLuise, Giulia, et al. "Differential properties of sinkhorn approximation for learning withwasserstein distance." arXiv preprint arXiv:1805.11897 (2018).Wassersteinڑͷۙࣅͷඍhttps://arxiv.org/abs/1805.11897্ख͘Ε͜ͷΜ͕͑Δ͔
Λʹἧ͑Δָ͖ثͱἧ͍͚͑ͯͳָ͍ث͕͋ΔωWassersteinٯFMԻݯͰ ΛʹറΖ͏ͱ͢Δޯੜ͡ͳ͍ͷͰඇͷ͕ग़͍͢ωω
Λʹἧ͑Δָ͖ثͱἧ͍͚͑ͯͳָ͍ث͕͋ΔωWassersteinٯFMԻݯͰ ΛʹറΖ͏ͱ͢Δޯੜ͡ͳ͍ͷͰඇͷ͕ग़͍͢ωωόΠΦϦϯͷΑ͏ʹഒԻ͕ഒʹͳ͍ͬͯͳ͍ࣄ͕ຊΒ͠͞Λग़ָ͢ث͋ΕτϥϯϖοτͷΑ͏ʹഒԻ͕ഒʹͳ͍ͬͯͳ͍ͱຊΒ͕͠͞ग़ͳָ͍ث͋Δ
ຊͷτϥϯϖοτWassersteinٯFMԻݯͰ࡞ͬͨFMԻݯτϥϯϖοτ2ΦϖϨʔλ ϑΟʔυόοΫͳ͠Λʹݻఆ͠ͳ͍ω
ຊͷτϥϯϖοτWassersteinٯFMԻݯͰ࡞ͬͨFMԻݯτϥϯϖοτ2ΦϖϨʔλ ϑΟʔυόοΫͳ͠Λ1ʹݻఆ͢Δω
Λಛఆͷʹݻఆͨ͠ํ͕ྑ͍݁Ռ͕ಘΒΕΔ͔ݻఆ͠ͳ͍ํ͕ྑ͍݁Ռ͕ಘΒΕΔ͔ΛࣗಈͰஅ͢Δखஈࠓͷͱ͜Ζແ͍ω
·ͱΊFMԻݯͰຊͷָثΒ͍͠ԻΛग़͢ύϥϝʔλΛޯ๏ Ͱ୳͢ͱ͖WassersteinڑΛ͏ͱྑ͍