Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Wasserstein逆FM音源

635e53b96114c922fa5486b418895960?s=47 Fadis
March 18, 2021

 Wasserstein逆FM音源

逆FM音源はWasserstein距離を誤差関数に使うと良い、という話をします
これは2021年3月20日に行われた Kernel/VM探検隊online part2での発表資料です
サンプルコード: https://github.com/Fadis/wifm

635e53b96114c922fa5486b418895960?s=128

Fadis

March 18, 2021
Tweet

Transcript

  1. asserstein ٯFMԻݯ NAOMASA MATSUBAYASHI Twitter: @fadis_ WASSERSTEIN INVERTED FREQUENCY MODULATION

    SYNTHESIZER https://github.com/Fadis/wifm αϯϓϧίʔυ W ݱࡏςετԻ੠Λ࠶ੜ͍ͯ͠·͢ɻԻ͸ฉ͍͑ͯ͜·͔͢
  2. asserstein ٯFMԻݯ NAOMASA MATSUBAYASHI Twitter: @fadis_ WASSERSTEIN INVERTED FREQUENCY MODULATION

    SYNTHESIZER https://github.com/Fadis/wifm αϯϓϧίʔυ W
  3. FMԻݯ ੲͷPC΍ήʔϜػʹΑ͘ࡌ͍ͬͯͨ ԻָΛ૗ͰΔϋʔυ΢ΣΞ

  4. άϥϯυϐΞϊ ΞϧταοΫε ϏϒϥϑΥϯ όΠΦϦϯ ϚϦϯό ຊ෺ͷָثͷԻΛϑʔϦΤม׵

  5. 261.63Hz 523.25Hz 784.88Hz 1046.5Hz 1308.1Hz 1569.8Hz 1831.4Hz 2093.0Hz 2354.6Hz Ի֊ʹରԠ͢Δप೾਺੒෼

    άϥϯυϐΞϊ ΞϧταοΫε ϏϒϥϑΥϯ όΠΦϦϯ ϚϦϯό جԻ
  6. 261.63Hz 523.25Hz 784.88Hz 1046.5Hz 1308.1Hz 1569.8Hz 1831.4Hz 2093.0Hz 2354.6Hz ͦͷ੔਺ഒͷप೾਺੒෼

    άϥϯυϐΞϊ ΞϧταοΫε ϏϒϥϑΥϯ όΠΦϦϯ ϚϦϯό ഒԻ
  7. ͦͷ੔਺ഒͷप೾਺ͷ೾ άϥϯυϐΞϊ ߴ͍प೾਺੒෼΄Ͳ ૣ͘ݮਰ͢Δ

  8. ͜͏ͨ͠ৼΔ෣͍Λ ؆୯ͳܭࢉͰਅࣅΔࣄ͕Ͱ͖Ε͹ ؆୯ͳϋʔυ΢ΣΞͰ ຊ෺ͷָثͬΆ͍Ի͕ग़ͤΔ

  9. f (t) = Ec (t) sin (2πωc t + Em

    (t) B sin (2πωm t)) ͜ͷαΠϯ೾Ͱ ͜ͷαΠϯ೾Λ࿪ΊΔ FMԻݯ
  10. 1ωc = ωm 2ωc = ωm 3ωc = ωm 4ωc

    = ωm 5ωc = ωm f (t) = Ec (t) sin (2πωc t + Em (t) B sin (2πωm t)) 6ωc = ωm ύϥϝʔλʹΑͬͯ ༷ʑͳप೾਺ʹ༷ʑͳେ͖͞ͷഒԻ͕ग़Δ
  11. f (t) = Ec (t) sin (2πωc t + Em

    (t) B sin (2πωm t)) ͜Ε ͜Ε ͜Ε ͜Ε ͜Ε΋ ͜ΕΒͷύϥϝʔλΛௐ੔ͯ͠ ຊ෺ͷָثͱಉ͡ഒԻΛ࣋ͭԻΛग़ͤΕ͹ ͦͷָثͷԻʹฉ͑͜Δ
  12. ͜ΕΒͷύϥϝʔλΛௐ੔ͯ͠ ຊ෺ͷָثͱಉ͡ഒԻΛ࣋ͭԻΛग़ͤΕ͹ ͦͷָثͷԻʹฉ͑͜Δ f (t) = Ec (t) sin (2πωc

    t + Em (t) B sin (2πωm t)) ೉ FMԻݯ͸໓Μͩ ͜Ε ͜Ε ͜Ε ͜Ε ͜Ε΋ ͋·Γྑ͍Ի͕ग़ͳ͍ͱ͞Ε
  13. αϯϓϦϯάԻΛप೾਺ղੳ ϥϯμϜͳҨ఻ࢠΛੜ੒ ֤Ҩ఻ࢠΛ'.ԻݯͰԋ૗ ֤Ҩ఻ࢠͷԻΛप೾਺ղੳ Ұ༷ަ伹ͱಥવมҟͰҨ఻ࢠΛੜ੒ αϯϓϦϯάԻͱͷڑ཭Λܭࢉ ϧʔϨοτબ୒ͰݸମΛݫબ ࠷େ෼ղೳͰ ্ҐʹมԽ͕ݟΒΕͳ͍ ͸͍

    ͍͍͑ ෼ղೳΛ্͛Δ͔Ͳ͏͔Λ൑அ Ҩ఻తFMԻݯ 2016೥ https://speakerdeck.com/fadis/yi-chuan-de-fmyin-yuan Ҩ఻తΞϧΰϦζϜͰ ΑΓ༩͑ΒΕͨԻͷεϖΫτϧʹ ͍ۙԻʹͳΔύϥϝʔλΛ ੜଘͤ͞Δ ୈ12ճ ΧʔωϧʗVM୳ݕୂ
  14. ຊ෺ͷϐΞϊͷԻ '.ԻݯͷϐΞϊͷԻ ͜ͷ࣮ݧͷաఔͰ FMԻݯͰ͸ग़ͤͳ͍ͱ͞Ε͍ͯͨ ϐΞϊͷԻΛͦͦ͜͜࠶ݱ͢Δύϥϝʔλ͕ݟ͔ͭͬͨ Ҩ఻తFMԻݯ 2016೥ ୈ12ճ ΧʔωϧʗVM୳ݕୂ

  15. FMԻݯͰ͸ָثͷԻ৭Λ࠶ݱͰ͖ͳ͍ ਓؒͷΧϯͰ͸FMԻݯͰָثͷԻ৭Λ࠶ݱͰ͖ͳ͍

  16. https://speakerdeck.com/fadis/niyurarufmyin-yuan ٯFMԻݯ 2020೥ FMԻݯͷࣜΛඍ෼ͯ͠ ޯ഑๏ͰύϥϝʔλΛൃݟ͢Δ f (t) = A ∞

    ∑ n=−∞ Jn (B) cos (2πt (nωm + ωc)) Χʔωϧ7.୳ݕୂ!ؔ੢ճ໨ dJn (x) dx = 1 2 (Jn−1 (x) − Jn+1 (x))
  17. ࠷΋ԫ৭͘ͳΔ ͱ Λݟ͚͍ͭͨ ω B

  18. Ұ൪௿͍ॴΛ୳͍ͨ͠ ࠓͷҐஔͷ܏͖Λݟͯ ΑΓ௿͍ํʹগ͠Ҡಈ͢Δ ޯ഑๏ ΋ͬͱࠨ͔ͳ ͜ͷ΁ΜʹͨͲΓண͘ ϘʔϧΛస͕ͯ͠Ұ൪௿͍ॴͰࢭ·ΔͷΛ ظ଴͢Δͷʹࣅ͍ͯΔ

  19. ޯ഑๏ ͜ͷ΁ΜʹͨͲΓண͘ ͜͜ʹ͋Δຊ౰ͷ࠷খ஋ʹ͸ ͨͲΓண͚ͳ͍ ΋ͬͱࠨ͔ͳ ΋ͬͱӈ͔ͳ

  20. ͜ͷঢ়ଶͰ͸ޯ഑๏ͷద༻͸ແཧ

  21. 261.63Hz 523.25Hz 784.88Hz 1046.5Hz 1308.1Hz 1569.8Hz 1831.4Hz 2093.0Hz 2354.6Hz ഒԻ͸੔਺ഒͷप೾਺෇ۙʹग़Δ

    άϥϯυϐΞϊ ΞϧταοΫε ϏϒϥϑΥϯ όΠΦϦϯ ϚϦϯό ͸ৗʹ੔਺ʹͳΔ͜ͱʹ͢Δ ω = ωm ωc
  22. ͸ৗʹ੔਺ʹͳΔ͜ͱʹ͢Δ ω = ωm ωc

  23. ͜ͷࢁΛ্ʹ௒͑ΒΕΔͱࠔΔ

  24. ෳ਺ͷॳظ஋͔Β୳࢝͠Ίͯ࠷খ஋ʹ౸ୡͨ͠1ͭΛબͿ ม਺ͷ਺͕૿͑Δͱࢼߦճ਺͕ٸܹʹ૿͑Δ 3ΦϖϨʔλҎ্ʹద༻͕ࠔ೉

  25. ຊ෺ͷόΠΦϦϯ ٯFMԻݯόΠΦϦϯ ࣮ࡍόΠΦϦϯͷԻʹฉ͑͜ͳ͍ σνϡʔϯ໰୊ ͜ͷ΁Μ͕ͬͦ͝Γফ͍͑ͯΔ 2ΦϖϨʔλ ϑΟʔυόοΫͳ͠

  26. ຊ෺ͷόΠΦϦϯ ٯFMԻݯόΠΦϦϯ ੔਺ഒͷप೾਺͔Βগͣ͠Εͨप೾਺ͰഒԻ͕ग़͍ͯΔ (ഒԻͷप೾਺ͷഒ཰)͸ৗʹ੔਺ʹͳΔ͜ͱʹ͢Δ ω = ωm ωc ͱ͍͏੍໿Λ՝ͨ͠ͷͰͣΕͨഒԻ͕શͯແࢹ͞Ε͍ͯΔ

  27. ޡࠩͷ෼෍͕Α͘ͳ͍ খࡉ޻Ͱ෼෍Λྑ͘͢Δͱ ָثͷಛ௃ΛऔΓଛͶΔ ͦ΋ͦ΋ଛࣦؔ਺͕ྑ͘ͳ͍ͷͰ͸? ͔͜͜Β͕ࠓճͷ৽ωλ

  28. ฏۉೋ৐ޡࠩ L = 1 n n ∑ i=0 (ei −

    gi) 2 ຊ෺ͷप೾਺੒෼ ੜ੒෺ͷप೾਺੒෼ શͯͷप೾਺੒෼ʹ͍ͭͯͷޡࠩΛ଍ͯ͠ฏۉΛऔΔ ຊ෺ͱੜ੒෺ͷ͕ࠩ ਖ਼Ͱ΋ෛͰ΋ޡࠩ͸ਖ਼ʹͳΔ ͱ ͕ࣅ͍ͯΔఔ ͸খ͘͞ͳΔ e g L ٯFMԻݯͷଛࣦؔ਺
  29. ฏۉೋ৐ޡࠩ ຊ෺ ੜ੒෺ ຊ෺Ͱ͸ग़͍ͯͳ͍प೾਺ͷ੒෼͕ग़͍ͯΔ ຊ෺Ͱ͸ग़͍ͯΔप೾਺ͷ੒෼͕ग़͍ͯͳ͍ ͜͜Ͱग़͍ͯΔ෼ؙ͕͝ͱޡࠩ ͜͜Ͱग़͍ͯͳ͍෼ؙ͕͝ͱޡࠩ ޡࠩ=A + B

    = A = B
  30. ฏۉೋ৐ޡࠩ ຊ෺ ੜ੒෺ ຊ෺Ͱ͸ग़͍ͯΔप೾਺ͷ੒෼͕ग़͍ͯͳ͍ ͜͜Ͱग़͍ͯͳ͍෼ؙ͕͝ͱޡࠩ ؒҧͬͨप೾਺ͰԻ͕ग़͍ͯΔΑΓ Ի͕ग़͍ͯͳ͍ํ͕·ͩϚγ ޡࠩ=B = B

  31. ໼ҹͷํ޲ʹҠಈ͢Δ΄Ͳߴप೾੒෼͕ՄௌҬ֎ʹग़Δ = Ի͕ग़͍ͯͳ͍ͷͰϚγঢ়ଶ ࢁͷݪҼ

  32. ฏۉೋ৐ޡࠩ ຊ෺ ੜ੒෺ ޡࠩ=શ෦ ͳ͠ ͕มΘΔͱഒԻͷप೾਺͸Ҡಈ͢Δ͕ ฏۉೋ৐ޡࠩͰ͸ຊ෺ͱҰக͢ΔҰॠΛআ͍ͯ ౳͘͠ग़͍ͯͳ͍ํ͕Ϛγ ω

  33. ͜ͷ݁Ռ ํ޲͸ຊ෺ͱप೾਺͕Ұக͢Δ͔ᷮͳॠ͚ؒͩԫ৭͘ͳΔ ω

  34. L = 1 n n ∑ i=0 (ei − gi)

    2 ฏۉೋ৐ޡ͕ࠩԻ৭ಉ࢜ͷڑ཭ΛଌΔखஈͱͯ͠ద͍ͯ͠ͳ͍ࣄ͸໌Β͔
  35. Wassersteinڑ཭

  36. Wassersteinڑ཭ W (ℙ, ℚ) = inf J∈𝒥(ℙ, ℚ) ∫ ∥x

    − y∥dJ (x, y) ຊ෺ͷ෼෍ ੜ੒෺ͷ෼෍ ͷ͋Δαϯϓϧ͔Β ͷ͋Δαϯϓϧ΁ͷڑ཭ ℙ ℚ ࠷΋ޮ཰ͷྑ͍༌ૹํ๏Λ༻͍ͨ৔߹ͷ 2ͭͷ෼෍ͷྨࣅ౓ΛଌΔڑ཭ ͷ͋Δαϯϓϧ͔Β ͷ͋Δαϯϓϧ΁ͷ༌ૹྔ ℙ ℚ ༌ૹྔ ڑ཭ͷ૯࿨ ×
  37. Wassersteinڑ཭ 0 0.25 0.5 0.75 1 0 1 2 3

    4 5 0 0.25 0.5 0.75 1 0 1 2 3 4 5 ℙ ℚ W (ℙ, ℚ) = × 1+ × 1+ × 2 ͷঢ়ଶ͔Β ͷঢ়ଶʹ͢Δҝʹ ӡ͹ͳ͚Ε͹ͳΒͳ͍෺ ӡͿڑ཭ ͷ૯࿨ ℙ ℚ × ͜ͷ஋͕খ͍͞ఔ2ͭͷ෼෍͸ࣅ͍ͯΔͱݴ͑Δ
  38. ਺ֶతʹͪΌΜͱͨ͠આ໌΍ޮ཰ͷྑ͍ٻΊํ http://www.stat.cmu.edu/~larry/=sml/ 36-708 Statistical Methods for Machine Learning (CMUͷߨٛ)ͷࢿྉ ͱ͔

    https://arxiv.org/abs/1701.07875 Arjovsky, Martin, Soumith Chintala, and Léon Bottou. "Wasserstein generative adversarial networks." International conference on machine learning. PMLR, 2017. Wasserstein GAN Wassersteinڑ཭Λ࢖ͬͯϞʔυ่յΛ๷͙GANʹ͍ͭͯͷ࿦จ Optimal Transport and Wasserstein Distance ͋ͨΓݟͯ
  39. Wassersteinڑ཭ͷੌ͍ॴ ຊ෺ ੜ੒෺ ฏۉೋ৐ޡࠩϚϯ ੜ੒෺͸Կ΋͔΋ؒҧ͍ͬͯΔ ੜ੒෺ͷ ͸ཁΒͳ͍ॴʹੜ͍͑ͯΔͷͰແ͍ํ͕ྑ͍ WassersteinϚϯ ੯͍͠ ੜ੒෺ͷ

    ͸͋ͱ΋͏গ͠ࠨʹ͋ͬͨΒ׬ᘳͩͬͨͷʹ ΄Μͷগ͠ͷࠩ
  40. ਓ͕ؒݟͯ΋࠷దͳ ͱ ͷҐஔ͕ݟ͑ΔϨϕϧ B ω

  41. ϑʔϦΤม׵ Wassersteinڑ཭ؔ਺ Τϯϕϩʔϓਪఆ ຊ෺ͷָثͷԻ ∑ ਖ਼نԽ ຊ෺ͷԻͱੜ੒ͨ͠Իͷڑ཭L ԻྔA प೾਺ྖҬFMԻݯ มௐͷڧ͞B

    2ͭͷαΠϯ೾ͷप೾਺ൺω ·ͣద౰ͳ ͱ Ͱ ΛٻΊͯ B ω L
  42. ϑʔϦΤม׵ प೾਺ྖҬFMԻݯͷඍ෼ Wassersteinڑ཭ؔ਺ͷඍ෼ Τϯϕϩʔϓਪఆ ຊ෺ͷָثͷԻ ∑ ਖ਼نԽ มௐͷڧ͞B 2ͭͷαΠϯ೾ͷप೾਺ൺω ຊ෺ͷԻͱੜ੒ͨ͠Իͷڑ཭L

    ԻྔA Adam ޡࠩٯ఻೻Ͱ ͔Β ͱ ΛͲ͏मਖ਼͢΂͖͔ΛٻΊΔ L B ω
  43. ϑʔϦΤม׵ प೾਺ྖҬFMԻݯͷඍ෼ Wassersteinڑ཭ؔ਺ͷඍ෼ Τϯϕϩʔϓਪఆ ຊ෺ͷָثͷԻ ∑ ਖ਼نԽ มௐͷڧ͞B 2ͭͷαΠϯ೾ͷप೾਺ൺω ຊ෺ͷԻͱੜ੒ͨ͠Իͷڑ཭L

    ԻྔA Adam ޡࠩٯ఻೻Ͱ ͔Β ͱ ΛͲ͏मਖ਼͢΂͖͔ΛٻΊΔ L B ω Wassersteinڑ཭ؔ਺͸ ղੳతʹඍ෼Ͱ͖ͳ͍
  44. std::tuple< std::vector< T >, std::vector< T > > backward( const

    std::vector<T> &A, const std::vector<T> &AWeights, const std::vector<T> &B, const std::vector<T> &BWeights, T dist, T delta, T wdelta ) { std::vector<T> dAWeights( AWeights.size() ); std::vector<T> dA( A.size() ); #pragma omp parallel for for( size_t i = 0; i < A.size(); ++i ) { auto modif_ = A; modif_[ i ] += delta; auto modified_dist = forward_x( modif_, AWeights, B, BWeights ); dA[ i ] = ( modified_dist - dist ) / delta; } #pragma omp parallel for for( size_t i = 0; i < AWeights.size(); ++i ) { auto modif_ = AWeights; modif_[ i ] += wdelta; auto modified_dist = forward_x( A, modif_, B, BWeights ); dAWeights[ i ] = ( modified_dist - dist ) / wdelta; } return std::make_tuple( dA, dAWeights ); } ղੳతʹඍ෼Ͱ͖ͳ͍ͳΒ ਺஋తʹඍ෼͢Ε͹ྑ͍
  45. ɺ Ͱ࢝ΊΔ ෳ਺ͷॳظ஋͔Βࢼ͢ඞཁ͸ͳ͍ B = 5 ω = 5

  46. ݁Ռ

  47. ຊ෺ͷόΠΦϦϯ ٯFMԻݯόΠΦϦϯ ͜ͷ΁Μ͕ͬͦ͝Γফ͍͑ͯΔ ࣮ࡍόΠΦϦϯͷԻʹฉ͑͜ͳ͍ 2ΦϖϨʔλ ϑΟʔυόοΫͳ͠

  48. ຊ෺ͷόΠΦϦϯ Wasserstein ٯFMԻݯόΠΦϦϯ όΠΦϦϯͷԻʹฉ͑͜Δ 2ΦϖϨʔλ ϑΟʔυόοΫͳ͠

  49. ຊ෺ͷϐΞϊ ਓ͕ؒ࡞ͬͨFMԻݯϐΞϊ 80೥୅ʙ90೥୅ͷPCήʔϜͱ͔ͰΑ͘໐ͬͯͨ΍ͭ 2ΦϖϨʔλ ϑΟʔυόοΫͳ͠

  50. ຊ෺ͷϐΞϊ ٯFMԻݯͰ࡞ͬͨFMԻݯϐΞϊ 2ΦϖϨʔλ ϑΟʔυόοΫͳ͠

  51. ຊ෺ͷϐΞϊ WassersteinٯFMԻݯͰ ࡞ͬͨFMԻݯϐΞϊ 2ΦϖϨʔλ ϑΟʔυόοΫͳ͠

  52. ՝୊

  53. $ time wav2fm2 -i Piano.ff.C4.aiff -d 1 -n 60 -t

    40 ... real 4m43.748s user 17m5.382s sys 0m3.239s $ time wav2fm -i Piano.ff.C4.aiff -d 1 -n 60 ... real 3m27.189s user 3m29.071s sys 0m0.036s ٯFMԻݯͰϐΞϊͷύϥϝʔλΛ୳͢ WassersteinٯFMԻݯͰϐΞϊͷύϥϝʔλΛ୳͢ ΍͸Γ਺஋ඍ෼͸஗͍ Intel Core i5-6500 (Gentoo Linux)Λ࢖༻
  54. Sinkhorn Distances: Lightspeed Computation of Optimal Transport Cuturi, Marco. "Sinkhorn

    distances: lightspeed computation of optimal transport." NIPS. Vol. 2. No. 3. 2013. https://papers.nips.cc/paper/2013/hash/af21d0c97db2e27e13572cbf59eb343d- Abstract.html Wassersteinڑ཭ͷۙࣅ Differential Properties of Sinkhorn Approximation for Learning with Wasserstein Distance Luise, Giulia, et al. "Differential properties of sinkhorn approximation for learning with wasserstein distance." arXiv preprint arXiv:1805.11897 (2018). Wassersteinڑ཭ͷۙࣅͷඍ෼ https://arxiv.org/abs/1805.11897 ্ख͘΍Ε͹͜ͷ΁Μ͕࢖͑Δ͔΋
  55. Λ੔਺ʹἧ͑Δ΂ָ͖ثͱἧ͑ͯ͸͍͚ͳָ͍ث͕͋Δ ω WassersteinٯFMԻݯͰ͸ Λ੔਺ʹറΖ͏ͱ͢Δޯ഑͸ੜ͡ͳ͍ͷͰ ͸ඇ੔਺ͷ஋͕ग़΍͍͢ ω ω

  56. Λ੔਺ʹἧ͑Δ΂ָ͖ثͱἧ͑ͯ͸͍͚ͳָ͍ث͕͋Δ ω WassersteinٯFMԻݯͰ͸ Λ੔਺ʹറΖ͏ͱ͢Δޯ഑͸ੜ͡ͳ͍ͷͰ ͸ඇ੔਺ͷ஋͕ग़΍͍͢ ω ω όΠΦϦϯͷΑ͏ʹഒԻ͕੔਺ഒʹͳ͍ͬͯͳ͍ࣄ͕ ຊ෺Β͠͞Λग़ָ͢ث΋͋Ε͹ τϥϯϖοτͷΑ͏ʹഒԻ͕੔਺ഒʹͳ͍ͬͯͳ͍ͱ

    ຊ෺Β͕͠͞ग़ͳָ͍ث΋͋Δ
  57. ຊ෺ͷτϥϯϖοτ WassersteinٯFMԻݯͰ ࡞ͬͨFMԻݯτϥϯϖοτ 2ΦϖϨʔλ ϑΟʔυόοΫͳ͠ Λ੔਺஋ʹݻఆ͠ͳ͍ ω

  58. ຊ෺ͷτϥϯϖοτ WassersteinٯFMԻݯͰ ࡞ͬͨFMԻݯτϥϯϖοτ 2ΦϖϨʔλ ϑΟʔυόοΫͳ͠ Λ1ʹݻఆ͢Δ ω

  59. Λಛఆͷ஋ʹݻఆͨ͠ํ͕ྑ͍݁Ռ͕ಘΒΕΔ͔ ݻఆ͠ͳ͍ํ͕ྑ͍݁Ռ͕ಘΒΕΔ͔Λ ࣗಈͰ൑அ͢Δखஈ͸ࠓͷͱ͜Ζແ͍ ω

  60. ·ͱΊ FMԻݯͰຊ෺ͷָثΒ͍͠ԻΛग़͢ύϥϝʔλΛ ޯ഑๏ Ͱ୳͢ͱ͖͸ Wassersteinڑ཭ Λ࢖͏ͱྑ͍