Upgrade to Pro — share decks privately, control downloads, hide ads and more …

muana vol.11 音楽識別の事前学習モデル

Avatar for Yuya Yamamoto Yuya Yamamoto
October 28, 2023
730

muana vol.11 音楽識別の事前学習モデル

Muana vol.11 WIP.

Avatar for Yuya Yamamoto

Yuya Yamamoto

October 28, 2023
Tweet

Transcript

  1. ࣗݾ঺հ • ࢁຊ ༤໵ʢyamathcyʣ • ஜ೾େֶେֶӃത࢜ޙظ՝ఔ3೥ • ઐ໳ɿԻָɾԻڹ৘ใॲཧ • ՎखͷಛघͳՎ͍ํʮՎএςΫχοΫʯͷ෼ੳ

    • ࠷ۙ͸SSLϞσϧͰՎ੠෼ੳλεΫΛղ͍͍ͯΔ • ࠷ۙͷԻָͷϚΠϒʔϜ • 4’33" Tsukuba Remix. • ΠϯυͷଧָثͷλϒϥͷۂΛௌ͘͜ͱ 2 ಛٕ: ͡Ό͕Γࣦ͜ഊ github.io
  2. ͪΐͬͱ͚ͩએ఻͍ͤͯͩ͘͞͞m(_ _)m • Իָ৘ใॲཧͷࠃࡍձٞɼISMIR2023 ͷ࿦จಡΈձΛओ࠵͍ͯ͠·͢ʂ • ΦϯϥΠϯɼ2023 11/22ʢਫʣ 18:00- •

    ະͩProceedingsະެ։ͷͨΊϦεέ Մೳੑ͋Γ🙇 ʢޙ΄ͲΞφ΢ϯε͠ ·͢ʣ • (Ϧεέޙ͸12্݄०༧ఆ) 3
  3. ͦ΋ͦ΋ͳͥԻָࣝผʹࣄલֶशϞσϧ͕ඞཁʁ • ཧ༝1ɿσʔληοτෆ଍ • Իָ͸ѹ౗తʹσʔλ͕଍Γͳ͍ • Ξϊςʔγϣϯ͸΋ͬͱ଍Γͳ͍͠࡞Δ ίετ͕ߴ͍ • ཧ༝2ɿλεΫࣗମͷ೉͠͞

    • ͦ΋ͦ΋Իָͷղੳࣗମ͕೉͍͠ • DNN࢖ͬͨํ͕ੑೳ͸ग़Δ 7 https://yamathcy.github.io/ISMIR2022J-POP/ ࢁຊ΋ˢͷΑ͏ͳ5࣌ؒ͘Β͍ͷՎͷσʔληοτ Λ࡞Γ·͕ͨ͠࡞੒ʹ1೥͔͔ۙ͘Γ·ͨ͠... DNNͷύϫʔΛ૊ΈࠐΈ͍͕ͨσʔλෆ଍ɼͷδϨϯϚ
  4. ͬ͘͟ΓΧςΰϥΠζʢಠஅʣ 9 ڭࢣ͋Γֶशʹجͮ͘Ϟσϧ ࣗݾڭࢣ͋Γֶशʹجͮ͘Ϟσϧ ੜ੒౳ଞͷ໨తͷ ϞσϧΛస༻ Musicnn CREPE ੺ɿԻָɼ੨ɿԻ੠ɼࠇɿ؀ڥԻɼҰൠ VGGish

    PANNs AST Whisper Wav2Vec2.0 HuBERT WavLM CLAP MapMusic2Vec MERT Data2Vec JukeMIR: JukeBox (ੜ੒༻) ͷ ಛ௃Λར༻ Encodec (ߴੑೳͳԻ੠ѹॖ༻) ͋ΔλεΫΛڭࢣ͋Γֶश ͯ͠ɼࣅͨλεΫʹస༻ ϥϕϧͷͳ͍σʔλʹ ٖࣅతͳϥϕϧΛ෇༩ֶͯ͠श ݩʑࣝผ༻Ͱ͸ͳ͍Ϟσϧͷ தؒग़ྗΛࣝผʹ࢖͏
  5. ԻָԻڹ৴߸ʹର͢ΔBERTͷΑ͏ͳEncoderϞσϧ 11 MERT [Li 22] • ԻָࣝผʹಛԽͨ͠ࣄલֶशϞσϧ • BERTͷΑ͏ʹϚεΫ෦ਪఆʹجͮ ֶ͘शΛߦ͏

    • ԻָʹಛԽֶͨ͠शΛ௥Ճʢϐο νಛ௃ͷ෮ݩ౳ʣ • ଟ͘ͷλεΫͰSoTAʹඖఢ ղઆهࣄॻ͍ͯ·ͨ͠ˠhttps://qiita.com/yamathcy/items/f2f27468c5b5c4dc24a9
  6. Իָͱݴޠͷͭͳ͕Γͷؔ܎Λֶश 12 ϚϧνϞʔμϧܥʢಛʹԻָxࣗવݴޠʣ ݴޠԻָ - MusicLM - MusicGen Իָݴޠ -

    MuLan - MusCALL - LPMusicCaps - MU-LLaMa - LLark->New!! ※αʔϏεͷΈͳΒ΋ͬͱͨ͘͞Μ͋Γ·͕͢झࢫ͕ҟͳΔͨΊׂѪ
  7. ԻָΛղੳ͢ΔͷʹݴޠΛ࢖͏Ϟσϧ 13 ԻָxLLM LP-MusicCaps: ೖྗͨ͠ۂʹ͍ͭͯͷ هड़จΛੜ੒ [Doh 23] Mu-LLaMA: MERT+LLaMAͰɼ

    Իָͷ಺༰ʹର͢Δ࣭໰Ԡ౴ [Liu 23] LLark: ࣭໰Ԡ౴΍هड़จੜ੒ΛҰൠԽɽ ΑΓԻָͷཁૉ ʢίʔυ/ςϯϙ౳ʣʹಛԽ [Gardner 23]
  8. 14 • ϥΠϒϥϦ • Hugging face: https://huggingface.co/ • s3prl: https://s3prl.github.io/s3prl/index.html

    • Ի੠ͷࣗݾڭࢣ͋ΓֶशʹಛԽͨ͠πʔϧΩοτ • ࿦จɼσϞ౳ͷಈ޲Λ௥͏ • ೔ਐ݄าͲ͜Ζ͔ඵਐ෼าɽίʔυ/Ϟσϧ΋ެ։͞ΕΔ৔߹͕ଟ͍ͷ ͰɼΞϯςφΛுΔͱ͍͍͜ͱ͕͋Δ͔΋ զʑ͕࢖͍ͬͯ͘ʹ͸
  9. ۙ೥ͷσΧ͍Ϟσϧʹ͍ͭͯͷTips 16 զʑ͕࢖͍ͬͯ͘ʹ͸ 🔥 🔥 🔥 🔥 ❄ Transformer Encoder૚ͷΈΛֶश

    ֤૚ͷग़ྗΛॏΈ͚ͮฏۉ Adapter, LoRA, Pre fi x tuning౳ͷ Parameter-ef fi cient FT [Chen 23] Ұ෦ͷ༗༻ͳ૚ͷग़ྗͷΈΛ࢖͏ 㱻Իͷ৔߹૚ͷલ൒ޙ൒Ͱ࣋ͭ৘ใ͕ҟͳΔ [Chen 22]
  10. ݩͷυϝΠϯʹ஫ҙ 18 ͲΕΛ࢖͑͹͍͍͔ • Վ੠ͷ৔߹ͳΒԻ੠Ͱ΋͋Δఔ౓͸ɹ ࢖͑ΔɼԻָͷϞσϧΑΓྑ͍৔߹΋ • ͦΕҎ֎ͩͱ͋·Γ͏·͍͔͘ͳ͍ έʔε΋ •

    Reprogrammingͱ͍͏࿮૊ΈΛ࢖ͬ ͯదԠͤͯ͞͠·͏ݚڀ΋ ՎखΛ౰ͯΔλεΫͩͱԻ੠>Իָ [Yamamoto 23] Wav2Vec2.0͸ϐον/ָثࣝผʹ͸ͦͷ··FTͰ࢖͏ΑΓ ԻָͰ࠶ֶश͢Δํ͕ྑ͍ [Legano 23]