Upgrade to Pro — share decks privately, control downloads, hide ads and more …

muana vol.11 音楽識別の事前学習モデル

Yuya Yamamoto
October 28, 2023
700

muana vol.11 音楽識別の事前学習モデル

Muana vol.11 WIP.

Yuya Yamamoto

October 28, 2023
Tweet

Transcript

  1. ࣗݾ঺հ • ࢁຊ ༤໵ʢyamathcyʣ • ஜ೾େֶେֶӃത࢜ޙظ՝ఔ3೥ • ઐ໳ɿԻָɾԻڹ৘ใॲཧ • ՎखͷಛघͳՎ͍ํʮՎএςΫχοΫʯͷ෼ੳ

    • ࠷ۙ͸SSLϞσϧͰՎ੠෼ੳλεΫΛղ͍͍ͯΔ • ࠷ۙͷԻָͷϚΠϒʔϜ • 4’33" Tsukuba Remix. • ΠϯυͷଧָثͷλϒϥͷۂΛௌ͘͜ͱ 2 ಛٕ: ͡Ό͕Γࣦ͜ഊ github.io
  2. ͪΐͬͱ͚ͩએ఻͍ͤͯͩ͘͞͞m(_ _)m • Իָ৘ใॲཧͷࠃࡍձٞɼISMIR2023 ͷ࿦จಡΈձΛओ࠵͍ͯ͠·͢ʂ • ΦϯϥΠϯɼ2023 11/22ʢਫʣ 18:00- •

    ະͩProceedingsະެ։ͷͨΊϦεέ Մೳੑ͋Γ🙇 ʢޙ΄ͲΞφ΢ϯε͠ ·͢ʣ • (Ϧεέޙ͸12্݄०༧ఆ) 3
  3. ͦ΋ͦ΋ͳͥԻָࣝผʹࣄલֶशϞσϧ͕ඞཁʁ • ཧ༝1ɿσʔληοτෆ଍ • Իָ͸ѹ౗తʹσʔλ͕଍Γͳ͍ • Ξϊςʔγϣϯ͸΋ͬͱ଍Γͳ͍͠࡞Δ ίετ͕ߴ͍ • ཧ༝2ɿλεΫࣗମͷ೉͠͞

    • ͦ΋ͦ΋Իָͷղੳࣗମ͕೉͍͠ • DNN࢖ͬͨํ͕ੑೳ͸ग़Δ 7 https://yamathcy.github.io/ISMIR2022J-POP/ ࢁຊ΋ˢͷΑ͏ͳ5࣌ؒ͘Β͍ͷՎͷσʔληοτ Λ࡞Γ·͕ͨ͠࡞੒ʹ1೥͔͔ۙ͘Γ·ͨ͠... DNNͷύϫʔΛ૊ΈࠐΈ͍͕ͨσʔλෆ଍ɼͷδϨϯϚ
  4. ͬ͘͟ΓΧςΰϥΠζʢಠஅʣ 9 ڭࢣ͋Γֶशʹجͮ͘Ϟσϧ ࣗݾڭࢣ͋Γֶशʹجͮ͘Ϟσϧ ੜ੒౳ଞͷ໨తͷ ϞσϧΛస༻ Musicnn CREPE ੺ɿԻָɼ੨ɿԻ੠ɼࠇɿ؀ڥԻɼҰൠ VGGish

    PANNs AST Whisper Wav2Vec2.0 HuBERT WavLM CLAP MapMusic2Vec MERT Data2Vec JukeMIR: JukeBox (ੜ੒༻) ͷ ಛ௃Λར༻ Encodec (ߴੑೳͳԻ੠ѹॖ༻) ͋ΔλεΫΛڭࢣ͋Γֶश ͯ͠ɼࣅͨλεΫʹస༻ ϥϕϧͷͳ͍σʔλʹ ٖࣅతͳϥϕϧΛ෇༩ֶͯ͠श ݩʑࣝผ༻Ͱ͸ͳ͍Ϟσϧͷ தؒग़ྗΛࣝผʹ࢖͏
  5. ԻָԻڹ৴߸ʹର͢ΔBERTͷΑ͏ͳEncoderϞσϧ 11 MERT [Li 22] • ԻָࣝผʹಛԽͨ͠ࣄલֶशϞσϧ • BERTͷΑ͏ʹϚεΫ෦ਪఆʹجͮ ֶ͘शΛߦ͏

    • ԻָʹಛԽֶͨ͠शΛ௥Ճʢϐο νಛ௃ͷ෮ݩ౳ʣ • ଟ͘ͷλεΫͰSoTAʹඖఢ ղઆهࣄॻ͍ͯ·ͨ͠ˠhttps://qiita.com/yamathcy/items/f2f27468c5b5c4dc24a9
  6. Իָͱݴޠͷͭͳ͕Γͷؔ܎Λֶश 12 ϚϧνϞʔμϧܥʢಛʹԻָxࣗવݴޠʣ ݴޠԻָ - MusicLM - MusicGen Իָݴޠ -

    MuLan - MusCALL - LPMusicCaps - MU-LLaMa - LLark->New!! ※αʔϏεͷΈͳΒ΋ͬͱͨ͘͞Μ͋Γ·͕͢झࢫ͕ҟͳΔͨΊׂѪ
  7. ԻָΛղੳ͢ΔͷʹݴޠΛ࢖͏Ϟσϧ 13 ԻָxLLM LP-MusicCaps: ೖྗͨ͠ۂʹ͍ͭͯͷ هड़จΛੜ੒ [Doh 23] Mu-LLaMA: MERT+LLaMAͰɼ

    Իָͷ಺༰ʹର͢Δ࣭໰Ԡ౴ [Liu 23] LLark: ࣭໰Ԡ౴΍هड़จੜ੒ΛҰൠԽɽ ΑΓԻָͷཁૉ ʢίʔυ/ςϯϙ౳ʣʹಛԽ [Gardner 23]
  8. 14 • ϥΠϒϥϦ • Hugging face: https://huggingface.co/ • s3prl: https://s3prl.github.io/s3prl/index.html

    • Ի੠ͷࣗݾڭࢣ͋ΓֶशʹಛԽͨ͠πʔϧΩοτ • ࿦จɼσϞ౳ͷಈ޲Λ௥͏ • ೔ਐ݄าͲ͜Ζ͔ඵਐ෼าɽίʔυ/Ϟσϧ΋ެ։͞ΕΔ৔߹͕ଟ͍ͷ ͰɼΞϯςφΛுΔͱ͍͍͜ͱ͕͋Δ͔΋ զʑ͕࢖͍ͬͯ͘ʹ͸
  9. ۙ೥ͷσΧ͍Ϟσϧʹ͍ͭͯͷTips 16 զʑ͕࢖͍ͬͯ͘ʹ͸ 🔥 🔥 🔥 🔥 ❄ Transformer Encoder૚ͷΈΛֶश

    ֤૚ͷग़ྗΛॏΈ͚ͮฏۉ Adapter, LoRA, Pre fi x tuning౳ͷ Parameter-ef fi cient FT [Chen 23] Ұ෦ͷ༗༻ͳ૚ͷग़ྗͷΈΛ࢖͏ 㱻Իͷ৔߹૚ͷલ൒ޙ൒Ͱ࣋ͭ৘ใ͕ҟͳΔ [Chen 22]
  10. ݩͷυϝΠϯʹ஫ҙ 18 ͲΕΛ࢖͑͹͍͍͔ • Վ੠ͷ৔߹ͳΒԻ੠Ͱ΋͋Δఔ౓͸ɹ ࢖͑ΔɼԻָͷϞσϧΑΓྑ͍৔߹΋ • ͦΕҎ֎ͩͱ͋·Γ͏·͍͔͘ͳ͍ έʔε΋ •

    Reprogrammingͱ͍͏࿮૊ΈΛ࢖ͬ ͯదԠͤͯ͞͠·͏ݚڀ΋ ՎखΛ౰ͯΔλεΫͩͱԻ੠>Իָ [Yamamoto 23] Wav2Vec2.0͸ϐον/ָثࣝผʹ͸ͦͷ··FTͰ࢖͏ΑΓ ԻָͰ࠶ֶश͢Δํ͕ྑ͍ [Legano 23]