Upgrade to Pro — share decks privately, control downloads, hide ads and more …

国際会議ISMIR2022報告(山本分)

Yuya Yamamoto
February 27, 2023

 国際会議ISMIR2022報告(山本分)

2023年2月27日に行われた音楽情報科学研究会の発表資料です.

Yuya Yamamoto

February 27, 2023
Tweet

More Decks by Yuya Yamamoto

Other Decks in Science

Transcript

  1. I S M I R ͷ Α ͔ ͬ ͨ ͱ ͜ Ζ 3 બ


    2 0 2 3 . 0 2 . 2 7 S I G M U S 1 3 6


    I S M I R 2 0 2 2 ใ ࠂ ʙ ͍ ͪ ࢀ Ճ ऀ ͷ ໨ ઢ ͔ Β ʙ ɹ ɹ ɹ ஜ ೾ େ ֶ ࢁ ຊ ༤ ໵
    2022.12.08 Ұॹͷϗςϧͷϝϯόʔͨͪͱɽ

    View Slide

  2. ࣗ ݾ ঺ հ
    • ࢁຊ ༤໵ (΍·΋ͱ Ώ͏΍)


    • ஜ೾େֶ ਓͱԻͷ৘ใֶݚڀࣨʢࣉᖒݚʣ
    ത࢜ޙظ՝ఔ2೥


    • (2022. 08-12: ؖࠃ KAIST ๚໰ݚڀһ)


    • J-POPͷ”ՎএςΫχοΫ”Λ൐૗͖ͭͷՎ
    ͔Βࣗಈݕग़͢Δٕज़ʹ͍ͭͯݱ஍Ͱൃද


    • Analysis and Detection of Singing
    Techniques in Repertoires of J-POP Solo
    Singers (ࢁຊ, Nam, ࣉᖒ)


    • Juhan Nam (KAISTʣͱͷڞಉݚڀ
    C U LT U R A L
    D I V E R S I T Y ʹ ৐ ͬͯ


    ண ෺ Ͱ ൃ ද ʂ

    View Slide

  3. I S M I R ͷ ͍ ͍ ͱ ͜ Ζ ᶃ
    Ի ָ ৘ ใ ॲ ཧ ͷ


    ࠷ ઌ ୺ Λ ஌ Ε Δ

    View Slide

  4. ஌ ݟ ͷ ๅ ݿ ʂ
    ૉ੖Β͍͠ݚڀͨͪ ࠷ઌ୺Λ஌ΕΔνϡʔτϦΞϧ
    https://ismir2022.ismir.net/
    program/tutorials/

    View Slide

  5. ٞ ࿦ Λ ଅ ͢ ۭ ؒ ͕
    ੔ ͑Β Εͯ ͍ Δ
    I S M I R ͷ ͍ ͍ ͱ ͜ Ζ ᶄ

    View Slide

  6. ൃ ද ͷ Α ͏͢
    ޱ಄ൃද x 4෼
    ϙελʔൃද x 1࣌ؒ


    (ݱ஍ͷΈ)
    օ͕͋Δఔ౓಺༰Λ஌ͬͨঢ়ଶͰདྷͯ͘ΕΔ+


    ΈͬͪΓٞ࿦Ͱ͖Δʂ

    View Slide

  7. ԕ ִ ͷ ࢀ Ճ ऀ ͱ ͷ Π ϯ λ ϥ Ϋ γ ϣ ϯ
    ֶձଆͰఏڙ͍ͯ͠ΔSlackͰΈͬͪΓٞ࿦


    ԕִࢀՃऀͱͷϏσΦνϟοτ΋ൃੜʂ

    View Slide

  8. ੈ ք த ͷ ݚ ڀ ऀ ͱ
    ͭ ͳ ͕ Γ ͕ Ͱ ͖ Δ
    I S M I R ͷ ͍ ͍ ͱ ͜ Ζ ᶅ

    View Slide

  9. I S M I R ͱ ͍ ͏ ֶ ձ ͷ ಛ ௃
    • ʢൺֱతʣগਓ਺


    • γϯάϧτϥοΫ


    • ࢀՃऀಉ࢜ͷͭͳ͕ΓΛੵۃతʹଅ͍ͯ͠Δ


    • ΈΜͳԻָʹؔ͢ΔݚڀΛ͍ͯ͠Δʢˡେࣄʣ

    View Slide

  10. ݱ ஍ ͳ ΒͰ ͸ ͷ ͜ ͱ
    όϯέοτ΍ίʔώʔλΠϜͰͷஊসλΠϜ Իָؔ࿈ͷϓϩάϥϜ
    ੈքதͷMIRݚڀऀͨͪͱɼԻָΛͭ·Έʹݚڀ΍


    ͦΕͧΕͷจԽɾੜ׆ʹ͍ͭͯஊসͰ͖Δ

    View Slide

  11. ηογϣϯ֎͸ࣗ༝ߦಈͰOKʂ؍ޫ͢Δ΋Α͠ɼҙؾ౤߹ͨ͠ϝϯπͰग़͔͚Δͷ΋Α͠


    SlackͰ͸͓͢͢Ίεϙοτͷ৘ใ΋ϙετ͞ΕΔ

    View Slide

  12. ݸ ਓ త ʹ ࢥ ͏ ࠃ ࡍ ձ ٞ ʢ ର ໘ ʣ ͷ ͏ · Έ
    • ࠷ઌ୺ͷݚڀΛ௚ʹ৮Εͯ஌Δ͜ͱ͕Ͱ͖Δ


    • ΈͬͪΓٞ࿦Ͱ͖Δ


    • ੈքதͷݚڀऀͱͷग़ձ͍͕͋Δ


    • ҟͳΔ஍ͰҟͳΔจԽʹ৮ΕΔ͜ͱ͕Ͱ͖Δ


    • ISMIR͸ͦͷ͏·ΈΛ࠷େԽͨ͠ࠃࡍձٞʂ

    View Slide

  13. S H A L L W E I S M I R ?
    END

    View Slide

  14. Ҏ߱ิ଍εϥΠυʢNot ຊฤʣ
    ΞϝχςΟ

    View Slide

  15. ͓ · ͚ ɿ ࢀ Ճ ࣌ ʹ΍ ͬͯΑ ͔ ͬ ͨ ͜ ͱ 5 બ
    • શͯͷ࿦จʹͬ͟ͱ໨Λ௨͢


    • ࣭໰Λͱʹ͔͘౤͔͚͛ΔʢSlackͰ΋ϙελʔͰ΋ʣ


    • ੈքͷ஍ཧ΍จԽʹৄ͘͠ͳ͓͖ͬͯɼ࿩ͷωλΛ࡞Δ


    • ண෺ΛணΔࣗ෼ͷࠃͰ࿩ͤΔωλΛ͓࣋ͬͯ͘


    • λϒϨοτ୺຤Λങ͏ʢϙελʔൃදͰ͸ύιίϯΑΓ
    ॏๅʣ

    View Slide

  16. ӳ ޠ ؔ ࿈ Ͱ ݸ ਓ త ʹ΍ ͬͯΑ ͔ ͬ ͨ ͜ ͱ
    • ӳޠ͕ۤख͔ͩΒͱҤॖ͗͢͠ͳ͍


    • ͱʹ͔͘ϦεχϯάΛຏ͘


    • पΓ͸౰ͨΓલ͕ͩωΠςΟϒεϐʔυ


    • ൃԻʹ͍ͭͯ஌Δͷʹ͓͢͢Ί-> https://youtu.be/nEpewJsEgUg


    • ֤ࠃͷᨅΓͷบ΋஌͓ͬͯ͘ͱ͍͍͔΋


    • ࣌ʹ͸σδλϧΛۦ࢖


    • ୯ޠ͕ࢥ͍ු͔͹ͳ͍࣌“Ah~ What should I say… (खݩͰg○ogle຋༁)”
    Ͱ৐Γ੾ͬͨ৔໘͕͋ͬͨ

    View Slide

  17. ΄ ͔ ҹ ৅ త ͩ ͬ ͨ ͜ ͱ
    • ؖࠃ੎ͷ୆಄


    • ࠓճ20ਓ͕ۙ͘ࢀՃɽKorean-SMIR dinner͕2ճ։࠵
    ʢࢁຊ͸KAISTͷΑ͠ΈͰ2ճ໨ʹ͓अຐͨ͠ʣ


    • एखͷΞΫςΟϒ͞


    • ֶੜͰ΋ੵۃతʹslack౳ʹϙετ


    • ศ৐ͨ͠ˠ

    View Slide

  18. ͓ · ͚ ɿ Π ϯ υ ͷ ͜ ΅ Ε ࿩ ᶃ
    • ΧϨʔͷछྨ͕๛෋


    • ຖճ10छྨ͸ඞͣ͋ͬͨɽϨύʔτϦʔ΋๛෋ɽΧϨʔ޷͖ͳࢲʹ͸ఱࠃͩͬͨɽ


    • ຖճUberͷར༻͕େม…


    • ͳ͔ͳ͔͔ͭ·Βͳ͍


    • ੕4ϗςϧΛࣗྗͰݟ͚͕ͭͨɼγϟτϧόε͕ఏڙ͞Ε͍ͯΔެࣜఏڙͷϗςϧʹധ
    ·ͬͨ΄͏͕Α͔͔ͬͨ΋


    • ෲ௧ʹ஫ҙʂ


    • ಉߦϝϯόʔ6ਓத5ਓෲ௧ʹ…

    View Slide

  19. ͓ · ͚ ɿ Π ϯ υ ͷ ͜ ΅ Ε ࿩ ᶄ
    • Χʔυ͕࢖͑ͳ͍৔ॴଟ͠ʂ


    • ؖࠃ͔Βͷ౉ߤͰɼݱۚ͸KRW͔࣋ͬͯ͠ͳ͔ͬͨʢͦͷ্KRW྆ସෆՄʣɽɹɹɹɹɹɹɹ
    ۭߓͰ൧΋৯͑ͣ1్࣌ؒํʹ฻Ε͍ͯͨ…ʢޙͰATMΛݟ͚ͭͳΜͱ͔ͳΔʣ


    • ࢸΔॴʹ໺ݘ͕ʂ


    • ೔ຊͷോ͘Β͍਺͕͍Δײ֮. ҙ֎ʹ΋ਓջ͍ͬ͜


    • ձ৔ۙ͘Ͱ͸໺ੜͷݘͱࣛʁ͕݌՞͢Δϋϓχϯά΋


    • ंͷΫϥΫγϣϯͷԻ৭ͷଟ༷͞


    • ं͸΢ΠϯΧʔͷ୅ΘΓʹΫϥΫγϣϯΛ࢖͏จԽɽָثΈ͍ͨͰ֗த͕·ΔͰύϨʔυ

    View Slide

  20. ҹ ৅ ʹ ࢒ ͬ ͨ ݚ ڀ ᶃ ɿ ଟ ໘ త ͳ ࣝ ผ
    • Իָͷ΋ͭෳ਺ͷཁૉʹண໨͠ɼෳ਺ͷಛ௃Λ࢖͏


    • End-to-End Lyrics Transcription Informed by Pitch and Onset
    Estimation ՎࢺࣝผʹϐονͱΦϯηοτΛ࢖͏


    • And what if two musical versions don't share melody, harmony,
    rhythm, or lyrics? ΧόʔιϯάࣝผʹϐονɼϋʔϞχʔɼϦζ
    ϜɼՎࢺಛ௃Λ૊Έ߹ΘͤɼSoTA ʢଟ༷ͳόʔδϣϯҧ͍ʹదԠʣ


    • Verse versus Chorus: Structure-aware Feature Extraction for
    Lyrics-based Genre Recognition ՎࢺϕʔεͷδϟϯϧࣝผʹʮͲ
    ͷηΫγϣϯͷՎࢺ͔ʁʯͷӨڹΛௐࠪ

    View Slide

  21. • సҠֶश


    • Melody transcription via generative pre-training ϝϩσΟͷָේΛग़ྗɼJukebox΍MT3ͷembeddingΛར༻


    • Singing beat tracking with Self-supervised front-end and linear transformers Վ”ͷΈ”Λೖྗͱ͢ΔϏʔτਪఆɽWavLM΍
    DistillHuBERTΛೖྗʹར༻


    • Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription Wav2Vec2.0Λ༻͍ͨՎࢺࣝผɼগͳֶ͍शσʔλͰ΋େྔσʔ
    λͰֶशͤͨ͞Ϟσϧʹඖఢ͢ΔੑೳΛ࣮ݱ


    • σʔλ֦ு


    • PDAugment: Data Augmentation by Pitch and Duration Adjustments for Automatic Lyrics Transcription େྔͷ࿩੠Λ”ϝϩσΟ
    Խ”ֶ͠शσʔλʹՃ͑ΔՎࢺࣝผ


    • Scaling Polyphonic Transcription with Mixtures of Monophonic Transcriptions ෳ਺ͷ୯Իͷԋ૗σʔλΛॏͶ߹ΘͤͨσʔλΛֶ
    शͨ͠ෳ਺ָث࠾ේ


    • ίετߟྀܕֶश


    • Toward postprocessing-free neural networks for joint beat and downbeat estimation ޮ཰తͳ৞ΈࠐΈϞδϡʔϧʢSeparable
    convʣͱଛࣦؔ਺ʢFocal loss+DICE lossʣΛར༻ͨ͠ޙॲཧ͍ΒͣͷϏʔτਪఆ


    • Analysis and Detection of Singing Techniques in Repertoires of J-POP Solo Singers (ࣗ෼ͷ) Focal lossΛར༻͠୹͍ՎএςΫχο
    Ϋͷݕग़ੑೳUp
    ҹ ৅ ʹ ࢒ ͬ ͨ ݚ ڀ ᶄ ɿ σ ʔ λ ෆ ଍ ΁ ͷ ର Ԡ

    View Slide

  22. • Traces of Globalization in Online Music Consumption Patterns
    and Results of Recommendation Algorithms ԻָਪનγεςϜ
    ͕ԻָͷGlobalizationΛଅ͔͢Ͳ͏͔Λௐ΂ͨ -> USԻָ͕Ϛʔ
    έοτΛಠ઎͍ͯ͠Δࠃ͕ଟ͘ɼNNϕʔεͷਪનγεςϜ͕
    GlobalizationΛଅ͕͢USԻָͷಠ઎ੑ΋ॿ௕͢Δ͜ͱΛࣔࠦ


    • Violin Etudes: A Comprehensive Dataset for f0 Estimation and
    Performance Analysis όΠΦϦϯԋ૗ͷͨΊͷf0ͷΞϊςʔγϣ
    ϯσʔληοτ CREPE୯ମͰ͸ਖ਼͘͠ਪఆͰ͖ͳ͍͜ͱΛ֬ೝ
    ͠ɼஞ࣍తʹf0Λਪఆ->ֶशʹՃ͑Δͱ͍͏ϧʔϓͰσʔληο
    τΛߏங
    ҹ ৅ ʹ ࢒ ͬ ͨ ݚ ڀ ᶅ ɿ ͦ ͷ ଞ ॾ ʑ

    View Slide