2023年2月27日に行われた音楽情報科学研究会の発表資料です.
I S M I R ͷ Α ͔ ͬ ͨ ͱ ͜ Ζ 3 બ2 0 2 3 . 0 2 . 2 7 S I G M U S 1 3 6I S M I R 2 0 2 2 ใ ࠂ ʙ ͍ ͪ ࢀ Ճ ऀ ͷ ઢ ͔ Β ʙ ɹ ɹ ɹ ஜ େ ֶ ࢁ ຊ ༤ 2022.12.08 Ұॹͷϗςϧͷϝϯόʔͨͪͱɽ
View Slide
ࣗ ݾ հ• ࢁຊ ༤ (·ͱ Ώ͏)• ஜେֶ ਓͱԻͷใֶݚڀࣨʢࣉᖒݚʣത࢜ޙظ՝ఔ2• (2022. 08-12: ؖࠃ KAIST ๚ݚڀһ)• J-POPͷ”ՎএςΫχοΫ”Λ͖ͭͷՎ͔Βࣗಈݕग़͢Δٕज़ʹ͍ͭͯݱͰൃද• Analysis and Detection of SingingTechniques in Repertoires of J-POP SoloSingers (ࢁຊ, Nam, ࣉᖒ)• Juhan Nam (KAISTʣͱͷڞಉݚڀC U LT U R A LD I V E R S I T Y ʹ ͬͯண Ͱ ൃ ද ʂ
I S M I R ͷ ͍ ͍ ͱ ͜ Ζ ᶃԻ ָ ใ ॲ ཧ ͷ࠷ ઌ Λ Ε Δ
ݟ ͷ ๅ ݿ ʂૉΒ͍͠ݚڀͨͪ ࠷ઌΛΕΔνϡʔτϦΞϧhttps://ismir2022.ismir.net/program/tutorials/
ٞ Λ ଅ ͢ ۭ ؒ ͕ ͑Β Εͯ ͍ ΔI S M I R ͷ ͍ ͍ ͱ ͜ Ζ ᶄ
ൃ ද ͷ Α ͏͢ޱ಄ൃද x 4ϙελʔൃද x 1࣌ؒ(ݱͷΈ)օ͕͋Δఔ༰Λͬͨঢ়ଶͰདྷͯ͘ΕΔ+ΈͬͪΓٞͰ͖Δʂ
ԕ ִ ͷ ࢀ Ճ ऀ ͱ ͷ Π ϯ λ ϥ Ϋ γ ϣ ϯֶձଆͰఏڙ͍ͯ͠ΔSlackͰΈͬͪΓٞԕִࢀՃऀͱͷϏσΦνϟοτൃੜʂ
ੈ ք த ͷ ݚ ڀ ऀ ͱͭ ͳ ͕ Γ ͕ Ͱ ͖ ΔI S M I R ͷ ͍ ͍ ͱ ͜ Ζ ᶅ
I S M I R ͱ ͍ ͏ ֶ ձ ͷ ಛ • ʢൺֱతʣগਓ• γϯάϧτϥοΫ• ࢀՃऀಉ࢜ͷͭͳ͕ΓΛੵۃతʹଅ͍ͯ͠Δ• ΈΜͳԻָʹؔ͢ΔݚڀΛ͍ͯ͠Δʢˡେࣄʣ
ݱ ͳ ΒͰ ͷ ͜ ͱόϯέοτίʔώʔλΠϜͰͷஊসλΠϜ Իָؔ࿈ͷϓϩάϥϜੈքதͷMIRݚڀऀͨͪͱɼԻָΛͭ·ΈʹݚڀͦΕͧΕͷจԽɾੜ׆ʹ͍ͭͯஊসͰ͖Δ
ηογϣϯ֎ࣗ༝ߦಈͰOKʂ؍ޫ͢ΔΑ͠ɼҙؾ߹ͨ͠ϝϯπͰग़͔͚ΔͷΑ͠SlackͰ͓͢͢Ίεϙοτͷใϙετ͞ΕΔ
ݸ ਓ త ʹ ࢥ ͏ ࠃ ࡍ ձ ٞ ʢ ର ໘ ʣ ͷ ͏ · Έ• ࠷ઌͷݚڀΛʹ৮ΕͯΔ͜ͱ͕Ͱ͖Δ• ΈͬͪΓٞͰ͖Δ• ੈքதͷݚڀऀͱͷग़ձ͍͕͋Δ• ҟͳΔͰҟͳΔจԽʹ৮ΕΔ͜ͱ͕Ͱ͖Δ• ISMIRͦͷ͏·ΈΛ࠷େԽͨ͠ࠃࡍձٞʂ
S H A L L W E I S M I R ?END
Ҏ߱ิεϥΠυʢNot ຊฤʣΞϝχςΟ
͓ · ͚ ɿ ࢀ Ճ ࣌ ʹ ͬͯΑ ͔ ͬ ͨ ͜ ͱ 5 બ• શͯͷจʹͬ͟ͱΛ௨͢• ࣭Λͱʹ͔͔͚͛͘ΔʢSlackͰϙελʔͰʣ• ੈքͷཧจԽʹৄ͘͠ͳ͓͖ͬͯɼͷωλΛ࡞Δ• ணΛணΔࣗͷࠃͰͤΔωλΛ͓࣋ͬͯ͘• λϒϨοτΛങ͏ʢϙελʔൃදͰύιίϯΑΓॏๅʣ
ӳ ޠ ؔ ࿈ Ͱ ݸ ਓ త ʹ ͬͯΑ ͔ ͬ ͨ ͜ ͱ• ӳޠ͕ۤख͔ͩΒͱҤॖ͗͢͠ͳ͍• ͱʹ͔͘ϦεχϯάΛຏ͘• पΓͨΓલ͕ͩωΠςΟϒεϐʔυ• ൃԻʹ͍ͭͯΔͷʹ͓͢͢Ί-> https://youtu.be/nEpewJsEgUg• ֤ࠃͷᨅΓͷบ͓ͬͯ͘ͱ͍͍͔• ࣌ʹσδλϧΛۦ• ୯ޠ͕ࢥ͍ු͔ͳ͍࣌“Ah~ What should I say… (खݩͰg○ogle༁)”ͰΓͬͨ໘͕͋ͬͨ
΄ ͔ ҹ త ͩ ͬ ͨ ͜ ͱ• ؖࠃͷ಄• ࠓճ20ਓ͕ۙ͘ࢀՃɽKorean-SMIR dinner͕2ճ։࠵ʢࢁຊKAISTͷΑ͠ΈͰ2ճʹ͓अຐͨ͠ʣ• एखͷΞΫςΟϒ͞• ֶੜͰੵۃతʹslackʹϙετ• ศͨ͠ˠ
͓ · ͚ ɿ Π ϯ υ ͷ ͜ ΅ Ε ᶃ• ΧϨʔͷछྨ͕๛• ຖճ10छྨඞͣ͋ͬͨɽϨύʔτϦʔ๛ɽΧϨʔ͖ͳࢲʹఱࠃͩͬͨɽ• ຖճUberͷར༻͕େม…• ͳ͔ͳ͔͔ͭ·Βͳ͍• 4ϗςϧΛࣗྗͰݟ͚͕ͭͨɼγϟτϧόε͕ఏڙ͞Ε͍ͯΔެࣜఏڙͷϗςϧʹധ·ͬͨ΄͏͕Α͔͔ͬͨ• ෲ௧ʹҙʂ• ಉߦϝϯόʔ6ਓத5ਓෲ௧ʹ…
͓ · ͚ ɿ Π ϯ υ ͷ ͜ ΅ Ε ᶄ• Χʔυ͕͑ͳ͍ॴଟ͠ʂ• ؖࠃ͔ΒͷߤͰɼݱۚKRW͔࣋ͬͯ͠ͳ͔ͬͨʢͦͷ্KRW྆ସෆՄʣɽɹɹɹɹɹɹɹۭߓͰ൧৯͑ͣ1్࣌ؒํʹΕ͍ͯͨ…ʢޙͰATMΛݟ͚ͭͳΜͱ͔ͳΔʣ• ࢸΔॴʹݘ͕ʂ• ຊͷോ͘Β͍͕͍Δײ֮. ҙ֎ʹਓջ͍ͬ͜• ձۙ͘Ͱੜͷݘͱࣛʁ͕՞͢Δϋϓχϯά• ंͷΫϥΫγϣϯͷԻ৭ͷଟ༷͞• ंΠϯΧʔͷΘΓʹΫϥΫγϣϯΛ͏จԽɽָثΈ͍ͨͰ֗த͕·ΔͰύϨʔυ
ҹ ʹ ͬ ͨ ݚ ڀ ᶃ ɿ ଟ ໘ త ͳ ࣝ ผ• Իָͷͭෳͷཁૉʹண͠ɼෳͷಛΛ͏• End-to-End Lyrics Transcription Informed by Pitch and OnsetEstimation ՎࢺࣝผʹϐονͱΦϯηοτΛ͏• And what if two musical versions don't share melody, harmony,rhythm, or lyrics? ΧόʔιϯάࣝผʹϐονɼϋʔϞχʔɼϦζϜɼՎࢺಛΛΈ߹ΘͤɼSoTA ʢଟ༷ͳόʔδϣϯҧ͍ʹదԠʣ• Verse versus Chorus: Structure-aware Feature Extraction forLyrics-based Genre Recognition ՎࢺϕʔεͷδϟϯϧࣝผʹʮͲͷηΫγϣϯͷՎࢺ͔ʁʯͷӨڹΛௐࠪ
• సҠֶश• Melody transcription via generative pre-training ϝϩσΟͷָේΛग़ྗɼJukeboxMT3ͷembeddingΛར༻• Singing beat tracking with Self-supervised front-end and linear transformers Վ”ͷΈ”Λೖྗͱ͢ΔϏʔτਪఆɽWavLMDistillHuBERTΛೖྗʹར༻• Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription Wav2Vec2.0Λ༻͍ͨՎࢺࣝผɼগͳֶ͍शσʔλͰେྔσʔλͰֶशͤͨ͞Ϟσϧʹඖఢ͢ΔੑೳΛ࣮ݱ• σʔλ֦ு• PDAugment: Data Augmentation by Pitch and Duration Adjustments for Automatic Lyrics Transcription େྔͷΛ”ϝϩσΟԽ”ֶ͠शσʔλʹՃ͑ΔՎࢺࣝผ• Scaling Polyphonic Transcription with Mixtures of Monophonic Transcriptions ෳͷ୯ԻͷԋσʔλΛॏͶ߹ΘͤͨσʔλΛֶशͨ͠ෳָث࠾ේ• ίετߟྀܕֶश• Toward postprocessing-free neural networks for joint beat and downbeat estimation ޮతͳΈࠐΈϞδϡʔϧʢSeparableconvʣͱଛࣦؔʢFocal loss+DICE lossʣΛར༻ͨ͠ޙॲཧ͍ΒͣͷϏʔτਪఆ• Analysis and Detection of Singing Techniques in Repertoires of J-POP Solo Singers (ࣗͷ) Focal lossΛར༻͍͠ՎএςΫχοΫͷݕग़ੑೳUpҹ ʹ ͬ ͨ ݚ ڀ ᶄ ɿ σ ʔ λ ෆ ͷ ର Ԡ
• Traces of Globalization in Online Music Consumption Patternsand Results of Recommendation Algorithms ԻָਪનγεςϜ͕ԻָͷGlobalizationΛଅ͔͢Ͳ͏͔Λௐͨ -> USԻָ͕ϚʔέοτΛಠ͍ͯ͠Δࠃ͕ଟ͘ɼNNϕʔεͷਪનγεςϜ͕GlobalizationΛଅ͕͢USԻָͷಠੑॿ͢Δ͜ͱΛࣔࠦ• Violin Etudes: A Comprehensive Dataset for f0 Estimation andPerformance Analysis όΠΦϦϯԋͷͨΊͷf0ͷΞϊςʔγϣϯσʔληοτ CREPE୯ମͰਖ਼͘͠ਪఆͰ͖ͳ͍͜ͱΛ֬ೝ͠ɼஞ࣍తʹf0Λਪఆ->ֶशʹՃ͑Δͱ͍͏ϧʔϓͰσʔληοτΛߏஙҹ ʹ ͬ ͨ ݚ ڀ ᶅ ɿ ͦ ͷ ଞ ॾ ʑ