Upgrade to Pro — share decks privately, control downloads, hide ads and more …

BirdCLEF2021まとめ

E6311847f4fe11546a24eec9e7a403a5?s=47 start
June 12, 2021

 BirdCLEF2021まとめ

E6311847f4fe11546a24eec9e7a403a5?s=128

start

June 12, 2021
Tweet

Transcript

  1. BirdCLEF2021·ͱΊ ίϯϖ֓ཁͱ্Ґऀղ๏ English version also available (ΞΠίϯ୳͠த...) start 
 (@startjapan)

    (Speaker Deckͷ֓ཁཝ͔Β֤ϦϯΫʹඈ΂·͢)
  2. ࣗݾ঺հ

  3. ޿ౡͷҩֶੜɽ 
 ࠃࢼͷษڧͷ๣Βҩྍϕϯνϟʔ (ג)MNES ʹͯ ΠϯλʔϯΛ͓ͯ͠Γɼಉϕϯνϟʔ͕ޙԉ͢Δ LAIMEͱ͍͏ֶੜ޲͚ػցֶशษڧαʔΫϧͰ 
 ݚᮎΛੵΜͰ͍·͢ɽ 


    ຊίϯϖͰ༏উ͠ɼMasterͱͳΓ·ͨ͠ɽ 
 (ଟ෼ʹνʔϜϝΠτͷ͓͔͛Ͱ͕͢...) 
 ͋ͱɼྑ͍ΞΠίϯΛ୳͍ͯ͠·͢ɽ kaggleɿ@startjapan 
 twitter ɿ@startjapanml ࣗݾ঺հ
  4. ίϯϖ֓ཁ

  5. ίϯϖ֓ཁ • 5ඵࠁΈͷԻ੠ηάϝϯτ͔Β໐͍͍ͯΔௗछΛಛఆ͢Δίϯϖ 
 (2020೥ʹ΋ಉ͡ओ࠵ऀ͕ྨࣅίϯϖΛ։࠵͍ͯ͠Δ → 2020೥ͷௗίϯϖͱݺͼ·͢) • trainσʔλ͸xeno-cantoͱ͍͏ௗͷ໐͖੠ڞ༗αΠτ͔Βऔಘ͞ΕͨԻ੠ 


    (train_short_audio) • testσʔλ͸10෼×80݅ͷԻ੠ϑΝΠϧɼ͜ΕΛ5ඵ͝ͱʹ۠੾Γ༧ଌ͢Δ 
 (test_soundscapes) • ্هͱ͸ผʹvalidation༻ͷԻ੠(10෼×20݅)΋༩͑ΒΕͨ 
 (train_soundscapes)
  6. [test_soundscapes] testσʔλɽ10෼×80͕݅ͩఏग़͠ͳ͍ͱΞΫηεͰ͖ͳ͍ɽ 
 τʔλϧ4ͭͷ৔ॴͰ࿥Ի͞Ε͍ͯΔɽ [train_short_audio] ֶशσʔλɽௗछ͝ͱʹԻ੠͕·ͱΊΒΕ͍ͯΔɽ 
 ߹ܭͰ62874݅ͷԻ੠σʔλɽ [train_soundscapes] test_soundscapesʹ͍ۙԻڹυϝΠϯΛ࣋ͭɽ

    
 10෼×20݅͋Γɼtest_soundscapesΛ࿥Իͨ͠4ͭͷ৔ॴͷ 
 ͏ͪ2ͭͷ৔ॴͰ࿥ΒΕͨԻ੠ [train_metadata.csv] train_short_audioʹର͢Δmetadataɽshape͸(62784, 14) [train_soundscape_labels.csv / test.csv] 10෼ͷϑΝΠϧΛ5ඵηάϝϯτʹ෼͚ͨࡍͷࠎ૊ΈΛఏڙɽ 
 train_soundscape_labels.csv͸train_soundscapesʹɼ 
 test.csv͸test_soundscapesʹରԠ͠ɼ 
 લऀʹͷΈਖ਼ղϥϕϧ͕෇͍͍ͯΔɽ
  7. ఏग़ܗࣜ & ධՁࢦඪ • 1ͭͷηάϝϯτʹରͯ͠ෳ਺ͷௗछΛ༧ଌͱͯ͠ఏग़Մೳ • ௗ͕໐͍͍ͯͳ͍ηάϝϯτʹରͯ͠͸"nocall"ͱ͍͏จࣈྻΛఏग़ • ධՁࢦඪ͸ߦ͝ͱͷmicro-F1είΞͷฏۉ

  8. 2020೥ͷௗίϯϖͱͷࠩҟ

  9. 2020೥ͷௗίϯϖͱͷࠩҟ • train_soundscapesͷଘࡏ 
 ʔ train_short_audioͱtestσʔλͷؒʹ͸ԻڹυϝΠϯͷ͕ࠩେ͖͍ 
 ʔ ࠓճͷίϯϖͰ͸ΑΓtestσʔλʹ͍ۙԻڹυϝΠϯΛ࣋ͭtrain_soundscapes͕༩͑ΒΕͨ 


    ʔ validation༻్Ͱ༻͍ΒΕΔ͜ͱ͕ଟ͔͕ͬͨɼதʹ͸޻෉ֶͯ͠शʹ༻͍Δਓ΋͍ͨ • testσʔλͷҐஔ৘ใʹΞΫηεͰ͖ͨ 
 ʔ testσʔλͷ֤ϑΝΠϧ໊ʹ৔ॴͷ৘ใ͕ೖ͍ͬͯΔ͜ͱ͸อূ͞Ε͍ͯͨ (೔෇΋) 
 ʔ ैͬͯɼ͜ΕΒͷ৘ใ΋ԿΒ͔ͷܗͰύΠϓϥΠϯʹ૊ΈࠐΉඞཁ͕͋ͬͨ (ࢀߟɿStarter and some thoughts by @hidehisaarai1213)
  10. EDA (train_short_audioฤ)

  11. 1ϑΝΠϧ͋ͨΓͷԻ੠ͷ௕͞ (train_short_audio) ※ train_short_audioͷ͏ͪ1000݅ ( / 62874݅) ͷԻ੠ϑΝΠϧΛϥϯμϜαϯϓϦϯά ※ ԣ࣠

    : 1ϑΝΠϧ͋ͨΓͷԻ੠ͷ௕͞ [ඵ] ※ ॎ࣠ : ౓਺ (߹ܭ1000݅)
  12. 1छͷௗʹରͯ͠Կ݅ͷԻ੠ϑΝΠϧ͕͋Δʁ (train_short_audio) ※ ԣ࣠ : ֤ௗʹ͓͚ΔԻ੠ϑΝΠϧ਺ (train_short_audio಺) ※ ॎ࣠ :

    ౓਺ (߹ܭ397छ)
  13. secondary labelsʹ͸ܽଛ͕͋Δͱ໌ه͞Ε͍ͯΔ (train_short_audio) (BirdCLEF2021: Exploring the dataΑΓҾ༻)

  14. EDA (soundscapesฤ)

  15. શߦnocallఏग़ͰPublicLBͷnocall཰͸෼͔Δ BirdCLEF2021 (ࢀߟɿ2020೥ͷௗίϯϖ) Private Private Public Public

  16. ҰํͰtrain_soundscapesͰ͸΍΍ߴ͍nocall཰

  17. train_soundscapesʹ͓͚Δ໨తม਺ͷ෼෍ (nocallࠐΈ) • ѹ౗తʹnocall͕ଟ͍ • 2छҎ্໐͍͍ͯΔ5ඵηάϝϯτ΋͋Δ

  18. • Α͘؍ଌ͞ΕΔௗछͷ૊Έ߹Θͤ΋͋Δ train_soundscapesʹ͓͚Δ໨తม਺ͷ෼෍ (nocall࡟আ൛)

  19. train_soundscapesʹ͓͍ͯ5ඵηάϝϯτ಺Ͱಉ࣌ʹ໐͍͍ͯΔௗͷ਺

  20. Ի੠ೝࣝλεΫͷϕʔγοΫͳղ๏

  21. Ի੠ೝࣝλεΫͷϕʔγοΫͳղ๏ Ի੠σʔλ͸ԣ͕࣠࣌ؒɼॎ͕࣠प೾਺ɼ 
 ֤ϐΫηϧ͕৴߸੒෼ͷڧ౓Λࣔ͢ը૾ 
 (εϖΫτϩάϥϜ) ʹม׵ՄೳͰ͋Γɼ 
 ͜Εʹରͯ͠CNNͳͲΛదԠ͢Δͱ 


    ैདྷ௨Γͷը૾ॲཧͱͯ͠ѻ͑Δɽ ※ ຊίϯϖͰ͸ॎ࣠(प೾਺)ʹϝϧई౓Λ࢖༻ͨ͠ϝϧεϖΫτϩάϥϜ͕Α͘࢖ΘΕͨ 
 ※ ϝϧई౓ͱ͸ɿԻͷप೾਺ʹؔͯ͠ɼ͜ͷई౓্Ͱͷ͕ࠩಉ͡Ͱ͋Ε͹ਓ͕ؒࣖͰײ͡ΔԻͷߴ͞ͷࠩ΋ಉ͡ʹͳΔ CNN (ը૾͸BirdCLEF2021: Processing audio dataΑΓҾ༻)
  22. ຊίϯϖಛ༗ͷΫη

  23. ຊίϯϖಛ༗ͷΫη • train_short_audioʹରͯ͠weak label͔͠ৼΒΕ͍ͯͳ͍ (weak label໰୊) 
 ʔ ਺ेඵͷԻ੠σʔλશମʹରͯ͠ϥϕϧ͕෇༩͞Ε͍ͯΔ 


    ʔ 5ඵ۠੾ΓͷηάϝϯτϨϕϧͰͲͷௗ͕໐͍͍ͯΔ͔͕෼͔Βͳ͍ • train_short_audioͷҰ෦Ͱϥϕϧͷܽଛ͕͋Δ (noisy label໰୊) 
 ʔ ಛʹsecondary_labels(※)ʹ͸ܽଛ͕͋Δͱ໌ه͞Ε͍ͯΔ • ࿥Ի೔΍৔ॴͷ৘ใͳͲͷmetadata΋ԿΒ͔ͷܗͰ৫ΓࠐΉඞཁ͕͋Δ (metadataͷ৫ࠐ) • ༧ଌର৅ͷલޙͷηάϝϯτͰௗ͕໐͍͍ͯΔ͔ͱ͍͏৘ใ΋ҙຯΛ࣋ͭՄೳੑ͕͋Δ 
 (ηάϝϯτલޙ৘ใͷ৫ࠐ) • train_soundscapesͱtest_soundscapesͰnocall཰͕େ͖͘ҟͳΔ (CVઓཱུ֬ͷࠔ೉) ※ train_short_audioͷϥϕϧʹ͸primary_labelͱsedondary_labelsͷ2छྨ͕͋Δ
  24. ্Ґऀͷղ๏ top solutions and approaches ্هͷdiscussionʹ্Ґऀղ๏΁ͷϦϯΫ͕·ͱ·͍ͬͯ·͢

  25. 1st place (ours!) [1st Place] Quick Solution [1st Place] Detailed

    Solution
  26. tl;dr 1st stage : ֎෦σʔλ(freefield1010)Λ࢖ͬͯbinary nocall detector ࡞੒ (1 :

    Կ͔ௗ໐͍ͯΔ / 0 : nocall) 
 2nd stage : nocall detectorΛ࢖ͬͯtrain_short_audio͔Βnocall෦෼ͷweightΛݮΒ্ͨ͠Ͱ397࣍ݩϚϧνϥϕϧ෼ྨثΛ࡞੒ 
 3rd stage : nocall detectorͷ݁Ռɼmetadataɼ2nd stageͷ݁ՌͳͲ͔Βࣗલtable competitionΛ࡞੒ ࠷ऴతʹࣗલtable competitionʹ͢Δ͜ͱͰ 
 weak label໰୊ɼnoisy label໰୊ɼmetadataͷ৫ࠐɼηάϝϯτલޙ৘ใͷ৫ࠐͳͲΛ·Δͬͱղܾʂʂ ※ Inference Part ͷΈͷུ֓Ͱ͋Γɼ1st stage෦෼͸লུ͍ͯ͠·͢
  27. ͳͥtableԽͰweak label໰୊ & noisy label໰୊͕ղܾ͞ΕΔʁ • 3rd stageͷtargetม਺ (0 :

    ͸ͣΕߦ / 1 : ͋ͨΓߦ) ͸ҎԼͷྲྀΕͰܾఆ͞ΕΔ • ਺ेඵͷԻ੠σʔλʹରͯ͠෇༩͞Εͨprimary & secondary labelsʹରͯ͠ηάϝϯτ୯ҐͰ༧ଌ஋Λग़ͤΔ 
 nocall detectorͱϚϧνϥϕϧ෼ྨثͷग़ྗΛ૊Έ߹ΘͤΔ͜ͱͰweak label໰୊Λղܾ • Ծʹsecondary labelsʹܽଛ͕͋Δͱϥϕϧ0͕෇༩͞ΕΔ͕ϥϕϧ0ͷαϯϓϧ਺͸ൺֱతଟ͘ 
 noise͸͍͍ײ͡ʹຒ΋ΕΔ (noisy label໰୊ͷ؇࿨)
  28. more details... • νʔϜϝΠτͷkami͞Μ (twitter : @634kami / kaggle :

    @kami634) ͕ҎԼʹ೔ຊޠͰղ๏Λ·ͱΊͯ͘Ε·ͨ͠ 
 Kaggle ͷௗίϯϖͰ1ҐΛऔͬͨ࿩ɿBirdCLEF 2021 ༏উղ๏
  29. 2nd place 2nd place solution

  30. (2nd place solutionΑΓҾ༻) 2nd place

  31. 2nd place • train_short_audio͔Β30ඵ୯ҐͰநग़ͨ͠ͷͪɼ5ඵ͝ͱʹ۠੾Γɼmixup͢Δ 
 (weak label໰୊΁ͷରԠ) • train_soundscapesͷ͏ͪ10෼ؒશ͘ௗ͕໐͔ͳ͍Ի੠ϑΝΠϧ3ͭͷআ֎ &

    ϒʔτετϥοϓαϯϓϦϯά 
 (ϩόετͳCVઓུ) • label smoothing & metadataதͷratingྻΛ༻͍ͯॏΈ෇͚ (noisy label໰୊΁ͷରԠ) • ᮢ஋બ୒ͷࡍͷtips 
 ʔ LBͰ͸CVΑΓnocall཰͕௿͍ͷͰᮢ஋ΛԼ͛ͯௗΛଟ͘༧ଌ 
 ʔ ϞσϧؒͰ֬཰஋ͷ෼෍͕ҟͳΔͨΊ୯Ұͷ֬཰஋Λᮢ஋ͱ͢Δͷ͸φϯηϯε 
 ΑͬͯɼύʔηϯλΠϧϕʔεͷᮢ஋Λ࢖༻ • ͦͷଞ (ޙॲཧ) 
 ʔ ௗ͝ͱͷฏۉ༧ଌ֬཰͔Βݸʑͷ֬཰஋Λमਖ਼ 
 ʔ લޙηάϝϯτ৘ใΛ࢖༻ 
 ʔ nocall detectorͷ݁ՌΛՃຯ 
 ʔ ࣌ͱ৔ॴͷ৘ใ͔Β͋Γ͑ͳ͍ௗछΛ༧ଌΛ͍ͯ͠Δ৔߹͸࡟আ (metadataͷ৫ࠐ)
  32. 4th place 4th place solution

  33. 4th place • SEDϞσϧΛ࢖༻ɼೖྗ͸10-30ඵ (weak label໰୊΁ͷରԠ) 
 (ࢀߟɿIntroduction to Sound

    Event Detection by @hidehisaarai1213) • ͜ͷํ΋mixupΛ࢖༻ • psudo labelingΛ࣮ߦ (noisy label໰୊΁ͷରԠ) • ༧ଌର৅ͷ5ඵηάϝϯτͱͦΕΛத৺ͱ͢Δ30ඵηάϝϯτͷͦΕͧΕʹର͢ΔSEDͷग़ྗ Λ૊Έ߹Θͤͯ࠷ऴग़ྗͱͨ͠ (ηάϝϯτલޙ৘ใͷ৫ࠐ) 
 ʔ 5ඵηάϝϯτʹରͯ͠͸খ͞ͳᮢ஋ɼ30ඵηάϝϯτʹରͯ͠͸େ͖ͳᮢ஋Λ࢖༻ • 2Ґͷղ๏ͱಉ༷ʹɼ࣌ͱ৔ॴͷ৘ใ͔Β؍ଌ͞ΕΔՄೳੑ͕௿͍ͱ൑அͨ͠ௗछ͸࡟আ 
 (metadataͷ৫ࠐ)
  34. 5th place 5th place solution

  35. 5th place • 2020೥ͷௗίϯϖͰ2ҐͩͬͨํͰ͋Γɼࠓճ΋ͦΕΛϕʔεͱ͍ͯͨ͠ • લճ͔Βͷվળ఺ɿSEDʹมߋͰ +1% (※1) / ᮢ஋ௐ੔ʹΑΓ

    +1% / Ξϯαϯϒϧํ๏վྑͰ +1% 
 (+ ஍Ҭ৘ใΛ΋ͱʹ༧ଌϥϕϧͷߜΓࠐΈ΋ͨ͠Έ͍ͨ(※2) ) • augmentation͕ಛ௃తɿը૾Λ0.5-3৐ / nഒ଎ / Ӎ΍ձ࿩ͳͲͷԻΛ௥Ճ / ϊΠζ௥Ճ / 0.5ͷ֬཰Ͱप೾਺ௐ੔ 
 (1-4Ґ͸mixup΍ϊΠζ௥Ճʹཹ·Δҹ৅) • primary label͸ϥϕϧ1, secondary labels͸ϥϕϧ0.3Λ෇༩ • 1ͭͷηάϝϯτͰ؍ଌ͞Εͨௗ͸10෼ͷԻ੠ϑΝΠϧશମͰर্͍͛΍͘͢ͳΔΑ͏ௐ੔ (※3) ※1 : weak label໰୊΁ͷରԠ ※2 : metadataͷ৫ࠐ ※3 : ηάϝϯτલޙ৘ใͷ৫ࠐ
  36. 8th place 8th place writeup

  37. 8th place • 2020೥ͷௗίϯϖͰ6Ґͩͬͨํɼࠓճ΋SEDΛ࢖༻ (weak label໰୊΁ͷରԠ) • ֶश࣌͸5ඵ or 20ඵηάϝϯτɼਪ࿦࣌͸40ඵηάϝϯτΛ࢖༻ɼ௕͍΄͏͕Α͔ͬͨ

    
 ·ͨɼਪ࿦͸0-40ඵͰߦͬͨ࣍ʹ20-60ඵͱ͍͏෩ʹoverlapΛ΋ͨͤͨ (ηάϝϯτલޙ৘ใͷ৫ࠐ) • augmentationɿΨ΢γΞϯϊΠζɼϐϯΫϊΠζɼϘϦϡʔϜௐ੔ɼϐονγϑτ 
 (mixup΋্ख͘ߦ͕ͬͨܭࢉࢿݯͷ໰୊Ͱ࠷ऴఏग़ʹ͸૊ΈࠐΊͳ͔ͬͨͦ͏) • ଛࣦؔ਺͕ಛ௃త (BCEFocal2WayLoss) • primary labelͱsecondary labels͸ಉ͡Α͏ʹѻͬͨ • psudo labelingΛ࣮ߦ (noisy label໰୊΁ͷରԠ) • ᮢ஋͸call thresholdͱnocall thresholdͷ2͕ͭଘࡏ͠ɼcall thresholdΛ௒͑ͨௗछ͸ཅੑͱ͢ΔҰํͰ 
 શͯͷௗछʹ͓͍ͯnocall thresholdΛ௒͑ͳ͔ͬͨηάϝϯτʹ͸nocall΋෇༩ (ௗϥϕϧͱnocall͕ڞଘ͠͏Δ) • ஍Ҭ৘ใ͔Βଘࡏ͢Δ͸͕ͣͳ͍ௗछ͸༧ଌ͍ͯͯ͠΋আ֎ (metadataͷ৫ࠐ) • ௗ͕໐͍͍ͯΔߦͱnocallߦʹ෼͚ͯF1είΞΛࢉग़͠0.54 * nocall_f1 + 0.46 * call_f1ͰCVΛಋग़ (ϩόετͳCVઓུ)
  38. 9th place 9th Place solution

  39. 9th place • ֶश࣌ͷೖྗ͸5-7ඵηάϝϯτ • secondary labelsͷॏΈ͸খͨ͘͞͠ • mixup࢖༻ •

    ՄೳͳݶΓͷଟ༷ੑΛ΋ͨͤͨ 
 ʔ ࣌ؒ෼ղೳͷҟͳΔmel-spectrogramɼhop_length͸200ͱ320 
 ʔ ༷ʑͳbackbone 
 ʔ augmentationɿwhite noise, pink noise, band noise, nocall clipsɼmel-spectrogramը૾ͷྦྷ৐ • ޙॲཧ 
 ʔ 10෼ͷԻ੠σʔλશମʹ͓͚Δ֤ௗͷ໐͘࠷େ֬཰ or ฏۉ֬཰Ͱޙॲཧ (ηάϝϯτલޙ৘ใͷ৫ࠐ) 
 ʔ ஍Ҭ৘ใ͔ΒͲΕ͚֤ͩௗ͕໐͘Մೳੑ͕͋Δ͔ධՁͯͦ͠ͷ݁ՌͰޙॲཧ (metadataͷ৫ࠐ) 
 ʔ 1೔ͷؒͰ֤ௗ͕໐͘࠷େ֬཰Λ࢖ͬͯޙॲཧ (metadataͷ৫ࠐ) • Squeeze width of test soundscapes by 2-5% (mostly to reverse far field effects) • ͜ͷํ΋ᮢ஋Λ2ͭ(call, nocall)ઃఆ͠ɼௗϥϕϧͱnocallͷڞଘΛೝΊͨ
  40. 11th place My journey (11th solution)

  41. 11th place • Public LBͰ௕͍͜ͱटҐΛಠ઎͞Ε͍ͯͨCPMP͞Μ • 2020೥ͷௗίϯϖͰ18ҐɼRainforestίϯϖͰ11ҐΛͱΒΕͨํͰ͋Γ྆ऀͷղ๏Λmixͨ͠΋ͷΛϕʔεͱͨͦ͠͏ 
 ʔ 2020೥ͷௗίϯϖͷղ๏

    : 18th place solution: efficientnet b3 
 ʔ Rainforestίϯϖͷղ๏ : 11th place, The 0.931 Magic Explained: Image Classification • 8Ґͷղ๏ͱಉ͘͡0.54 * nocall_f1 + 0.46 * call_f1ʹͯCVΛࢉग़ (ϩόετͳCVઓུ) • ΞϯαϯϒϧͰ࣮֬ʹείΞ্͕ঢ͢Δͱա৴͓ͯ͠Γίϯϖऴྃ਺೔લ·ͰΞϯαϯϒϧverΛఏग़͠ͳ͔ͬͨ͜ͱ Λޙչͳ͍ͬͯ͞Δ (࣮ࡍʹ͸Ξϯαϯϒϧ͕ޮ͔ͳ͔ͬͨͦ͏)
  42. ͋Γ͕ͱ͏͍͟͝·ͨ͠ʂ