6th place solution to Cornell Birdcall Identification Challenge

Slide 1

Slide 1 text

Cornell Birdcall Identification 6th place solution Team「Deepでポン」 Hidehisa Arai 1

Slide 2

Slide 2 text

About the competition 2 https://www.kaggle.com/c/birdsong-recognition Train dataset Test dataset The task: Build a system that can detect and classify birdcall events in audio clips • Around 20k audio clips • Additional ~25k audio clips (not officially provided but allowed to use) • 264 classes (bird species) • Each audio clip has primary labels and 1/3 of them also have secondary labels • Can be treated as multi-label, but itʼs basically for multi-class classification • Annotated in clip-level • Data from Xeno Canto , which means user- uploaded data. The annotations are also done by the uploaders • File format, sampling rate, audio quality, audio duration, etc. varies a lot. • Around 150 audio clips • Each clip is 10min long • 265 classes (bird species + `nocall` class) • We need to submit 5s chunk level prediction • The dataset is soundscape, which means it contains birdcall events of multiple species (sometimes overlapped) • Multi-label classification problem • Annotation is done in controlled environment with annotators

Slide 1

Slide 1 text

Slide 2

Slide 2 text

Slide 3

Slide 3 text

Slide 4

Slide 4 text

Slide 5

Slide 5 text

Slide 6

Slide 6 text

Slide 7

Slide 7 text

Slide 8

Slide 8 text

Slide 9

Slide 9 text

Slide 10

Slide 10 text

Slide 11

Slide 11 text

Slide 12

Slide 12 text

Slide 13

Slide 13 text

Slide 14

Slide 14 text

Slide 15

Slide 15 text

Slide 16

Slide 16 text