hybrid-vocal-classifier tutorial

Slide 1

Slide 1 text

github.com/NickleDave/hybrid-vocal-classifier David Nicholson Graduate Student, Neuroscience Emory University

Slide 2

Slide 2 text

• Birdsong • consists of elements called syllables • segment sound file into syllables by threshold crossings of amplitude

Slide 3

Slide 3 text

• Each bird’s song similar to song of its tutor • But each individual of a species will have a unique song • So syllable “c” for Bengalese finch #1 is not syllable “c” for Bengalese finch #2

Slide 4

Slide 4 text

• Problem: • Birds sing 100s of songs a day • Many more than can be labeled by hand

Slide 5

Slide 5 text

• Problem: • Birds sing 100s of songs a day • Many more than can be labeled by hand • Previous work: • Sound Analysis Pro (soundanalysispro.com): • Great software! Open source! Drove field forward! • Avoids labeling song, produces similarity scores based on cross-correlation of spectrograms • Some groups automate labeling with clustering • Clustering doesn’t work very well for some bird species similarity spectrogram 1 spectrogram 2 100% 0%

Slide 6

Slide 6 text

• Problem: • Birds sing 100s of songs a day • Many more than can be labeled by hand • Previous work: • Sound Analysis Pro • Other machine learning methods applied to Bengalese finch song • K nearest neighbors (k-NN) • Support Vector Machines (SVM) • Convolutional Neural Network (CNN)

Slide 7

Slide 7 text

• Problem: • Birds sing 100s of songs a day • Many more than can be labeled by hand • Previous work: • Sound Analysis Pro • Other machine learning methods applied to Bengalese finch song • Hard to compare different machine learning methods • not all open-source • not all well-documented software • very little in the way of publicly available repository of song

Slide 8

Slide 8 text

• Enter • a library that automates labeling vocalizations • what it is not: • Shazam for songbirds • what it is: • like voice to text but for songbirds

Slide 9

Slide 9 text

• • open source • built on scipy-numpy stack • implements previously proposed approaches: • SVM and k-NN via scikit-learn • neural nets using Keras • easy to use, run on YAML config scripts • released with a large data set • hand-labeled data • well-segmented song • days, ~20k data points/day

Slide 10

Slide 10 text

• • goals of library: 1. make it easy to label song in an automated way 2. make it easy to compare different previously-proposed machine learning methods for automated labeling of song 3. make it easy to test new machine learning methods

Slide 11

Slide 11 text

• Comparing previous models: • k-Nearest Neighbors: http://www.utsa.edu/troyerlab/software.html https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm#/media/File:KnnClassification.svg Antti Ajanki, Wikipedia, CC 3.0

Slide 12

Slide 12 text

• Comparing previous models: • k-Nearest Neighbors: http://www.utsa.edu/troyerlab/software.html 50 ms

Slide 13

Slide 13 text

• Comparing previous models: • Support Vector Machine Tachibana et al. 2014 https://en.wikipedia.org/wiki/Support_vector_machine#/media/File:Svm_max_sep_hyperplane_with_margin.png

Slide 14

Slide 14 text

• Comparing previous models: • Support Vector Machine Tachibana et al. 2014 • features: • average spectra, cepstra plus • many from CUIDADO feature set (Peeters 2004): • spectral centroid • spectral spread • etc. https://en.wikipedia.org/wiki/Support_vector_machine#/media/File:Svm_max_sep_hyperplane_with_margin.png

Slide 15

Slide 15 text

• Comparing previous models: • Convolutional neural net • Koumura Okanoya 2016

Slide 16

Slide 16 text

• Comparing previous models: • Convolutional neural net • architecture • convolutional layer • max-pooling • “window” layer • goal: segmentation + classification

Slide 17

Slide 17 text

• Comparing previous models: • Convolutional neural net • open code + data!!! ty!

Slide 18

Slide 18 text

• Comparisons • Plot learning curves • accuracy v. training data (# of hand-labeled songs) • I want the best model for the least data

Slide 19

Slide 19 text

• Comparisons • Plot learning curves • accuracy v. training data (# of hand-labeled songs) • I want the best model for the least data • 5-fold cross validation

Slide 20

Slide 20 text

• Comparisons • Plot learning curves • accuracy v. training data (# of hand-labeled songs) • I want the best model for the least data • 5-fold cross validation • For each fold: random grab of n songs from training set, measure average accuracy across syllables

Slide 21

Slide 21 text

• Using hybrid-vocal- classifier on our data set, I find: • SVM outperforms k- NN if a radial basis function is used

Slide 22

Slide 22 text

• Using hybrid-vocal- classifier on our data set, I find: • A simple convolutional neural net with minimal training data outperforms SVM

Slide 23

Slide 23 text

• Sober lab • Jonah Queen