VocalPy: a core Python package for acoustic communication research

A core Python package for researchers studying acoustic communication David
Nicholson PyData Global 2023 @nicholdav NickleDave @[email protected] nicholdav.info

Acknowledgements Dr. Yarden Cohen Dept of Brain Sciences Weizmann Institute
of Science, Israel https://github.com/NeuralSyntaxLab https://www.weizmann.ac.il/brain-sciences/labs/cohen/ @YardenJCohen

A core package for animal acoustic communication research in Python
VocalPy

VocalPy Adapted from: Bass Chagnaud 2012, Branstetter et al. 2016, Chen et al. 2020

Acoustic behavior and animal communication What makes us human? Language
and speech How is speech like birdsong and bat calls? How did speech evolve? How do animals learn their vocalizations? Boraud Leblois Rougier 2014 https://inria.hal.science/hal-01874690

VocalPy • interdisciplinary (Wirthlin et al. 2019) • big team science (Hauser et al. 2002) • big data, automated analyses • → cutting edge computational methods ◦ Deep learning, AKA neural network models (Sainburg et al 2021, Stowell 2022) Hauser et al. 2002

• pykanto: https://github.com/nilomr/pykanto • koe: https://github.com/fzyukio/koe • Hundreds of diﬀerent
libraries and diﬀerent formats for audio, array data, annotations: https://github.com/rhine3/bioacoustics-software Do we really need a core package?

Goals: • Develop a library that is general and robust • Enable better collaboration VocalPy

Features: • Data types • Works with a wide variety of formats • Classes for common steps in workﬂows • Designed for reproducibility • Allows scientist-coders to interactively build datasets then share as a database VocalPy

VocalPy • Built on the scientiﬁc Python stack (thank you
NumFocus!) ◦ numpy ◦ scipy ◦ matplotlib ◦ pandas ◦ dask ◦ xarray

Case study: TweetyNet • A neural network model o Automates
annotations o Avoids issues with segmenting audio Cohen et al., https://elifesciences.org/articles/63853

Annotating animal sounds manually, with a GUI

Annotating animals sounds Annotation requires segmenting audio into a sequence
of units (Kershenbaum et al. 2016) Oﬀ-the-shelf signal processing algorithms for segmenting often don't work

Annotating animals sounds Oﬀ-the-shelf signal processing algorithms for segmenting often
don't work

Annotating animals sounds Oﬀ-the-shelf signal processing algorithms for segmenting often
don't work → Neural network model

Statistical models of acoustic behavior Syntax McGregor et al. 2022
Cohen et al. 2022 Koparkar et al. 2023 Yang et al. 2022 Dialect Motor learning Fit to data annotated with TweetyNet

Case study: TweetyNet • TweetyNet, the code ◦ Started as
a single Jupyter notebook, written in Tensorﬂow.

Case study: TweetyNet To be good computational scientists, we developed
a neural network framework: vak • SciPy 2023 talk: https://www.youtube.com/watch?v=tpL0 m5UwpZM • proceedings paper: https://conference.scipy.org/proceeding s/scipy2023/david_nicholson.html Version 1.0 in alpha now • Lightning backend • Better abstractions for models, tasks, and datasets https://github.com/vocalpy/vak

To work with annotation formats, we developed crowsetta: https://github.com/vocalpy/crowsetta •
now a pyOpenSci package ◦ https://www.pyopensci.org ◦ published in Journal of Open Source Software • PyCon 2023 lightning talk: https://youtu.be/54q_cPCNNS8?list=PL2 Uw4_HvXqvY2zhJ9AMUa_Z6dtMGF3gt b&t=1082 Case study: TweetyNet

Case study: TweetyNet • TweetyNet (the code) ◦ Separate repository
to replicate results in paper: https://github.com/yardencsGitHub/tweetynet ▪ people often asked us for functionality ▪ that was hidden in this code -- hard to read ▪ and it was hard for them to adapt to their data

What's the problem? Hundreds of diﬀerent libraries and diﬀerent formats
for audio, array data, annotations: https://github.com/rhine3/bioacoustics-software

What's the problem? Challenges: • GUIs ◦ Don't capture parameters
/ steps in analysis ◦ Other issues: proprietary language / no longer developed / single lab or PI or developer / closed source or source not easily accessible • Analysis scripts and libraries ◦ Deal with many low level details ◦ Because of data formats ◦ → dataset preparation + analysis tightly coupled to formats ◦ scripts are hard to read and hard for other groups to re-use ◦ libraries are hard for user to conﬁgure

What's the problem?

What's the problem? Key parameters hidden in defaults

What's the problem? Key parameters hidden in defaults Lack of
data types leads to proliferation of variables

VocalPy

Data types for acoustic communication vocalpy.Audio • works with a
wide array of formats Helper function to get paths to all audio ﬁles

wide array of formats Create list of Audio instances with read method

wide array of formats Audio encapsulates signal data along with samplerate and channels

wide array of formats Audio captures metadata like path

vocalpy.Spectrogram • save expensive-to-compute spectrograms in array ﬁles Data types
for acoustic communication Spectrogram encapsulates data with frequencies and times

Data types for acoustic communication vocalpy.Spectrogram • plot and inspect

Data types for acoustic communication vocalpy.Annotation • Load many diﬀerent
annotation formats using Crowsetta

Data types for acoustic communication • vocalpy.Annotation ◦ Plot with
spectrogram

Classes for common steps in workﬂows vocalpy.Segmenter for segmentation of
audio into sequences of units (Kershenbaum et al. 2016)

audio into sequences of units (Kershenbaum et al. 2016) Encourages explicit declaration of parameters

audio into sequences of units (Kershenbaum et al. 2016) Callbacks allow re-use of code

Classes for common steps in workﬂows vocalpy.SpectrogramMaker to compute spectrograms

Classes for common steps in workﬂows vocalpy.SpectrogramMaker to compute spectrograms
Parallelize with

Build datasets interactively, share as databases vocalpy.datasets.SequenceDataset

Build datasets interactively, share as databases vocalpy.datasets.SequenceDataset • Single-ﬁle database
• built into Python • archival format

Results

Benchmarking neural network models Poster presented at Society for Neuroscience
meeting 2023 https://github.com/vocalpy/Nicholson-Cohen-SfN-2023-poster

Benchmarking neural network models Cohen et al., https://elifesciences.org/articles/63853 Steinfath et
al., https://elifesciences.org/articles/68837

Experimental set-up

Window size matters

Evaluating segmentation with VocalPy Evaluate with vocalpy.metrics.segmentation

Window size matters

TCNs tend to oversegment

Additional loss terms mitigate oversegmentation

Work in Progress

How do we measure song similarity? "A procedure for an
automated measurement of song similarity", Tchernichovski et al. 1999

How do we measure song similarity? (predeﬁned acoustic) features (!)
• soundanalysispro.com ◦ http://soundanalysispro.com/matlab-sat • https://github.com/PaulEcoﬀet/birdsonganalysis • https://github.com/theresekoch/avn

Predeﬁned Acoustic Features vocalpy.feature.sat

Predeﬁned Acoustic Features vocalpy.feature.sat See also: https://github.com/theunissenlab/soundsig https://github.com/NeuroBatLab/SoundAnalysisBats

Predeﬁned acoustic features: a use case for xarray

Roadmap

Roadmap Working towards a community-developed software for acoustic communication research
Core package Annotation formats Neural networks

Roadmap: VocalPy community • We already have many contributors (thank
you to all of you!)

Roadmap: VocalPy community • We already have a growing forum:
https://forum.vocalpy.org/

Ways you can contribute • Star our repositories on GitHub
• Join the forum ◦ Ask and answer questions ◦ Share examples • Contribute code • Attend development meetings

Links • VocalPy organization on GitHub: https://github.com/vocalpy ◦ https://github.com/vocalpy/vocalpy ◦
https://github.com/vocalpy/vak ◦ https://github.com/vocalpy/crowsetta • VocalPy forum: https://forum.vocalpy.org/ @nicholdav NickleDave @[email protected] nicholdav.info

References (that are not linked elsewhere) • Bass Chagnaud 2012:
https://www.pnas.org/doi/abs/10.1073/pnas.1201886109 • Chen Wiens 2020: https://www.nature.com/articles/s41467-020-14356-3 (but see also Jorgewich-Cohen et al 2022 https://www.nature.com/articles/s41467-022-33741-8) • Wirthlin et al 2019: https://www.sciencedirect.com/science/article/pii/S0896627319308396 • Hauser Fitch Chomsky 2002: https://www.science.org/doi/full/10.1126/science.298.5598.1569?casa_token=iB9i4_rZEvUAAAAA%3AfhEc BMWcpTjRpmIrAFaxTe361utCWj-kB6eGvQ1UikCcPz6pDrtiWKywJJdJWl95pa2FwXJusxZ5xoc • Sainburg Gentner 2020: https://www.frontiersin.org/articles/10.3389/fnbeh.2021.811737/full • Stowell 2020: https://peerj.com/articles/13152/ • Kershenbaum et al. 2016: https://onlinelibrary.wiley.com/doi/abs/10.1111/brv.12160?casa_token=OzzQcd7O9AQAAAAA:jkcnLBuWBlia 0tHFmﬀTMziy9cXtAVwVraJm43mw7GQgDqGsmpZ-9omjv4X6FABVnd7KMcZST1Gl8mA

VocalPy: a core Python package for acoustic com...

VocalPy: a core Python package for acoustic communication research

More Decks by David Nicholson

Featured

Transcript