Analyzing Song Structure with Spectral Clustering

Brian McFee Dan Ellis Analyzing song structure with spectral clustering

Musical structure analysis 1. Detect change-points verse → chorus 2.
Label repeated sections ABAC • … also representation and visualization

1. Build a chain graph, each beat is a vertex
2. Connect repeated beats 3. Playback follows a random walk The Infinite Jukebox [Lamere, 2012]

What does this have to do with structure analysis?

Structure from the ∞ Jukebox 1. Sequential repetitions form dense
subgraphs ◦ Ignore edge directions, examine connectivity

Structure from the ∞ Jukebox 1. Sequential repetitions form dense
subgraphs ◦ Ignore edge directions, examine connectivity 2. Links between subgraphs are sparse ◦ Pruning these should reveal structure

Our strategy 1. Construct a graph over beats 2. Partition
the graph to recover structure 3. Vary the partition size to expose multi-level structure

Building the graph [1/3]: The local graph 1. Add a
vertex for each beat 2. Add local edges (i, i ± 1) 3. Weight edges by MFCC similarity 1 2 3 1 2 3 1 2 3

1. Link k-nearest neighbors (in CQT space) Building the graph
[2/3]: The repetition graph 3 1 2 5 6 4 8 7

1. Link k-nearest neighbors (in CQT space) 2. Enhance sequences
by windowed majority vote Building the graph [2/3]: The repetition graph 3 1 2 5 6 4 8 7

1. Link k-nearest neighbors (in CQT space) 2. Enhance sequences
by windowed majority vote 3. Weight edges by feature similarity Building the graph [2/3]: The repetition graph 3 1 2 5 6 4 8 7

Building the graph [3/3]: The combination 1. Take a weighted
combination of local and repetition A = μ *Local + (1-μ) *Repetition

combination of local and repetition A = μ *Local + (1-μ) *Repetition 2. Optimize μ : P[Local move] ≅ P[Repetition move]

combination of local and repetition A = μ *Local + (1-μ) *Repetition 2. Optimize μ : P[Local move] ≅ P[Repetition move] 3. μ* has a closed-form solution ∀ i: μ ∑ j Local[i,j] ≅ (1-μ) ∑ j Repetition[i,j]

Example: The Beatles - Come Together

Partitioning via spectral clustering • Affinity matrix A, degree matrix
D ii = ∑ j A ij • Normalized Laplacian L = I - D-1A ◦ Bottom eigenvectors encode component membership for each beat ◦ … i.e., the regions likely to trap a random walk • Cluster the eigenvectors of L to reveal structure

Example: The Beatles - Come Together

Example: The Beatles - Come Together Low-rank reconstructions expose structure
L ≅ Y[:n, :m]Y[:n, :m]T

Multi-level segmentation 1. Construct the n-by-n graph A 2. Compute
Laplacian eigenvectors Y 3. for m in [2, 3, …] a. Partitions[m] := spectral_clustering(Y[:n, :m], n_components=m) 4. Return Partitions One discrete parameter controls complexity

Interactive visualization demo

Quantitative evaluation • Metrics: boundary detection and pairwise frame labeling
◦ Beatles_TUT (174 tracks) ◦ SALAMI small (735 tracks) and functions • Choosing the number of components ◦ Maximize label entropy with duration constraints ◦ Oracle: Best m per track, per metric (simulates interactive display) • Baseline: [Serrà, Müller, Grosche & Arcos, 2012]

Results: Beatles F 0.5s F 3.0s F-Pairwise Automatic m 0.312
+- 0.15 0.579 +- 0.16 0.628 +- 0.13 Oracle 0.414 +- 0.14 0.684 +- 0.13 0.694 +- 0.12 SMGA 0.293 +- 0.13 0.699 +- 0.16 0.715 +- 0.15

Results: SALAMI small F 0.5s F 3.0s F-Pairwise Automatic m
0.192 +- 0.11 0.344 +- 0.15 0.448 +- 0.16 Oracle 0.292 +- 0.15 0.525 +- 0.19 0.561 +- 0.16 SMGA 0.173 +- 0.08 0.518 +- 0.12 0.493 +- 0.16

Results: SALAMI functions F 0.5s F 3.0s F-Pairwise Automatic m
0.304 +- 0.13 0.455 +- 0.16 0.546 +- 0.14 Oracle 0.406 +- 0.13 0.579 +- 0.15 0.652 +- 0.13 SMGA 0.224 +- 0.11 0.550 +- 0.18 0.553 +- 0.15

Summary • We demonstrated a novel graphical method for musical
structure analysis • Laplacian eigenvectors encode relevant musical information • Future directions: ◦ Improve edge prediction/weighting ◦ Enforce consistency between layers

Thanks! [email protected] https://github.com/bmcfee/laplacian_segmentation https://github.com/urinieto/msaf

Analyzing Song Structure with Spectral Clustering

Analyzing Song Structure with Spectral Clustering

Brian McFee

More Decks by Brian McFee

Other Decks in Research

Featured

Transcript

Brian McFee Dan Ellis Analyzing song structure with spectral clustering

Musical structure analysis 1. Detect change-points verse → chorus 2.

2012

1. Build a chain graph, each beat is a vertex

What does this have to do with structure analysis?

Structure from the ∞ Jukebox 1. Sequential repetitions form dense

Structure from the ∞ Jukebox 1. Sequential repetitions form dense

Our strategy 1. Construct a graph over beats 2. Partition

Building the graph [1/3]: The local graph 1. Add a

1. Link k-nearest neighbors (in CQT space) Building the graph

1. Link k-nearest neighbors (in CQT space) 2. Enhance sequences

1. Link k-nearest neighbors (in CQT space) 2. Enhance sequences

1. Link k-nearest neighbors (in CQT space) 2. Enhance sequences

Building the graph [3/3]: The combination 1. Take a weighted

Building the graph [3/3]: The combination 1. Take a weighted

Building the graph [3/3]: The combination 1. Take a weighted

Example: The Beatles - Come Together

Partitioning via spectral clustering • Affinity matrix A, degree matrix

Example: The Beatles - Come Together

Example: The Beatles - Come Together Low-rank reconstructions expose structure

Multi-level segmentation 1. Construct the n-by-n graph A 2. Compute

Interactive visualization demo

Quantitative evaluation • Metrics: boundary detection and pairwise frame labeling

Results: Beatles F 0.5s F 3.0s F-Pairwise Automatic m 0.312

Results: SALAMI small F 0.5s F 3.0s F-Pairwise Automatic m

Results: SALAMI functions F 0.5s F 3.0s F-Pairwise Automatic m

Summary • We demonstrated a novel graphical method for musical

Thanks! [email protected] https://github.com/bmcfee/laplacian_segmentation https://github.com/urinieto/msaf