Improving Genre Annotations for the Million Song Dataset

Improving Genre Annotations for the Million Song Dataset Hendrik Schreiber 
tagtraum industries incorporated [email protected] / @h_schreiber October 27, ISMIR 2015 Málaga

TL;DR 1. New, high-quality genre annotations for  part of the
Million Song Dataset (MSD)

Million Song Dataset (MSD) 2. Involved in automatic music genre recognition?    Please use them!

Million Song Dataset (MSD) 2. Involved in automatic music genre recognition?    Please use them! 3. http://www.tagtraum.com/msd_genre_datasets.html    Linked to from the MSD site—thanks, Colin!

Thank you.

Automatic Music Genre Recognition  is among the most popular MIR
tasks. * Why? * Zhouyu Fu, Guojun Lu, Kai Ming Ting, and Dengsheng Zhang. A survey of audio-based music classiﬁcation and annotation. Multimedia, IEEE Transactions on, 13(2):303–319, 2011.

Automatic Music Genre Recognition  is among the most popular MIR
tasks. * Why? But we don’t use a large, standard dataset. * Zhouyu Fu, Guojun Lu, Kai Ming Ting, and Dengsheng Zhang. A survey of audio-based music classiﬁcation and annotation. Multimedia, IEEE Transactions on, 13(2):303–319, 2011.

J Intell Inf Syst (2013) 41:371–406 377 Table 2 Datasets
used in MGR, the type of data they contain, and the percentage of experimental work (435 references) in our survey (Sturm 2012a) that use them Dataset Description % Private Constructed for research but not made available 58 GTZAN Audio; http://marsyas.info/download/data_sets 23 ISMIR2004 Audio; http://ismir2004.ismir.net/genre_contest 17 Latin (Silla et al. 2008) Features; http://www.ppgia.pucpr.br/∼silla/lmd/ 5 Ballroom Audio; http://mtg.upf.edu/ismir2004/contest/tempoContest/ 3 Homburg Audio; http://www-ai.cs.uni-dortmund.de/audio.html 3 (Homburg et al. 2005) Bodhidharma Symbolic; http://jmir.sourceforge.net/Codaich.html 3 USPOP2002 Audio; http://labrosa.ee.columbia.edu/projects/musicsim/ 2 (Berenzweig et al. 2004) uspop2002.html 1517-artists Audio; http://www.seyerlehner.info 1 RWC (Goto et al. 2003) Audio; http://staff.aist.go.jp/m.goto/RWC-MDB/ 1 SOMeJB Features; http://www.ifs.tuwien.ac.at/∼andi/somejb/ 1 SLAC Audio & symbols; http://jmir.sourceforge.net/Codaich.html 1 SALAMI (Smith et al. 2011) Features; http://ddmal.music.mcgill.ca/research/salami 0.7 Unique Features; http://www.seyerlehner.info 0.7 Million song Features; http://labrosa.ee.columbia.edu/millionsong/ 0.7 (Bertin-Mahieux et al. 2011) ISMIS2011 Features; http://tunedit.org/challenge/music-retrieval 0.4 All datasets listed after Private are public Used Datasets Sturm, B.L. A survey of evaluation in music genre recognition. In Proc. Adaptive Multimedia Retrieval. 2012.

used in MGR, the type of data they contain, and the percentage of experimental work (435 references) in our survey (Sturm 2012a) that use them Dataset Description % Private Constructed for research but not made available 58 GTZAN Audio; http://marsyas.info/download/data_sets 23 ISMIR2004 Audio; http://ismir2004.ismir.net/genre_contest 17 Latin (Silla et al. 2008) Features; http://www.ppgia.pucpr.br/∼silla/lmd/ 5 Ballroom Audio; http://mtg.upf.edu/ismir2004/contest/tempoContest/ 3 Homburg Audio; http://www-ai.cs.uni-dortmund.de/audio.html 3 (Homburg et al. 2005) Bodhidharma Symbolic; http://jmir.sourceforge.net/Codaich.html 3 USPOP2002 Audio; http://labrosa.ee.columbia.edu/projects/musicsim/ 2 (Berenzweig et al. 2004) uspop2002.html 1517-artists Audio; http://www.seyerlehner.info 1 RWC (Goto et al. 2003) Audio; http://staff.aist.go.jp/m.goto/RWC-MDB/ 1 SOMeJB Features; http://www.ifs.tuwien.ac.at/∼andi/somejb/ 1 SLAC Audio & symbols; http://jmir.sourceforge.net/Codaich.html 1 SALAMI (Smith et al. 2011) Features; http://ddmal.music.mcgill.ca/research/salami 0.7 Unique Features; http://www.seyerlehner.info 0.7 Million song Features; http://labrosa.ee.columbia.edu/millionsong/ 0.7 (Bertin-Mahieux et al. 2011) ISMIS2011 Features; http://tunedit.org/challenge/music-retrieval 0.4 All datasets listed after Private are public Used Datasets Sturm, B.L. A survey of evaluation in music genre recognition. In Proc. Adaptive Multimedia Retrieval. 2012. Not reproducible

used in MGR, the type of data they contain, and the percentage of experimental work (435 references) in our survey (Sturm 2012a) that use them Dataset Description % Private Constructed for research but not made available 58 GTZAN Audio; http://marsyas.info/download/data_sets 23 ISMIR2004 Audio; http://ismir2004.ismir.net/genre_contest 17 Latin (Silla et al. 2008) Features; http://www.ppgia.pucpr.br/∼silla/lmd/ 5 Ballroom Audio; http://mtg.upf.edu/ismir2004/contest/tempoContest/ 3 Homburg Audio; http://www-ai.cs.uni-dortmund.de/audio.html 3 (Homburg et al. 2005) Bodhidharma Symbolic; http://jmir.sourceforge.net/Codaich.html 3 USPOP2002 Audio; http://labrosa.ee.columbia.edu/projects/musicsim/ 2 (Berenzweig et al. 2004) uspop2002.html 1517-artists Audio; http://www.seyerlehner.info 1 RWC (Goto et al. 2003) Audio; http://staff.aist.go.jp/m.goto/RWC-MDB/ 1 SOMeJB Features; http://www.ifs.tuwien.ac.at/∼andi/somejb/ 1 SLAC Audio & symbols; http://jmir.sourceforge.net/Codaich.html 1 SALAMI (Smith et al. 2011) Features; http://ddmal.music.mcgill.ca/research/salami 0.7 Unique Features; http://www.seyerlehner.info 0.7 Million song Features; http://labrosa.ee.columbia.edu/millionsong/ 0.7 (Bertin-Mahieux et al. 2011) ISMIS2011 Features; http://tunedit.org/challenge/music-retrieval 0.4 All datasets listed after Private are public Used Datasets Sturm, B.L. A survey of evaluation in music genre recognition. In Proc. Adaptive Multimedia Retrieval. 2012. Not reproducible Hardly used

used in MGR, the type of data they contain, and the percentage of experimental work (435 references) in our survey (Sturm 2012a) that use them Dataset Description % Private Constructed for research but not made available 58 GTZAN Audio; http://marsyas.info/download/data_sets 23 ISMIR2004 Audio; http://ismir2004.ismir.net/genre_contest 17 Latin (Silla et al. 2008) Features; http://www.ppgia.pucpr.br/∼silla/lmd/ 5 Ballroom Audio; http://mtg.upf.edu/ismir2004/contest/tempoContest/ 3 Homburg Audio; http://www-ai.cs.uni-dortmund.de/audio.html 3 (Homburg et al. 2005) Bodhidharma Symbolic; http://jmir.sourceforge.net/Codaich.html 3 USPOP2002 Audio; http://labrosa.ee.columbia.edu/projects/musicsim/ 2 (Berenzweig et al. 2004) uspop2002.html 1517-artists Audio; http://www.seyerlehner.info 1 RWC (Goto et al. 2003) Audio; http://staff.aist.go.jp/m.goto/RWC-MDB/ 1 SOMeJB Features; http://www.ifs.tuwien.ac.at/∼andi/somejb/ 1 SLAC Audio & symbols; http://jmir.sourceforge.net/Codaich.html 1 SALAMI (Smith et al. 2011) Features; http://ddmal.music.mcgill.ca/research/salami 0.7 Unique Features; http://www.seyerlehner.info 0.7 Million song Features; http://labrosa.ee.columbia.edu/millionsong/ 0.7 (Bertin-Mahieux et al. 2011) ISMIS2011 Features; http://tunedit.org/challenge/music-retrieval 0.4 All datasets listed after Private are public Used Datasets Sturm, B.L. A survey of evaluation in music genre recognition. In Proc. Adaptive Multimedia Retrieval. 2012. Not reproducible Has its issues Hardly used

Too Much Britney • Small: 1,000 tracks • Replicas •
Excerpts from the same recording • Versions (same music but different recordings) • Mis-labelings • Distortions • Excerpts by the same artists: 35% of Reggae excerpts are by Bob Marley, 24% of Pop excerpts are by Britney Spears, … Sturm, B.L. An analysis of the GTZAN music genre dataset. In Proceedings of the second international ACM workshop on Music information retrieval with user-centered and multimodal strategies, pages 7–12. ACM, 2012.

What’s wrong with MSD?

What’s wrong with MSD? No song-level genre annotations.

Last.fm MSD Tags • 522,366 unique tags • 505,216 tracks
with at least one tag, i.e. multiple tags per song • No explicit relationships between tags • Many tags are genre related, but not all of them:    rock 101,071  pop 69,159  alternative 55,777  indie 48,175  electronic 46,270  female vocalists 42,565  favorites 39,921  love 34,901  dance 33,618  00s 31,432  … http://labrosa.ee.columbia.edu/millionsong/lastfm

Last.fm MSD Tags • 522,366 unique tags • 505,216 tracks
with at least one tag, i.e. multiple tags per song • No explicit relationships between tags • Many tags are genre related, but not all of them:    rock 101,071  pop 69,159  alternative 55,777  indie 48,175  electronic 46,270  female vocalists 42,565  favorites 39,921  love 34,901  dance 33,618  00s 31,432  … http://labrosa.ee.columbia.edu/millionsong/lastfm Yes, great data. But no tag hierarchies/ relationships.  What’s a genre, what’s not? What exactly is the ground truth here that’s usable for MGR?

Top-MAGD Annotations • Album-level annotations scraped from All Music Guide
website • 13 unique tags • 406,427 labeled tracks:    Pop/Rock 238,786  Electronic 41,075  Rap 20,939  Jazz 17,836  Latin 17,590  R&B 14,335  International 14,242  Country 11,772  Reggae 6,946  Blues 6,836  Vocal 6,195  Folk 5,865  New Age 4,010 Alexander Schindler, Rudolf Mayer, and Andreas Rauber. Facilitating comprehensive benchmarking experiments on the million song dataset. In Proceedings of the 13th International Conference on Music Information Retrieval (ISMIR), pages 469–474, 2012.

website • 13 unique tags • 406,427 labeled tracks:    Pop/Rock 238,786  Electronic 41,075  Rap 20,939  Jazz 17,836  Latin 17,590  R&B 14,335  International 14,242  Country 11,772  Reggae 6,946  Blues 6,836  Vocal 6,195  Folk 5,865  New Age 4,010 Is this still useful? Alexander Schindler, Rudolf Mayer, and Andreas Rauber. Facilitating comprehensive benchmarking experiments on the million song dataset. In Proceedings of the 13th International Conference on Music Information Retrieval (ISMIR), pages 469–474, 2012.

website • 13 unique tags • 406,427 labeled tracks:    Pop/Rock 238,786  Electronic 41,075  Rap 20,939  Jazz 17,836  Latin 17,590  R&B 14,335  International 14,242  Country 11,772  Reggae 6,946  Blues 6,836  Vocal 6,195  Folk 5,865  New Age 4,010 Alexander Schindler, Rudolf Mayer, and Andreas Rauber. Facilitating comprehensive benchmarking experiments on the million song dataset. In Proceedings of the 13th International Conference on Music Information Retrieval (ISMIR), pages 469–474, 2012. Also, great job.  But the granularity is not really what we want:  Metallica just isn’t Britney.  Has anybody veriﬁed the annotations?  Are album-level annotations good enough?

Goal Intelligently merge several datasets into  one high-quality genre annotation
dataset.

How 1. Make sure all datasets use the same  labels
(via taxonomies) 2. Use datasets to evaluate each other 3. Generate new & improved dataset 5. Hip-Hop Hip-Hop Jazz Reggae Latin Soundtrack 6. Hip-Hop/Rap R&B Alternative R&B Dance Jazz 7. Soundtrack Soundtrack Dance Soundtrack House Electronica/Dance 8. R&B Jazz R&B Blues Otros M#✏(Rock) 9. Electronic Country Rock/Pop Electronic Blues Altern. & Punk 10. Country Altern. & Punk Soundtrack Rap Electronica Hip-Hop/Rap Table 1. Top ten genres used by beaTunes users with different languages. N denotes the number of submissions in millions. Co-Occurrence Rank 1. 2. 3. 4. Rock Rock (0.609) Pop (0.057) Alternative (0.026) Rock/Pop (0.016) Pop Pop (0.593) Rock (0.077) Rock/Pop (0.014) R&B (0.013) Alternative Alternative (0.394) Rock (0.156) Pop (0.052) Alternative/Punk (0.036) R&B R&B (0.566) Pop (0.061) Soul (0.036) R&B/Soul (0.033) Soundtrack Soundtrack (0.754) Rock (0.024) Pop (0.022) Game (0.011) ... ... ... ... ... Table 2. Genre labels in the beaTunes database and their top four co-occurring labels ordered by relative strength given in parenthesis. The underlying values from the co-occurrence matrix C were computed taking only submissions by English speakers and the 1,000 most-used labels into account. (2) Because this rule allows a genre to be a sub-genre of multiple genres, we add: a is a direct sub-genre of b, iff a is a sub-genre of b ^ C a,b > C a,c with c 6= a ^ c 6= b; a, b, c 2 G (4) By finding all direct sub-genres and their parents, we can now create a set of trees. The number of created trees depends on the threshold ⌧. We found, that to properly distinguish between genres like Pop, Rock, Dance, R&B, Folk, and Other, ⌧ := 0.085 proved to be useful, resulting in 141 trees. The roots of these trees are typically the names of seed-genres like Jazz, Pop, Rock, etc. (see Figure 1). Not all generated trees have children. For example, the tree with the seed-genre Groove consists of just the root. Although Groove co-occurs with R&B, Rock, Funk, and Soul, the co-occurrence rates with genres other than itself are all below ⌧. Even the co-occurrence with itself is low (0.157). This suggests, that Groove is not really a genre, but more a property of a genre. Another example for a root- only tree is Calypso. Here the co-occurrence with itself is much higher (0.606) and indeed Calypso qualifies as stand-alone genre that simply does not have any sub-genres in this database. Naturally, the generated taxonomies are only simplified mappings of the more complex relationship graph represented by C. In reality, genres aren’t necessarily exclusive members of one tree or another (e.g. fusion genres). An ontology is the much better construct. But, as we will see, for the purpose of mapping most sub-genres to their seed- genre, trees are useful. Rock Metal Alternative Punk ... Pop Folk Pop Acoustic Pop Top 40 ... Hip-Hop East Coast Rap Turntablism ... RnB Motown Funk Soul Urban ... Figure 1. Partial, generated trees for the seed-genres Rock, Pop, Hip-Hop, and R&B. 2.4 Matching with Million Song Dataset To create song-level genre annotations for the MSD, we queried the beaTunes database for songs with artist/title pairs contained in the MSD and were able to match 677,038 songs. In order to ease the comparison with the HO and Top-MAGD datasets, we associated each matched song with the seed-genre of its most often occurring genre label, taking advantage of the taxonomies created in Section 2.3. Motown, for example, is represented by its seed-genre RnB. In many cases, the found seed-genres are

But ﬁrst, a little…

beaTunes • Consumer application for Windows and Mac • Encourages
users to correct metadata • Collects anonymized, user-submitted metadata in central database https://www.beatunes.com/

beaTunes Database • 870 million song submissions by 200 thousand
users • 772 million submissions are labeled with a genre • Mapped to more than 85 million distinct songs  (one song, many genre labels) • 677,038 songs have been  matched to MSD https://www.beatunes.com/

Mapping User Genre Labels to a Genre Taxonomy 1. Normalization
(lowercase, smart subs, etc.) 2. Inferring hierarchical relationships via co-occurrence

Co-Occurrence Matrix 2. Pop Pop Rock Pop Pop Pop 3.
Alternative Alternative Electronic Jazz Jazz J-Pop 4. Jazz Hip-Hop/Rap Hip-Hop Hip-Hop Soundtrack R&B 5. Hip-Hop Hip-Hop Jazz Reggae Latin Soundtrack 6. Hip-Hop/Rap R&B Alternative R&B Dance Jazz 7. Soundtrack Soundtrack Dance Soundtrack House Electronica/Dance 8. R&B Jazz R&B Blues Otros M#✏(Rock) 9. Electronic Country Rock/Pop Electronic Blues Altern. & Punk 10. Country Altern. & Punk Soundtrack Rap Electronica Hip-Hop/Rap Table 1. Top ten genres used by beaTunes users with different languages. N denotes the number of submissions in millions. Co-Occurrence Rank 1. 2. 3. 4. Rock Rock (0.609) Pop (0.057) Alternative (0.026) Rock/Pop (0.016) Pop Pop (0.593) Rock (0.077) Rock/Pop (0.014) R&B (0.013) Alternative Alternative (0.394) Rock (0.156) Pop (0.052) Alternative/Punk (0.036) R&B R&B (0.566) Pop (0.061) Soul (0.036) R&B/Soul (0.033) Soundtrack Soundtrack (0.754) Rock (0.024) Pop (0.022) Game (0.011) ... ... ... ... ... Table 2. Genre labels in the beaTunes database and their top four co-occurring labels ordered by relative strength given in parenthesis. The underlying values from the co-occurrence matrix C were computed taking only submissions by English speakers and the 1,000 most-used labels into account. (2) Because this rule allows a genre to be a sub-genre of multiple genres, we add: a is a direct sub-genre of b, iff a is a sub-genre of b ^ C a,b > C a,c with c 6= a ^ c 6= b; a, b, c 2 G (4) Rock Metal Alternative Punk ... Pop Genre labels and their top four co-occurring labels ordered by relative strength given in parenthesis.

Co-Occurrence Matrix 2. Pop Pop Rock Pop Pop Pop 3.
Alternative Alternative Electronic Jazz Jazz J-Pop 4. Jazz Hip-Hop/Rap Hip-Hop Hip-Hop Soundtrack R&B 5. Hip-Hop Hip-Hop Jazz Reggae Latin Soundtrack 6. Hip-Hop/Rap R&B Alternative R&B Dance Jazz 7. Soundtrack Soundtrack Dance Soundtrack House Electronica/Dance 8. R&B Jazz R&B Blues Otros M#✏(Rock) 9. Electronic Country Rock/Pop Electronic Blues Altern. & Punk 10. Country Altern. & Punk Soundtrack Rap Electronica Hip-Hop/Rap Table 1. Top ten genres used by beaTunes users with different languages. N denotes the number of submissions in millions. Co-Occurrence Rank 1. 2. 3. 4. Rock Rock (0.609) Pop (0.057) Alternative (0.026) Rock/Pop (0.016) Pop Pop (0.593) Rock (0.077) Rock/Pop (0.014) R&B (0.013) Alternative Alternative (0.394) Rock (0.156) Pop (0.052) Alternative/Punk (0.036) R&B R&B (0.566) Pop (0.061) Soul (0.036) R&B/Soul (0.033) Soundtrack Soundtrack (0.754) Rock (0.024) Pop (0.022) Game (0.011) ... ... ... ... ... Table 2. Genre labels in the beaTunes database and their top four co-occurring labels ordered by relative strength given in parenthesis. The underlying values from the co-occurrence matrix C were computed taking only submissions by English speakers and the 1,000 most-used labels into account. (2) Because this rule allows a genre to be a sub-genre of multiple genres, we add: a is a direct sub-genre of b, iff a is a sub-genre of b ^ C a,b > C a,c with c 6= a ^ c 6= b; a, b, c 2 G (4) Rock Metal Alternative Punk ... Pop Genre labels and their top four co-occurring labels ordered by relative strength given in parenthesis. Co-occurrence rates aren’t symmetric!

Rules 1. If a genre a co-occurs with another genre
b more than a minimum threshold τ, and a co-occurs with b more than the other way around, then we assume that a is a sub-genre of b.

Rules 1. If a genre a co-occurs with another genre
b more than a minimum threshold τ, and a co-occurs with b more than the other way around, then we assume that a is a sub-genre of b. 2. a is a direct sub-genre of b, iff a is a sub-genre of b and Ca,b > Ca,c with c≠a und c≠b.    a,b,c ∈ G and Ca,b being the co-occurrence rate between a and b.

Generated Taxonomies Rock (0.077) Rock/Pop (0.014) R&B (0.013) Rock (0.156)
Pop (0.052) Alternative/Punk (0.036) Pop (0.061) Soul (0.036) R&B/Soul (0.033) Rock (0.024) Pop (0.022) Game (0.011) ... ... ... d their top four co-occurring labels ordered by relative strength given in ccurrence matrix C were computed taking only submissions by English unt. genre of (4) ents, we ed trees properly roved to se trees z, Pop, mple, the he root. nk, and an itself f is low a genre, r a root- Rock Metal Alternative Punk ... Pop Folk Pop Acoustic Pop Top 40 ... Hip-Hop East Coast Rap Turntablism ... RnB Motown Funk Soul Urban ... Figure 1. Partial, generated trees for the seed-genres Rock, Pop, Hip-Hop, and R&B. (2) Because this rule allows a genre to be a sub-genre of multiple genres, we add: a is a direct sub-genre of b, iff a is a sub-genre of b ^ C a,b > C a,c with c 6= a ^ c 6= b; a, b, c 2 G (4) By finding all direct sub-genres and their parents, we can now create a set of trees. The number of created trees depends on the threshold ⌧. We found, that to properly distinguish between Pop and Rock, ⌧ := 0.085 proved to be useful, resulting in 141 trees. The roots of these trees are typically the names of seed-genres like Jazz, Pop, Rock, etc. (see Figure 1). Not all generated trees have children. For example, the tree with the seed-genre Groove consists of just the root. Although Groove co-occurs with R&B, Rock, Funk, and Soul, the co-occurrence rates with genres other than itself are all below ⌧. Even the co-occurrence with itself is low (0.157). This suggests, that Groove is not really a genre, but more a property of a genre. Another example for a root- only tree is Calypso. Here the co-occurrence with itself is much higher (0.606) and indeed Calypso qualifies as stand-alone genre that simply does not have any sub-genres in this database. Naturally, the generated taxonomies are only simplified mappings of the more complex relationship graph represented by C. In reality, genres aren’t necessarily exclusive members of one tree or another (e.g. fusion genres). An ontology is the much better construct. But, as we will see, Rock Metal Alternative Punk ... Pop Folk Pop Acoustic Pop Top 40 ... Hip-Hop East Coast Rap Turntablism ... RnB Motown Funk Soul Urban ... Figure 1. Partial, generated trees for the seed-genres Rock, Pop, Hip-Hop, and R&B. 2.4 Matching with Million Song Dataset To create song-level genre annotations for the MSD, we queried the beaTunes database for songs with artist/title pairs contained in the MSD and were able to match 677,038 songs. In order to ease the comparison with the HO and Top-MAGD datasets, we associated each matched song with the seed-genre of its most often occurring genre label, taking advantage of the taxonomies created in

Generated Taxonomies Rock (0.077) Rock/Pop (0.014) R&B (0.013) Rock (0.156)
Pop (0.052) Alternative/Punk (0.036) Pop (0.061) Soul (0.036) R&B/Soul (0.033) Rock (0.024) Pop (0.022) Game (0.011) ... ... ... d their top four co-occurring labels ordered by relative strength given in ccurrence matrix C were computed taking only submissions by English unt. genre of (4) ents, we ed trees properly roved to se trees z, Pop, mple, the he root. nk, and an itself f is low a genre, r a root- Rock Metal Alternative Punk ... Pop Folk Pop Acoustic Pop Top 40 ... Hip-Hop East Coast Rap Turntablism ... RnB Motown Funk Soul Urban ... Figure 1. Partial, generated trees for the seed-genres Rock, Pop, Hip-Hop, and R&B. (2) Because this rule allows a genre to be a sub-genre of multiple genres, we add: a is a direct sub-genre of b, iff a is a sub-genre of b ^ C a,b > C a,c with c 6= a ^ c 6= b; a, b, c 2 G (4) By finding all direct sub-genres and their parents, we can now create a set of trees. The number of created trees depends on the threshold ⌧. We found, that to properly distinguish between Pop and Rock, ⌧ := 0.085 proved to be useful, resulting in 141 trees. The roots of these trees are typically the names of seed-genres like Jazz, Pop, Rock, etc. (see Figure 1). Not all generated trees have children. For example, the tree with the seed-genre Groove consists of just the root. Although Groove co-occurs with R&B, Rock, Funk, and Soul, the co-occurrence rates with genres other than itself are all below ⌧. Even the co-occurrence with itself is low (0.157). This suggests, that Groove is not really a genre, but more a property of a genre. Another example for a root- only tree is Calypso. Here the co-occurrence with itself is much higher (0.606) and indeed Calypso qualifies as stand-alone genre that simply does not have any sub-genres in this database. Naturally, the generated taxonomies are only simplified mappings of the more complex relationship graph represented by C. In reality, genres aren’t necessarily exclusive members of one tree or another (e.g. fusion genres). An ontology is the much better construct. But, as we will see, Rock Metal Alternative Punk ... Pop Folk Pop Acoustic Pop Top 40 ... Hip-Hop East Coast Rap Turntablism ... RnB Motown Funk Soul Urban ... Figure 1. Partial, generated trees for the seed-genres Rock, Pop, Hip-Hop, and R&B. 2.4 Matching with Million Song Dataset To create song-level genre annotations for the MSD, we queried the beaTunes database for songs with artist/title pairs contained in the MSD and were able to match 677,038 songs. In order to ease the comparison with the HO and Top-MAGD datasets, we associated each matched song with the seed-genre of its most often occurring genre label, taking advantage of the taxonomies created in No parent  = seed genre Seed genres can easily be found and  mapped to Top-MAGD labels.  (Pop/Rock, Electronic, Rap, Jazz, Latin, R&B, International,  Country, Reggae, Blues, Vocal, Folk, New Age)

Building Genre Taxonomies with Last.fm Tags • Last.fm tags come
with a relative strength (0-100) • Same procedure can be applied • Many more different tags -> minimum threshold τ has to be adjusted • Allows us to ﬁnd seed genres (top-level)

Comparing Annotations • beaTunes and Last.fm labels can now be
matched to Top-MAGD labels using the generated taxonomies • Let’s compare!

Last.fm beaTunes Top-MAGD 75.7% 84.0% Last.fm - 80.9% Pairwise Comparison

High agreement rates, especially between beaTunes and Top-MAGD

Glass-ceiling for ground truth with just one value/song?

Combined Dataset 1

Combined Dataset 1 • Find songs occurring in all datasets

• For which at least two of the datasets agree  (majority voting)

• For which at least two of the datasets agree  (majority voting) • Take note of minority vote, if existent  (i.e. allow ambiguity)

• For which at least two of the datasets agree  (majority voting) • Take note of minority vote, if existent  (i.e. allow ambiguity) • => Combined Dataset 1 (CD1):  133,676 tracks  98,149 (73.4%) found by unanimous consent

CD1 Genre Distribution CD1 0 20 40 60 Blues Country
Electronic Folk Intern. Jazz Latin New Age Pop Rock Rap Reggae RnB Vocal 2.2 3.9 11.4 2.2 1.1 5.8 2.1 1 59.8 4.6 2.7 2.9 0.2 Tracks per Genre [%] Figure 2. Majority genre distribution of tracks in CD1. As gen BG suit

Metallica is not Britney ≠

Combined Dataset 2

Combined Dataset 2 • Just beaTunes and Last.fm tracks, because
Top-MAGD can’t distinguish between Pop and Rock

Top-MAGD can’t distinguish between Pop and Rock • Split Pop and Rock

Top-MAGD can’t distinguish between Pop and Rock • Split Pop and Rock • Add Metal and Punk

Top-MAGD can’t distinguish between Pop and Rock • Split Pop and Rock • Add Metal and Punk • Remove Vocal

Top-MAGD can’t distinguish between Pop and Rock • Split Pop and Rock • Add Metal and Punk • Remove Vocal • Combine R&B and Soul

Top-MAGD can’t distinguish between Pop and Rock • Split Pop and Rock • Add Metal and Punk • Remove Vocal • Combine R&B and Soul • Rename International to World

Combined Dataset 2 • Find songs in both beaTunes and
Last.fm datasets • => Combined Dataset 2 (CD2):  280,831 tracks  191,401 (68.2%) have the same genre label • Combined Dataset 2 Consensus (CD2C):  Convenience dataset with only the songs that have the same genre label

0 20 40 60 Blues Country Electronic Folk Intern. Jazz
Latin New Age Pop Rock Rap Reggae RnB Vocal 2.2 3.9 11.4 2.2 1.1 5.8 2.1 1 59.8 4.6 2.7 2.9 0.2 Tracks per Genre [%] Figure 2. Majority genre distribution of tracks in CD1. CD2C Genre Distribution CD1 CD2C 0 20 40 60 RnB Vocal 2.7 2.9 0.2 Tracks per Genre [%] Figure 2. Majority genre distribution of tracks in CD1. 0 20 40 60 Blues Country Electronic Folk Jazz Latin Metal New Age Pop Punk Rap Reggae RnB Rock World 3.2 4.7 11.4 2.2 7.7 1.6 4.8 0.6 6.8 1.7 5.7 4.2 5.1 39.2 1 Tracks per Genre [%] Figure 3. Genre distribution of tracks in CD2C. As CD2 songs are genre, we used the ﬁ 6. A BGD and LFMGD suitable for compar Top-MAGD. They b the genre labels them ﬁcations are problem datasets presented in bels where feasible. is actually much mo basis. We are publis it proves useful for cludes: • Multiple genr relative streng judge reliabili • Co-occurrenc in Section 2.3 • Derived genre All data can be fo com/msd_genre_

Benchmarking Partitions • Main “feature” of Schindler et al. paper
• Increase reproducibility • Traditional training/test splits (90%, 80%, …) • Training/test splits with genre stratiﬁcation • Splits with ﬁxed number per genre (1,000, 2,000, 3,000)

Summary

Summary • Multiple large ground truth datasets for the MSD

• Despite large size, reasonable quality

• Despite large size, reasonable quality • Allow for ambiguity

• Despite large size, reasonable quality • Allow for ambiguity • Benchmark partitions to promote experimentation and comparability

Thank you. http://www.tagtraum.com/msd_genre_datasets.html [email protected] / @h_schreiber

Thank you. Questions? http://www.tagtraum.com/msd_genre_datasets.html [email protected] / @h_schreiber

Improving Genre Annotations for the Million Son...

Improving Genre Annotations for the Million Song Dataset

More Decks by Hendrik Schreiber

Other Decks in Science

Featured

Transcript