FMA Music Big Data Analysis

FMA music Big Data Analysis

1. About the Dataset Exploring the FMA Data

FMA music data ◎ It was accessed from https://github.com/mdeff/fma ◎
It consists of multiple files – o tracks.csv o echohonest.csv o genre.csv ◎ Original Data was scraped from Free Music Archive (FMA) and Echohonest (Now Spotify)

14,000+ rows This is just part of the data 3
datasets The analysis was done using these files 23 columns After cleaning the data

2. Preparing the Data Data Cleansing and Preparation

The FMA data preparation 1. Collecting the Data The dataset
had multiple .csv files which contained information about songs, its features and genres 2. Cleaning the Data Since the data was huge in size, there was a lot of data cleaning to be done without losing anything 3. Preparing the Data The files had linking columns such as Track_ID and Genre_ID Tracks – Information about the track ID, track interest, track duration Genre – Information about the various genres Echohonest (now Spotify) – contains details about song’s features such as danceability, song hotness, valence etc.

3. Business Problem What questions can we answer with this
data?

Top 7 genres by the number of songs created I)
Exploratory Analysis

Top 7 genres by the number of song listens

Bottom 7 genres by the number of song listens

Mean Song Duration

Valence Analysis

Song Hotness ◎ Song currency is highly positively correlated ◎
Acousticness and Instrumentalness are negatively correlated ◎ Speechiness and artist hotness are positively correlated II) Analyzing popularity of a song Track Listens ◎ Artist familiarity is highly positively correlated ◎ Danceability and Energy are negatively correlated ◎ Track favorites and speechiness are positively correlated

Evaluating song popularity measures Song Hotness Track Listens Accuracy 88%
64% Accuracy after Feature Selection 84% 54%

Song Happiness Quotient Logistic Regression: Features: - Genre - Acousticness
- Danceability - Energy - Liveness - Speechiness - Tempo - Artist Familiarity - Artist Hotness III) Analyzing polarity of a song Accuracy 77.25%

III) Analyzing polarity of a song Song Happiness Quotient DT,
RF, GBC: Features: - Acousticness - Danceability - Energy - Instrumentalness - Liveness - Speechiness - Tempo - Song Hotness - Artist Familiarity - Artist Hotness

Evaluation of Song Valence Decision Tree Random Forest GBC Accuracy
61% 74% 70%

Thanks! ANY QUESTIONS?

FMA Music Big Data Analysis

FMA Music Big Data Analysis

Ruchira

More Decks by Ruchira

Other Decks in Education

Featured

Transcript