Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A Hybrid Method for Rating Prediction Using Lin...

A Hybrid Method for Rating Prediction Using Linked Data Features and Text Reviews

The presentation for our entry to the Linked Data Mining Challenge 2016 organized by Know@LOD workshop at ESWC 2016

Emir Muñoz

May 30, 2016
Tweet

More Decks by Emir Muñoz

Other Decks in Research

Transcript

  1. A Hybrid Method for Rating Prediction Using Linked Data Features

    and Text Reviews Semih, Y., Emir M., Pasquale, M., Erdogan, D., Halife, K. Linked Data Mining Challenge 2016 - Know@LOD ESWC 2016, Heraklion, Crete, Greece
  2. “ What makes a good/bad album of music? Can Linked

    Open Data help with the classification of music albums as “good” or “bad”?
  3. Music genres Some genres are more popular than others 

    dbo:genre http://hpo.org/two-things-you-need-to-know-about-genre-hopping/
  4. Reviews Words used for good albums differ from the ones

    used for bad albums http://www.youtube.com
  5. Award winners Albums of award winning artists are likely to

    be more successful  # awards of dbo:artist
  6. Datasets ◎ Training dataset: 1,280 album URIs ◎ Test dataset:

    320 album URIs ◎ DBpedia ◎ Metacritic.com
  7. Experimental Setup ◎ Python 3.5.1 ◎ Beautiful Soup Library 4.4.0

    (Web Scraping) ◎ scikit-learn Library 0.17 (Data Mining) ◎ Jena Fuseki (LD Caching)
  8. Different Classifiers and Different Feature Sets Feature Set Linear SVM

    KNN RBF SVM Dec. Tree Rand. Forest AdaBoost Naïve Bayes LD 76.64% 60.47% 48.05% 72.66% 53.91% 75.00% 76.41% LDA 54.53% 52.58% 54.69% 54.45% 48.91% 54.53% 52.89% LD+LDA 76.72% 60.23% 48.05% 72.66% 52.34% 75.00% 76.41% TEXT 85.00% 50.00% 47.27% 67.27% 52.81% 78.91% 68.44% LD+LDA+TEXT 87.81% 52.81% 47.27% 72.03% 52.58% 82.50% 77.19% + = 90% test set