A Hybrid Method for Rating Prediction Using Linked Data Features and Text Reviews

A Hybrid Method for Rating Prediction Using Linked Data Features and Text Reviews

The presentation for our entry to the Linked Data Mining Challenge 2016 organized by Know@LOD workshop at ESWC 2016

175389e8c3ad885108fc33f8f05ba9bd?s=128

Emir Muñoz

May 30, 2016
Tweet

Transcript

  1. 1.

    A Hybrid Method for Rating Prediction Using Linked Data Features

    and Text Reviews Semih, Y., Emir M., Pasquale, M., Erdogan, D., Halife, K. Linked Data Mining Challenge 2016 - Know@LOD ESWC 2016, Heraklion, Crete, Greece
  2. 2.

    “ What makes a good/bad album of music? Can Linked

    Open Data help with the classification of music albums as “good” or “bad”?
  3. 5.

    Music genres Some genres are more popular than others 

    dbo:genre http://hpo.org/two-things-you-need-to-know-about-genre-hopping/
  4. 8.

    Reviews Words used for good albums differ from the ones

    used for bad albums http://www.youtube.com
  5. 9.

    Award winners Albums of award winning artists are likely to

    be more successful  # awards of dbo:artist
  6. 11.

    Datasets ◎ Training dataset: 1,280 album URIs ◎ Test dataset:

    320 album URIs ◎ DBpedia ◎ Metacritic.com
  7. 12.
  8. 14.

    Experimental Setup ◎ Python 3.5.1 ◎ Beautiful Soup Library 4.4.0

    (Web Scraping) ◎ scikit-learn Library 0.17 (Data Mining) ◎ Jena Fuseki (LD Caching)
  9. 15.

    Different Classifiers and Different Feature Sets Feature Set Linear SVM

    KNN RBF SVM Dec. Tree Rand. Forest AdaBoost Naïve Bayes LD 76.64% 60.47% 48.05% 72.66% 53.91% 75.00% 76.41% LDA 54.53% 52.58% 54.69% 54.45% 48.91% 54.53% 52.89% LD+LDA 76.72% 60.23% 48.05% 72.66% 52.34% 75.00% 76.41% TEXT 85.00% 50.00% 47.27% 67.27% 52.81% 78.91% 68.44% LD+LDA+TEXT 87.81% 52.81% 47.27% 72.03% 52.58% 82.50% 77.19% + = 90% test set