Upgrade to Pro — share decks privately, control downloads, hide ads and more …

【2018/09/06】 Semi-Stacking for Semi-supervised Sentiment Classification

756fcabd5aabf52ab37e9ac247294c07?s=47 vhqviet
September 06, 2018
36

【2018/09/06】 Semi-Stacking for Semi-supervised Sentiment Classification

756fcabd5aabf52ab37e9ac247294c07?s=128

vhqviet

September 06, 2018
Tweet

Transcript

  1. Literature review: Shoushan Li | Lei Huang | Jingjing Wang

    | Guodong Zhou. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 27– 31, July 2015. Nagaoka University of Technology VO HUYNH QUOC VIET ➢ Natural Language Processing Laboratory 2018 / 09 / 06 Semi-Stacking for Semi-supervised Sentiment Classification
  2. Introduction • Various semi-supervised learning algorithms are available and have

    been shown to be successful in exploiting unlabeled data. ➡ Each algorithm has its own characteristic with different pros and cons. • For example in Li et al. (2013): • the co-training algorithm with personal and impersonal views yields better performances in: Book and Kitchen • the label propagation algorithm yields better performances in: DVD and Electronic • This paper address semi-supervised sentiment learning via semi-stacking: • Combine two or more semi-supervised learning algorithms. 2
  3. Method 3 Semi-Stacking for Semi-supervised Sentiment Classification

  4. Method 4 Semi-Stacking for Semi-supervised Sentiment Classification

  5. Method 5 Meta-learning with N-fold Cross Validation One problem of

    meta-learning is that the data size of Lun might be too small. ➡ Employ N-fold cross validation to generate more meta- samples.
  6. Experiments 6 Dataset: • The dataset contains product reviews from

    four different domains: Book, DVD, Electronics and Kitchen appliances. • Each contains 1000 positive and 1000 negative labeled reviews. ⇀ Randomly select: 100 instances: labeled data 400 instances: test data 1500 instances: unlabeled data.
  7. Experiments 7 Approaches: • Supervised learning algorithm: The maximum entropy

    (ME) classifier implemented with the public tool, Mallet Toolkits. • Semi-supervised learning algorithms: (1) self-trainingFS by Gao et al. (2014) (2) label propagation by Zhu and Ghahramani (2002) Evaluation: • Perform t-test to evaluate the significance of the performance (Yang and Liu, 1999)
  8. Result 8

  9. Conclusions 9 • This paper present a novel ensemble learning

    approach named semi-stacking to semi-supervised sentiment classification. • Semi-stacking is implemented by re-predicting the labels of the unlabeled samples with meta-learning after two or more member semi-supervised learning approaches have been performed. • Experimental evaluation in four domains demonstrates that semi-stacking outperforms both member algorithms.