Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Lifelong Learning CRF for Supervised Aspect Ex...

Avatar for vhqviet vhqviet
September 28, 2018
77

Lifelong Learning CRF for Supervised Aspect Extraction

Avatar for vhqviet

vhqviet

September 28, 2018
Tweet

Transcript

  1. Literature review: Lei Shu | Hu Xu | Bing Liu.

    ACL 2017 - 55th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Volume 2: Short Papers), pages 148-154, Jan 2017. Nagaoka University of Technology VO HUYNH QUOC VIET ➢ Natural Language Processing Laboratory 2018 / 09 / 28 Lifelong Learning CRF for Supervised Aspect Extraction
  2. Introduction • Aspect Extraction • Identify aspect from customer reviews.

    • For example: “The battery of this camera is great” • Supervised Sequence Labeling • Using Conditional Random Field (CRF) • Lifelong Machine Learning (LML) • Uses LML to help a supervised extraction method to markedly improve its results. 2
  3. Method 3 Key idea: • Assume the model has been

    applied to many domains d1 , d2 ,…, dn and obtained their extraction result sets r1 , r2 ,…, rn . • Then apply the model on the new domain dn+1 : • Using Reliable Results (R) from the r1 , r2 ,…, rn to generate features for dn+1 • Highly frequent in one domain • Common in multiple past domains
  4. Method 4 Dependency feature: • For example: apply the model

    on a new domain dn+1 “Phone” • Leveraging the past Reliable Result “battery” • “phone” in “the battery of this phone” is likely predicted as an aspect. • Since “battery” is known as a reliable aspect. • Without any past reliable result • “signal” and “phone” in “The signal of this phone” is unlikely predicted as an aspect. • Since both “signal” and “phone” are not reliable aspect.
  5. Method 5 Dependency feature: • Dependency relation: • (type, gov

    word, gov POS tag, dep word, dep POS tag) • Generalize dependency relation and link with R (reliable result) • If a word in R, replace it with label “A”, otherwise, label “O” • For example of sentence: “The battery of this phone is great” (battery is a R) Step 1: Dependency relation: (nmod, battery, NN, phone, NN) Step 2: replace current word’s info with wildcard “*” (nmod, battery, NN, *) Step 3: Replace related word’s info with label: (nmod, A, NN, *) → applying learning algorithm
  6. Method 6 Lifelong-CRF (L-CRF) Algorithm: works in two phases •

    Training phase: • Normally trains a CRF model using the training data. • Lifelong extraction phase: • The model has extracted aspects from data in n previous domains D1 , . . . , Dn and the extracted sets of aspects are A1 , . . . , An . • Then, system is faced with a new domain data D n+1 . • Then, the model can leverage some reliable prior knowledge in A1 , . . . , An to make a better extraction from Dn+1
  7. Experiments 7 Dataset: • Labeled datasets for training/testing: reviews on

    7 products labeled with its aspect. • Unlabeled review datasets for LML: 50 diverse domains (each domain has 1000 reviews).
  8. Experiments 8 Cross-Domain: • Combine 6 labeled domain datasets for

    training and test on the 7th domain. In-Domain: • Train and test on the same domain. Evaluation Method: • Precision, Recall and F1-score as evaluation measures.
  9. Experiments 9 Compared Methods: • CRF: • Linear-chain CRF •

    All features including dependency feature but not employ LML. • CRF + R: • Treats the reliable aspect set as a dictionary. • Adds those aspects to the final result. (which are not extracted by CRF but existed in test data) • L-CRF (this paper model): • Employ LML
  10. Conclusions 11 • This paper introduced a novel idea to

    add a new capability to lifelong learning. • Continuously improving the model performance after training. • Improve the CRF model perfromance by leveragin the knowledge gained from extraction results of previous domains. • Future work: modify CRF so that it can consider previous extraction results as well as the knowledge in previous CRF models.