Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Using Phrasal Patterns to Identify Discourse Relations

Using Phrasal Patterns to Identify Discourse Relations

Manami Saito, Kazuhide Yamamoto and Satoshi Sekine. Using Phrasal Patterns to Identify Discourse Relations. Proceedings of Human Language Technology conference - North American chapter of the Association for Computational Linguistics annual meeting (HLT-NAACL 2006), Companion Volume, pp.133-136 (2006.6)

More Decks by 自然言語処理研究室

Other Decks in Research

Transcript

  1. Using Phrasal Patterns to Identify Discourse Relations Manami Saito and

    Kazuhide Yamamoto (Nagaoka University of Technology) Satoshi Sekine (New York University)
  2. Overview Š Task: to identify a discourse relation of two

    consecutive sentences „ out of six relations (c.f. next slide) Š Marcu and Echihabi[2002] identify discourse relations between text segments using Naïve Bayes classifiers. „ They proposed a method using lexical pairs. „ On top of the lexical information, we use phrasal pattern.
  3. Discourse relations 0.2 most of all, in general OTHER 1.5

    for example, for instance EXAMPLE 5.1 by the way, and now, meanwhile CHANGE-TOPIC 6.0 in fact, alternatively, similarly EQUIVALENCE 12.1 because, and so, thus, therefore CAUSE-EFFECT 32.2 although, but, while, however CONTRAST 43.0 and, also, then, moreover ELABORATION Freq. in corpus [%] Examples of cue phrase (English translation) Discourse relation
  4. Our Method: Outline Š Use of two viewpoints „ lexical

    information „ phrasal patterns (“*should have done*.*did*.”) Š Use phrasal patterns for the sentences which cannot be decided by lexical information Š Score calculation for each discourse relation and each viewpoint Š Determine a relation using ranking by the score
  5. Identification using Lexical Information Clue: a pair of words in

    two consecutive sentences. Ex1) a. It is ideal that people all over the world accept independence and associate on an equal footing with each other. b. (However,) Reality is not that simple. CONTRAST relation
  6. Score Calculation Š Score1: fraction of a given relation among

    all the word pairs Š Score2: adjustment of Score1 by average appearances These scores are calculated for each discourse relation. If a relation got highest (among six) in both scores It's the relation. Otherwise, Use of phrasal pattern
  7. Identification using Phrasal Pattern (1/2) Clue: fragments of the two

    sentences. E.g., 1st sent. : “X should have done Y” 2nd sent. : “A did B” ˠ very likely that the discourse relation is CONTRAST (89% in our Japanese corpus). Phrasal pattern: “*should have done*.*did*.” *:anything
  8. Identification using Phrasal Pattern (2/2) How to choice phrases for

    phrasal pattern? 1. Deleting unnecessary phrases 2. Restricting phrasal pattern 3. Combining phrases and selecting words in a phrase Ex2) a. “kanojo-no kokoro-ni donna omoi-ga at-ta-ka-ha wakara- nai.” (No one knows what feeling she had in her mind.) b. “sore-ha totemo yuuki-ga iru koto-dat-ta-ni-chigai-nai.” (I think that she must have needed courage.)
  9. Score Calculation Š Score3: fraction of a given relation among

    all the patterns (˺Score1) Š Score4: adjustment of Score3 by average appearances (˺Score2) These scores are calculated for each discourse relation. If a relation got highest (among six) in both scores It's the relation. Otherwise, Get off the relations which no match pattern in six relations. Output the discourse relation highest Score2 (using word pairs) from the rest.
  10. Evaluation Š Corpus: 1.3M sentence pairs from the Web +

    150K pairs from newspapers. Š Test set: 300 pairs; 50 for each relation from the Web Š Ran two experiments „ Using only lexical information (selected by Score2) „ Using both of lexical pairs and phrasal pattern
  11. Experimental Result 65% 53% Weighted accuracy 64% (192/300) 57% (171/300)

    Total 60% (30/50) 56% (28/50) EXAMPLE 72% (36/50) 66% (33/50) CHANGE-TOPIC 58% (29/50) 58% (29/50) EQUIVALENCE 56% (28/50) 56% (28/50) CAUSE-EFFECT 86% (43/50) 62% (31/50) CONTRAST 52% (26/50) 44% (22/50) ELABORATION With phrasal pattern Lexical info. Only Discourse relation
  12. Conclusion Š Identify discourse relations between two successive Japanese sentences.

    Š Use of phrasal pattern information as well as lexical information. Š Future works „ Applying other machine learning methods, such as SVM. „ Analyzing discourse relation categorization strategy. „ Including a longer context beyond two sentences.