Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Natural Language Processing (7) Discourse

Natural Language Processing (7) Discourse

自然言語処理研究室

November 01, 2013
Tweet

More Decks by 自然言語処理研究室

Other Decks in Education

Transcript

  1. 1 1 / 24 Natural Language Processing (7) Discourse Kazuhide

    Yamamoto Dept. of Electrical Engineering Nagaoka University of Technology
  2. 3 / 24 Discourse • Discourse is an extended sequence

    of sentences produced by one or more people. • Significance: it is not enough if we only (lexically, syntactically, and semantically) analyze sentence by sentence, because of the following reasons: – Relations between sentences should be recognized. – A sentence can not analyzed if it has anaphora or ellipsis; information of other sentences are required to resolve such problems.
  3. 4 / 24 Characteristics of text • Text is a

    meaningful collection of sentences; if we only collect sentences from somewhere, it is not a text. • Meaningful phenomena includes: – conjunction, e.g., hence, however, ... – pronoun and reference words, e.g., this, that, so, ... – ellipsis – repetition (of words, phrases, ...) – lexical relation – comparison
  4. 7 / 24 Focused information and omission • We tend

    not to repeat to mention same information since it sounds redundant to the hearer. • In discourse analysis, "focused information" refers to that we want to talk now or a focus of the talk, that is not always first- to-hear information to the hearer.
  5. 8 / 24 Anaphora (1) The upstairs restaurant is so

    good. It's heaven. • antecedent / 先行詞: the upstairs restaurant • anaphor / 照応詞: it Do you know the legend? A student eats three set meals in the upstairs restaurant. • cataphora / 後方照応 vs. anaphora / 前方照応 Could you pass me the salt, please? • exophora / 外界照応 vs. endophora / 文脈照応
  6. 9 / 24 Anaphora (2) • noun anaphora / 名詞照応

    – ある男がいた。男は杉田研の学生である。 • pronoun anaphora / 代名詞照応 – ある男がいた。その男は圓道研の学生である。 • ellipsis = zero anaphora / ゼロ代名詞照応 – ある男がいた。(φは)坪根研の学生である。
  7. 10 / 24 Anaphora (3) • direct anaphora – antecedent

    is found within the text. • indirect anaphora – antecedent is indeed within the text (not like exophora), but nothing in the text is indicated.. – "I sold a car." "What will you do with it?" – "The mouse intends to surprise the cat, but the cat doesn't think so."
  8. 11 / 24 Anaphora resolution: example 「ソニーが新しいパソコンを発表。高性能CPUを搭 載、夏に発売予定。」 Sony has

    announced a release of a new PC. It makes a PC with high-end CPU, and will sell in this summer. Q: What is it? Candidates: Sony, (new) PC Semantic constraint: it is [company] to make PC A: It's Sony.
  9. 13 / 24 Anaphora resolution 拓哉のレポートを静香に見せた。 彼は後悔した。 拓哉はレポートを慎吾に見せた。 彼は後悔した。 I

    showed Mr.Takuya's report to Ms.Shizuka. He felt regret. Mr.Takuya showed his report to Mr.Shingo. He felt regret.
  10. 14 / 24 Anaphora resolution 拓哉のレポートを静香に見せた。 彼は後悔した。 拓哉はレポートを慎吾に見せた。 彼は後悔した。 拓哉のレポートを慎吾に見せた。

    彼は後悔した。 I showed Mr.Takuya's report to Ms.Shizuka. He felt regret. Mr.Takuya showed his report to Mr.Shingo. He felt regret. I showed Mr.Takuya's report to Mr.Shingo. He felt regret.
  11. 15 / 24 Anaphora resolution 拓哉のレポートを静香に見せた。 彼は後悔した。 拓哉はレポートを慎吾に見せた。 彼は後悔した。 拓哉のレポートを慎吾に見せた。

    彼は後悔した。 拓哉のレポートを慎吾が見た。 彼は後悔した。 I showed Mr.Takuya's report to Ms.Shizuka. He felt regret. Mr.Takuya showed his report to Mr.Shingo. He felt regret. I showed Mr.Takuya's report to Mr.Shingo. He felt regret. Mr.Shingo saw Mr.Takuya's report. He felt regret.
  12. 16 / 24 Difficult examples 「太郎が自分の車で帰省した。次郎もそうした。」 Taro went home with

    his car. Jiro did so, too. 「名前は?」「さつきです」「5月生まれですか?」 「いいえ9月です」「それって詐欺みたい」 Your name? / May. / Oh, you're born in May. / No, September. / That's puzzling. 「私は風邪を引いたので学校を休んだ」 「私が風邪を引いたので学校を休んだ」 I caught a cold, so I am absent to school.
  13. 18 / 24 Dialogue and its processing Dialogue is an

    exchange of conversation between two or more agents. • Not necessarily spoken; computer chat is also a dialogue. • Not necessarily with human; human-computer interaction, HCI, or even computer-computer interaction (in natural language) is also a kind of dialogue. Dialogue can be regarded as discourse, in the sense that it is sequence of sentences, but nature is not the same to discourse.
  14. 19 / 24 Characteristics of (Japanese) dialogs Many ellipses •

    「退学します」「本当?」「嘘だぴょん」 Many anaphora • 「あの件はこんな感じでああやって」 Interjection • 「ええっと、あの、それはその、うーん」 Inversion • 「知ってた?来週宿題があるんだって」
  15. 20 / 24 Dialogues: three properties Although dialogue is a

    kind of discourse, it is different from others in: •turn-taking – conversation may have overlaps and silence. •grounding – collective act by the speaker and the hearer are performed in order to establish common ground. •implicature / 含意 – the speaker is communicating more information than seems to be present in the uttered words.
  16. 21 / 24 Gricean four maxims • proposed by philosopher

    Paul Grice. • a way to explain the link between utterances and what is understood from them • based on his cooperative principle. – ‘Make your conversational contribution such as is required, at the stage at which it occurs, by the accepted purpose or direction of the talk exchange in which you are engaged,’
  17. 22 / 24 Gricean maxims Maxim of Quantity • give

    the right amount of information Maxim of Quality • try to make your contribution one that is true Maxim of Relation • be relevant Maxim of Manner • avoid obscurity or ambiguity. be brief and orderly.
  18. 23 / 24 What is "understanding"? At a street: A:

    Excuse me, do you have a watch? B: Oh, yeah, this watch is a premium one made in Switzerland. Glad you know how cool it is. This one is ... At a ticket counter: customer: I would like to go to Tokyo today. counter: Oh, that's very nice. Have a nice trip. Literal interpretation is not enough; In these cases, the hearer should know intention of the speakers.
  19. 24 / 24 Summary: today's key words • discourse analysis

    • characteristics of text • anaphora and its resolution • dialogue and its analysis