Input: a Japanese sentence • Process: morphological analysis, parsing, semantic analysis, (and discourse analysis) • The output of the above process is the input of the (target language = English) generation process.
question. • What time is the next train to Tokyo? The system analyzes the question. • Q-type: time, destination: Tokyo, condition: the earliest The system searches an answer. • 1430 hrs The system generates a sentence. • The next train is at 1430 hrs.
given. • The system analyzes all the sentences that generates semantic representation. • It then select important part out of the semantic representation. • It finally generates sentences. However, current summarizers do not realize the processes above.
Case frames, semantic network, and other semantic representations. – Sometimes it is used in machine translation. • Non-linguistic data – stock prices – baseball scores – weather forecasts
sentences can be generated from the same semantic network. • How do we select one out of many? • What is the difference? • What affects the generation result?
ago an experiment is conducted to generate a sentence at random to verify the hand-made syntax rules. • But we have no way when the generated one is ungrammatical.
a sentence is to provide a template and fill out something that gives an output. This is called template generation, and is widely used for fixed expressions. N時N分発、(のぞみ|ひかり|こだま)N号はN番線から発 車します。自由席は、N号車からN号車まで、... N error is found. – The sentences changes whether N is 1 or not!
different referential expressions that depend on the situation of the entity. 「.....本を貸してください」 • 本が目の前に1冊しかないとき • 薄い本と厚い本がある場合 • 厚い本が料理の本の場合 • テーブルの上に1冊しかない場合
noun sounds redundant, so some of them are replaced by pronouns, definite article, and some are deleted (as a zero anaphora). Taro says Taro and Taro's friend went back to Taro's home by Taro's car. – absolutely unnatural; instead of Taro, we use he(his), the, or nothing(=ellipsis).
same, but we feel something different. It is not clear (and thus a big problem for researcher) what makes us feel different. 「太郎は花を買った。その花はバラである」 「太郎は花を買った。花はバラである」 「太郎は花を買った。それはバラである」 「太郎は花を買った。バラである」
problem. See the example above, and choose appropriate one. Guess why you choose it here. I was bitten by a dog of the Suzuki family. • I was bitten by an animal. • I was bitten by a mammal. • I was bitten by a beast. • I was bitten by a puppy. • I was bitten by a West Highland White Terrier.
used to replace an expression. It may be logically strange, or we may feel funny in terms of language style or formality. Last night I went to bed early. The night of yesterday I went to bed early. • It depends on the situation and the environment. • It is not always possible to be paraphrased. – Last night means the night of yesterday. – 「お誕生日に食事(ごはん|めし)に招待された」
we emphasize the subordinate first. Causal relation should be considered this time. e.g. S1: He caught a cold. S2: He is absent. S1+S2 He caught a cold so that he is absent. He is absent since he caught a cold.
to take care of viewpoint to generate an expression. e.g. Football Japan vs. Korea • Japan won last night thanks to Honda. • Last night Korea is beaten by Honda • It's true to see that Japan will get so good performance. • Japan was lucky since Korea just happened to have a bad condition.
no strong constraint; – In language analysis, the answer should be one, while in language generation many answers are possible. • It depends on the application or the domain. – "General" language generation application are difficult to be considered. • It is difficult to evaluate; – It is hard to define good generation in the engineering matter.