not synthesis of constituents. – Not just like math; (3+5)*(9-7)=8*2=16 – Hot (熱い)+ water (水) = hot water (熱い水?) • It depends on context and/or situation. – 「はい」 • It also depends on one's cultural/social background. – Not like image / audio processing.
translation, MT – also called automatic translation • source language – the input language of the translator • target language – the output language of the translator
syntactic structure of the source language is transferred into the target language. • semantic transfer method – semantic representation of the source language is transferred into the target language. • interlingua method – interlingua, language-independent semantic representation, is defined and all of the input expressions in any source language is once converted into the interlingua, and then the target language is generated based on it.
words in the source sentences into its corresponding target language without analysis. • It may work well in similar language pairs, such as in Spanish and Portuguese, or in Malay and Indonesian. Although it is regarded as obsolete approach, it is receiving more attention again. (explained next time)
translation based on syntactic transformation parses input sentences and change its structure according to some transformation rules, before replacing words. • Most commonly used in commercial-based MT systems.
translation based on semantic transformation is partially introduced into commercial MT system. – e.g., whether forecast • Full semantic analysis of the input sentences is still a difficult problems that should be solved. • Partial semantic analysis, particularly word sense disambiguation, is realized also in a commercial system.
is a method to translate twice; – source to interlingua; – and interlingua to target. • In this sense it seems NOT to be efficient, but it is indeed efficient in case number of language pairs is increased.
and an original semantic representation are the candidates of interlingua. • English – largest number of speakers in the world. Easy to develop a system since many people can use it. Fair in a sense. • Esperanto – most famous artificial language. Little exception in grammar rules thus easy to be implemented. • any semantic representation – we can define whatever we hope.
I know) semantic representation is mostly used as interlingua. • There are also attempts to use English as interlingua. • However, all of those are still experimental. – impossible? unrealized dream? – European people can not abandon its attempt since they have serious problems in language communication.
easy; it depends on culture and many others. • no vagueness – direction (north/south), number, season, etc. • different granuality – Japanese: 氷/水/湯, English: water and ice, Malay: air • looks similar but different – English: hip/waist, Japanese: 腰 • culture-dependent – こたつ、浴衣、畳
interlingua is considered to be difficult – or hopeless • since it has to includes any concept in any language • even if a language has so minute concept.
is hard to design an interlingua in general, but it is still useful in a limited domain that is independent to one's culture. • in some specific tasks we don't need to consider cultural difference so that we can design a language-independent representation. – flight reservation, sightseeing guide, negotiation in business, and so on. • in a restricted region (such as European countries) many concepts are regarded as same or similar, that may enable designing common concepts.