Mutsugu Kuboki and Kazuhide Yamamoto. Generation of Descriptive Elements for Text. Proceedings of 7th International Conference on Natural Language Processing and Knowledge Engineering (NLPKE 2011), pp.56-59 (2011.11)
“LPF” We can‟t recognize it immediately. It may be that the text may not describe query. We want to know content at web search results. ….But it is difficult.
use following rules. 7 (1)Pattern ex) enforcement of LawProtectingPersonalInformation(eng) kojinjyouhouhogohou-no-shikou(jpn) (2)DEs are one word in Japanese. „noun or compound nouns‟ of „query‟ „query “no” nouns or compound nouns‟(jpn) Note. Japanese word “no” means “of”
DEs 77(21%) Next experiment use 54 DEs from adequate Candidates. 10 Above results include a lot of low frequency DEs. These DEs are rejected from candidates.
words. 12 Paragraph 1 ={w1,w2,w3,…} Paragraph 2 ={w2,w3,w4,…} … Paragraphs DE: X {w2,w3} Trigger of DE X Triggers construct 1, 2 and 3 morphemes. (2)Extract cooccurrence words. (1)Extract text from the web. (3)Collect Triggers.
“query-no-DE(jpn)” from the web. 2. Extract the content words from the paragraphs. 3. Extract cooccurrence words from the same DE paragraphs. Triggers
use combination of following rules. (1)[used Trigger] used by the pilot test Effect to increase accuracy (2)[unused Trigger] assign mistake over two (3)[unused Trigger] used by over 2 Des Lead to error (not decisive factor) 16
know factor to decide DEs. Let‟s try to use more strong rules. Modification relation Triggers Notice. Next experiment uses 19 DEs for simplicity (Similar DEs are rejected from candidates) 19
p/n n/p n/n All 0.31 11 24 181 1615 DE 0.67 6 3 - - Synonym 0.21 3 11 - - Hyponym 0.17 2 10 - - Answer data - 192 - - 1708 Results have a lot of mistakes. Trigger is not effect to evaluate true or false? check results
relate DE. 23 22/24 results are constructed by relation words. Examples Operation(cabinet, citizen, month) Enforcement-status(announcement, cabinet, year) Words of Triggers relate to DE. But precision value is low.
people. we check n/p(181 pairs) manually 25 Factor is unclear • 28 pairs(15%) • 11 pair DE is “description” • Others are low frequently. Factor is clear • 153 pairs(85%) • These have specific expression. Word, Words or Phrase
use only part of text to decide almost DE.(Text only explain query) Point don‟t use all text. Example …Law protecting personal information is established for fiscal year 2003… 26