Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A System to Solve Language Tests for Second Grade Students

A System to Solve Language Tests for Second Grade Students

Manami Saito, Satoshi Sekine, Hitoshi Isahara, and Kazuhide Yamamoto. A System to Solve Language Tests for Second Grade Students. Proceedings of The Second International Joint Conference on Natural Language Processing (IJCNLP-05), pp.45-50 (2005.10)

自然言語処理研究室

October 30, 2005
Tweet

More Decks by 自然言語処理研究室

Other Decks in Research

Transcript

  1. A System to Solve Language Tests for Second Grade Students

    Manami SaitoʢNagaoka University of Technology ʣ Kazuhide YamamotoʢNagaoka University of Technology ʣ Satoshi SekineʢLanguage CraftɾNew York Universityʣ Hitoshi IsaharaʢNational Institute of Information and Communications Technologyʣ
  2. Abstract „ We made a system which solves language tests

    for 2nd grade students (http://languagecraft.jp). „ Two aims: „ To realize the NLP technologies into the form which can be easily observed by ordinary people. „ To observe the problems of the NLP technologies by degrading the level of target materials.
  3. Question Types „ Four major types „ Chinese character (Kanji),

    Word knowledge, Comprehension, and Composition „ Kanji Reading, Writing, Order of writing, etc… „ Word knowledge Anonym, Synonym, Particle, Onomatopeia, etc… „ Comprehension 5W1H, Fill in the blanks, Progress order of a story, etc…
  4. Kanji questions „ Reading „ Use of the result of

    morphological analysis „ Writing „ We got the candidates from Kanji-dictionary, and then choose feasible one using large corpus (38 years of newspapers, 350GB Web corpus)
  5. Discussion (4) „ Reading of Kanji „ Sort a correct

    answer from plural reading „ The large corpus solved the writing kanji questions by counting the candidate „ No corpus with reading annotation
  6. Word knowledge questions (1) „ The most of question types

    can be solved by using dictionary and large corpus (38 years of newspapers, 350GB Web corpus) „ Many types of word knowledge question
  7. Word knowledge questions (2) Ex1. Write a opposite word ͕ͤ

    ߴ͍ɻ (He is tall.)˱ ௿͍ (short) ͶͩΜ͕ ߴ͍ɻ (Price is high.)˱ ͍҆ (low) Ex2ɽChoose the word grouped “horse”, “rabbit”, “monkey” and “giraffe”. Option: “bird”, “musical instrument”, “animal” Answer: animal
  8. Discussion (1) „ Effectiveness of Large Corpus „ Very useful

    „ Difficult to solve questions above just using dictionary „ Should we edit the knowledge corpus?
  9. Reading comprehension (1) „ Five typical techniques that are used

    at different type of questions ʢaʣPattern matching (Ex3) ʢbʣStandard NE and form of grammar ʢcʣPartial matching with keywords ʢdʣUse of frequencies in the large corpus ʢeʣUse of distance between keywords in question and answers
  10. Reading comprehension (2) Ex3. Fill blanks in the expression Story:

    ೋɺࡾ೔ͨͭͱɺͦͷՖ͸ ͠΅ΜͰɺͩΜͩ Μ ࠇͬΆ͍ ৭ʹ ͔Θͬͯ ͍͖·͢ɻ(In a few days, the flower withers and gradually changes its color to black.) Expression: Ֆ͸ (1), (2) ৭ʹ ͔ΘΔɻ (The flower (1) and changes its color to (2).) Answer: (1) ͠΅ΜͰ (withers) (2) ࠇͬΆ͍ (black)
  11. Discussion (2) „ World knowledge „ Necessary even the 2nd

    grade level e.g. “A student enters junior high school after graduated from elementary school”, “A person become happy, if he receives something nice from someone ”
  12. Discussion (3) „ Difference between Reading comprehension and Question answering

    „ It should be flexibility of Named Entity „ “Who” = “raccoon dog behind our house ”, “the moon” „ The answer can be a clause „ “When” = “the time when new leaves growing on a branch ”
  13. Evaluation „ Training Data „ A published language test book

    for 2nd grade students „ Test Data „ Targeted questions suggest the questions prepared the subsystem to solve it „ All questions suggest targeted questions and the questions that this system can’t even try to solve it „ Rate of the questions targeted by this system „ Kanji 97.4%, Word knowledge 57.1%, Reading comprehension 64.7%
  14. Result at test data We made 47 kinds of subsystems

    to solve 90% questions. /VN /VNPG /VNPG 3$" 3$" 3$" JOUPUBM PG LOPXO DPSSFDU LOPXOUZQF UPUBM PGLOPXO BMM2 UZQF2 BOT 2 <> <> UZQF2<> ,BOKJ       8PSE ,OPXMFEHF 3FBEJOH $PNQSFIFOTJPO 5PUBM                   *Rate of Correct Answer
  15. Discussion (5) „ Recognition and Classification of Questions „ About

    100 minor types „ Is this classification right? Are there any unknown types? „ We can’t solve the questions of unknown type „ Reclassification is in review „ Accuracy of the classification program „ We are building the system which classify the questions automatically