Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Language Model Based Grammatical Error Correction without Annotated Training Data

Language Model Based Grammatical Error Correction without Annotated Training Data

長岡技術科学大学
自然言語処理研究室
文献紹介(2018-07-25)

youichiro

July 25, 2018
Tweet

More Decks by youichiro

Other Decks in Technology

Transcript

  1. Language Model Based Grammatical Error Correction without Annotated Training Data

    Christopher Bryant and Ted Briscoe Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications, pages 247–253, 2018 ⽂献紹介(2018-07-25) ⻑岡技術科学⼤学 ⾃然⾔語処理研究室 ⼩川 耀⼀朗 1
  2. Method 訂正候補セット l 以下の英語のエラータイプを対象とする non-words, morphology, article and prepositions l

    Non-words(⾮単語) ex) [freind → friend] CyHunspell*1を使⽤し、訂正候補を⽣成する *1 https://pypi.org/project/CyHunspell/ 6
  3. Method 訂正候補セット l Morphology(語形) - noun number: [cat → cats]

    - verb tense: [eat → ate] - adjective form: [big → bigger] など Automatically Generated Inflection Database(AGID)*2から、訂正 候補を⽣成する l Articles and Prepositions(冠詞と前置詞) article: {φ, a, an, the} preposition: {φ, about, at, by, for, from, in, of, on, to, with} *2 http://wordlist.aspell.net/other/ 7
  4. Experiment l ⾔語モデルの構築 5-gram language model trained on the One

    Billion Word Benchmark dataset*3 with KenLM l 開発セットとテストセット CoNLL-2013, CoNLL-2014, FCE, JFLEGを使⽤ 8 *3 https://arxiv.org/pdf/1312.3005.pdf