Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Automated Essay Scoring with Discourse-Aware Neural Models

youichiro
January 27, 2020

Automated Essay Scoring with Discourse-Aware Neural Models

長岡技術科学大学
自然言語処理研究室
文献紹介(2020-01-28)
Automated Essay Scoring with Discourse-Aware Neural Models
https://www.aclweb.org/anthology/W19-4450.pdf

youichiro

January 27, 2020
Tweet

More Decks by youichiro

Other Decks in Research

Transcript

  1. Automated Essay Scoring with Discourse-Aware Neural Models 文献紹介 (2020-01-28) 長岡技術科学大学

    自然言語処理研究室 小川 耀一朗 Farah Nadeem, Huy Nguyen, Yang Liu, and Mari Ostendorf BEA Workshop 2019, pages 484-493
  2. Automated Essay Scoring 2 Prompt More and more people use

    computers, but not everyone agrees that this benefits society. … Write a letter to your local newspaper in which you state your opinion on the effects computers have on people. Essay Dear local newspaper, I think effects computers have on people are great learning skills/affects because they give us time to chat with friends/new people, helps us learn about the globe(astronomy) and keeps us out of troble! … Thank you for listening. score: 4 (1~6) 自動小論文採点: 小論文に自動でスコア付けを行うタスク
  3. Related Work これまでの流れ • feature-based ◦ length, N-gram, word category,

    readability, syntactic, semanticなど ◦ Linear regressionでスコア予測 • LSTM-based ◦ feature-basedと同等かそれ以上の性能 • NN+features (Liu et al., 2019) ◦ state-of-the-artのモデル 3
  4. Related Work これまでの流れ • feature-based ◦ length, N-gram, word category,

    readability, syntactic, semanticなど ◦ Linear regressionでスコア予測 • LSTM-based ◦ feature-basedと同等かそれ以上の性能 • NN+features (Liu et al., 2019) ◦ state-of-the-artのモデル 4 本研究では • アーキテクチャを文書レベルに拡張する • 談話を認識するための事前学習を行う
  5. Bidirectional context with attention (BCA) 7 [Nadeem and Ostendorf, 2018]

    Hierachical RNNを拡張し, 前の文及び後の文 との依存関係を捉える 時刻 t の潜在表現 hlt に加えて、前の文及び後 の文の単語列とのAttentionを計算し、その加 重平均ベクトルを結合する
  6. 関連するタスクで事前学習を行い、訓練データ不足に対処する • Natural language inference (NLI) ◦ 2つの文から関係性(矛盾, 合意, 中性)を予測する

    • Discourse marker prediction (DM) ◦ 2つの文からdiscourse markerを予測する ◦ (however, in other words, meanwhileなど) • BERT embeddings ◦ contextualized word embeddingsとして使用 Pre-training Tasks 8
  7. Non-Native Written English from the Linguistic Data Consortium (LDC) -

    TOEFLエッセイに [high, medium, low] が付与されている Automated Student Assessment Prize (ASAP) - kaggleコンペのデータセット Dataset 12
  8. Results on LDC TOEFL corpus 13 モデル設定 (1) LDC(小論文データ)で学習 (2)

    NLIかDMのどちらかを事前学習+(1) (3) NLIとDMの両方を事前学習+(1) (4) (1)の入力をBERT embeddingsにする • HANよりも前後の文の関係を捉える BCA の方が性能が高い • NLIタスクの貢献は小さい • BERT-BCAが最も高い
  9. References • [Yang et al., 2016] ◦ Hierarchical Attention Networks

    for Document Classification ◦ Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, Eduard Hovy ◦ NAACL-HLT 2016, pages 1480-1489. • [Nadeem and Ostendorf, 2018] ◦ Estimating Linguistic Complexity for Science Texts ◦ Farah Nadeem, Mari Ostendorf ◦ BEA Workshop 2018, pages 45-55. 16
  10. Additions QWK - https://ktrw.hatenablog.com/entry/2019/05/03/005011 Automated Essay Scoring: A Survey of

    the State of the Art - https://www.ijcai.org/Proceedings/2019/0879.pdf - https://qiita.com/r-takahama/items/8f87aa1425cabb5d9a26 Kaggle - https://www.kaggle.com/c/asap-aes 17