Upgrade to Pro — share decks privately, control downloads, hide ads and more …

文献紹介:OpenKiwi: An Open Source Framework for Qua...

Taichi Aida
January 10, 2020

文献紹介:OpenKiwi: An Open Source Framework for Quality Estimation

OpenKiwi: An Open Source Framework for Quality Estimation
Fabio Kepler, Jonay Trénous, Marcos Treviso, Miguel Vera, André F. T. Martins
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 117-122, Florence, July 2019.

Taichi Aida

January 10, 2020
Tweet

More Decks by Taichi Aida

Other Decks in Research

Transcript

  1. OpenKiwi: An Open Source Framework for Quality Estimation Fabio Kepler,

    Jonay Trénous, Marcos Treviso, Miguel Vera, André F. T. Martins Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 117-122, Florence, July 2019.
  2. Quality Estimation (QE) - Word-level - PEとMTを比較して、単語の挿入(Gap tag)、置換・削除 (MT tag)の必要があれば

    BAD、それ以外は OK - SourceにおいてMTのエラーになりそうな箇所に BADそれ 以外は OK (Source tag) - Sentence-level - PEとMTの編集距離(lower is better)を算出 4
  3. Methods - 4つの既存モデルを Ensemble/Stack - QUality Estimation from scraTCH (QUETCH)

    - Neural Quality Estimation (NuQE) - Predictor-Estimator - Automatic Post-Editing adapted for QE (APE-QE) 7
  4. Neural Quality Estimation (NuQE) - QUETCH (左図) の改良版 - 隠れ層(Hidden

    Layers)を変更 - feedforward layers - bi-directional GRU layers 9
  5. Automatic Post-Editing adapted for QE (APE-QE) (QEのおさらい) - 訓練時:原文、訳文、訳文を人手で編集した文 -

    テスト:原文、訳文 「テストのときに編集文がないなら、疑似編集文を生成すればよい のでは?」 - 訓練データを用いて編集文を生成するモデルを訓練 - タグ付け、編集距離計算は予測した編集文を用いて行う 11
  6. Experiment - Task - Word-level QE - Sentence-level QE -

    Data - WMT18 English-German, train/dev = 39,715/2000(文) - pred-est の事前訓練:WMT 3,396,364文, in-domain 12
  7. References - Julia Kreutzer, Shigehiko Schamoni, and Stefan Riezler. QUality

    Estimation from ScraTCH (QUETCH): Deep Learning for Word-level Translation Quality Estimation. In Proc. of WMT, pp. 316-322, 2015. - André F. T. Martins, Ramón Astudillo, Chris Hokamp, and Fabio Kepler. Unbabel’s Participation in the WMT16 Word-Level Translation Quality Estimation Shared Task. In Proc. of WMT, pp. 806-811, 2016. - Hyun Kim, Jong-Hyeok Lee, and Seung-Hoon Na. Predictor-Estimator using Multilevel Task Learning with Stack Propagation for Neural Quality Estimation. In Proc. of WMT, pp. 562-568, 2017. - André F. T. Martins, Marcin Junczys-Dowmunt, Fabio N. Kepler, Ramón Astudillo, Chris Hokamp, and Roman Grundkiewicz. Pushing the Limits of Translation Quality Estimation. Trans. of ACL, pp. 205-218, 2017. 19