dataset [Hermann et al., 2015; Nallapati et al., 2016] ◦ Online news articles (800 tokens on average) and their abstracts (60 tokens on average) ◦ 300k for training, 13k for validation, 11k for test ◦ BPE with 32k, truncate articles to 400 subwords • APE: two variants ◦ English-German (En-De): WMT18/WMT19 APE shared task (IT domain) ▪ translated by a NMT system (45.8 BLEU) ▪ 13k for training, 1k for validation, 1k for test ◦ English-Latvian (En-Lv): life sciences domain [Specia et al., 2018] ▪ Translated by a NMT system (38.4 BLEU) ▪ 13k for training, 1k for validation, 1k for test