Sta3s3cal Machine Transla3on, the transla3on quality is mainly dependent on the corpus. Out of vocabulary, words not in corpus appears as not translated, is considered as caused by data sparseness in corpus. In related work, Ullman et al. [1] paraphrases high frequent compound nouns. In their result, the BLEU value, quan3ta3ve score, has lowered with paraphrased corpus. When looking at the graph of token frequency (fig. 1), 1-‐frequency tokens occupy the majority. Knowing this, we inves3gate how much reduced size of 1-‐frequency tokens would affect in BLEU values. 1. E. Ullman and J. Nivre, “Paraphrasing Swedish compound nouns in machine transla3on,” EACL 2014, p. 99, 2014. 2. P. Koehn, H. Hoang, A. Birch, C. Callison-‐Burch, M. Fed-‐ erico, N. Bertoldi, B. Cowan, W. Shen, C. Moran, R. Zens, C. Dyer, O. Bojar, A. Constan3n, and E. Herbst. Moses: Open source toolkit for sta3s3cal machine transla3on. In Proc. 45th ACL, Companion Volume, pages 177–180, 2007. Experiment Instead of dele3ng the 1-‐frequency words, we make paraphrases of 1-‐ frequency verbs according to Ullman’s method. By paraphrasing low-‐ frequent words to more frequent words, it does not only eliminate the low-‐ frequent words but also makes the paraphrase verb more frequent. (fig. 2) The corpus is KFTT corpus which consists of 440k sentences for training. We have paraphrased randomly selected 200 1-‐frequency verbs to some other more common verbs. In fig. 3, it shows a paraphrasing example. It prevented the enemies from listening . It prevented the enemies from eavesdropping . fig. 2 fig. 1 fig. 3 Token Frequency For the experiment setup, MOSES[2] is used. Paraphrasing is done in both training set and test set as well. For evalua3ons, we have conducted both quan3ta3ve, BLEU, and subjec3ve evalua3ons. For subjec3ve, we evaluated transla3on in 4-‐scale: 0 is being incorrect in grammar and not retaining senses and 3 is vice-‐versa. The fig.4 shows the result. In result, BLEU shows the drop in Open Experiments same as to the result by Ullman. In subjec3ve evalua3on, it shows scale 0 shows increase in paraphrased meaning increase in low-‐quality transla3ons, but also some increase in scale 3 as well. fig. 4