Sta3s3cal   Machine   Transla3on,   the   transla3on   quality   is   mainly   dependent  on  the  corpus.    Out  of  vocabulary,  words  not  in  corpus  appears   as  not  translated,  is  considered  as  caused  by  data  sparseness  in  corpus.    In   related  work,  Ullman  et  al.  [1]  paraphrases  high  frequent  compound  nouns.     In   their   result,   the   BLEU   value,   quan3ta3ve   score,   has   lowered   with   paraphrased  corpus.    When  looking  at  the  graph  of  token  frequency  (fig.  1),   1-‐frequency  tokens  occupy  the  majority.    Knowing  this,  we  inves3gate  how   much  reduced  size  of  1-‐frequency  tokens  would  affect  in  BLEU  values.   1.  E.  Ullman  and  J.  Nivre,  “Paraphrasing  Swedish  compound  nouns  in  machine  transla3on,”  EACL  2014,  p.  99,  2014.     2.  P.  Koehn,  H.  Hoang,  A.  Birch,  C.  Callison-‐Burch,  M.  Fed-‐  erico,  N.  Bertoldi,  B.  Cowan,  W.  Shen,  C.  Moran,  R.  Zens,  C.  Dyer,  O.  Bojar,  A.  Constan3n,  and  E.   Herbst.  Moses:  Open  source  toolkit  for  sta3s3cal  machine  transla3on.  In  Proc.  45th  ACL,  Companion  Volume,  pages  177–180,  2007.     Experiment Instead   of   dele3ng   the   1-‐frequency   words,   we   make   paraphrases   of   1-‐ frequency   verbs   according   to   Ullman’s   method.     By   paraphrasing   low-‐ frequent  words  to  more  frequent  words,  it  does  not  only  eliminate  the  low-‐ frequent  words  but  also  makes  the  paraphrase  verb  more  frequent.  (fig.  2)     The  corpus  is  KFTT  corpus  which  consists  of  440k  sentences  for  training.    We   have  paraphrased  randomly  selected  200  1-‐frequency  verbs  to  some  other   more  common  verbs.      In  fig.  3,  it  shows  a  paraphrasing  example.       It  prevented  the  enemies  from  listening  . It  prevented  the  enemies  from  eavesdropping  .     fig.  2 fig.  1 fig.  3 Token  Frequency For   the   experiment   setup,   MOSES[2]   is   used.     Paraphrasing   is   done   in   both   training   set   and   test   set   as   well.       For   evalua3ons,   we   have   conducted   both   quan3ta3ve,   BLEU,   and   subjec3ve   evalua3ons.       For   subjec3ve,   we   evaluated   transla3on   in   4-‐scale:   0   is   being   incorrect  in  grammar  and  not  retaining  senses   and  3  is  vice-‐versa.    The  fig.4  shows  the  result.     In   result,   BLEU   shows   the   drop   in   Open   Experiments  same  as  to  the  result  by  Ullman.   In   subjec3ve   evalua3on,   it   shows   scale   0   shows   increase   in   paraphrased   meaning   increase   in   low-‐quality   transla3ons,   but   also   some  increase  in  scale  3  as  well.   fig.  4