The
Effect
of
Paraphrases
in
Sta3s3cal
Machine
Transla3on
Background
In
Sta3s3cal
Machine
Transla3on,
the
transla3on
quality
is
mainly
dependent
on
the
corpus.
Out
of
vocabulary,
words
not
in
corpus
appears
as
not
translated,
is
considered
as
caused
by
data
sparseness
in
corpus.
In
related
work,
Ullman
et
al.
[1]
paraphrases
high
frequent
compound
nouns.
In
their
result,
the
BLEU
value,
quan3ta3ve
score,
has
lowered
with
paraphrased
corpus.
When
looking
at
the
graph
of
token
frequency
(fig.
1),
1-‐frequency
tokens
occupy
the
majority.
Knowing
this,
we
inves3gate
how
much
reduced
size
of
1-‐frequency
tokens
would
affect
in
BLEU
values.
1. E.
Ullman
and
J.
Nivre,
“Paraphrasing
Swedish
compound
nouns
in
machine
transla3on,”
EACL
2014,
p.
99,
2014.
2. P.
Koehn,
H.
Hoang,
A.
Birch,
C.
Callison-‐Burch,
M.
Fed-‐
erico,
N.
Bertoldi,
B.
Cowan,
W.
Shen,
C.
Moran,
R.
Zens,
C.
Dyer,
O.
Bojar,
A.
Constan3n,
and
E.
Herbst.
Moses:
Open
source
toolkit
for
sta3s3cal
machine
transla3on.
In
Proc.
45th
ACL,
Companion
Volume,
pages
177–180,
2007.
Experiment
Instead
of
dele3ng
the
1-‐frequency
words,
we
make
paraphrases
of
1-‐
frequency
verbs
according
to
Ullman’s
method.
By
paraphrasing
low-‐
frequent
words
to
more
frequent
words,
it
does
not
only
eliminate
the
low-‐
frequent
words
but
also
makes
the
paraphrase
verb
more
frequent.
(fig.
2)
The
corpus
is
KFTT
corpus
which
consists
of
440k
sentences
for
training.
We
have
paraphrased
randomly
selected
200
1-‐frequency
verbs
to
some
other
more
common
verbs.
In
fig.
3,
it
shows
a
paraphrasing
example.
It
prevented
the
enemies
from
listening
.
It
prevented
the
enemies
from
eavesdropping
.
fig.
2
fig.
1
fig.
3
Token
Frequency
For
the
experiment
setup,
MOSES[2]
is
used.
Paraphrasing
is
done
in
both
training
set
and
test
set
as
well.
For
evalua3ons,
we
have
conducted
both
quan3ta3ve,
BLEU,
and
subjec3ve
evalua3ons.
For
subjec3ve,
we
evaluated
transla3on
in
4-‐scale:
0
is
being
incorrect
in
grammar
and
not
retaining
senses
and
3
is
vice-‐versa.
The
fig.4
shows
the
result.
In
result,
BLEU
shows
the
drop
in
Open
Experiments
same
as
to
the
result
by
Ullman.
In
subjec3ve
evalua3on,
it
shows
scale
0
shows
increase
in
paraphrased
meaning
increase
in
low-‐quality
transla3ons,
but
also
some
increase
in
scale
3
as
well.
fig.
4