N03-1024 |
al. ( 2002 ) have proposed an
|
automatic MT
|
system evaluation technique (
|
N04-1022 |
AMTA , 2003 ) . We expect new
|
automatic MT
|
evaluation metrics to emerge
|
D13-1025 |
the translations produced by any
|
automatic MT
|
system still remain below than
|
N07-1005 |
system parameters based on the
|
automatic MT
|
evaluation measures . Acknowledgments
|
N06-1057 |
translation ( MT ) community for
|
automatic MT
|
evaluation . A problem with ROUGE
|
N06-1058 |
correlate with its utility for
|
automatic MT
|
evaluation . Our results suggest
|
N04-1036 |
translation quality is measured by the
|
automatic MT
|
evaluation metrics , such as
|
C88-2160 |
be aecomplisheA by any existing
|
automatic MT
|
system . The problem remains
|
N04-1022 |
translation in two scenarios . Given an
|
automatic MT
|
metric , we design a loss function
|
D09-1074 |
so as to directly optimize an
|
automatic MT
|
performance evaluation metric
|
D10-1090 |
evaluation . Inspired by the success of
|
automatic MT
|
evalua - tion , Lin ( 2004 )
|
D12-1090 |
systems . The early seminal work on
|
automatic MT
|
metrics ( e.g. , BLEU and NIST
|
N09-2006 |
riddled error surface computed by
|
automatic MT
|
evaluation metrics . We showed
|
N04-1022 |
) , to the problem of building
|
automatic MT
|
systems tuned for specific metrics
|
D11-1035 |
significantly improve the quality of
|
automatic MT
|
compared to BLEU , as measured
|
I05-5003 |
translations . Fortunately , the
|
automatic MT
|
evaluation techniques commonly
|
J03-3003 |
, approaches other than fully
|
automatic MT
|
might provide interesting characteristics
|
D14-1020 |
Judgment A common means of assessing
|
automatic MT
|
evaluation metrics is Spearman
|
D13-1011 |
by short term improvements in
|
automatic MT
|
evaluation metrics such as BLEU
|
N07-1005 |
tion . However , a measure for
|
automatic MT
|
evaluation that strongly correlates
|