tech,4-1-J05-4003,bq |
generation
</term>
. We present a novel
<term>
|
method
|
</term>
for
<term>
discovering parallel sentences
|
#8987
We present a novelmethod for discovering parallel sentences in comparable, non-parallel corpora. |
tech,6-1-J05-4003,bq |
present a novel
<term>
method
</term>
for
<term>
|
discovering parallel sentences
|
</term>
in
<term>
comparable , non-parallel
|
#8989
We present a novel method fordiscovering parallel sentences in comparable, non-parallel corpora. |
lr,10-1-J05-4003,bq |
discovering parallel sentences
</term>
in
<term>
|
comparable , non-parallel corpora
|
</term>
. We train a
<term>
maximum entropy
|
#8993
We present a novel method for discovering parallel sentences incomparable , non-parallel corpora. |
tech,3-2-J05-4003,bq |
non-parallel corpora
</term>
. We train a
<term>
|
maximum entropy classifier
|
</term>
that , given a pair of
<term>
sentences
|
#9001
We train amaximum entropy classifier that, given a pair of sentences, can reliably determine whether or not they are translations of each other. |
other,12-2-J05-4003,bq |
classifier
</term>
that , given a pair of
<term>
|
sentences
|
</term>
, can reliably determine whether
|
#9010
We train a maximum entropy classifier that, given a pair ofsentences, can reliably determine whether or not they are translations of each other. |
other,22-2-J05-4003,bq |
determine whether or not they are
<term>
|
translations
|
</term>
of each other . Using this
<term>
approach
|
#9020
We train a maximum entropy classifier that, given a pair of sentences, can reliably determine whether or not they aretranslations of each other. |
tech,2-3-J05-4003,bq |
translations
</term>
of each other . Using this
<term>
|
approach
|
</term>
, we extract
<term>
parallel data
</term>
|
#9027
Using thisapproach, we extract parallel data from large Chinese, Arabic, and English non-parallel newspaper corpora. |
lr,6-3-J05-4003,bq |
this
<term>
approach
</term>
, we extract
<term>
|
parallel data
|
</term>
from large
<term>
Chinese , Arabic
|
#9031
Using this approach, we extractparallel data from large Chinese, Arabic, and English non-parallel newspaper corpora. |
lr,10-3-J05-4003,bq |
<term>
parallel data
</term>
from large
<term>
|
Chinese , Arabic , and English non-parallel newspaper corpora
|
</term>
. We evaluate the
<term>
quality of
|
#9035
Using this approach, we extract parallel data from largeChinese , Arabic , and English non-parallel newspaper corpora. |
measure(ment),3-4-J05-4003,bq |
newspaper corpora
</term>
. We evaluate the
<term>
|
quality of the extracted data
|
</term>
by showing that it improves the performance
|
#9048
We evaluate thequality of the extracted data by showing that it improves the performance of a state-of-the-art statistical machine translation system. |
tech,18-4-J05-4003,bq |
performance of a state-of-the-art
<term>
|
statistical machine translation system
|
</term>
. We also show that a good-quality
|
#9063
We evaluate the quality of the extracted data by showing that it improves the performance of a state-of-the-artstatistical machine translation system. |
tech,6-5-J05-4003,bq |
. We also show that a good-quality
<term>
|
MT system
|
</term>
can be built from scratch by starting
|
#9074
We also show that a good-qualityMT system can be built from scratch by starting with a very small parallel corpus (100,000 words) and exploiting a large non-parallel corpus. |
lr,19-5-J05-4003,bq |
scratch by starting with a very small
<term>
|
parallel corpus
|
</term>
( 100,000
<term>
words
</term>
) and
|
#9087
We also show that a good-quality MT system can be built from scratch by starting with a very smallparallel corpus (100,000 words) and exploiting a large non-parallel corpus. |
other,23-5-J05-4003,bq |
<term>
parallel corpus
</term>
( 100,000
<term>
|
words
|
</term>
) and exploiting a large
<term>
non-parallel
|
#9091
We also show that a good-quality MT system can be built from scratch by starting with a very small parallel corpus (100,000words) and exploiting a large non-parallel corpus. |
lr,29-5-J05-4003,bq |
words
</term>
) and exploiting a large
<term>
|
non-parallel corpus
|
</term>
. Thus , our method can be applied
|
#9097
We also show that a good-quality MT system can be built from scratch by starting with a very small parallel corpus (100,000 words) and exploiting a largenon-parallel corpus. |
other,11-6-J05-4003,bq |
can be applied with great benefit to
<term>
|
language pairs
|
</term>
for which only scarce
<term>
resources
|
#9111
Thus, our method can be applied with great benefit tolanguage pairs for which only scarce resources are available. |
lr,17-6-J05-4003,bq |
pairs
</term>
for which only scarce
<term>
|
resources
|
</term>
are available . In this paper we
|
#9117
Thus, our method can be applied with great benefit to language pairs for which only scarceresources are available. |