lr,10-1-J05-4003,bq discovering parallel sentences </term> in <term> comparable , non-parallel corpora </term> . We train a <term> maximum entropy
lr,10-3-J05-4003,bq <term> parallel data </term> from large <term> Chinese , Arabic , and English non-parallel newspaper corpora </term> . We evaluate the <term> quality of
lr,17-6-J05-4003,bq pairs </term> for which only scarce <term> resources </term> are available . In this paper we
lr,19-5-J05-4003,bq scratch by starting with a very small <term> parallel corpus </term> ( 100,000 <term> words </term> ) and
lr,29-5-J05-4003,bq words </term> ) and exploiting a large <term> non-parallel corpus </term> . Thus , our method can be applied
lr,6-3-J05-4003,bq this <term> approach </term> , we extract <term> parallel data </term> from large <term> Chinese , Arabic
measure(ment),3-4-J05-4003,bq newspaper corpora </term> . We evaluate the <term> quality of the extracted data </term> by showing that it improves the performance
other,11-6-J05-4003,bq can be applied with great benefit to <term> language pairs </term> for which only scarce <term> resources
other,12-2-J05-4003,bq classifier </term> that , given a pair of <term> sentences </term> , can reliably determine whether
other,22-2-J05-4003,bq determine whether or not they are <term> translations </term> of each other . Using this <term> approach
other,23-5-J05-4003,bq <term> parallel corpus </term> ( 100,000 <term> words </term> ) and exploiting a large <term> non-parallel
tech,18-4-J05-4003,bq performance of a state-of-the-art <term> statistical machine translation system </term> . We also show that a good-quality
tech,2-3-J05-4003,bq translations </term> of each other . Using this <term> approach </term> , we extract <term> parallel data </term>
tech,3-2-J05-4003,bq non-parallel corpora </term> . We train a <term> maximum entropy classifier </term> that , given a pair of <term> sentences
tech,4-1-J05-4003,bq generation </term> . We present a novel <term> method </term> for <term> discovering parallel sentences
tech,6-1-J05-4003,bq present a novel <term> method </term> for <term> discovering parallel sentences </term> in <term> comparable , non-parallel
tech,6-5-J05-4003,bq . We also show that a good-quality <term> MT system </term> can be built from scratch by starting
hide detail