#467The key features of the system include: (i) Robust efficient parsing of Korean (a verb final language with overt case markers, relatively free word order, and frequent omissions of arguments).
tech,7-4-H01-1041,ak
quality
<term>
translation
</term>
via
<term>
word
sense disambiguation
</term>
and accurate
#484(ii) High quality translation viaword sense disambiguation and accurate word order generation of the target language.
tech,12-4-H01-1041,ak
disambiguation
</term>
and accurate
<term>
word
order generation
</term>
of the
<term>
target
#489(ii) High quality translation via word sense disambiguation and accurateword order generation of the target language.
other,8-10-H01-1042,ak
Additionally , they were asked to mark the
<term>
word
</term>
at which they made this decision
#747Additionally, they were asked to mark theword at which they made this decision.
other,4-3-H01-1058,ak
<term>
oracle
</term>
knows the
<term>
reference
word
string
</term>
and selects the
<term>
word
#1075The oracle knows the reference word string and selects the word string with the best performance (typically, word or semantic error rate) from a list of word strings, where each word string has been obtained by using a different LM.
other,10-3-H01-1058,ak
word string
</term>
and selects the
<term>
word
string
</term>
with the best
<term>
performance
#1080The oracle knows the reference word string and selects theword string with the best performance (typically, word or semantic error rate) from a list of word strings, where each word string has been obtained by using a different LM.
measure(ment),19-3-H01-1058,ak
<term>
performance
</term>
( typically ,
<term>
word
or semantic error rate
</term>
) from a list
#1089The oracle knows the reference word string and selects the word string with the best performance (typically,word or semantic error rate) from a list of word strings, where each word string has been obtained by using a different LM.
other,29-3-H01-1058,ak
error rate
</term>
) from a list of
<term>
word
strings
</term>
, where each
<term>
word string
#1099The oracle knows the reference word string and selects the word string with the best performance (typically, word or semantic error rate) from a list ofword strings, where each word string has been obtained by using a different LM.
other,34-3-H01-1058,ak
<term>
word strings
</term>
, where each
<term>
word
string
</term>
has been obtained by using
#1104The oracle knows the reference word string and selects the word string with the best performance (typically, word or semantic error rate) from a list of word strings, where eachword string has been obtained by using a different LM.
model,24-3-P01-1004,ak
</term>
superior to any of the tested
<term>
word
N-gram models
</term>
. Further , in their
#1555Over two distinct datasets, we find that indexing according to simple character bigrams produces a retrieval accuracy superior to any of the testedword N-gram models.
other,3-3-P01-1008,ak
approach yields
<term>
phrasal and single
word
lexical paraphrases
</term>
as well as
<term>
#1807Our approach yields phrasal and single word lexical paraphrases as well as syntactic paraphrases.
measure(ment),20-3-N03-1018,ak
significantly reduce
<term>
character and
word
error rate
</term>
, and provide evaluation
#2767We present an implementation of the model based on finite-state models, demonstrate the model's ability to significantly reduce character and word error rate, and provide evaluation results involving automatic extraction of translation lexicons from printed text.
other,66-1-N03-1033,ak
) fine-grained modeling of
<term>
unknown
word
features
</term>
. Using these ideas together
#2977We present a new part-of-speech tagger that demonstrates the following ideas: (i) explicit use of both preceding and following tag contexts via a dependency network representation, (ii) broad use of lexical features, including jointly conditioning on multiple consecutive words, (iii) effective use of priors in conditional loglinear models, and (iv) fine-grained modeling of unknown word features.
tech,6-1-N03-2017,ak
<term>
syntax-based constraint
</term>
for
<term>
word
alignment
</term>
, known as the
<term>
cohesion
#3235We present a syntax-based constraint forword alignment, known as the cohesion constraint.
model,14-4-N03-2036,ak
projections
</term>
using an underlying
<term>
word
alignment
</term>
. We show experimental
#3459During training, the blocks are learned from source interval projections using an underlyingword alignment.
other,11-1-P03-1051,ak
morphology
</term>
by a model that a
<term>
word
</term>
consists of a sequence of
<term>
morphemes
#4613We approximate Arabic's rich morphology by a model that aword consists of a sequence of morphemes in the pattern prefix*-stem-suffix* (* denotes zero or more occurrences of a morpheme).
tech,22-2-P03-1051,ak
algorithm
</term>
to build the
<term>
Arabic
word
segmenter
</term>
from a
<term>
large unsegmented
#4663Our method is seeded by a small manually segmented Arabic corpus and uses it to bootstrap an unsupervised algorithm to build the Arabic word segmenter from a large unsegmented Arabic corpus.
lr,18-5-P03-1051,ak
<term>
stems
</term>
from a
<term>
155 million
word
unsegmented corpus
</term>
, and re-estimate
#4728To improve the segmentation accuracy, we use an unsupervised algorithm for automatically acquiring new stems from a 155 million word unsegmented corpus, and re-estimate the model parameters with the expanded vocabulary and training corpus.
tech,2-6-P03-1051,ak
corpus
</term>
. The resulting
<term>
Arabic
word
segmentation system
</term>
achieves around
#4748The resulting Arabic word segmentation system achieves around 97% exact match accuracy on a test corpus containing 28,449 word tokens.
other,19-6-P03-1051,ak
test corpus
</term>
containing 28,449
<term>
word
tokens
</term>
. We believe this is a state-of-the-art
#4764The resulting Arabic word segmentation system achieves around 97% exact match accuracy on a test corpus containing 28,449word tokens.