|
documentation . The question is , however , how
|
an
|
interesting information piece would be
|
#43
The question is, however, how an interesting information piece would be found in a large database. |
|
whether they believed the sample output to be
|
an
|
<term>
expert human translation
</term>
or
|
#730
The subjects were given three minutes per extract to determine whether they believed the sample output to be an expert human translation or a machine translation. |
|
</term>
. We have built and will demonstrate
|
an
|
application of this approach called
<term>
|
#820
We have built and will demonstrate an application of this approach called LCS-Marine. |
|
fall short of the
<term>
performance
</term>
of
|
an
|
<term>
oracle
</term>
. The
<term>
oracle
</term>
|
#1067
We find that simple interpolation methods, like log-linear and linear interpolation, improve the performance but fall short of the performance of an oracle. |
|
example , after
<term>
translation
</term>
into
|
an
|
equivalent
<term>
RCG
</term>
, any
<term>
tree
|
#1661
For example, after translation into an equivalent RCG, any tree adjoining grammar can be parsed in O(n6) time. |
|
collect
<term>
paraphrases
</term>
. We present
|
an
|
<term>
unsupervised learning algorithm
</term>
|
#1779
We present an unsupervised learning algorithm for identification of paraphrases from a corpus of multiple English translations of the same source text. |
|
be improved dramatically by incorporating
|
an
|
approximation of the
<term>
formal analysis
|
#1886
I show that the performance of a search engine can be improved dramatically by incorporating an approximation of the formal analysis that is compatible with the search engine's operational semantics. |
|
human judgments
</term>
. In order to perform
|
an
|
exhaustive comparison , we also evaluate
|
#2075
In order to perform an exhaustive comparison, we also evaluate a hand-crafted template-based generation component, two rule-based sentence planners, and two baseline sentence planners. |
|
sets of
<term>
concepts
</term>
on the basis of
|
an
|
<term>
ontology
</term>
. We apply our
<term>
|
#2453
In this paper we present ONTOSCORE, a system for scoring sets of concepts on the basis of an ontology. |
|
<term>
semantic coherence
</term>
. We conducted
|
an
|
<term>
annotation experiment
</term>
and showed
|
#2481
We conducted an annotation experiment and showed that human annotators can reliably differentiate between semantically coherent and incoherent speech recognition hypotheses. |
|
recognition ( OCR ) model
</term>
that describes
|
an
|
end-to-end process in the
<term>
noisy channel
|
#2685
In this paper, we introduce a generative probabilistic optical character recognition (OCR) model that describes an end-to-end process in the noisy channel framework, progressing from generation of true text through its transformation into the noisy output of an OCR system. |
|
transformation into the
<term>
noisy output
</term>
of
|
an
|
<term>
OCR system
</term>
. The
<term>
model
</term>
|
#2708
In this paper, we introduce a generative probabilistic optical character recognition (OCR) model that describes an end-to-end process in the noisy channel framework, progressing from generation of true text through its transformation into the noisy output of an OCR system. |
|
useful for
<term>
NLP tasks
</term>
. We present
|
an
|
implementation of the
<term>
model
</term>
|
#2746
We present an implementation of the model based on finite-state models, demonstrate the model's ability to significantly reduce character and word error rate, and provide evaluation results involving automatic extraction of translation lexicons from printed text. |
|
from
<term>
printed text
</term>
. We present
|
an
|
application of
<term>
ambiguity packing and
|
#2786
We present an application of ambiguity packing and stochastic disambiguation techniques for Lexical-Functional Grammars (LFG) to the domain of sentence condensation. |
|
</term>
on the
<term>
Penn Treebank WSJ
</term>
,
|
an
|
<term>
error reduction
</term>
of 4.4 % on
|
#2998
Using these ideas together, the resulting tagger gives a 97.24% accuracy on the Penn Treebank WSJ, an error reduction of 4.4% on the best previous single automatically learned tagging result. |
|
small-sized
<term>
bilingual corpus
</term>
, we use
|
an
|
out-of-domain
<term>
bilingual corpus
</term>
|
#3097
In order to boost the translation quality of EBMT based on a small-sized bilingual corpus, we use an out-of-domain bilingual corpus and, in addition, the language model of an in-domain monolingual corpus. |
|
addition , the
<term>
language model
</term>
of
|
an
|
in-domain
<term>
monolingual corpus
</term>
|
#3110
In order to boost the translation quality of EBMT based on a small-sized bilingual corpus, we use an out-of-domain bilingual corpus and, in addition, the language model of an in-domain monolingual corpus. |
|
corpus
</term>
. We conducted experiments with
|
an
|
<term>
EBMT system
</term>
. The two
<term>
evaluation
|
#3119
We conducted experiments with an EBMT system. |
|
score
</term>
demonstrated the effect of using
|
an
|
out-of-domain
<term>
bilingual corpus
</term>
|
#3140
The two evaluation measures of the BLEU score and the NIST score demonstrated the effect of using an out-of-domain bilingual corpus and the possibility of using the language model. |
|
</term>
by identifying
<term>
hubs
</term>
in
|
an
|
<term>
automaton
</term>
. For our purposes
|
#3166
We describe a simple unsupervised technique for learning morphology by identifying hubs in an automaton. |