|
In this paper we show how two standard
outputs
from
<term>
information extraction ( IE ) systems
</term>
-
<term>
named entity annotations
</term>
and
<term>
scenario templates
</term>
- can be used to enhance access to
<term>
text collections
</term>
via a standard
<term>
text browser
</term>
.
|
#282
In this paper we show how two standard outputs from information extraction (IE) systems - named entity annotations and scenario templates - can be used to enhance access to text collections via a standard text browser. |
other,18-6-H01-1041,bq |
Having been trained on
<term>
Korean newspaper articles
</term>
on missiles and chemical biological warfare , the
<term>
system
</term>
produces the
<term>
translation
output
</term>
sufficient for content understanding of the
<term>
original document
</term>
.
|
#534
Having been trained on Korean newspaper articles on missiles and chemical biological warfare, the system produces the translation output sufficient for content understanding of the original document. |
other,28-1-H01-1042,bq |
The purpose of this research is to test the efficacy of applying
<term>
automated evaluation techniques
</term>
, originally devised for the
<term>
evaluation
</term>
of
<term>
human language learners
</term>
, to the
<term>
output
</term>
of
<term>
machine translation ( MT ) systems
</term>
.
|
#572
The purpose of this research is to test the efficacy of applying automated evaluation techniques, originally devised for the evaluation of human language learners, to theoutput of machine translation (MT) systems. |
other,16-3-H01-1042,bq |
This , the first experiment in a series of experiments , looks at the
<term>
intelligibility
</term>
of
<term>
MT
output
</term>
.
|
#626
This, the first experiment in a series of experiments, looks at the intelligibility of MT output. |
other,16-6-H01-1042,bq |
We tested this to see if similar criteria could be elicited from duplicating the experiment using
<term>
machine translation
output
</term>
.
|
#680
We tested this to see if similar criteria could be elicited from duplicating the experiment using machine translation output. |
other,11-8-H01-1042,bq |
Some of the extracts were
<term>
expert human translations
</term>
, others were
<term>
machine translation
outputs
</term>
.
|
#710
Some of the extracts were expert human translations, others were machine translation outputs. |
|
The subjects were given three minutes per extract to determine whether they believed the sample
output
to be an
<term>
expert human translation
</term>
or a
<term>
machine translation
</term>
.
|
#727
The subjects were given three minutes per extract to determine whether they believed the sample output to be an expert human translation or a machine translation. |
|
Second , the
<term>
sentence-plan-ranker ( SPR )
</term>
ranks the list of
output
<term>
sentence plans
</term>
, and then selects the top-ranked
<term>
plan
</term>
.
|
#1411
Second, the sentence-plan-ranker (SPR) ranks the list of output sentence plans, and then selects the top-ranked plan. |
|
The
<term>
non-deterministic parsing choices
</term>
of the
<term>
main parser
</term>
for a
<term>
language L
</term>
are directed by a
<term>
guide
</term>
which uses the
<term>
shared derivation forest
</term>
output
by a prior
<term>
RCL parser
</term>
for a suitable
<term>
superset of L.
|
#1723
The non-deterministic parsing choices of the main parser for a language L are directed by a guide which uses the shared derivation forestoutput by a prior RCL parser for a suitable superset of L. |
other,21-3-N03-1001,bq |
In our method ,
<term>
unsupervised training
</term>
is first used to train a
<term>
phone n-gram model
</term>
for a particular
<term>
domain
</term>
; the
<term>
output
</term>
of
<term>
recognition
</term>
with this
<term>
model
</term>
is then passed to a
<term>
phone-string classifier
</term>
.
|
#2276
In our method, unsupervised training is first used to train a phone n-gram model for a particular domain; theoutput of recognition with this model is then passed to a phone-string classifier. |
other,38-1-N03-1018,bq |
In this paper , we introduce a
<term>
generative probabilistic optical character recognition ( OCR ) model
</term>
that describes an end-to-end process in the
<term>
noisy channel framework
</term>
, progressing from generation of
<term>
true text
</term>
through its transformation into the
<term>
noisy
output
</term>
of an
<term>
OCR system
</term>
.
|
#2706
In this paper, we introduce a generative probabilistic optical character recognition (OCR) model that describes an end-to-end process in the noisy channel framework, progressing from generation of true text through its transformation into the noisy output of an OCR system. |
other,16-2-N03-1018,bq |
The
<term>
model
</term>
is designed for use in
<term>
error correction
</term>
, with a focus on
<term>
post-processing
</term>
the
<term>
output
</term>
of black-box
<term>
OCR systems
</term>
in order to make it more useful for
<term>
NLP tasks
</term>
.
|
#2728
The model is designed for use in error correction, with a focus on post-processing theoutput of black-box OCR systems in order to make it more useful for NLP tasks. |
tech,26-2-N03-1026,bq |
Our
<term>
system
</term>
incorporates a
<term>
linguistic parser/generator
</term>
for
<term>
LFG
</term>
, a
<term>
transfer component
</term>
for
<term>
parse reduction
</term>
operating on
<term>
packed parse forests
</term>
, and a
<term>
maximum-entropy model
</term>
for
<term>
stochastic
output
selection
</term>
.
|
#2835
Our system incorporates a linguistic parser/generator for LFG, a transfer component for parse reduction operating on packed parse forests, and a maximum-entropy model for stochastic output selection. |
other,15-5-N03-1026,bq |
Overall
<term>
summarization
</term>
quality of the proposed
<term>
system
</term>
is state-of-the-art , with guaranteed
<term>
grammaticality
</term>
of the
<term>
system
output
</term>
due to the use of a
<term>
constraint-based parser/generator
</term>
.
|
#2899
Overall summarization quality of the proposed system is state-of-the-art, with guaranteed grammaticality of the system output due to the use of a constraint-based parser/generator. |
|
We consider the case of
<term>
multi-document summarization
</term>
, where the input
<term>
documents
</term>
are in
<term>
Arabic
</term>
, and the
output
<term>
summary
</term>
is in
<term>
English
</term>
.
|
#7170
We consider the case of multi-document summarization, where the input documents are in Arabic, and the output summary is in English. |
measure(ment),6-3-H05-1117,bq |
The lack of automatic
<term>
methods
</term>
for
<term>
scoring system
output
</term>
is an impediment to progress in the field , which we address with this work .
|
#7578
The lack of automatic methods for scoring system output is an impediment to progress in the field, which we address with this work. |
other,30-2-H05-2007,bq |
We incorporate this analysis into a
<term>
diagnostic tool
</term>
intended for
<term>
developers
</term>
of
<term>
machine translation systems
</term>
, and demonstrate how our application can be used by
<term>
developers
</term>
to explore
<term>
patterns
</term>
in
<term>
machine translation
output
</term>
.
|
#7676
We incorporate this analysis into a diagnostic tool intended for developers of machine translation systems, and demonstrate how our application can be used by developers to explore patterns in machine translation output. |
|
The use of
<term>
BLEU
</term>
at the
<term>
character
</term>
level eliminates the
<term>
word segmentation problem
</term>
: it makes it possible to directly compare commercial
<term>
systems
</term>
outputting
<term>
unsegmented texts
</term>
with , for instance ,
<term>
statistical MT systems
</term>
which usually segment their
<term>
outputs
</term>
.
|
#7769
The use of BLEU at the character level eliminates the word segmentation problem: it makes it possible to directly compare commercial systemsoutputting unsegmented texts with, for instance, statistical MT systems which usually segment their outputs. |
other,38-4-I05-2014,bq |
The use of
<term>
BLEU
</term>
at the
<term>
character
</term>
level eliminates the
<term>
word segmentation problem
</term>
: it makes it possible to directly compare commercial
<term>
systems
</term>
outputting
<term>
unsegmented texts
</term>
with , for instance ,
<term>
statistical MT systems
</term>
which usually segment their
<term>
outputs
</term>
.
|
#7784
The use of BLEU at the character level eliminates the word segmentation problem: it makes it possible to directly compare commercial systems outputting unsegmented texts with, for instance, statistical MT systems which usually segment theiroutputs. |
other,11-3-I05-6011,bq |
This
<term>
referential information
</term>
is vital for resolving
<term>
zero pronouns
</term>
and improving
<term>
machine translation
outputs
</term>
.
|
#8614
This referential information is vital for resolving zero pronouns and improving machine translation outputs. |