lr,1-3-P03-1050,bq |
training resources
</term>
. No
<term>
parallel
|
text
|
</term>
is needed after the
<term>
training
|
#4478
No parallel text is needed after the training phase. |
other,37-3-N03-1018,bq |
translation lexicons
</term>
from
<term>
printed
|
text
|
</term>
. We present an application of
<term>
|
#2782
We present an implementation of the model based on finite-state models, demonstrate the model's ability to significantly reduce character and word error rate, and provide evaluation results involving automatic extraction of translation lexicons from printed text. |
other,12-4-P06-1013,bq |
are derived automatically from
<term>
raw
|
text
|
</term>
. Experiments using the
<term>
SemCor
|
#11022
Our combination methods rely on predominant senses which are derived automatically from raw text. |
|
aggregation system
</term>
using each author 's
|
text
|
as a coherent
<term>
corpus
</term>
. Our approach
|
#6133
This paper proposes a new methodology to improve the accuracy of a term aggregation system using each author's text as a coherent corpus. |
other,20-2-P01-1008,bq |
translations
</term>
of the same
<term>
source
|
text
|
</term>
. Our approach yields
<term>
phrasal
|
#1798
We present an unsupervised learning algorithm for identification of paraphrases from a corpus of multiple English translations of the same source text. |
tech,36-1-H01-1040,bq |
text collections
</term>
via a standard
<term>
|
text
|
browser
</term>
. We describe how this information
|
#310
In this paper we show how two standard outputs from information extraction (IE) systems - named entity annotations and scenario templates - can be used to enhance access to text collections via a standardtext browser. |
other,28-1-C86-1132,bq |
sublanguages
</term>
with
<term>
stereotyped
|
text
|
structure
</term>
.
<term>
RAREAS
</term>
draws
|
#13943
This paper describes a system (RAREAS) which synthesizes marine weather forecasts directly from formatted weather data. Such synthesis appears feasible in certain natural sublanguages with stereotyped text structure. |
other,14-4-C92-4207,bq |
spatial constraints
</term>
from the
<term>
|
text
|
</term>
, and represent them as the
<term>
|
#18468
To reconstruct the model, the authors extract the qualitative spatial constraints from thetext, and represent them as the numerical constraints on the spatial attributes of the entities. |
other,29-3-P84-1078,bq |
antecedence
</term>
of each element in the
<term>
|
text
|
</term>
to select the proper
<term>
substitutions
|
#13816
The system identities a strength of antecedence recovery for each of the lexical substitutions, and matches them against the strength of potential antecedence of each element in thetext to select the proper substitutions for these elements. |
other,17-1-A94-1026,bq |
conversion
</term>
needed to input the
<term>
|
text
|
</term>
. It is critical , therefore , for
|
#20383
Japanese texts frequently suffer from the homophone errors caused by the KANA-KANJI conversion needed to input thetext. |
other,37-1-A92-1027,bq |
</term>
are unknown and much of the
<term>
|
text
|
</term>
is irrelevant to the task . The
<term>
|
#17580
We present an efficient algorithm for chart-based phrase structure parsing of natural language that is tailored to the problem of extracting specific information from unrestricted texts where many of the words are unknown and much of thetext is irrelevant to the task. |
other,31-1-H01-1040,bq |
- can be used to enhance access to
<term>
|
text
|
collections
</term>
via a standard
<term>
text
|
#305
In this paper we show how two standard outputs from information extraction (IE) systems - named entity annotations and scenario templates - can be used to enhance access totext collections via a standard text browser. |
other,31-1-N03-1018,bq |
progressing from generation of
<term>
true
|
text
|
</term>
through its transformation into the
|
#2699
In this paper, we introduce a generative probabilistic optical character recognition (OCR) model that describes an end-to-end process in the noisy channel framework, progressing from generation of true text through its transformation into the noisy output of an OCR system. |
lr,0-4-P03-1050,bq |
phase
</term>
.
<term>
Monolingual , unannotated
|
text
|
</term>
can be used to further improve the
|
#4489
Monolingual, unannotated text can be used to further improve the stemmer by allowing it to adapt to a desired domain or genre. |
lr,11-4-P04-2010,bq |
<term>
pronouns
</term>
in
<term>
unannotated
|
text
|
</term>
by using a fully automatic sequence
|
#7081
Furthermore, we present a standalone system that resolves pronouns in unannotated text by using a fully automatic sequence of preprocessing modules that mimics the manual annotation process. |
other,16-7-P03-1050,bq |
average precision
</term>
over
<term>
unstemmed
|
text
|
</term>
, and 96 % of the performance of
|
#4586
Task-based evaluation using Arabic information retrieval indicates an improvement of 22-38% in average precision over unstemmed text, and 96% of the performance of the proprietary stemmer above. |
other,13-2-N03-2003,bq |
data
</term>
can be supplemented with
<term>
|
text
|
</term>
from the
<term>
web
</term>
filtered
|
#3041
In this paper, we show how training data can be supplemented withtext from the web filtered to match the style and/or topic of the target recognition task, but also that it is possible to get bigger performance gains from the data by using class-dependent interpolation of N-grams. |
tech,25-1-H94-1084,bq |
<term>
image understanding
</term>
with
<term>
|
text
|
understanding
</term>
. Our
<term>
document
|
#21385
Because of the complexity of documents and the variety of applications which must be supported, document understanding requires the integration of image understanding withtext understanding. |