other,16-7-P03-1050,bq |
average precision
</term>
over
<term>
unstemmed
|
text
|
</term>
, and 96 % of the performance of
|
#4586
Task-based evaluation using Arabic information retrieval indicates an improvement of 22-38% in average precision over unstemmed text, and 96% of the performance of the proprietary stemmer above. |
other,14-4-C92-4207,bq |
spatial constraints
</term>
from the
<term>
|
text
|
</term>
, and represent them as the
<term>
|
#18468
To reconstruct the model, the authors extract the qualitative spatial constraints from thetext, and represent them as the numerical constraints on the spatial attributes of the entities. |
lr,26-6-P03-1050,bq |
affix lists
</term>
, and
<term>
human annotated
|
text
|
</term>
, in addition to an
<term>
unsupervised
|
#4560
Our resource-frugal approach results in 87.5% agreement with a state of the art, proprietary Arabic stemmer built using rules, affix lists, and human annotated text, in addition to an unsupervised component. |
other,26-2-P82-1035,bq |
that differ significantly from
<term>
neat
|
texts
|
</term>
, posing special problems for readers
|
#13001
However, a great deal of natural language texts e.g., memos, rough drafts, conversation transcripts etc., have features that differ significantly from neat texts, posing special problems for readers, such as misspelled words, missing words, poor syntactic construction, missing periods, etc. |
other,7-6-C94-1026,bq |
experimental objects are
<term>
Chinese-English
|
texts
|
</term>
, which are selected from different
|
#20603
Most importantly, the experimental objects are Chinese-English texts, which are selected from different language families. |
other,8-3-C86-1132,bq |
synthesize
<term>
bilingual or multMingual
|
texts
|
</term>
. A method for
<term>
error correction
|
#13986
The approach can easily be adapted to synthesize bilingual or multMingual texts. |
other,12-4-P06-1013,bq |
are derived automatically from
<term>
raw
|
text
|
</term>
. Experiments using the
<term>
SemCor
|
#11022
Our combination methods rely on predominant senses which are derived automatically from raw text. |
other,27-1-P82-1035,bq |
newspaper stories
</term>
and other
<term>
edited
|
texts
|
</term>
. However , a great deal of
<term>
|
#12972
Most large text-understanding systems have been designed under the assumption that the input text will be in reasonably neat form, e.g., newspaper stories and other edited texts. |
other,17-1-A94-1026,bq |
conversion
</term>
needed to input the
<term>
|
text
|
</term>
. It is critical , therefore , for
|
#20383
Japanese texts frequently suffer from the homophone errors caused by the KANA-KANJI conversion needed to input thetext. |
other,20-2-P01-1008,bq |
translations
</term>
of the same
<term>
source
|
text
|
</term>
. Our approach yields
<term>
phrasal
|
#1798
We present an unsupervised learning algorithm for identification of paraphrases from a corpus of multiple English translations of the same source text. |
other,11-7-H01-1042,bq |
six extracts of
<term>
translated newswire
|
text
|
</term>
. Some of the extracts were
<term>
|
#695
Subjects were given a set of up to six extracts of translated newswire text. |
other,24-1-N03-4010,bq |
answering capability
</term>
on
<term>
free
|
text
|
</term>
. The demonstration will focus on
|
#3660
The JAVELIN system integrates a flexible, planning-based architecture with a variety of language processing modules to provide an open-domain question answering capability on free text. |
other,37-3-N03-1018,bq |
translation lexicons
</term>
from
<term>
printed
|
text
|
</term>
. We present an application of
<term>
|
#2782
We present an implementation of the model based on finite-state models, demonstrate the model's ability to significantly reduce character and word error rate, and provide evaluation results involving automatic extraction of translation lexicons from printed text. |
other,12-3-C92-4207,bq |
</term>
, which takes
<term>
natural language
|
texts
|
</term>
and produces a
<term>
model
</term>
of
|
#18444
It is done by an experimental computer program SPRINT, which takes natural language texts and produces a model of the described world. |
other,3-3-C94-1026,bq |
proposed . We postulate that
<term>
source
|
texts
|
</term>
and
<term>
target texts
</term>
should
|
#20560
We postulate that source texts and target texts should share the same concepts, ideas, entities, and events. |
|
aggregation system
</term>
using each author 's
|
text
|
as a coherent
<term>
corpus
</term>
. Our approach
|
#6133
This paper proposes a new methodology to improve the accuracy of a term aggregation system using each author's text as a coherent corpus. |
other,26-4-P04-2005,bq |
exploits the large amount of
<term>
Chinese
|
text
|
</term>
available in
<term>
corpora
</term>
and
|
#6985
Our method takes advantage of the different way in which word senses are lexicalised in English and Chinese, and also exploits the large amount of Chinese text available in corpora and on the Web. |
other,2-1-C94-1026,bq |
homophone errors
</term>
. To align
<term>
bilingual
|
texts
|
</term>
becomes a crucial issue recently
|
#20535
To align bilingual texts becomes a crucial issue recently. |
tech,36-1-H01-1040,bq |
text collections
</term>
via a standard
<term>
|
text
|
browser
</term>
. We describe how this information
|
#310
In this paper we show how two standard outputs from information extraction (IE) systems - named entity annotations and scenario templates - can be used to enhance access to text collections via a standardtext browser. |
tech,38-3-H01-1040,bq |
increased potential of
<term>
IE-enhanced
|
text
|
browsers
</term>
. At MIT Lincoln Laboratory
|
#383
We also report results of a preliminary, qualitative user evaluation of the system, which while broadly positive indicates further work needs to be done on the interface to make users aware of the increased potential of IE-enhanced text browsers. |