tech,21-3-H94-1084,bq |
<term>
integration
</term>
of
<term>
image and
|
text
|
understanding
</term>
.
|
#21446
This paper summarizes the areas of research during IDUS development where we have found the most benefit from the integration of image and text understanding. |
other,29-3-P84-1078,bq |
antecedence
</term>
of each element in the
<term>
|
text
|
</term>
to select the proper
<term>
substitutions
|
#13816
The system identities a strength of antecedence recovery for each of the lexical substitutions, and matches them against the strength of potential antecedence of each element in thetext to select the proper substitutions for these elements. |
lr,19-2-N03-4010,bq |
answer candidates
</term>
from the given
<term>
|
text
|
corpus
</term>
. The operation of the
<term>
|
#3681
The demonstration will focus on how JAVELIN processes questions and retrieves the most likely answer candidates from the giventext corpus. |
other,13-1-P82-1035,bq |
under the assumption that the input
<term>
|
text
|
</term>
will be in reasonably neat form ,
|
#12957
Most large text-understanding systems have been designed under the assumption that the inputtext will be in reasonably neat form, e.g., newspaper stories and other edited texts. |
other,20-2-P01-1008,bq |
translations
</term>
of the same
<term>
source
|
text
|
</term>
. Our approach yields
<term>
phrasal
|
#1798
We present an unsupervised learning algorithm for identification of paraphrases from a corpus of multiple English translations of the same source text. |
other,16-7-P03-1050,bq |
average precision
</term>
over
<term>
unstemmed
|
text
|
</term>
, and 96 % of the performance of
|
#4586
Task-based evaluation using Arabic information retrieval indicates an improvement of 22-38% in average precision over unstemmed text, and 96% of the performance of the proprietary stemmer above. |
other,35-1-I05-4010,bq |
numbering system
</term>
in the
<term>
legal
|
text
|
hierarchy
</term>
. Basic methodology and
|
#8239
In this paper we present our recent work on harvesting English-Chinese bitexts of the laws of Hong Kong from the Web and aligning them to the subparagraph level via utilizing the numbering system in the legal text hierarchy. |
other,13-2-N03-2003,bq |
data
</term>
can be supplemented with
<term>
|
text
|
</term>
from the
<term>
web
</term>
filtered
|
#3041
In this paper, we show how training data can be supplemented withtext from the web filtered to match the style and/or topic of the target recognition task, but also that it is possible to get bigger performance gains from the data by using class-dependent interpolation of N-grams. |
other,28-1-C86-1132,bq |
sublanguages
</term>
with
<term>
stereotyped
|
text
|
structure
</term>
.
<term>
RAREAS
</term>
draws
|
#13943
This paper describes a system (RAREAS) which synthesizes marine weather forecasts directly from formatted weather data. Such synthesis appears feasible in certain natural sublanguages with stereotyped text structure. |
other,31-1-H01-1040,bq |
- can be used to enhance access to
<term>
|
text
|
collections
</term>
via a standard
<term>
text
|
#305
In this paper we show how two standard outputs from information extraction (IE) systems - named entity annotations and scenario templates - can be used to enhance access totext collections via a standard text browser. |
lr,0-4-P03-1050,bq |
phase
</term>
.
<term>
Monolingual , unannotated
|
text
|
</term>
can be used to further improve the
|
#4489
Monolingual, unannotated text can be used to further improve the stemmer by allowing it to adapt to a desired domain or genre. |
other,12-4-P06-1013,bq |
are derived automatically from
<term>
raw
|
text
|
</term>
. Experiments using the
<term>
SemCor
|
#11022
Our combination methods rely on predominant senses which are derived automatically from raw text. |
other,11-7-H01-1042,bq |
six extracts of
<term>
translated newswire
|
text
|
</term>
. Some of the extracts were
<term>
|
#695
Subjects were given a set of up to six extracts of translated newswire text. |
tech,3-1-C04-1116,bq |
smaller and more robust . We present a
<term>
|
text
|
mining method
</term>
for finding
<term>
synonymous
|
#6095
We present atext mining method for finding synonymous expressions based on the distributional hypothesis in a set of coherent corpora. |
tech,17-1-H92-1095,bq |
spoken language understanding
</term>
,
<term>
|
text
|
understanding
</term>
, and
<term>
document
|
#19654
Language understanding work at Paramax focuses on applying general-purpose language understanding technology to spoken language understanding,text understanding, and document processing, integrating language understanding with speech recognition, knowledge-based information retrieval and image understanding. |
|
aggregation system
</term>
using each author 's
|
text
|
as a coherent
<term>
corpus
</term>
. Our approach
|
#6133
This paper proposes a new methodology to improve the accuracy of a term aggregation system using each author's text as a coherent corpus. |
tech,36-1-H01-1040,bq |
text collections
</term>
via a standard
<term>
|
text
|
browser
</term>
. We describe how this information
|
#310
In this paper we show how two standard outputs from information extraction (IE) systems - named entity annotations and scenario templates - can be used to enhance access to text collections via a standardtext browser. |
tech,8-1-C90-3072,bq |
have become an integral part of most
<term>
|
text
|
processing software
</term>
. From different
|
#16730
Spelling-checkers have become an integral part of mosttext processing software. |
other,31-1-N03-1018,bq |
progressing from generation of
<term>
true
|
text
|
</term>
through its transformation into the
|
#2699
In this paper, we introduce a generative probabilistic optical character recognition (OCR) model that describes an end-to-end process in the noisy channel framework, progressing from generation of true text through its transformation into the noisy output of an OCR system. |
lr,20-3-I05-4010,bq |
an authoritative and comprehensive
<term>
|
text
|
collection
</term>
covering the specific
|
#8272
The resultant bilingual corpus, 10.4M English words and 18.3M Chinese characters, is an authoritative and comprehensivetext collection covering the specific and special domain of HK laws. |