lr,1-3-P03-1050,bq |
training resources
</term>
. No
<term>
parallel
|
text
|
</term>
is needed after the
<term>
training
|
#4478
No parallel text is needed after the training phase. |
tech,3-1-C04-1116,bq |
smaller and more robust . We present a
<term>
|
text
|
mining method
</term>
for finding
<term>
synonymous
|
#6095
We present atext mining method for finding synonymous expressions based on the distributional hypothesis in a set of coherent corpora. |
other,10-2-A88-1001,bq |
heuristically-produced complete
<term>
sentences
</term>
in
<term>
|
text
|
</term>
or
<term>
text-to-speech form
</term>
|
#14892
Multimedia answers include videodisc images and heuristically-produced complete sentences intext or text-to-speech form. |
other,12-4-P06-1013,bq |
are derived automatically from
<term>
raw
|
text
|
</term>
. Experiments using the
<term>
SemCor
|
#11022
Our combination methods rely on predominant senses which are derived automatically from raw text. |
other,24-1-A92-1027,bq |
specific information from
<term>
unrestricted
|
texts
|
</term>
where many of the
<term>
words
</term>
|
#17568
We present an efficient algorithm for chart-based phrase structure parsing of natural language that is tailored to the problem of extracting specific information from unrestricted texts where many of the words are unknown and much of the text is irrelevant to the task. |
other,31-1-N03-1018,bq |
progressing from generation of
<term>
true
|
text
|
</term>
through its transformation into the
|
#2699
In this paper, we introduce a generative probabilistic optical character recognition (OCR) model that describes an end-to-end process in the noisy channel framework, progressing from generation of true text through its transformation into the noisy output of an OCR system. |
other,13-1-P84-1078,bq |
system
</term>
designed to create
<term>
cohesive
|
text
|
</term>
through the use of
<term>
lexical substitutions
|
#13758
This report describes Paul, a computer text generation system designed to create cohesive text through the use of lexical substitutions. |
other,35-1-I05-4010,bq |
numbering system
</term>
in the
<term>
legal
|
text
|
hierarchy
</term>
. Basic methodology and
|
#8239
In this paper we present our recent work on harvesting English-Chinese bitexts of the laws of Hong Kong from the Web and aligning them to the subparagraph level via utilizing the numbering system in the legal text hierarchy. |
other,24-4-I05-2014,bq |
systems
</term>
outputting
<term>
unsegmented
|
texts
|
</term>
with , for instance ,
<term>
statistical
|
#7771
The use of BLEU at the character level eliminates the word segmentation problem: it makes it possible to directly compare commercial systems outputting unsegmented texts with, for instance, statistical MT systems which usually segment their outputs. |
tech,26-3-P04-2005,bq |
Sense Disambiguation ( WSD )
</term>
and
<term>
|
Text
|
Summarisation
</term>
. Our method takes
|
#6955
Topic signatures can be useful in a number of Natural Language Processing (NLP) applications, such as Word Sense Disambiguation (WSD) andText Summarisation. |
other,3-3-C94-1026,bq |
proposed . We postulate that
<term>
source
|
texts
|
</term>
and
<term>
target texts
</term>
should
|
#20560
We postulate that source texts and target texts should share the same concepts, ideas, entities, and events. |
other,6-2-C88-1044,bq |
</term>
. We examine a broad range of
<term>
|
texts
|
</term>
to show how the distribution of
<term>
|
#15199
We examine a broad range oftexts to show how the distribution of demonstrative forms and functions is genre dependent. |
other,14-4-C92-4207,bq |
spatial constraints
</term>
from the
<term>
|
text
|
</term>
, and represent them as the
<term>
|
#18468
To reconstruct the model, the authors extract the qualitative spatial constraints from thetext, and represent them as the numerical constraints on the spatial attributes of the entities. |
|
papers in English , many systems to run off
|
texts
|
have been developed . In this paper , we
|
#12231
In order to meet the needs of a publication of papers in English, many systems to run off texts have been developed. |
tech,24-2-H94-1084,bq |
</term>
, which creates the data for a
<term>
|
text
|
retrieval application
</term>
and the
<term>
|
#21412
Our document understanding technology is implemented in a system called IDUS (Intelligent Document Understanding System), which creates the data for atext retrieval application and the automatic generation of hypertext links. |
tech,25-1-H94-1084,bq |
<term>
image understanding
</term>
with
<term>
|
text
|
understanding
</term>
. Our
<term>
document
|
#21385
Because of the complexity of documents and the variety of applications which must be supported, document understanding requires the integration of image understanding withtext understanding. |
other,24-1-N03-4010,bq |
answering capability
</term>
on
<term>
free
|
text
|
</term>
. The demonstration will focus on
|
#3660
The JAVELIN system integrates a flexible, planning-based architecture with a variety of language processing modules to provide an open-domain question answering capability on free text. |
other,13-2-N03-2003,bq |
data
</term>
can be supplemented with
<term>
|
text
|
</term>
from the
<term>
web
</term>
filtered
|
#3041
In this paper, we show how training data can be supplemented withtext from the web filtered to match the style and/or topic of the target recognition task, but also that it is possible to get bigger performance gains from the data by using class-dependent interpolation of N-grams. |
other,29-3-P84-1078,bq |
antecedence
</term>
of each element in the
<term>
|
text
|
</term>
to select the proper
<term>
substitutions
|
#13816
The system identities a strength of antecedence recovery for each of the lexical substitutions, and matches them against the strength of potential antecedence of each element in thetext to select the proper substitutions for these elements. |
other,6-2-P82-1035,bq |
, a great deal of
<term>
natural language
|
texts
|
</term>
e.g. ,
<term>
memos
</term>
, rough
<term>
|
#12982
However, a great deal of natural language texts e.g., memos, rough drafts, conversation transcripts etc., have features that differ significantly from neat texts, posing special problems for readers, such as misspelled words, missing words, poor syntactic construction, missing periods, etc. |