|
</term>
to recover a
<term>
submanifold
</term>
of
|
data
|
from a
<term>
high dimensionality space
</term>
|
#11358
It works by calculating eigenvectors of an adjacency graph's Laplacian to recover a submanifold of data from a high dimensionality space and then performing cluster number estimation on the eigenvectors. |
|
Information System ) domain
</term>
. This
|
data
|
collection effort has been co-ordinated
|
#18545
This data collection effort has been co-ordinated by MADCOW (Multi-site ATIS Data COllection Working group). |
|
Understanding System )
</term>
, which creates the
|
data
|
for a
<term>
text retrieval application
</term>
|
#21409
Our document understanding technology is implemented in a system called IDUS (Intelligent Document Understanding System), which creates the data for a text retrieval application and the automatic generation of hypertext links. |
lr,13-1-H05-1012,bq |
</term>
based on
<term>
supervised training
|
data
|
</term>
. We demonstrate that it is feasible
|
#7264
This paper presents a maximum entropy word alignment algorithm for Arabic-English based on supervised training data. |
lr,13-4-P03-1033,bq |
learning
</term>
using real
<term>
dialogue
|
data
|
</term>
collected by the
<term>
system
</term>
|
#4365
Moreover, the models are automatically derived by decision tree learning using real dialogue data collected by the system. |
lr,14-1-P03-1058,bq |
is the lack of
<term>
manually sense-tagged
|
data
|
</term>
required for
<term>
supervised learning
|
#4815
A central problem of word sense disambiguation (WSD) is the lack of manually sense-tagged data required for supervised learning. |
lr,15-1-N03-1001,bq |
manual transcription
</term>
of
<term>
training
|
data
|
</term>
. The method combines
<term>
domain
|
#2221
This paper describes a method for utterance classification that does not require manual transcription of training data. |
lr,17-5-H05-1095,bq |
better generalize from the
<term>
training
|
data
|
</term>
. This paper investigates some
<term>
|
#7432
Experimental results are presented, that demonstrate how the proposed method allows to better generalize from the training data. |
lr,19-2-P01-1004,bq |
both
<term>
character - and word-segmented
|
data
|
</term>
, in combination with a range of
<term>
|
#1511
We take a selection of both bag-of-words and segment order-sensitive string comparison methods, and run each over both character- and word-segmented data, in combination with a range of local segment contiguity models (in the form of N-grams). |
lr,2-1-N03-2003,bq |
</term>
result . Sources of
<term>
training
|
data
|
</term>
suitable for
<term>
language modeling
|
#3017
Sources of training data suitable for language modeling of conversational speech are limited. |
lr,27-3-H90-1060,bq |
the usual pooling of all the
<term>
speech
|
data
|
</term>
from many
<term>
speakers
</term>
prior
|
#17062
In addition, combination of the training speakers is done by averaging the statistics> of independently trained models rather than the usual pooling of all the speech data from many speakers prior to training. |
lr,6-3-J05-4003,bq |
approach
</term>
, we extract
<term>
parallel
|
data
|
</term>
from large
<term>
Chinese , Arabic
|
#9032
Using this approach, we extract parallel data from large Chinese, Arabic, and English non-parallel newspaper corpora. |
lr,7-4-N03-1012,bq |
<term>
system
</term>
against the
<term>
annotated
|
data
|
</term>
shows that , it successfully classifies
|
#2509
An evaluation of our system against the annotated data shows that, it successfully classifies 73.2% in a German corpus of 2.284 SRHs as either coherent or incoherent (given a baseline of 54.55%). |
lr,8-6-N01-1003,bq |
automatically learned from
<term>
training
|
data
|
</term>
. We show that the trained
<term>
SPR
|
#1431
The SPR uses ranking rules automatically learned from training data. |
lr,9-1-H05-2007,bq |
<term>
patterns
</term>
in
<term>
translation
|
data
|
</term>
using
<term>
part-of-speech tag sequences
|
#7638
We describe a method for identifying systematic patterns in translation data using part-of-speech tag sequences. |
lr-prod,5-5-P06-1013,bq |
the
<term>
SemCor
</term>
and
<term>
Senseval-3
|
data
|
sets
</term>
demonstrate that our ensembles
|
#11030
Experiments using the SemCor and Senseval-3 data sets demonstrate that our ensembles yield significantly better results when compared with state-of-the-art. |
lr-prod,6-4-C04-1112,bq |
</term>
on the
<term>
Dutch SENSEVAL-2 test
|
data
|
</term>
, we achieve a significant increase
|
#6064
Testing the lemma-based model on the Dutch SENSEVAL-2 test data, we achieve a significant increase in accuracy over the wordform model. |
other,13-1-P03-1005,bq |
</term>
for
<term>
structured natural language
|
data
|
</term>
. The
<term>
HDAG Kernel
</term>
directly
|
#3805
This paper proposes the Hierarchical Directed Acyclic Graph (HDAG) Kernel for structured natural language data. |
other,13-1-P05-1067,bq |
statistical models
</term>
to
<term>
structured
|
data
|
</term>
. In this paper , we present a
<term>
|
#9422
Syntax-based statistical machine translation (MT) aims at applying statistical models to structured data. |
other,15-1-C86-1132,bq |
forecasts directly from
<term>
formatted weather
|
data
|
</term>
. Such
<term>
synthesis
</term>
appears
|
#13931
This paper describes a system (RAREAS) which synthesizes marine weather forecasts directly from formatted weather data. Such synthesis appears feasible in certain natural sublanguages with stereotyped text structure. |