|
ability to spend their time finding more
|
data
|
relevant to their task , and gives them
|
#3615
It gives users the ability to spend their time finding more data relevant to their task, and gives them translingual reach into other languages by leveraging human language technology. |
|
</term>
to recover a
<term>
submanifold
</term>
of
|
data
|
from a
<term>
high dimensionality space
</term>
|
#11358
It works by calculating eigenvectors of an adjacency graph's Laplacian to recover a submanifold of data from a high dimensionality space and then performing cluster number estimation on the eigenvectors. |
|
Information System ) domain
</term>
. This
|
data
|
collection effort has been co-ordinated
|
#18545
This data collection effort has been co-ordinated by MADCOW (Multi-site ATIS Data COllection Working group). |
|
Understanding System )
</term>
, which creates the
|
data
|
for a
<term>
text retrieval application
</term>
|
#21409
Our document understanding technology is implemented in a system called IDUS (Intelligent Document Understanding System), which creates the data for a text retrieval application and the automatic generation of hypertext links. |
lr,11-2-P03-1058,bq |
automatically acquire
<term>
sense-tagged training
|
data
|
</term>
from
<term>
English-Chinese parallel
|
#4834
In this paper, we evaluate an approach to automatically acquire sense-tagged training data from English-Chinese parallel corpora, which are then used for disambiguating the nouns in the SENSEVAL-2 English lexical sample task. |
lr,13-1-H05-1012,bq |
</term>
based on
<term>
supervised training
|
data
|
</term>
. We demonstrate that it is feasible
|
#7264
This paper presents a maximum entropy word alignment algorithm for Arabic-English based on supervised training data. |
lr,13-4-P03-1033,bq |
learning
</term>
using real
<term>
dialogue
|
data
|
</term>
collected by the
<term>
system
</term>
|
#4365
Moreover, the models are automatically derived by decision tree learning using real dialogue data collected by the system. |
lr,14-1-P03-1058,bq |
is the lack of
<term>
manually sense-tagged
|
data
|
</term>
required for
<term>
supervised learning
|
#4815
A central problem of word sense disambiguation (WSD) is the lack of manually sense-tagged data required for supervised learning. |
lr,15-1-N03-1001,bq |
manual transcription
</term>
of
<term>
training
|
data
|
</term>
. The method combines
<term>
domain
|
#2221
This paper describes a method for utterance classification that does not require manual transcription of training data. |
lr,17-5-H05-1095,bq |
better generalize from the
<term>
training
|
data
|
</term>
. This paper investigates some
<term>
|
#7432
Experimental results are presented, that demonstrate how the proposed method allows to better generalize from the training data. |
lr,19-2-P01-1004,bq |
both
<term>
character - and word-segmented
|
data
|
</term>
, in combination with a range of
<term>
|
#1511
We take a selection of both bag-of-words and segment order-sensitive string comparison methods, and run each over both character- and word-segmented data, in combination with a range of local segment contiguity models (in the form of N-grams). |
lr,2-1-N03-2003,bq |
</term>
result . Sources of
<term>
training
|
data
|
</term>
suitable for
<term>
language modeling
|
#3017
Sources of training data suitable for language modeling of conversational speech are limited. |
lr,27-3-H90-1060,bq |
the usual pooling of all the
<term>
speech
|
data
|
</term>
from many
<term>
speakers
</term>
prior
|
#17062
In addition, combination of the training speakers is done by averaging the statistics> of independently trained models rather than the usual pooling of all the speech data from many speakers prior to training. |
lr,37-4-P03-1058,bq |
advantage that
<term>
manually sense-tagged
|
data
|
</term>
have in their
<term>
sense coverage
|
#4908
On a subset of the most difficult SENSEVAL-2 nouns, the accuracy difference between the two approaches is only 14.0%, and the difference could narrow further to 6.5% if we disregard the advantage that manually sense-tagged data have in their sense coverage. |
lr,43-2-N03-2003,bq |
bigger performance gains from the
<term>
|
data
|
</term>
by using
<term>
class-dependent interpolation
|
#3071
In this paper, we show how training data can be supplemented with text from the web filtered to match the style and/or topic of the target recognition task, but also that it is possible to get bigger performance gains from thedata by using class-dependent interpolation of N-grams. |
lr,6-3-J05-4003,bq |
approach
</term>
, we extract
<term>
parallel
|
data
|
</term>
from large
<term>
Chinese , Arabic
|
#9032
Using this approach, we extract parallel data from large Chinese, Arabic, and English non-parallel newspaper corpora. |
lr,7-2-N03-2003,bq |
In this paper , we show how
<term>
training
|
data
|
</term>
can be supplemented with
<term>
text
|
#3036
In this paper, we show how training data can be supplemented with text from the web filtered to match the style and/or topic of the target recognition task, but also that it is possible to get bigger performance gains from the data by using class-dependent interpolation of N-grams. |
lr,7-4-N03-1012,bq |
<term>
system
</term>
against the
<term>
annotated
|
data
|
</term>
shows that , it successfully classifies
|
#2509
An evaluation of our system against the annotated data shows that, it successfully classifies 73.2% in a German corpus of 2.284 SRHs as either coherent or incoherent (given a baseline of 54.55%). |
lr,8-6-N01-1003,bq |
automatically learned from
<term>
training
|
data
|
</term>
. We show that the trained
<term>
SPR
|
#1431
The SPR uses ranking rules automatically learned from training data. |
lr,9-1-H05-2007,bq |
<term>
patterns
</term>
in
<term>
translation
|
data
|
</term>
using
<term>
part-of-speech tag sequences
|
#7638
We describe a method for identifying systematic patterns in translation data using part-of-speech tag sequences. |