lr,19-2-P01-1004,bq |
both
<term>
character - and word-segmented
|
data
|
</term>
, in combination with a range of
<term>
|
#1511
We take a selection of both bag-of-words and segment order-sensitive string comparison methods, and run each over both character- and word-segmented data, in combination with a range of local segment contiguity models (in the form of N-grams). |
lr-prod,6-4-C04-1112,bq |
</term>
on the
<term>
Dutch SENSEVAL-2 test
|
data
|
</term>
, we achieve a significant increase
|
#6064
Testing the lemma-based model on the Dutch SENSEVAL-2 test data, we achieve a significant increase in accuracy over the wordform model. |
other,23-9-J05-1003,bq |
feature space
</term>
in the
<term>
parsing
|
data
|
</term>
. Experiments show significant efficiency
|
#8885
The article also introduces a new algorithm for the boosting approach which takes advantage of the sparsity of the feature space in the parsing data. |
other,13-1-P05-1067,bq |
statistical models
</term>
to
<term>
structured
|
data
|
</term>
. In this paper , we present a
<term>
|
#9422
Syntax-based statistical machine translation (MT) aims at applying statistical models to structured data. |
other,27-1-N03-4004,bq |
inflow of
<term>
multilingual , multimedia
|
data
|
</term>
. It gives users the ability to spend
|
#3602
The TAP-XL Automated Analyst's Assistant is an application designed to help an English-speaking analyst write a topical report, culling information from a large inflow of multilingual, multimedia data. |
other,15-1-C86-1132,bq |
forecasts directly from
<term>
formatted weather
|
data
|
</term>
. Such
<term>
synthesis
</term>
appears
|
#13931
This paper describes a system (RAREAS) which synthesizes marine weather forecasts directly from formatted weather data. Such synthesis appears feasible in certain natural sublanguages with stereotyped text structure. |
other,13-1-P03-1005,bq |
</term>
for
<term>
structured natural language
|
data
|
</term>
. The
<term>
HDAG Kernel
</term>
directly
|
#3805
This paper proposes the Hierarchical Directed Acyclic Graph (HDAG) Kernel for structured natural language data. |
lr,15-1-N03-1001,bq |
manual transcription
</term>
of
<term>
training
|
data
|
</term>
. The method combines
<term>
domain
|
#2221
This paper describes a method for utterance classification that does not require manual transcription of training data. |
lr,17-5-H05-1095,bq |
better generalize from the
<term>
training
|
data
|
</term>
. This paper investigates some
<term>
|
#7432
Experimental results are presented, that demonstrate how the proposed method allows to better generalize from the training data. |
other,30-4-P03-1009,bq |
classifying
</term><term>
undisambiguated SCF
|
data
|
</term>
. We apply a
<term>
decision tree based
|
#3971
A novel evaluation scheme is proposed which accounts for the effect of polysemy on the clusters, offering us a good insight into the potential and limitations of semantically classifying undisambiguated SCF data. |
lr,13-1-H05-1012,bq |
</term>
based on
<term>
supervised training
|
data
|
</term>
. We demonstrate that it is feasible
|
#7264
This paper presents a maximum entropy word alignment algorithm for Arabic-English based on supervised training data. |
other,15-1-P03-1009,bq |
classes
</term>
from undisambiguated
<term>
corpus
|
data
|
</term>
. We describe a new approach which
|
#3900
Previous research has demonstrated the utility of clustering in inducing semantic verb classes from undisambiguated corpus data. |
lr,8-6-N01-1003,bq |
automatically learned from
<term>
training
|
data
|
</term>
. We show that the trained
<term>
SPR
|
#1431
The SPR uses ranking rules automatically learned from training data. |
other,34-2-P01-1047,bq |
learning algorithm
</term>
from
<term>
structured
|
data
|
</term>
( based on a
<term>
typing-algorithm
|
#1981
Our logical definition leads to a neat relation to categorial grammar, (yielding a treatment of Montague semantics), a parsing-as-deduction in a resource sensitive logic, and a learning algorithm from structured data (based on a typing-algorithm and type-unification). |
other,5-2-C92-1055,bq |
the problem of
<term>
insufficient training
|
data
|
</term>
and
<term>
approximation error
</term>
|
#17828
Owing to the problem of insufficient training data and approximation error introduced by the language model, traditional statistical approaches, which resolve ambiguities by indirectly and implicitly using maximum likelihood method, fail to achieve high performance in real applications. |
measure(ment),3-4-J05-4003,bq |
evaluate the
<term>
quality of the extracted
|
data
|
</term>
by showing that it improves the performance
|
#9052
We evaluate the quality of the extracted data by showing that it improves the performance of a state-of-the-art statistical machine translation system. |
lr,43-2-N03-2003,bq |
bigger performance gains from the
<term>
|
data
|
</term>
by using
<term>
class-dependent interpolation
|
#3071
In this paper, we show how training data can be supplemented with text from the web filtered to match the style and/or topic of the target recognition task, but also that it is possible to get bigger performance gains from thedata by using class-dependent interpolation of N-grams. |
lr,7-2-N03-2003,bq |
In this paper , we show how
<term>
training
|
data
|
</term>
can be supplemented with
<term>
text
|
#3036
In this paper, we show how training data can be supplemented with text from the web filtered to match the style and/or topic of the target recognition task, but also that it is possible to get bigger performance gains from the data by using class-dependent interpolation of N-grams. |
lr,13-4-P03-1033,bq |
learning
</term>
using real
<term>
dialogue
|
data
|
</term>
collected by the
<term>
system
</term>
|
#4365
Moreover, the models are automatically derived by decision tree learning using real dialogue data collected by the system. |
|
Information System ) domain
</term>
. This
|
data
|
collection effort has been co-ordinated
|
#18545
This data collection effort has been co-ordinated by MADCOW (Multi-site ATIS Data COllection Working group). |