tech,9-1-H01-1049,bq |
paradigm for
<term>
human interaction with
|
data
|
sources
</term>
. We integrate a
<term>
spoken
|
#792
Listen-Communicate-Show (LCS) is a new paradigm for human interaction with data sources. |
lr,8-6-N01-1003,bq |
automatically learned from
<term>
training
|
data
|
</term>
. We show that the trained
<term>
SPR
|
#1431
The SPR uses ranking rules automatically learned from training data. |
lr,19-2-P01-1004,bq |
both
<term>
character - and word-segmented
|
data
|
</term>
, in combination with a range of
<term>
|
#1511
We take a selection of both bag-of-words and segment order-sensitive string comparison methods, and run each over both character- and word-segmented data, in combination with a range of local segment contiguity models (in the form of N-grams). |
other,34-2-P01-1047,bq |
learning algorithm
</term>
from
<term>
structured
|
data
|
</term>
( based on a
<term>
typing-algorithm
|
#1981
Our logical definition leads to a neat relation to categorial grammar, (yielding a treatment of Montague semantics), a parsing-as-deduction in a resource sensitive logic, and a learning algorithm from structured data (based on a typing-algorithm and type-unification). |
lr,15-1-N03-1001,bq |
manual transcription
</term>
of
<term>
training
|
data
|
</term>
. The method combines
<term>
domain
|
#2221
This paper describes a method for utterance classification that does not require manual transcription of training data. |
lr,7-4-N03-1012,bq |
<term>
system
</term>
against the
<term>
annotated
|
data
|
</term>
shows that , it successfully classifies
|
#2509
An evaluation of our system against the annotated data shows that, it successfully classifies 73.2% in a German corpus of 2.284 SRHs as either coherent or incoherent (given a baseline of 54.55%). |
lr,2-1-N03-2003,bq |
</term>
result . Sources of
<term>
training
|
data
|
</term>
suitable for
<term>
language modeling
|
#3017
Sources of training data suitable for language modeling of conversational speech are limited. |
lr,7-2-N03-2003,bq |
In this paper , we show how
<term>
training
|
data
|
</term>
can be supplemented with
<term>
text
|
#3036
In this paper, we show how training data can be supplemented with text from the web filtered to match the style and/or topic of the target recognition task, but also that it is possible to get bigger performance gains from the data by using class-dependent interpolation of N-grams. |
lr,43-2-N03-2003,bq |
bigger performance gains from the
<term>
|
data
|
</term>
by using
<term>
class-dependent interpolation
|
#3071
In this paper, we show how training data can be supplemented with text from the web filtered to match the style and/or topic of the target recognition task, but also that it is possible to get bigger performance gains from thedata by using class-dependent interpolation of N-grams. |
other,27-1-N03-4004,bq |
inflow of
<term>
multilingual , multimedia
|
data
|
</term>
. It gives users the ability to spend
|
#3602
The TAP-XL Automated Analyst's Assistant is an application designed to help an English-speaking analyst write a topical report, culling information from a large inflow of multilingual, multimedia data. |
|
ability to spend their time finding more
|
data
|
relevant to their task , and gives them
|
#3615
It gives users the ability to spend their time finding more data relevant to their task, and gives them translingual reach into other languages by leveraging human language technology. |
other,15-3-N03-4010,bq |
browsing the
<term>
repository
</term>
of
<term>
|
data
|
objects
</term>
created by the
<term>
system
|
#3699
The operation of the system will be explained in depth through browsing the repository ofdata objects created by the system during each question answering session. |
other,13-1-P03-1005,bq |
</term>
for
<term>
structured natural language
|
data
|
</term>
. The
<term>
HDAG Kernel
</term>
directly
|
#3805
This paper proposes the Hierarchical Directed Acyclic Graph (HDAG) Kernel for structured natural language data. |
other,15-1-P03-1009,bq |
classes
</term>
from undisambiguated
<term>
corpus
|
data
|
</term>
. We describe a new approach which
|
#3900
Previous research has demonstrated the utility of clustering in inducing semantic verb classes from undisambiguated corpus data. |
other,30-4-P03-1009,bq |
classifying
</term><term>
undisambiguated SCF
|
data
|
</term>
. We apply a
<term>
decision tree based
|
#3971
A novel evaluation scheme is proposed which accounts for the effect of polysemy on the clusters, offering us a good insight into the potential and limitations of semantically classifying undisambiguated SCF data. |
lr,13-4-P03-1033,bq |
learning
</term>
using real
<term>
dialogue
|
data
|
</term>
collected by the
<term>
system
</term>
|
#4365
Moreover, the models are automatically derived by decision tree learning using real dialogue data collected by the system. |
lr,14-1-P03-1058,bq |
is the lack of
<term>
manually sense-tagged
|
data
|
</term>
required for
<term>
supervised learning
|
#4815
A central problem of word sense disambiguation (WSD) is the lack of manually sense-tagged data required for supervised learning. |
lr,11-2-P03-1058,bq |
automatically acquire
<term>
sense-tagged training
|
data
|
</term>
from
<term>
English-Chinese parallel
|
#4834
In this paper, we evaluate an approach to automatically acquire sense-tagged training data from English-Chinese parallel corpora, which are then used for disambiguating the nouns in the SENSEVAL-2 English lexical sample task. |
tech,5-3-P03-1058,bq |
this
<term>
method of acquiring sense-tagged
|
data
|
</term>
is promising . On a subset of the
|
#4865
Our investigation reveals that this method of acquiring sense-tagged data is promising. |
lr,37-4-P03-1058,bq |
advantage that
<term>
manually sense-tagged
|
data
|
</term>
have in their
<term>
sense coverage
|
#4908
On a subset of the most difficult SENSEVAL-2 nouns, the accuracy difference between the two approaches is only 14.0%, and the difference could narrow further to 6.5% if we disregard the advantage that manually sense-tagged data have in their sense coverage. |