|
database
</term>
and detect those automatically
|
which
|
is shown on a large
<term>
database
</term>
|
#152
Several extensions of this basic idea are being discussed and/or evaluated: Similar to activities one can define subsets of larger database and detect those automatically which is shown on a large database of TV shows. |
|
mixed-initiative speech dialogue interactions
</term>
|
which
|
reach beyond current capabilities in
<term>
|
#218
To support engaging human users in robust, mixed-initiative speech dialogue interactionswhich reach beyond current capabilities in dialogue systems, the DARPA Communicator program [1] is funding the development of a distributed message-passing infrastructure for dialogue systems which all Communicator participants are using. |
|
evaluation
</term>
of the
<term>
system
</term>
,
|
which
|
while broadly positive indicates further
|
#359
We also report results of a preliminary, qualitative user evaluation of the system, which while broadly positive indicates further work needs to be done on the interface to make users aware of the increased potential of IE-enhanced text browsers. |
|
Even more illuminating was the factors on
|
which
|
the
<term>
assessors
</term>
made their decisions
|
#655
Even more illuminating was the factors on which the assessors made their decisions. |
|
inter-related but distinct tasks , one of
|
which
|
is
<term>
sentence scoping
</term>
, i.e. the
|
#1306
Sentence planning is a set of inter-related but distinct tasks, one of which is sentence scoping, i.e. the choice of syntactic structure for elementary speech acts and the decision of how to combine them into one or more sentences. |
|
</term>
has revealed many attractive properties
|
which
|
may be used in
<term>
NLP
</term>
. In particular
|
#1614
The theoretical study of the range concatenation grammar [RCG] formalism has revealed many attractive properties which may be used in NLP. |
|
</term>
called
<term>
alternative markers
</term>
,
|
which
|
includes
<term>
other ( than )
</term>
,
<term>
|
#1831
This paper presents a formal analysis for a large class of words called alternative markers, which includes other (than), such (as), and besides. |
|
connection to
<term>
Montague semantics
</term>
|
which
|
can be viewed as a
<term>
formal computation
|
#1999
Here we emphasize the connection to Montague semanticswhich can be viewed as a formal computation of the logical form. |
|
WH-questions
</term>
. These
<term>
models
</term>
,
|
which
|
are built from
<term>
shallow linguistic
|
#2146
These models, which are built from shallow linguistic features of questions, are employed to predict target variables which represent a user's informational goals. |
|
multi-source approach to question answering
</term>
|
which
|
is based on combining the results from
|
#2334
Motivated by the success of ensemble methods in machine learning and other areas of natural language processing, we developed a multi-strategy and multi-source approach to question answeringwhich is based on combining the results from different answering agents searching for answers in multiple corpora. |
|
word-based models
</term>
. Our empirical results ,
|
which
|
hold for all examined
<term>
language pairs
|
#2592
Our empirical results, which hold for all examined language pairs, suggest that the highest levels of performance can be obtained through relatively simple means: heuristic learning of phrase translations from word-based alignments and lexical weighting of phrase translations. |
|
<term>
predicate argument structures
</term>
,
|
which
|
is central to our
<term>
IE paradigm
</term>
|
#3742
We also introduce a new way of automatically identifying predicate argument structures, which is central to our IE paradigm. |
|
data
</term>
. We describe a new approach
|
which
|
involves clustering
<term>
subcategorization
|
#3907
We describe a new approach which involves clustering subcategorization frame (SCF) distributions using the Information Bottleneck and nearest neighbour methods. |
|
English-Chinese parallel corpora
</term>
,
|
which
|
are then used for disambiguating the
<term>
|
#4840
In this paper, we evaluate an approach to automatically acquire sense-tagged training data from English-Chinese parallel corpora, which are then used for disambiguating the nouns in the SENSEVAL-2 English lexical sample task. |
|
that the
<term>
features
</term>
in terms of
|
which
|
we formulate our
<term>
heuristic principles
|
#5249
The results show that the features in terms of which we formulate our heuristic principles have significant predictive power, and that rules that closely resemble our Horn clauses can be learnt automatically from these features. |
|
selection function
</term>
is presented ,
|
which
|
yields superior
<term>
feature vectors
</term>
|
#5368
Finally, a novel feature weighting and selection function is presented, which yields superior feature vectors and better word similarity performance. |
|
this paper is the first step in a project
|
which
|
aims to cluster and summarise
<term>
electronic
|
#5392
The work presented in this paper is the first step in a project which aims to cluster and summarise electronic discussions in the context of help-desk applications. |
|
WSD ) system
</term>
for
<term>
Dutch
</term>
|
which
|
combines
<term>
statistical classification
|
#5997
In this paper, we present a corpus-based supervised word sense disambiguation (WSD) system for Dutchwhich combines statistical classification (maximum entropy) with linguistic information. |
|
comparison with previous
<term>
models
</term>
,
|
which
|
either use arbitrary
<term>
windows
</term>
|
#6356
In comparison with previous models, which either use arbitrary windows to compute similarity between words or use lexical affinity to create sequential models, in this paper we focus on models intended to capture the co-occurrence patterns of any pair of words or phrases at any distance in the corpus. |
|
</term>
, a
<term>
probabilistic model
</term>
|
which
|
has performed well on
<term>
information
|
#6832
The information extraction system we evaluate is based on a linear-chain conditional random field (CRF), a probabilistic modelwhich has performed well on information extraction tasks because of its ability to capture arbitrary, overlapping features of the input in a Markov model. |