|
Communicator
</term>
participants are using .
|
In
|
this presentation , we describe the features
|
#253
To support engaging human users in robust, mixed-initiative speech dialogue interactions which reach beyond current capabilities in dialogue systems, the DARPA Communicator program [1] is funding the development of a distributed message-passing infrastructure for dialogue systems which all Communicator participants are using. In this presentation, we describe the features of and requirements for a genuinely useful software infrastructure for this purpose. |
|
infrastructure
</term>
for this purpose .
|
In
|
this paper we show how two standard outputs
|
#274
In this presentation, we describe the features of and requirements for a genuinely useful software infrastructure for this purpose. In this paper we show how two standard outputs from information extraction (IE) systems - named entity annotations and scenario templates - can be used to enhance access to text collections via a standard text browser. |
|
<term>
machine learning techniques
</term>
.
|
In
|
this paper , we address the problem of
|
#1027
We show how research in generation can be adapted to dialog systems, and how the high cost of hand-crafting knowledge-based generation systems can be overcome by employing machine learning techniques. In this paper, we address the problem of combining several language models (LMs). |
|
into one or more
<term>
sentences
</term>
.
|
In
|
this paper , we present
<term>
SPoT
</term>
|
#1335
Sentence planning is a set of inter-related but distinct tasks, one of which is sentence scoping, i.e. the choice of syntactic structure for elementary speech acts and the decision of how to combine them into one or more sentences. In this paper, we present SPoT, a sentence planner, and a new methodology for automatically training SPoT on the basis of feedback provided by human judges. |
|
<term>
top human-ranked sentence plan
</term>
.
|
In
|
this paper , we compare the relative effects
|
#1461
We show that the trained SPR learns to select a sentence plan whose rating on average is only 5% worse than the top human-ranked sentence plan. In this paper, we compare the relative effects of segment order, segmentation and segment contiguity on the retrieval performance of a translation memory system. |
|
properties which may be used in
<term>
NLP
</term>
.
|
In
|
particular ,
<term>
range concatenation languages
|
#1621
The theoretical study of the range concatenation grammar [RCG] formalism has revealed many attractive properties which may be used in NLP. In particular, range concatenation languages [RCL] can be parsed in polynomial time and many classical grammatical formalisms can be translated into equivalent RCGs without increasing their worst-case parsing time complexity. |
|
be parsed in
<term>
O ( n6 ) time
</term>
.
|
In
|
this paper , we study a
<term>
parsing technique
|
#1679
For example, after translation into an equivalent RCG, any tree adjoining grammar can be parsed in O(n6) time. In this paper, we study a parsing technique whose purpose is to improve the practical efficiency of RCL parsers. |
|
template-based or rule-based approaches
</term>
.
|
In
|
this paper We experimentally evaluate a
|
#2050
Techniques for automatically training modules of a natural language generator have recently been proposed, but a fundamental concern is whether the quality of utterances produced with trainable components can compete with hand-crafted template-based or rule-based approaches. In this paper We experimentally evaluate a trainable sentence planner for a spoken dialogue system by eliciting subjective human judgments. |
|
<term>
subjective human judgments
</term>
.
|
In
|
order to perform an exhaustive comparison
|
#2071
In this paper We experimentally evaluate a trainable sentence planner for a spoken dialogue system by eliciting subjective human judgments. In order to perform an exhaustive comparison, we also evaluate a hand-crafted template-based generation component, two rule-based sentence planners, and two baseline sentence planners. |
|
requiring
<term>
manual transcription
</term>
.
|
In
|
our method ,
<term>
unsupervised training
|
#2255
The method combines domain independent acoustic models with off-the-shelf classifiers to give utterance classification performance that is surprisingly close to what can be achieved using conventional word-trigram recognition requiring manual transcription. In our method, unsupervised training is first used to train a phone n-gram model for a particular domain; the output of recognition with this model is then passed to a phone-string classifier. |
|
the
<term>
average precision metric
</term>
.
|
In
|
this paper we present
<term>
ONTOSCORE
</term>
|
#2435
Experiments evaluating the effectiveness of our answer resolution algorithm show a 35.0% relative improvement over our baseline system in the number of questions correctly answered, and a 32.8% improvement according to the average precision metric. In this paper we present ONTOSCORE, a system for scoring sets of concepts on the basis of an ontology. |
|
performance of our
<term>
systems
</term>
.
|
In
|
this paper , we introduce a
<term>
generative
|
#2667
Learning only syntactically motivated phrases degrades the performance of our systems. In this paper, we introduce a generative probabilistic optical character recognition (OCR) model that describes an end-to-end process in the noisy channel framework, progressing from generation of true text through its transformation into the noisy output of an OCR system. |
|
conversational speech
</term>
are limited .
|
In
|
this paper , we show how
<term>
training
|
#3028
Sources of training data suitable for language modeling of conversational speech are limited. In this paper, we show how training data can be supplemented with text from the web filtered to match the style and/or topic of the target recognition task, but also that it is possible to get bigger performance gains from the data by using class-dependent interpolation of N-grams. |
|
interpolation
</term>
of
<term>
N-grams
</term>
.
|
In
|
order to boost the
<term>
translation quality
|
#3079
In this paper, we show how training data can be supplemented with text from the web filtered to match the style and/or topic of the target recognition task, but also that it is possible to get bigger performance gains from the data by using class-dependent interpolation of N-grams. In order to boost the translation quality of EBMT based on a small-sized bilingual corpus, we use an out-of-domain bilingual corpus and, in addition, the language model of an in-domain monolingual corpus. |
|
performance for some
<term>
NE types
</term>
.
|
In
|
this paper , we describe a
<term>
phrase-based
|
#3389
The resulting NE system approaches supervised NE performance for some NE types. In this paper, we describe a phrase-based unigram model for statistical machine translation that uses a much simpler set of model parameters than similar phrase-based models. |
|
</term>
counts and
<term>
phrase
</term>
length .
|
In
|
this paper , we propose a novel
<term>
Cooperative
|
#3477
We show experimental results on block selection criteria based on unigram counts and phrase length. In this paper, we propose a novel Cooperative Model for natural language understanding in a dialogue system. |
|
<term>
question answering session
</term>
.
|
In
|
this paper we present a novel , customizable
|
#3711
The operation of the system will be explained in depth through browsing the repository of data objects created by the system during each question answering session. In this paper we present a novel, customizable IE paradigm that takes advantage of predicate-argument structures. |
|
and
<term>
nearest neighbour
</term>
methods .
|
In
|
contrast to previous work , we particularly
|
#3925
We describe a new approach which involves clustering subcategorization frame (SCF) distributions using the Information Bottleneck and nearest neighbour methods. In contrast to previous work, we particularly focus on clustering polysemic verbs. |
|
</term>
of
<term>
new event detection
</term>
.
|
In
|
this paper we formulate
<term>
story link
|
#4063
Link detection has been regarded as a core technology for the Topic Detection and Tracking tasks of new event detection. In this paper we formulate story link detection and new event detection as information retrieval task and hypothesize on the impact of precision and recall on both systems. |
|
required for
<term>
supervised learning
</term>
.
|
In
|
this paper , we evaluate an approach to
|
#4821
A central problem of word sense disambiguation (WSD) is the lack of manually sense-tagged data required for supervised learning. In this paper, we evaluate an approach to automatically acquire sense-tagged training data from English-Chinese parallel corpora, which are then used for disambiguating the nouns in the SENSEVAL-2 English lexical sample task. |