|
At MIT Lincoln Laboratory , we have
been
developing a
<term>
Korean-to-English machine translation system
</term><term>
CCLINC ( Common Coalition Language System at Lincoln Laboratory )
</term>
.
|
#393
At MIT Lincoln Laboratory, we have been developing a Korean-to-English machine translation system CCLINC (Common Coalition Language System at Lincoln Laboratory). |
|
Having
been
trained on
<term>
Korean newspaper articles
</term>
on missiles and chemical biological warfare , the
<term>
system
</term>
produces the
<term>
translation output
</term>
sufficient for content understanding of the
<term>
original document
</term>
.
|
#516
Having been trained on Korean newspaper articles on missiles and chemical biological warfare, the system produces the translation output sufficient for content understanding of the original document. |
|
The issue of
<term>
system response
</term>
to
<term>
users
</term>
has
been
extensively studied by the
<term>
natural language generation community
</term>
, though rarely in the context of
<term>
dialog systems
</term>
.
|
#976
The issue of system response to users has been extensively studied by the natural language generation community, though rarely in the context of dialog systems. |
|
The
<term>
oracle
</term>
knows the
<term>
reference word string
</term>
and selects the
<term>
word string
</term>
with the best
<term>
performance
</term>
( typically ,
<term>
word or semantic error rate
</term>
) from a list of
<term>
word strings
</term>
, where each
<term>
word string
</term>
has
been
obtained by using a different
<term>
LM
</term>
.
|
#1107
The oracle knows the reference word string and selects the word string with the best performance (typically, word or semantic error rate) from a list of word strings, where each word string has been obtained by using a different LM. |
|
<term>
Techniques for automatically training
</term>
modules of a
<term>
natural language generator
</term>
have recently
been
proposed , but a fundamental concern is whether the
<term>
quality
</term>
of
<term>
utterances
</term>
produced with
<term>
trainable components
</term>
can compete with
<term>
hand-crafted template-based or rule-based approaches
</term>
.
|
#2024
Techniques for automatically training modules of a natural language generator have recently been proposed, but a fundamental concern is whether the quality of utterances produced with trainable components can compete with hand-crafted template-based or rule-based approaches. |
|
<term>
Link detection
</term>
has
been
regarded as a core technology for the
<term>
Topic Detection and Tracking tasks
</term>
of
<term>
new event detection
</term>
.
|
#4045
Link detection has been regarded as a core technology for the Topic Detection and Tracking tasks of new event detection. |
|
<term>
Dialogue strategies
</term>
based on the
<term>
user modeling
</term>
are implemented in
<term>
Kyoto city bus information system
</term>
that has
been
developed at our laboratory .
|
#4397
Dialogue strategies based on the user modeling are implemented in Kyoto city bus information system that has been developed at our laboratory. |
|
Along the way , we present the first comprehensive comparison of
<term>
unsupervised methods for part-of-speech tagging
</term>
, noting that published results to date have not
been
comparable across
<term>
corpora
</term>
or
<term>
lexicons
</term>
.
|
#5551
Along the way, we present the first comprehensive comparison of unsupervised methods for part-of-speech tagging, noting that published results to date have not been comparable across corpora or lexicons. |
|
While
<term>
sentence extraction
</term>
as an approach to
<term>
summarization
</term>
has
been
shown to work in
<term>
documents
</term>
of certain
<term>
genres
</term>
, because of the conversational nature of
<term>
email communication
</term>
where
<term>
utterances
</term>
are made in relation to one made previously ,
<term>
sentence extraction
</term>
may not capture the necessary
<term>
segments
</term>
of
<term>
dialogue
</term>
that would make a
<term>
summary
</term>
coherent .
|
#6211
While sentence extraction as an approach to summarization has been shown to work in documents of certain genres, because of the conversational nature of email communication where utterances are made in relation to one made previously, sentence extraction may not capture the necessary segments of dialogue that would make a summary coherent. |
|
This paper investigates some
<term>
computational problems
</term>
associated with
<term>
probabilistic translation models
</term>
that have recently
been
adopted in the literature on
<term>
machine translation
</term>
.
|
#7448
This paper investigates some computational problems associated with probabilistic translation models that have recently been adopted in the literature on machine translation. |
|
Much effort has
been
put in designing and evaluating dedicated
<term>
word sense disambiguation ( WSD ) models
</term>
, in particular with the
<term>
Senseval
</term>
series of workshops .
|
#7832
Much effort has been put in designing and evaluating dedicated word sense disambiguation (WSD) models, in particular with the Senseval series of workshops. |
|
Surprisingly however , the
<term>
WSD
</term><term>
accuracy
</term>
of
<term>
SMT models
</term>
has never
been
evaluated and compared with that of the dedicated
<term>
WSD models
</term>
.
|
#7905
Surprisingly however, the WSD accuracy of SMT models has never been evaluated and compared with that of the dedicated WSD models. |
|
Over the last few years dramatic improvements have
been
made , and a number of comparative evaluations have shown , that
<term>
SMT
</term>
gives competitive results to
<term>
rule-based translation systems
</term>
, requiring significantly less development time .
|
#8013
Over the last few years dramatic improvements have been made, and a number of comparative evaluations have shown, that SMT gives competitive results to rule-based translation systems, requiring significantly less development time. |
|
<term>
STTK
</term>
has
been
developed by the presenter and co-workers over a number of years and is currently used as the basis of
<term>
CMU 's SMT system
</term>
.
|
#8143
STTK has been developed by the presenter and co-workers over a number of years and is currently used as the basis of CMU's SMT system. |
|
It has also successfully
been
coupled with
<term>
rule-based and example based machine translation modules
</term>
to build a
<term>
multi engine machine translation system
</term>
.
|
#8172
It has also successfully been coupled with rule-based and example based machine translation modules to build a multi engine machine translation system. |
|
In this paper we study a set of problems that are of considerable importance to
<term>
Statistical Machine Translation ( SMT )
</term>
but which have not
been
addressed satisfactorily by the
<term>
SMT research community
</term>
.
|
#9947
In this paper we study a set of problems that are of considerable importance to Statistical Machine Translation (SMT) but which have not been addressed satisfactorily by the SMT research community. |
|
Over the last decade , a variety of
<term>
SMT algorithms
</term>
have
been
built and empirically tested whereas little is known about the
<term>
computational complexity
</term>
of some of the fundamental problems of
<term>
SMT
</term>
.
|
#9967
Over the last decade, a variety of SMT algorithms have been built and empirically tested whereas little is known about the computational complexity of some of the fundamental problems of SMT. |
|
The correlation of the new
<term>
measure
</term>
with
<term>
human judgment
</term>
has
been
investigated systematically on two different
<term>
language pairs
</term>
.
|
#10415
The correlation of the new measure with human judgment has been investigated systematically on two different language pairs. |
|
We first apply approaches that have
been
proposed for
<term>
predicting top-level topic shifts
</term>
to the problem of
<term>
identifying subtopic boundaries
</term>
.
|
#10494
We first apply approaches that have been proposed for predicting top-level topic shifts to the problem of identifying subtopic boundaries. |
|
As evidence of its usefulness and usability , it has
been
used successfully in a research context to uncover relationships between
<term>
language
</term>
and
<term>
behavioral patterns
</term>
in two distinct domains :
<term>
tutorial dialogue
</term>
( Kumar et al. , submitted ) and
<term>
on-line communities
</term>
( Arguello et al. , 2006 ) .
|
#10904
As evidence of its usefulness and usability, it has been used successfully in a research context to uncover relationships between language and behavioral patterns in two distinct domains: tutorial dialogue (Kumar et al., submitted) and on-line communities (Arguello et al., 2006). |