|
One of the distinguishing features of a more
<term>
linguistically sophisticated representation of documents
</term>
over a
<term>
word set based representation
</term>
of them is that
<term>
linguistically sophisticated units
</term>
are more frequently individually good predictors of
<term>
document descriptors ( keywords )
</term>
than single
<term>
words
</term>
are .
This
leads us to consider the assignment of
<term>
descriptors
</term>
from individual
<term>
phrases
</term>
rather than from the
<term>
weighted sum
</term>
of a
<term>
word set representation
</term>
.
|
#20045
One of the distinguishing features of a more linguistically sophisticated representation of documents over a word set based representation of them is that linguistically sophisticated units are more frequently individually good predictors of document descriptors (keywords) than single words are. This leads us to consider the assignment of descriptors from individual phrases rather than from the weighted sum of a word set representation. |
|
It therefore shows that
<term>
statistical systems
</term>
can exploit
<term>
sophisticated representations of documents
</term>
, and lends some support to the use of more
<term>
linguistically sophisticated representations
</term>
for
<term>
document classification
</term>
.
This
paper reports on work done for the
<term>
LRE project SmTA double check
</term>
, which is creating a
<term>
PC based tool
</term>
to be used in the
<term>
technical abstracting industry
</term>
.
|
#20163
It therefore shows that statistical systems can exploit sophisticated representations of documents, and lends some support to the use of more linguistically sophisticated representations for document classification. This paper reports on work done for the LRE project SmTA double check, which is creating a PC based tool to be used in the technical abstracting industry. |
|
This paper reports on work done for the
<term>
LRE project SmTA double check
</term>
, which is creating a
<term>
PC based tool
</term>
to be used in the
<term>
technical abstracting industry
</term>
.
This
paper proposes a model using
<term>
associative processors ( APs )
</term>
for
<term>
real-time spoken language translation
</term>
.
|
#20193
This paper reports on work done for the LRE project SmTA double check, which is creating a PC based tool to be used in the technical abstracting industry. This paper proposes a model using associative processors (APs) for real-time spoken language translation. |
|
We have already proposed a model ,
<term>
TDMT ( Transfer-Driven Machine Translation )
</term>
, that translates a
<term>
sentence
</term>
utilizing examples effectively and performs accurate
<term>
structural disambiguation
</term>
and
<term>
target word selection
</term>
.
This
paper will concentrate on the second requirement .
|
#20259
We have already proposed a model, TDMT (Transfer-Driven Machine Translation), that translates a sentence utilizing examples effectively and performs accurate structural disambiguation and target word selection. This paper will concentrate on the second requirement. |
|
It is critical , therefore , for
<term>
Japanese revision support systems
</term>
to detect and to correct
<term>
homophone errors
</term>
.
This
paper proposes a method for detecting and correcting
<term>
Japanese homophone errors
</term>
in
<term>
compound nouns
</term>
.
|
#20404
It is critical, therefore, for Japanese revision support systems to detect and to correct homophone errors. This paper proposes a method for detecting and correcting Japanese homophone errors in compound nouns. |
|
This paper proposes a method for detecting and correcting
<term>
Japanese homophone errors
</term>
in
<term>
compound nouns
</term>
.
This
method can not only detect
<term>
Japanese homophone errors
</term>
in
<term>
compound nouns
</term>
, but also can find the correct candidates for the detected errors automatically .
|
#20420
This paper proposes a method for detecting and correcting Japanese homophone errors in compound nouns. This method can not only detect Japanese homophone errors in compound nouns, but also can find the correct candidates for the detected errors automatically. |
|
Finding the correct candidates is one superiority of
this
method over existing methods .
|
#20454
Finding the correct candidates is one superiority of this method over existing methods. |
|
The basic idea of
this
method is that a
<term>
compound noun
</term>
component places some restrictions on the
<term>
semantic categories
</term>
of the
<term>
adjoining words
</term>
.
|
#20464
The basic idea of this method is that a compound noun component places some restrictions on the semantic categories of the adjoining words. |
|
<term>
Simulated annealing approach
</term>
is used to implement
this
<term>
alignment algorithm
</term>
.
|
#20584
Simulated annealing approach is used to implement this alignment algorithm. |
|
In order to judge three types of the
<term>
errors
</term>
, which are characters wrongly substituted , deleted or inserted in a
<term>
Japanese bunsetsu
</term>
and an
<term>
English word
</term>
, and to correct these
<term>
errors
</term>
,
this
paper proposes new methods using
<term>
m-th order Markov chain model
</term>
for
<term>
Japanese kanji-kana characters
</term>
and
<term>
English alphabets
</term>
, assuming that
<term>
Markov probability
</term>
of a correct chain of
<term>
syllables
</term>
or
<term>
kanji-kana characters
</term>
is greater than that of
<term>
erroneous chains
</term>
.
|
#20675
In order to judge three types of the errors, which are characters wrongly substituted, deleted or inserted in a Japanese bunsetsu and an English word, and to correct these errors, this paper proposes new methods using m-th order Markov chain model for Japanese kanji-kana characters and English alphabets, assuming that Markov probability of a correct chain of syllables or kanji-kana characters is greater than that of erroneous chains. |
|
From the results of the experiments , it is concluded that the methods is useful for detecting as well as correcting these errors in
<term>
Japanese bunsetsu
</term>
and
<term>
English words
</term>
.
This
paper describes the enhancements made , within a
<term>
unification framework
</term>
, based on
<term>
typed feature structures
</term>
, in order to support linking of
<term>
lexical entries
</term>
to their
<term>
translation equivalents
</term>
.
|
#20745
From the results of the experiments, it is concluded that the methods is useful for detecting as well as correcting these errors in Japanese bunsetsu and English words. This paper describes the enhancements made, within a unification framework, based on typed feature structures, in order to support linking of lexical entries to their translation equivalents. |
|
To help
this
task we have developed an
<term>
interactive environment
</term>
:
<term>
TGE
</term>
.
|
#20778
To help this task we have developed an interactive environment: TGE. |
|
<term>
Chart-like parsing
</term>
and
<term>
semantic-head-driven generation
</term>
emerge from
this
method .
|
#20923
Chart-like parsing and semantic-head-driven generation emerge from this method. |
|
Despite the large amount of theoretical work done on
<term>
non-constituent coordination
</term>
during the last two decades , many computational systems still treat
<term>
coordination
</term>
using adapted
<term>
parsing strategies
</term>
, in a similar fashion to the
<term>
SYSCONJ system
</term>
developed for
<term>
ATNs
</term>
.
This
paper reviews the theoretical literature , and shows why many of the theoretical accounts actually have worse coverage than accounts based on processing .
|
#21167
Despite the large amount of theoretical work done on non-constituent coordination during the last two decades, many computational systems still treat coordination using adapted parsing strategies, in a similar fashion to the SYSCONJ system developed for ATNs. This paper reviews the theoretical literature, and shows why many of the theoretical accounts actually have worse coverage than accounts based on processing. |
|
Finally , it shows how processing accounts can be described formally and declaratively in terms of
<term>
Dynamic Grammars
</term>
.
This
paper introduces a simple mixture
<term>
language model
</term>
that attempts to capture
<term>
long distance constraints
</term>
in a
<term>
sentence
</term>
or
<term>
paragraph
</term>
.
|
#21211
Finally, it shows how processing accounts can be described formally and declaratively in terms of Dynamic Grammars. This paper introduces a simple mixture language model that attempts to capture long distance constraints in a sentence or paragraph. |
|
Using the
<term>
BU recognition system
</term>
, experiments show a 7 % improvement in
<term>
recognition accuracy
</term>
with the
<term>
mixture trigram models
</term>
as compared to using a
<term>
trigram model
</term>
.
This
paper describes a method of
<term>
detecting speechrepairs
</term>
that uses a
<term>
part-of-speech tagger
</term>
.
|
#21291
Using the BU recognition system, experiments show a 7% improvement in recognition accuracy with the mixture trigram models as compared to using a trigram model. This paper describes a method of detecting speechrepairs that uses a part-of-speech tagger. |
|
Our
<term>
document understanding technology
</term>
is implemented in a system called
<term>
IDUS ( Intelligent Document Understanding System )
</term>
, which creates the data for a
<term>
text retrieval application
</term>
and the
<term>
automatic generation of hypertext links
</term>
.
This
paper summarizes the areas of research during
<term>
IDUS
</term>
development where we have found the most benefit from the
<term>
integration
</term>
of
<term>
image and text understanding
</term>
.
|
#21423
Our document understanding technology is implemented in a system called IDUS (Intelligent Document Understanding System), which creates the data for a text retrieval application and the automatic generation of hypertext links. This paper summarizes the areas of research during IDUS development where we have found the most benefit from the integration of image and text understanding. |