One of the distinguishing features of a more <term> linguistically sophisticated representation of documents </term> over a <term> word set based representation </term> of them is that <term> linguistically sophisticated units </term> are more frequently individually good predictors of <term> document descriptors ( keywords ) </term> than single <term> words </term> are . This leads us to consider the assignment of <term> descriptors </term> from individual <term> phrases </term> rather than from the <term> weighted sum </term> of a <term> word set representation </term> .
It therefore shows that <term> statistical systems </term> can exploit <term> sophisticated representations of documents </term> , and lends some support to the use of more <term> linguistically sophisticated representations </term> for <term> document classification </term> . This paper reports on work done for the <term> LRE project SmTA double check </term> , which is creating a <term> PC based tool </term> to be used in the <term> technical abstracting industry </term> .
This paper reports on work done for the <term> LRE project SmTA double check </term> , which is creating a <term> PC based tool </term> to be used in the <term> technical abstracting industry </term> . This paper proposes a model using <term> associative processors ( APs ) </term> for <term> real-time spoken language translation </term> .
We have already proposed a model , <term> TDMT ( Transfer-Driven Machine Translation ) </term> , that translates a <term> sentence </term> utilizing examples effectively and performs accurate <term> structural disambiguation </term> and <term> target word selection </term> . This paper will concentrate on the second requirement .
It is critical , therefore , for <term> Japanese revision support systems </term> to detect and to correct <term> homophone errors </term> . This paper proposes a method for detecting and correcting <term> Japanese homophone errors </term> in <term> compound nouns </term> .
This paper proposes a method for detecting and correcting <term> Japanese homophone errors </term> in <term> compound nouns </term> . This method can not only detect <term> Japanese homophone errors </term> in <term> compound nouns </term> , but also can find the correct candidates for the detected errors automatically .
Finding the correct candidates is one superiority of this method over existing methods .
The basic idea of this method is that a <term> compound noun </term> component places some restrictions on the <term> semantic categories </term> of the <term> adjoining words </term> .
<term> Simulated annealing approach </term> is used to implement this <term> alignment algorithm </term> .
In order to judge three types of the <term> errors </term> , which are characters wrongly substituted , deleted or inserted in a <term> Japanese bunsetsu </term> and an <term> English word </term> , and to correct these <term> errors </term> , this paper proposes new methods using <term> m-th order Markov chain model </term> for <term> Japanese kanji-kana characters </term> and <term> English alphabets </term> , assuming that <term> Markov probability </term> of a correct chain of <term> syllables </term> or <term> kanji-kana characters </term> is greater than that of <term> erroneous chains </term> .
From the results of the experiments , it is concluded that the methods is useful for detecting as well as correcting these errors in <term> Japanese bunsetsu </term> and <term> English words </term> . This paper describes the enhancements made , within a <term> unification framework </term> , based on <term> typed feature structures </term> , in order to support linking of <term> lexical entries </term> to their <term> translation equivalents </term> .
To help this task we have developed an <term> interactive environment </term> : <term> TGE </term> .
<term> Chart-like parsing </term> and <term> semantic-head-driven generation </term> emerge from this method .
Despite the large amount of theoretical work done on <term> non-constituent coordination </term> during the last two decades , many computational systems still treat <term> coordination </term> using adapted <term> parsing strategies </term> , in a similar fashion to the <term> SYSCONJ system </term> developed for <term> ATNs </term> . This paper reviews the theoretical literature , and shows why many of the theoretical accounts actually have worse coverage than accounts based on processing .
Finally , it shows how processing accounts can be described formally and declaratively in terms of <term> Dynamic Grammars </term> . This paper introduces a simple mixture <term> language model </term> that attempts to capture <term> long distance constraints </term> in a <term> sentence </term> or <term> paragraph </term> .
Using the <term> BU recognition system </term> , experiments show a 7 % improvement in <term> recognition accuracy </term> with the <term> mixture trigram models </term> as compared to using a <term> trigram model </term> . This paper describes a method of <term> detecting speechrepairs </term> that uses a <term> part-of-speech tagger </term> .
Our <term> document understanding technology </term> is implemented in a system called <term> IDUS ( Intelligent Document Understanding System ) </term> , which creates the data for a <term> text retrieval application </term> and the <term> automatic generation of hypertext links </term> . This paper summarizes the areas of research during <term> IDUS </term> development where we have found the most benefit from the <term> integration </term> of <term> image and text understanding </term> .
hide detail