|
</term>
,
<term>
verb group
</term>
, and so on
|
which
|
is inherently extensible to more sophisticated
|
#19983
A novel method for adding linguistic annotation to corpora is presented which involves using a statistical POS tagger in conjunction with unsupervised structure finding methods to derive notions of noun group, verb group, and so on which is inherently extensible to more sophisticated annotation, and does not require a pre-tagged corpus to fit. |
|
</term>
. We then proceed to repeat results
|
which
|
show that standard
<term>
statistical models
|
#20102
We then proceed to repeat results which show that standard statistical models are not particularly suitable for exploiting linguistically sophisticated representations, and show that a statistically fitted rule-based model provides significantly improved performance for sophisticated representations. |
|
<term>
LRE project SmTA double check
</term>
,
|
which
|
is creating a
<term>
PC based tool
</term>
|
#20177
This paper reports on work done for the LRE project SmTA double check, which is creating a PC based tool to be used in the technical abstracting industry. |
|
objects are
<term>
Chinese-English texts
</term>
,
|
which
|
are selected from different
<term>
language
|
#20605
Most importantly, the experimental objects are Chinese-English texts, which are selected from different language families. |
|
difficult to detect
<term>
error characters
</term>
|
which
|
are wrongly deleted and inserted . In order
|
#20634
In optical character recognition and continuous speech recognition of a natural language, it has been difficult to detect error characterswhich are wrongly deleted and inserted. |
|
judge three types of the
<term>
errors
</term>
,
|
which
|
are characters wrongly substituted , deleted
|
#20651
In order to judge three types of the errors, which are characters wrongly substituted, deleted or inserted in a Japanese bunsetsu and an English word, and to correct these errors, this paper proposes new methods using m-th order Markov chain model for Japanese kanji-kana characters and English alphabets, assuming that Markov probability of a correct chain of syllables or kanji-kana characters is greater than that of erroneous chains. |
|
documents
</term>
and the variety of applications
|
which
|
must be supported ,
<term>
document understanding
|
#21371
Because of the complexity of documents and the variety of applications which must be supported, document understanding requires the integration of image understanding with text understanding. |
|
Document Understanding System )
</term>
,
|
which
|
creates the data for a
<term>
text retrieval
|
#21406
Our document understanding technology is implemented in a system called IDUS (Intelligent Document Understanding System), which creates the data for a text retrieval application and the automatic generation of hypertext links. |