Concordance

Query which 108 (5,035.2 per million)

</term> , <term> verb group </term> , and so on	which	is inherently extensible to more sophisticated	#19983 A novel method for adding linguistic annotation to corpora is presented which involves using a statistical POS tagger in conjunction with unsupervised structure finding methods to derive notions of noun group, verb group, and so on which is inherently extensible to more sophisticated annotation, and does not require a pre-tagged corpus to fit.
</term> . We then proceed to repeat results	which	show that standard <term> statistical models	#20102 We then proceed to repeat results which show that standard statistical models are not particularly suitable for exploiting linguistically sophisticated representations, and show that a statistically fitted rule-based model provides significantly improved performance for sophisticated representations.
<term> LRE project SmTA double check </term> ,	which	is creating a <term> PC based tool </term>	#20177 This paper reports on work done for the LRE project SmTA double check, which is creating a PC based tool to be used in the technical abstracting industry.
objects are <term> Chinese-English texts </term> ,	which	are selected from different <term> language	#20605 Most importantly, the experimental objects are Chinese-English texts, which are selected from different language families.
difficult to detect <term> error characters </term>	which	are wrongly deleted and inserted . In order	#20634 In optical character recognition and continuous speech recognition of a natural language, it has been difficult to detect error characterswhich are wrongly deleted and inserted.
judge three types of the <term> errors </term> ,	which	are characters wrongly substituted , deleted	#20651 In order to judge three types of the errors, which are characters wrongly substituted, deleted or inserted in a Japanese bunsetsu and an English word, and to correct these errors, this paper proposes new methods using m-th order Markov chain model for Japanese kanji-kana characters and English alphabets, assuming that Markov probability of a correct chain of syllables or kanji-kana characters is greater than that of erroneous chains.
documents </term> and the variety of applications	which	must be supported , <term> document understanding	#21371 Because of the complexity of documents and the variety of applications which must be supported, document understanding requires the integration of image understanding with text understanding.
Document Understanding System ) </term> ,	which	creates the data for a <term> text retrieval	#21406 Our document understanding technology is implemented in a system called IDUS (Intelligent Document Understanding System), which creates the data for a text retrieval application and the automatic generation of hypertext links.


	in Help