Concordance

Query words 37 >
Multilevel Sort 37 (1,725.0 per million)

	According to our assumption , most of the	words	with similar <term> context features </term>	#6166 According to our assumption, most of the words with similar context features in each author's corpus tend not to be synonymous expressions.
	encodes <term> honorifics </term> ( respectful	words	) . <term> Honorifics </term> are used extensively	#8576 This paper proposes an annotating scheme that encodes honorifics (respectful words).
	ungrammatically , missing out or repeating	words	, breaking-off and restarting , speaking	#12685 When people use natural language in natural settings, they often use it ungrammatically, missing out or repeating words, breaking-off and restarting, speaking in fragments, etc..
lr,21-2-C90-3072,bq	dictionaries of word forms </term> instead of <term>	words	</term> . This approach is sufficient for	#16755 From different reasons among which the speed of processing prevails they are usually based on dictionaries of word forms instead ofwords.
other,1-2-P01-1009,bq	</term> , and <term> besides </term> . These <term>	words	</term> appear frequently enough in <term>	#1847 Thesewords appear frequently enough in dialog to warrant serious attention, yet present natural language search engines perform poorly on queries containing them.
other,1-8-C92-3165,bq	practical systems . Detected <term> unknown	words	</term> can be incrementally incorporated	#18244 Detected unknown words can be incrementally incorporated into the dictionary after the interaction with the user.
other,11-1-P01-1009,bq	analysis </term> for a large class of <term>	words	</term> called <term> alternative markers </term>	#1826 This paper presents a formal analysis for a large class ofwords called alternative markers, which includes other (than), such (as), and besides.
other,11-4-P82-1035,bq	</term> can be used to figure out <term> unknown	words	</term> from <term> context </term> , constrain	#13067 These syntactic and semantic expectations can be used to figure out unknown words from context, constrain the possible word-senses of words with multiple meanings (ambiguity), fill in missing words (elllpsis), and resolve referents (anaphora).
other,12-2-P06-2001,bq	little <term> corpus </term> of 100,000 <term>	words	</term> , the system guesses correctly not	#11239 After several experiments, and trained with a little corpus of 100,000words, the system guesses correctly not placing commas with a precision of 96% and a recall of 98%.
other,15-3-C04-1147,bq	compute <term> similarity </term> between <term>	words	</term> or use <term> lexical affinity </term>	#6365 In comparison with previous models, which either use arbitrary windows to compute similarity betweenwords or use lexical affinity to create sequential models, in this paper we focus on models intended to capture the co-occurrence patterns of any pair of words or phrases at any distance in the corpus.
other,15-4-P03-1051,bq	segmented corpus </term> of about 110,000 <term>	words	</term> . To improve the <term> segmentation	#4704 The language model is initially estimated from a small manually segmented corpus of about 110,000words.
other,15-5-A92-1027,bq	</term> based on the placement of <term> function	words	</term> , and by <term> heuristic rules </term>	#17681 This is facilitated through the use of phrase boundary heuristics based on the placement of function words, and by heuristic rules that permit certain kinds of phrases to be deduced despite the presence of unknown words.
other,16-2-P04-2005,bq	<term> topic signature </term> is a set of <term>	words	</term> that tend to co-occur with it . <term>	#6921 Given a particular concept, or word sense, a topic signature is a set ofwords that tend to co-occur with it.
other,18-3-I05-5003,bq	of speech information </term> of the <term>	words	</term> contributing to the <term> word matches	#8385 We also introduce a novel classification method based on PER which leverages part of speech information of thewords contributing to the word matches and non-matches in the sentence.
other,18-4-H01-1042,bq	language essays </term> in less than 100 <term>	words	</term> . Even more illuminating was the	#646 A language learning experiment showed that assessors can differentiate native from non-native language essays in less than 100words.
other,19-2-C92-4199,bq	is proposed for identifying <term> unknown	words	</term> , especially <term> personal names </term>	#18305 In this paper, a new mechanism, based on the concept of sublanguage, is proposed for identifying unknown words, especially personal names, in Chinese newspapers.
other,21-4-P82-1035,bq	possible <term> word-senses </term> of <term>	words	with multiple meanings </term> ( <term> ambiguity	#13076 These syntactic and semantic expectations can be used to figure out unknown words from context, constrain the possible word-senses ofwords with multiple meanings (ambiguity), fill in missing words (elllpsis), and resolve referents (anaphora).
other,21-6-A94-1026,bq	semantic categories </term> of the <term> adjoining	words	</term> . The method accurately determines	#20482 The basic idea of this method is that a compound noun component places some restrictions on the semantic categories of the adjoining words.
other,22-1-A94-1007,bq	<term> but </term> and the equivalent <term>	words	</term> . <term> Syntactic analysis of the	#19697 The authors propose a model for analyzing English sentences including coordinate conjunctions such as and, or, but and the equivalentwords.
other,23-5-J05-4003,bq	<term> parallel corpus </term> ( 100,000 <term>	words	</term> ) and exploiting a large <term> non-parallel	#9091 We also show that a good-quality MT system can be built from scratch by starting with a very small parallel corpus (100,000words) and exploiting a large non-parallel corpus.


	in Help