Concordance

Query corpus 55 (2,564.2 per million)

lr,17-4-C04-1116,bq	context features </term> in each author 's <term>	corpus	</term> tend not to be <term> synonymous expressions	#6175 According to our assumption, most of the words with similar context features in each author'scorpus tend not to be synonymous expressions.
lr,50-3-C04-1147,bq	phrases </term> at any distance in the <term>	corpus	</term> . The framework is flexible , allowing	#6400 In comparison with previous models, which either use arbitrary windows to compute similarity between words or use lexical affinity to create sequential models, in this paper we focus on models intended to capture the co-occurrence patterns of any pair of words or phrases at any distance in thecorpus.
lr,7-5-C04-1147,bq	apply it in combination with a <term> terabyte	corpus	</term> to answer <term> natural language tests	#6425 We apply it in combination with a terabyte corpus to answer natural language tests, achieving encouraging results.
lr,30-2-C04-1192,bq	for the <term> languages </term> in the <term>	corpus	</term> . The <term> wordnets </term> are aligned	#6480 The method exploits recent advances in word alignment and word clustering based on automatic extraction of translation equivalents and being supported by available aligned wordnets for the languages in thecorpus.
lr,2-3-I05-4010,bq	in detail . The resultant <term> bilingual	corpus	</term> , 10.4 M <term> English words </term>	#8255 The resultant bilingual corpus, 10.4M English words and 18.3M Chinese characters, is an authoritative and comprehensive text collection covering the specific and special domain of HK laws.
lr,19-5-J05-4003,bq	starting with a very small <term> parallel	corpus	</term> ( 100,000 <term> words </term> ) and	#9088 We also show that a good-quality MT system can be built from scratch by starting with a very small parallel corpus (100,000 words) and exploiting a large non-parallel corpus.
lr,29-5-J05-4003,bq	and exploiting a large <term> non-parallel	corpus	</term> . Thus , our method can be applied	#9098 We also show that a good-quality MT system can be built from scratch by starting with a very small parallel corpus (100,000 words) and exploiting a large non-parallel corpus.
lr,3-3-P05-1034,bq	component </term> . We align a <term> parallel	corpus	</term> , project the <term> source dependency	#9248 We align a parallel corpus, project the source dependency parse onto the target sentence, extract dependency treelet translation pairs, and train a tree-based ordering model.
lr,11-4-P05-1074,bq	extracted from a <term> bilingual parallel	corpus	</term> to be ranked using <term> translation	#9729 We define a paraphrase probability that allows paraphrases extracted from a bilingual parallel corpus to be ranked using translation probabilities, and show how it can be refined to take contextual information into account.
lr,7-2-P05-2016,bq	required is a <term> sentence-aligned parallel	corpus	</term> . All other <term> resources </term>	#9803 The only bilingual resource required is a sentence-aligned parallel corpus.
tech,4-1-N06-4001,bq	strategies . We introduce a new <term> interactive	corpus	exploration tool </term> called <term> InfoMagnets	#10870 We introduce a new interactive corpus exploration tool called InfoMagnets.
tech,4-2-N06-4001,bq	InfoMagnets </term> aims at making <term> exploratory	corpus	analysis </term> accessible to researchers	#10881 InfoMagnets aims at making exploratory corpus analysis accessible to researchers who are not experts in text mining.
lr,6-3-P06-1052,bq	</term> . We evaluate the algorithm on a <term>	corpus	</term> , and show that it reduces the degree	#11183 We evaluate the algorithm on acorpus, and show that it reduces the degree of ambiguity significantly while taking negligible runtime.
lr,9-2-P06-2001,bq	experiments , and trained with a little <term>	corpus	</term> of 100,000 <term> words </term> , the	#11236 After several experiments, and trained with a littlecorpus of 100,000 words, the system guesses correctly not placing commas with a precision of 96% and a recall of 98%.
lr,18-4-P06-2001,bq	using a bigger and a more homogeneous <term>	corpus	</term> to train , that is , a bigger <term>	#11300 Finally, we have shown that these results can be improved using a bigger and a more homogeneouscorpus to train, that is, a bigger corpus written by one unique author.
lr,27-4-P06-2001,bq	</term> to train , that is , a bigger <term>	corpus	</term> written by one unique <term> author	#11309 Finally, we have shown that these results can be improved using a bigger and a more homogeneous corpus to train, that is, a biggercorpus written by one unique author.
lr,8-1-P06-2059,bq	method of building <term> polarity-tagged	corpus	</term> from <term> HTML documents </term> .	#11401 This paper proposes a novel method of building polarity-tagged corpus from HTML documents.
lr,9-5-P06-2059,bq	experiment , the method could construct a <term>	corpus	</term> consisting of 126,610 <term> sentences	#11464 In our experiment, the method could construct acorpus consisting of 126,610 sentences.
lr,29-2-C88-2130,bq	</term> derived through analysis of our <term>	corpus	</term> . <term> Chart parsing </term> is <term>	#15495 The model is embodied in a program, APT, that can reproduce segments of actual tape-recorded descriptions, using organizational and discourse strategies derived through analysis of ourcorpus.
lr,15-2-C90-3063,bq	co-occurrence patterns </term> in a large <term>	corpus	</term> . To a large extent , these <term>	#16631 This paper presents an automatic scheme for collecting statistics on co-occurrence patterns in a largecorpus.


	in Help