Concordance

Query corpus 55 >
Filter all but first hit in document 31 (1,445.3 per million)

lr,12-2-P01-1008,bq	identification of paraphrases </term> from a <term>	corpus	of multiple English translations </term>	#1789 We present an unsupervised learning algorithm for identification of paraphrases from acorpus of multiple English translations of the same source text.
lr,19-4-N03-1012,bq	successfully classifies 73.2 % in a <term> German	corpus	</term> of 2.284 <term> SRHs </term> as either	#2521 An evaluation of our system against the annotated data shows that, it successfully classifies 73.2% in a German corpus of 2.284 SRHs as either coherent or incoherent (given a baseline of 54.55%).
lr,13-1-N03-2006,bq	</term> based on a small-sized <term> bilingual	corpus	</term> , we use an out-of-domain <term> bilingual	#3093 In order to boost the translation quality of EBMT based on a small-sized bilingual corpus, we use an out-of-domain bilingual corpus and, in addition, the language model of an in-domain monolingual corpus.
lr,10-5-N03-2025,bq	Markov Model </term> is trained on a <term>	corpus	</term> automatically tagged by the first	#3368 Then, a Hidden Markov Model is trained on acorpus automatically tagged by the first learner.
lr,19-2-N03-4010,bq	candidates </term> from the given <term> text	corpus	</term> . The operation of the <term> system	#3682 The demonstration will focus on how JAVELIN processes questions and retrieves the most likely answer candidates from the given text corpus.
other,15-1-P03-1009,bq	classes </term> from undisambiguated <term>	corpus	data </term> . We describe a new approach	#3899 Previous research has demonstrated the utility of clustering in inducing semantic verb classes from undisambiguatedcorpus data.
lr,22-2-P03-1050,bq	a small ( 10K sentences ) <term> parallel	corpus	</term> as its sole <term> training resources	#4469 The stemming model is based on statistical machine translation and it uses an English stemmer and a small (10K sentences) parallel corpus as its sole training resources.
lr,7-2-P03-1051,bq	by a small <term> manually segmented Arabic	corpus	</term> and uses it to bootstrap an <term>	#4648 Our method is seeded by a small manually segmented Arabic corpus and uses it to bootstrap an unsupervised algorithm to build the Arabic word segmenter from a large unsegmented Arabic corpus.
lr,9-1-P03-1068,bq	of a large , <term> semantically annotated	corpus	</term> resource as reliable basis for the	#4943 We describe the ongoing construction of a large, semantically annotated corpus resource as reliable basis for the large-scale acquisition of word-semantic information, e.g. the construction of domain-independent lexica.
lr,6-3-C04-1106,bq	experiments conducted on a <term> multilingual	corpus	</term> to estimate the number of <term> analogies	#5916 We report experiments conducted on a multilingual corpus to estimate the number of analogies among the sentences that it contains.
lr,23-2-C04-1116,bq	each author 's text as a coherent <term>	corpus	</term> . Our approach is based on the idea	#6137 This paper proposes a new methodology to improve the accuracy of a term aggregation system using each author's text as a coherentcorpus.
lr,50-3-C04-1147,bq	phrases </term> at any distance in the <term>	corpus	</term> . The framework is flexible , allowing	#6400 In comparison with previous models, which either use arbitrary windows to compute similarity between words or use lexical affinity to create sequential models, in this paper we focus on models intended to capture the co-occurrence patterns of any pair of words or phrases at any distance in thecorpus.
lr,30-2-C04-1192,bq	for the <term> languages </term> in the <term>	corpus	</term> . The <term> wordnets </term> are aligned	#6480 The method exploits recent advances in word alignment and word clustering based on automatic extraction of translation equivalents and being supported by available aligned wordnets for the languages in thecorpus.
lr,2-3-I05-4010,bq	in detail . The resultant <term> bilingual	corpus	</term> , 10.4 M <term> English words </term>	#8255 The resultant bilingual corpus, 10.4M English words and 18.3M Chinese characters, is an authoritative and comprehensive text collection covering the specific and special domain of HK laws.
lr,19-5-J05-4003,bq	starting with a very small <term> parallel	corpus	</term> ( 100,000 <term> words </term> ) and	#9088 We also show that a good-quality MT system can be built from scratch by starting with a very small parallel corpus (100,000 words) and exploiting a large non-parallel corpus.
lr,3-3-P05-1034,bq	component </term> . We align a <term> parallel	corpus	</term> , project the <term> source dependency	#9248 We align a parallel corpus, project the source dependency parse onto the target sentence, extract dependency treelet translation pairs, and train a tree-based ordering model.
lr,11-4-P05-1074,bq	extracted from a <term> bilingual parallel	corpus	</term> to be ranked using <term> translation	#9729 We define a paraphrase probability that allows paraphrases extracted from a bilingual parallel corpus to be ranked using translation probabilities, and show how it can be refined to take contextual information into account.
lr,7-2-P05-2016,bq	required is a <term> sentence-aligned parallel	corpus	</term> . All other <term> resources </term>	#9803 The only bilingual resource required is a sentence-aligned parallel corpus.
tech,4-1-N06-4001,bq	strategies . We introduce a new <term> interactive	corpus	exploration tool </term> called <term> InfoMagnets	#10870 We introduce a new interactive corpus exploration tool called InfoMagnets.
lr,6-3-P06-1052,bq	</term> . We evaluate the algorithm on a <term>	corpus	</term> , and show that it reduces the degree	#11183 We evaluate the algorithm on acorpus, and show that it reduces the degree of ambiguity significantly while taking negligible runtime.


	in Help