Concordance

lr,44-1-N03-1004,ak

for <term> answers </term> in multiple <term>

corpora

</term> . The <term> answering agents </term>

lr,12-1-N03-2006,ak

</term> based on a <term> small-sized bilingual

corpus

</term> , we use an <term> out-of-domain bilingual

lr,32-1-N03-2006,ak

model </term> of an <term> in-domain monolingual

corpus

</term> . We conducted experiments with an

lr,10-5-N03-2025,ak

Markov Model </term> is trained on a <term>

corpus

</term> automatically tagged by the first

other,15-1-P03-1009,ak

classes </term> from undisambiguated <term>

corpus

data </term> . We describe a new approach

lr,17-2-P03-1050,ak

a <term> small ( 10K sentences ) parallel

corpus

</term> as its sole <term> training resources

lr,27-2-P03-1051,ak

</term> from a <term> large unsegmented Arabic

corpus

</term> . The <term> algorithm </term> uses a

lr,18-5-P03-1051,ak

from a <term> 155 million word unsegmented

corpus

</term> , and re-estimate the <term> model

lr,15-6-P03-1051,ak

exact match accuracy </term> on a <term> test

corpus

</term> containing 28,449 <term> word tokens

lr,15-2-P03-1058,ak

</term> from <term> English-Chinese parallel

corpora

</term> , which are then used for disambiguating

lr,12-2-P01-1008,ak	identification of paraphrases </term> from a <term>	corpus	of multiple English translations </term>	#1790 We present an unsupervised learning algorithm for identification of paraphrases from acorpus of multiple English translations of the same source text.
lr,44-1-N03-1004,ak	for <term> answers </term> in multiple <term>	corpora	</term> . The <term> answering agents </term>	#2351 Motivated by the success of ensemble methods in machine learning and other areas of natural language processing, we developed a multi-strategy and multi-source approach to question answering which is based on combining the results from different answering agents searching for answers in multiplecorpora.
lr,19-4-N03-1012,ak	successfully classifies 73.2 % in a <term> German	corpus	</term> of 2.284 <term> SRHs </term> as either	#2522 An evaluation of our system against the annotated data shows that, it successfully classifies 73.2% in a German corpus of 2.284 SRHs as either coherent or incoherent (given a baseline of 54.55%).
lr,12-1-N03-2006,ak	</term> based on a <term> small-sized bilingual	corpus	</term> , we use an <term> out-of-domain bilingual	#3094 In order to boost the translation quality of EBMT based on a small-sized bilingual corpus, we use an out-of-domain bilingual corpus and, in addition, the language model of an in-domain monolingual corpus.
lr,19-1-N03-2006,ak	, we use an <term> out-of-domain bilingual	corpus	</term> and , in addition , the <term> language	#3101 In order to boost the translation quality of EBMT based on a small-sized bilingual corpus, we use an out-of-domain bilingual corpus and, in addition, the language model of an in-domain monolingual corpus.
lr,32-1-N03-2006,ak	model </term> of an <term> in-domain monolingual	corpus	</term> . We conducted experiments with an	#3114 In order to boost the translation quality of EBMT based on a small-sized bilingual corpus, we use an out-of-domain bilingual corpus and, in addition, the language model of an in-domain monolingual corpus.
lr,18-3-N03-2006,ak	of using an <term> out-of-domain bilingual	corpus	</term> and the possibility of using the <term>	#3144 The two evaluation measures of the BLEU score and the NIST score demonstrated the effect of using an out-of-domain bilingual corpus and the possibility of using the language model.
lr,10-5-N03-2025,ak	Markov Model </term> is trained on a <term>	corpus	</term> automatically tagged by the first	#3369 Then, a Hidden Markov Model is trained on acorpus automatically tagged by the first learner.
lr,19-2-N03-4010,ak	candidates </term> from the given <term> text	corpus	</term> . The operation of the <term> system	#3683 The demonstration will focus on how JAVELIN processes questions and retrieves the most likely answer candidates from the given text corpus.
other,15-1-P03-1009,ak	classes </term> from undisambiguated <term>	corpus	data </term> . We describe a new approach	#3900 Previous research has demonstrated the utility of clustering in inducing semantic verb classes from undisambiguatedcorpus data.
lr,15-5-P03-1031,ak	information </term> obtained from <term> dialogue	corpora	</term> . Unlike conventional methods that	#4234 This paper proposes a method for resolving this ambiguity based on statistical information obtained from dialogue corpora.
lr,17-2-P03-1050,ak	a <term> small ( 10K sentences ) parallel	corpus	</term> as its sole <term> training resources	#4471 The stemming model is based on statistical machine translation and it uses an English stemmer and a small (10K sentences) parallel corpus as its sole training resources.
lr,6-2-P03-1051,ak	by a <term> small manually segmented Arabic	corpus	</term> and uses it to bootstrap an <term>	#4650 Our method is seeded by a small manually segmented Arabic corpus and uses it to bootstrap an unsupervised algorithm to build the Arabic word segmenter from a large unsegmented Arabic corpus.
lr,27-2-P03-1051,ak	</term> from a <term> large unsegmented Arabic	corpus	</term> . The <term> algorithm </term> uses a	#4670 Our method is seeded by a small manually segmented Arabic corpus and uses it to bootstrap an unsupervised algorithm to build the Arabic word segmenter from a large unsegmented Arabic corpus.
lr,8-4-P03-1051,ak	estimated from a <term> small manually segmented	corpus	</term> of about 110,000 <term> words </term>	#4702 The language model is initially estimated from a small manually segmented corpus of about 110,000 words.
lr,18-5-P03-1051,ak	from a <term> 155 million word unsegmented	corpus	</term> , and re-estimate the <term> model	#4730 To improve the segmentation accuracy, we use an unsupervised algorithm for automatically acquiring new stems from a 155 million word unsegmented corpus, and re-estimate the model parameters with the expanded vocabulary and training corpus.
lr,34-5-P03-1051,ak	<term> vocabulary </term> and <term> training	corpus	</term> . The resulting <term> Arabic word	#4743 To improve the segmentation accuracy, we use an unsupervised algorithm for automatically acquiring new stems from a 155 million word unsegmented corpus, and re-estimate the model parameters with the expanded vocabulary and training corpus.
lr,15-6-P03-1051,ak	exact match accuracy </term> on a <term> test	corpus	</term> containing 28,449 <term> word tokens	#4761 The resulting Arabic word segmentation system achieves around 97% exact match accuracy on a test corpus containing 28,449 word tokens.
lr,24-7-P03-1051,ak	can create a <term> small manually segmented	corpus	</term> of the <term> language </term> of interest	#4794 We believe this is a state-of-the-art performance and the algorithm can be used for many highly inflected languages provided that one can create a small manually segmented corpus of the language of interest.
lr,15-2-P03-1058,ak	</term> from <term> English-Chinese parallel	corpora	</term> , which are then used for disambiguating	#4840 In this paper, we evaluate an approach to automatically acquire sense-tagged training data from English-Chinese parallel corpora, which are then used for disambiguating the nouns in the SENSEVAL-2 English lexical sample task.


	in Help