Concordance

tech,36-1-H01-1040,ak

text collections </term> via a standard <term>

text

browser </term> . We describe how this information

up to six extracts of translated newswire

text

. Some of the extracts were <term> expert

tech,29-1-N03-1018,ak

progressing from <term> generation of true

text

</term> through its transformation into the

other,13-2-N03-2003,ak

data </term> can be supplemented with <term>

text

</term> from the <term> web </term> filtered

lr,19-2-N03-4010,ak

answer candidates </term> from the given <term>

text

corpus </term> . The operation of the <term>

lr,0-4-P03-1050,ak

phase </term> . <term> Monolingual , unannotated

text

</term> can be used to further improve the

other,16-7-P03-1050,ak

average precision </term> over <term> unstemmed

text

</term> , and 96 % of the performance of

other,24-1-H05-1032,ak

judgments </term> are distributed over the <term>

text

</term> . Comparison is made against <term>

other,13-1-I05-2013,ak

which takes as <term> input </term> a <term> raw

text

</term> in <term> French </term> and produces

commercial systems outputting unsegmented

texts

with , for instance , <term> statistical

other,31-1-H01-1040,ak	- can be used to enhance access to <term>	text	collections </term> via a standard <term> text	#305 In this paper we show how two standard outputs from information extraction (IE) systems - named entity annotations and scenario templates - can be used to enhance access totext collections via a standard text browser.
tech,36-1-H01-1040,ak	text collections </term> via a standard <term>	text	browser </term> . We describe how this information	#310 In this paper we show how two standard outputs from information extraction (IE) systems - named entity annotations and scenario templates - can be used to enhance access to text collections via a standardtext browser.
tech,38-3-H01-1040,ak	increased potential of <term> IE-enhanced	text	browsers </term> . At MIT Lincoln Laboratory	#383 We also report results of a preliminary, qualitative user evaluation of the system, which while broadly positive indicates further work needs to be done on the interface to make users aware of the increased potential of IE-enhanced text browsers.
	up to six extracts of translated newswire	text	. Some of the extracts were <term> expert	#695 Subjects were given a set of up to six extracts of translated newswire text.
other,20-2-P01-1008,ak	translations </term> of the same <term> source	text	</term> . Our approach yields <term> phrasal	#1799 We present an unsupervised learning algorithm for identification of paraphrases from a corpus of multiple English translations of the same source text.
tech,29-1-N03-1018,ak	progressing from <term> generation of true	text	</term> through its transformation into the	#2700 In this paper, we introduce a generative probabilistic optical character recognition (OCR) model that describes an end-to-end process in the noisy channel framework, progressing from generation of true text through its transformation into the noisy output of an OCR system.
other,38-3-N03-1018,ak	translation lexicons </term> from printed <term>	text	</term> . We present an application of <term>	#2783 We present an implementation of the model based on finite-state models, demonstrate the model's ability to significantly reduce character and word error rate, and provide evaluation results involving automatic extraction of translation lexicons from printedtext.
other,13-2-N03-2003,ak	data </term> can be supplemented with <term>	text	</term> from the <term> web </term> filtered	#3042 In this paper, we show how training data can be supplemented withtext from the web filtered to match the style and/or topic of the target recognition task, but also that it is possible to get bigger performance gains from the data by using class-dependent interpolation of N-grams.
other,24-1-N03-4010,ak	answering capability </term> on <term> free	text	</term> . The demonstration will focus on	#3661 The JAVELIN system integrates a flexible, planning-based architecture with a variety of language processing modules to provide an open-domain question answering capability on free text.
lr,19-2-N03-4010,ak	answer candidates </term> from the given <term>	text	corpus </term> . The operation of the <term>	#3682 The demonstration will focus on how JAVELIN processes questions and retrieves the most likely answer candidates from the giventext corpus.
lr,1-3-P03-1050,ak	training resources </term> . No <term> parallel	text	</term> is needed after the <term> training	#4480 No parallel text is needed after the training phase.
lr,0-4-P03-1050,ak	phase </term> . <term> Monolingual , unannotated	text	</term> can be used to further improve the	#4491 Monolingual, unannotated text can be used to further improve the stemmer by allowing it to adapt to a desired domain or genre.
lr,26-6-P03-1050,ak	affix lists </term> , and <term> human annotated	text	</term> , in addition to an <term> unsupervised	#4562 Our resource-frugal approach results in 87.5% agreement with a state of the art, proprietary Arabic stemmer built using rules, affix lists, and human annotated text, in addition to an unsupervised component.
other,16-7-P03-1050,ak	average precision </term> over <term> unstemmed	text	</term> , and 96 % of the performance of	#4588 Task-based evaluation using Arabic information retrieval indicates an improvement of 22-38% in average precision over unstemmed text, and 96% of the performance of the proprietary stemmer above.
tech,7-1-H05-1032,ak	presents a <term> Bayesian model </term> for <term>	text	summarization </term> , which explicitly	#5370 The paper presents a Bayesian model fortext summarization, which explicitly encodes and exploits information on how human judgments are distributed over the text.
other,24-1-H05-1032,ak	judgments </term> are distributed over the <term>	text	</term> . Comparison is made against <term>	#5387 The paper presents a Bayesian model for text summarization, which explicitly encodes and exploits information on how human judgments are distributed over thetext.
other,12-2-H05-1032,ak	<term> test data </term> from <term> Japanese news	texts	</term> . It is found that the <term> Bayesian	#5403 Comparison is made against non Bayesian summarizers, using test data from Japanese news texts.
other,13-1-I05-2013,ak	which takes as <term> input </term> a <term> raw	text	</term> in <term> French </term> and produces	#6093 We present a tool, called ILIMP, which takes as input a raw text in French and produces as output the same text in which every occurrence of the pronoun il is tagged either with tag [ANA] for anaphoric or [IMP] for impersonal or expletive.
other,23-1-I05-2013,ak	produces as <term> output </term> the same <term>	text	</term> in which every <term> occurrence </term>	#6102 We present a tool, called ILIMP, which takes as input a raw text in French and produces as output the sametext in which every occurrence of the pronoun il is tagged either with tag [ANA] for anaphoric or [IMP] for impersonal or expletive.
	commercial systems outputting unsegmented	texts	with , for instance , <term> statistical	#6314 The use of BLEU at the character level eliminates the word segmentation problem: it makes it possible to directly compare commercial systems outputting unsegmented texts with, for instance, statistical MT systems which usually segment their outputs.


	in Help