Concordance

the <term> annotated data </term> shows that ,

it

successfully classifies 73.2 % in a <term>

target recognition task </term> , but also that

it

is possible to get bigger performance gains

as the <term> cohesion constraint </term> .

It

requires disjoint <term> English phrases </term>

inflow of multilingual , multimedia data .

It

gives users the ability to spend their

Switchboard dialogues </term> and show that

it

compares well to Byron 's ( 2002 ) manually

statistical machine translation </term> and

it

uses an <term> English stemmer </term> and

manually segmented Arabic corpus </term> and uses

it

to bootstrap an <term> unsupervised algorithm

training data </term> . We demonstrate that

it

is feasible to create <term> training material

<term> summarizer </term> , at times giving

it

a significant lead over <term> non-Bayesian

</term> of this <term> pronoun </term> , for which

it

does not make sense to look for an <term>

ubiquitous and carries important information yet	it	is also time consuming to document . Given	#9 Oral communication is ubiquitous and carries important information yet it is also time consuming to document.
the <term> annotated data </term> shows that ,	it	successfully classifies 73.2 % in a <term>	#2514 An evaluation of our system against the annotated data shows that, it successfully classifies 73.2% in a German corpus of 2.284 SRHs as either coherent or incoherent (given a baseline of 54.55%).
black-box OCR systems </term> in order to make	it	more useful for <term> NLP tasks </term> .	#2738 The model is designed for use in error correction, with a focus on post-processing the output of black-box OCR systems in order to make it more useful for NLP tasks.
target recognition task </term> , but also that	it	is possible to get bigger performance gains	#3062 In this paper, we show how training data can be supplemented with text from the web filtered to match the style and/or topic of the target recognition task, but also that it is possible to get bigger performance gains from the data by using class-dependent interpolation of N-grams.
create a <term> word-trie </term> , transform	it	into a <term> minimal DFA </term> , then identify	#3199 We create a word-trie, transform it into a minimal DFA, then identify hubs.
as the <term> cohesion constraint </term> .	It	requires disjoint <term> English phrases </term>	#3244 We present a syntax-based constraint for word alignment, known as the cohesion constraint. It requires disjoint English phrases to be mapped to non-overlapping intervals in the French sentence.
algorithms </term> . The results show that	it	can provide a significant improvement in	#3276 The results show that it can provide a significant improvement in alignment quality.
inflow of multilingual , multimedia data .	It	gives users the ability to spend their	#3605 The TAP-XL Automated Analyst's Assistant is an application designed to help an English-speaking analyst write a topical report, culling information from a large inflow of multilingual, multimedia data. It gives users the ability to spend their time finding more data relevant to their task, and gives them translingual reach into other languages by leveraging human language technology.
central to our <term> IE paradigm </term> .	It	is based on : ( 1 ) an extended set of <term>	#3751 We also introduce a new way of automatically identifying predicate argument structures, which is central to our IE paradigm. It is based on: (1) an extended set of features; and (2) inductive decision tree learning.
Switchboard dialogues </term> and show that	it	compares well to Byron 's ( 2002 ) manually	#4030 We evaluate the system on twenty Switchboard dialogues and show that it compares well to Byron's (2002) manually tuned system.
</term> of <term> speech understanding </term> ,	it	is not appropriate to decide on a single	#4178 Since multiple candidates for the understanding result can be obtained for a user utterance due to the ambiguity of speech understanding, it is not appropriate to decide on a single understanding result after each user utterance.
statistical machine translation </term> and	it	uses an <term> English stemmer </term> and	#4458 The stemming model is based on statistical machine translation and it uses an English stemmer and a small (10K sentences) parallel corpus as its sole training resources.
improve the <term> stemmer </term> by allowing	it	to adapt to a desired <term> domain </term>	#4502 Monolingual, unannotated text can be used to further improve the stemmer by allowing it to adapt to a desired domain or genre.
manually segmented Arabic corpus </term> and uses	it	to bootstrap an <term> unsupervised algorithm	#4653 Our method is seeded by a small manually segmented Arabic corpus and uses it to bootstrap an unsupervised algorithm to build the Arabic word segmenter from a large unsegmented Arabic corpus.
English . Typically , information that makes	it	to a summary appears in many different <term>	#5207 Typically, information that makes it to a summary appears in many different lexical-syntactic forms in the input documents.
training data </term> . We demonstrate that	it	is feasible to create <term> training material	#5295 We demonstrate that it is feasible to create training material for problems in machine translation and that a mixture of supervised and unsupervised methods yields superior performance.
</term> from <term> Japanese news texts </term> .	It	is found that the <term> Bayesian approach	#5405 Comparison is made against non Bayesian summarizers, using test data from Japanese news texts. It is found that the Bayesian approach generally leverages performance of a summarizer, at times giving it a significant lead over non-Bayesian models.
<term> summarizer </term> , at times giving	it	a significant lead over <term> non-Bayesian	#5422 It is found that the Bayesian approach generally leverages performance of a summarizer, at times giving it a significant lead over non-Bayesian models.
version of our method and hypothesize that	it	can outperform a competitive <term> baseline	#5863 Currently, we present a topic-sensitive version of our method and hypothesize that it can outperform a competitive baseline, which compares the similarity of each sentence to the input question via IDF-weighted word overlap.
</term> of this <term> pronoun </term> , for which	it	does not make sense to look for an <term>	#6167 This tool is therefore designed to distinguish between the anaphoric occurrences of il, for which an anaphora resolution system has to look for an antecedent, and the expletive occurrences of this pronoun, for which it does not make sense to look for an antecedent.


	in Help