|
selection function
</term>
is presented ,
|
which
|
yields superior
<term>
feature vectors
</term>
|
#5368
Finally, a novel feature weighting and selection function is presented, which yields superior feature vectors and better word similarity performance. |
|
this paper is the first step in a project
|
which
|
aims to cluster and summarise
<term>
electronic
|
#5392
The work presented in this paper is the first step in a project which aims to cluster and summarise electronic discussions in the context of help-desk applications. |
|
WSD ) system
</term>
for
<term>
Dutch
</term>
|
which
|
combines
<term>
statistical classification
|
#5997
In this paper, we present a corpus-based supervised word sense disambiguation (WSD) system for Dutchwhich combines statistical classification (maximum entropy) with linguistic information. |
|
comparison with previous
<term>
models
</term>
,
|
which
|
either use arbitrary
<term>
windows
</term>
|
#6356
In comparison with previous models, which either use arbitrary windows to compute similarity between words or use lexical affinity to create sequential models, in this paper we focus on models intended to capture the co-occurrence patterns of any pair of words or phrases at any distance in the corpus. |
|
</term>
, a
<term>
probabilistic model
</term>
|
which
|
has performed well on
<term>
information
|
#6832
The information extraction system we evaluate is based on a linear-chain conditional random field (CRF), a probabilistic modelwhich has performed well on information extraction tasks because of its ability to capture arbitrary, overlapping features of the input in a Markov model. |
|
takes advantage of the different way in
|
which
|
<term>
word senses
</term>
are lexicalised
|
#6967
Our method takes advantage of the different way in which word senses are lexicalised in English and Chinese, and also exploits the large amount of Chinese text available in corpora and on the Web. |
|
an impediment to progress in the field ,
|
which
|
we address with this work . Experiments
|
#7588
The lack of automatic methods for scoring system output is an impediment to progress in the field, which we address with this work. |
|
instance ,
<term>
statistical MT systems
</term>
|
which
|
usually segment their
<term>
outputs
</term>
|
#7780
The use of BLEU at the character level eliminates the word segmentation problem: it makes it possible to directly compare commercial systems outputting unsegmented texts with, for instance, statistical MT systemswhich usually segment their outputs. |
|
classification method
</term>
based on
<term>
PER
</term>
|
which
|
leverages
<term>
part of speech information
|
#8377
We also introduce a novel classification method based on PERwhich leverages part of speech information of the words contributing to the word matches and non-matches in the sentence. |
|
</term>
. This article considers approaches
|
which
|
rerank the output of an existing
<term>
probabilistic
|
#8653
This article considers approaches which rerank the output of an existing probabilistic parser. |
|
</term>
or a
<term>
generative model
</term>
|
which
|
takes these
<term>
features
</term>
into account
|
#8752
The strength of our approach is that it allows a tree to be represented as an arbitrary set of features, without concerns about how these features interact or overlap and without the need to define a derivation or a generative modelwhich takes these features into account. |
|
</term>
for the
<term>
boosting approach
</term>
|
which
|
takes advantage of the
<term>
sparsity of
|
#8872
The article also introduces a new algorithm for the boosting approachwhich takes advantage of the sparsity of the feature space in the parsing data. |
|
applicable to many other
<term>
NLP problems
</term>
|
which
|
are naturally framed as
<term>
ranking tasks
|
#8961
Although the experiments in this article are on natural language parsing (NLP), the approach should be applicable to many other NLP problemswhich are naturally framed as ranking tasks, for example, speech recognition, machine translation, or natural language generation. |
|
benefit to
<term>
language pairs
</term>
for
|
which
|
only scarce
<term>
resources
</term>
are available
|
#9114
Thus, our method can be applied with great benefit to language pairs for which only scarce resources are available. |
|
phrase-based statistical machine translation
</term>
|
which
|
allows for the
<term>
retrieval
</term>
of
|
#9135
In this paper we describe a novel data structure for phrase-based statistical machine translationwhich allows for the retrieval of arbitrarily long phrases while simultaneously using less memory than is required by current decoder implementations. |
|
the
<term>
machine translation task
</term>
,
|
which
|
can also be viewed as a
<term>
stochastic
|
#9487
Second, we describe the graphical model for the machine translation task, which can also be viewed as a stochastic tree-to-tree transducer. |
|
<term>
log-linear block bigram model
</term>
|
which
|
uses
<term>
real-valued features
</term>
(
|
#9598
We use a maximum likelihood criterion to train a log-linear block bigram modelwhich uses real-valued features (e.g. a language model score) as well as binary features based on the block identities themselves, e.g. block bigram features. |
|
statistical machine translation system
</term>
|
which
|
performs
<term>
tree-to-tree translation
</term>
|
#9786
We present a Czech-English statistical machine translation systemwhich performs tree-to-tree translation of dependency structures. |
|
Statistical Machine Translation ( SMT )
</term>
but
|
which
|
have not been addressed satisfactorily
|
#9944
In this paper we study a set of problems that are of considerable importance to Statistical Machine Translation (SMT) but which have not been addressed satisfactorily by the SMT research community. |
|
present a new
<term>
evaluation measure
</term>
|
which
|
explicitly models
<term>
block reordering
|
#10365
In this paper, we will present a new evaluation measurewhich explicitly models block reordering as an edit operation. |