#8120The strength of our approach is that it allows a tree to be represented as an arbitrary set of features, without concerns about how these features interact or overlap and without the need to define a derivation or a generative model which takes thesefeatures into account.
tech,6-9-J05-1003,ak
The article also introduces a new
<term>
algorithm
</term>
for the
<term>
boosting approach
</term>
#8232The article also introduces a newalgorithm for the boosting approach which takes advantage of the sparsity of the feature space in the parsing data.
other,7-5-J05-1003,ak
We introduce a new method for the
<term>
reranking task
</term>
, based on the
<term>
boosting approach
#8131We introduce a new method for thereranking task, based on the boosting approach to ranking problems described in Freund et al. (1998).
tech,11-1-J05-1003,ak
which rerank the output of an existing
<term>
probabilistic parser
</term>
. The
<term>
base parser
</term>
produces
#8025This article considers approaches which rerank the output of an existingprobabilistic parser.
tech,43-12-J05-1003,ak
<term>
machine translation
</term>
, or
<term>
natural language generation
</term>
. We present a novel method for discovering
#8344Although the experiments in this article are on natural language parsing (NLP), the approach should be applicable to many other NLP problems which are naturally framed as ranking tasks, for example, speech recognition, machine translation, ornatural language generation.
model,40-4-J05-1003,ak
define a
<term>
derivation
</term>
or a
<term>
generative model
</term>
which takes these
<term>
features
</term>
#8115The strength of our approach is that it allows a tree to be represented as an arbitrary set of features, without concerns about how these features interact or overlap and without the need to define a derivation or agenerative model which takes these features into account.
other,10-4-J05-1003,ak
of our approach is that it allows a
<term>
tree
</term>
to be represented as an arbitrary
#8085The strength of our approach is that it allows atree to be represented as an arbitrary set of features, without concerns about how these features interact or overlap and without the need to define a derivation or a generative model which takes these features into account.
other,25-7-J05-1003,ak
additional 500,000
<term>
features
</term>
over
<term>
parse trees
</term>
that were not included in the original
#8189The method combined the log-likelihood under a baseline model (that of Collins [1999]) with evidence from an additional 500,000 features overparse trees that were not included in the original model.
other,26-4-J05-1003,ak
, without concerns about how these
<term>
features
</term>
interact or overlap and without the
#8101The strength of our approach is that it allows a tree to be represented as an arbitrary set of features, without concerns about how thesefeatures interact or overlap and without the need to define a derivation or a generative model which takes these features into account.
measure(ment),14-8-J05-1003,ak
</term>
, a 13 % relative decrease in
<term>
F-measure error
</term>
over the
<term>
baseline model ’s
</term>
#8214The new model achieved 89.75% F-measure, a 13% relative decrease inF-measure error over the baseline model’s score of 88.2%.
tech,21-11-J05-1003,ak
simplicity and efficiency — to work on
<term>
feature selection methods
</term>
within
<term>
log-linear ( maximum-entropy
#8291We argue that the method is an appealing alternative—in terms of both simplicity and efficiency—to work onfeature selection methods within log-linear (maximum-entropy) models.
other,21-2-J05-1003,ak
probabilities
</term>
that define an initial
<term>
ranking
</term>
of these
<term>
parses
</term>
. A second
#8049The base parser produces a set of candidate parses for each input sentence, with associated probabilities that define an initialranking of these parses.
model,34-7-J05-1003,ak
were not included in the original
<term>
model
</term>
. The new
<term>
model
</term>
achieved
#8198The method combined the log-likelihood under a baseline model (that of Collins [1999]) with evidence from an additional 500,000 features over parse trees that were not included in the originalmodel.
model,25-11-J05-1003,ak
feature selection methods
</term>
within
<term>
log-linear ( maximum-entropy ) models
</term>
. Although the experiments in this
#8295We argue that the method is an appealing alternative—in terms of both simplicity and efficiency—to work on feature selection methods withinlog-linear ( maximum-entropy ) models.
other,17-3-J05-1003,ak
additional
<term>
features
</term>
of the
<term>
tree
</term>
as evidence . The strength of our
#8071A second model then attempts to improve upon this initial ranking, using additional features of thetree as evidence.
other,37-4-J05-1003,ak
overlap and without the need to define a
<term>
derivation
</term>
or a
<term>
generative model
</term>
#8112The strength of our approach is that it allows a tree to be represented as an arbitrary set of features, without concerns about how these features interact or overlap and without the need to define aderivation or a generative model which takes these features into account.
tech,9-9-J05-1003,ak
a new
<term>
algorithm
</term>
for the
<term>
boosting approach
</term>
which takes advantage of the
<term>
#8235The article also introduces a new algorithm for theboosting approach which takes advantage of the sparsity of the feature space in the parsing data.
model,2-3-J05-1003,ak
these
<term>
parses
</term>
. A second
<term>
model
</term>
then attempts to improve upon this
#8056A secondmodel then attempts to improve upon this initial ranking, using additional features of the tree as evidence.
measure(ment),6-8-J05-1003,ak
<term>
model
</term>
achieved 89.75 %
<term>
F-measure
</term>
, a 13 % relative decrease in
<term>
#8206The new model achieved 89.75%F-measure, a 13% relative decrease in F-measure error over the baseline model’s score of 88.2%.
tech,15-10-J05-1003,ak
the obvious implementation of the
<term>
boosting approach
</term>
. We argue that the method is an
#8267Experiments show significant efficiency gains for the new algorithm over the obvious implementation of theboosting approach.