probabilities
</term>
that define an initial
<term>
ranking
</term>
of these
<term>
parses
</term>
. A second
#8049The base parser produces a set of candidate parses for each input sentence, with associated probabilities that define an initialranking of these parses.
other,4-7-J05-1003,ak
treebank
</term>
. The method combined the
<term>
log-likelihood under a baseline model
</term>
( that of Collins [ 1999 ] ) with
#8168The method combined thelog-likelihood under a baseline model (that of Collins [1999]) with evidence from an additional 500,000 features over parse trees that were not included in the original model.
model,2-3-J05-1003,ak
these
<term>
parses
</term>
. A second
<term>
model
</term>
then attempts to improve upon this
#8056A secondmodel then attempts to improve upon this initial ranking, using additional features of the tree as evidence.
other,19-4-J05-1003,ak
represented as an arbitrary set of
<term>
features
</term>
, without concerns about how these
#8094The strength of our approach is that it allows a tree to be represented as an arbitrary set offeatures, without concerns about how these features interact or overlap and without the need to define a derivation or a generative model which takes these features into account.
other,37-4-J05-1003,ak
overlap and without the need to define a
<term>
derivation
</term>
or a
<term>
generative model
</term>
#8112The strength of our approach is that it allows a tree to be represented as an arbitrary set of features, without concerns about how these features interact or overlap and without the need to define aderivation or a generative model which takes these features into account.
tech,43-12-J05-1003,ak
<term>
machine translation
</term>
, or
<term>
natural language generation
</term>
. We present a novel method for discovering
#8344Although the experiments in this article are on natural language parsing (NLP), the approach should be applicable to many other NLP problems which are naturally framed as ranking tasks, for example, speech recognition, machine translation, ornatural language generation.
model,18-8-J05-1003,ak
<term>
F-measure error
</term>
over the
<term>
baseline model ’s
</term>
score of 88.2 % . The article also
#8218The new model achieved 89.75% F-measure, a 13% relative decrease in F-measure error over thebaseline model ’s score of 88.2%.
other,24-2-J05-1003,ak
initial
<term>
ranking
</term>
of these
<term>
parses
</term>
. A second
<term>
model
</term>
then
#8052The base parser produces a set of candidate parses for each input sentence, with associated probabilities that define an initial ranking of theseparses.
other,17-3-J05-1003,ak
additional
<term>
features
</term>
of the
<term>
tree
</term>
as evidence . The strength of our
#8071A second model then attempts to improve upon this initial ranking, using additional features of thetree as evidence.
tech,1-2-J05-1003,ak
<term>
probabilistic parser
</term>
. The
<term>
base parser
</term>
produces a set of
<term>
candidate
#8029Thebase parser produces a set of candidate parses for each input sentence, with associated probabilities that define an initial ranking of these parses.
other,16-9-J05-1003,ak
</term>
which takes advantage of the
<term>
sparsity
</term>
of the
<term>
feature space
</term>
in
#8242The article also introduces a new algorithm for the boosting approach which takes advantage of thesparsity of the feature space in the parsing data.
tech,23-12-J05-1003,ak
should be applicable to many other
<term>
NLP problems
</term>
which are naturally framed as
<term>
#8324Although the experiments in this article are on natural language parsing (NLP), the approach should be applicable to many otherNLP problems which are naturally framed as ranking tasks, for example, speech recognition, machine translation, or natural language generation.
other,26-4-J05-1003,ak
, without concerns about how these
<term>
features
</term>
interact or overlap and without the
#8101The strength of our approach is that it allows a tree to be represented as an arbitrary set of features, without concerns about how thesefeatures interact or overlap and without the need to define a derivation or a generative model which takes these features into account.
tech,8-12-J05-1003,ak
experiments in this article are on
<term>
natural language parsing ( NLP )
</term>
, the approach should be applicable
#8309Although the experiments in this article are onnatural language parsing ( NLP ), the approach should be applicable to many other NLP problems which are naturally framed as ranking tasks, for example, speech recognition, machine translation, or natural language generation.
other,11-2-J05-1003,ak
<term>
candidate parses
</term>
for each
<term>
input sentence
</term>
, with associated
<term>
probabilities
#8039The base parser produces a set of candidate parses for eachinput sentence, with associated probabilities that define an initial ranking of these parses.
other,45-4-J05-1003,ak
generative model
</term>
which takes these
<term>
features
</term>
into account . We introduce a new
#8120The strength of our approach is that it allows a tree to be represented as an arbitrary set of features, without concerns about how these features interact or overlap and without the need to define a derivation or a generative model which takes thesefeatures into account.
tech,9-9-J05-1003,ak
a new
<term>
algorithm
</term>
for the
<term>
boosting approach
</term>
which takes advantage of the
<term>
#8235The article also introduces a new algorithm for theboosting approach which takes advantage of the sparsity of the feature space in the parsing data.
tech,15-10-J05-1003,ak
the obvious implementation of the
<term>
boosting approach
</term>
. We argue that the method is an
#8267Experiments show significant efficiency gains for the new algorithm over the obvious implementation of theboosting approach.
other,14-3-J05-1003,ak
<term>
ranking
</term>
, using additional
<term>
features
</term>
of the
<term>
tree
</term>
as evidence
#8068A second model then attempts to improve upon this initial ranking, using additionalfeatures of the tree as evidence.
other,23-7-J05-1003,ak
evidence from an additional 500,000
<term>
features
</term>
over
<term>
parse trees
</term>
that
#8187The method combined the log-likelihood under a baseline model (that of Collins [1999]) with evidence from an additional 500,000features over parse trees that were not included in the original model.