This perplexing fact needs both an explanation and a solution if the power of recently developed
<term>
NLP techniques
</term>
are to be successfully applied in
<term>
IR
</term>
.
#25828This perplexing fact needs both an explanation and a solution if the power of recently developed NLP techniques are to be successfully applied inIR.
other,35-7-A94-1011,ak
We then proceed to repeat results which show that standard
<term>
statistical models
</term>
are not particularly suitable for exploiting
<term>
linguistically sophisticated representations
</term>
, and show that a
<term>
statistically fitted rule-based model
</term>
provides significantly improved performance for
<term>
sophisticated representations
</term>
.
#26016We then proceed to repeat results which show that standard statistical models are not particularly suitable for exploiting linguistically sophisticated representations, and show that a statistically fitted rule-based model provides significantly improved performance forsophisticated representations.
tech,45-3-A94-1011,ak
A novel method for adding
<term>
linguistic annotation
</term>
to
<term>
corpora
</term>
is presented which involves using a
<term>
statistical POS tagger
</term>
in conjunction with
<term>
unsupervised structure finding methods
</term>
to derive notions of
<term>
noun group
</term>
,
<term>
verb group
</term>
, and so on which is inherently extensible to more sophisticated
<term>
annotation
</term>
, and does not require a
<term>
pre-tagged corpus
</term>
to fit .
#25875A novel method for adding linguistic annotation to corpora is presented which involves using a statistical POS tagger in conjunction with unsupervised structure finding methods to derive notions of noun group, verb group, and so on which is inherently extensible to more sophisticatedannotation, and does not require a pre-tagged corpus to fit.
other,15-4-A94-1011,ak
One of the distinguishing features of a more
<term>
linguistically sophisticated representation
</term>
of
<term>
documents
</term>
over a
<term>
word set based representation
</term>
of them is that linguistically sophisticated units are more frequently individually good predictors of
<term>
document descriptors ( keywords )
</term>
than single
<term>
words
</term>
are .
#25902One of the distinguishing features of a more linguistically sophisticated representation of documents over aword set based representation of them is that linguistically sophisticated units are more frequently individually good predictors of document descriptors (keywords) than single words are.
model,25-6-A94-1011,ak
We investigate how sets of individually
<term>
high-precision rules
</term>
can result in a low
<term>
precision
</term>
when used together , and develop some theory about these probably-correct
<term>
rules
</term>
.
#25979We investigate how sets of individually high-precision rules can result in a low precision when used together, and develop some theory about these probably-correctrules.
other,33-4-A94-1011,ak
One of the distinguishing features of a more
<term>
linguistically sophisticated representation
</term>
of
<term>
documents
</term>
over a
<term>
word set based representation
</term>
of them is that linguistically sophisticated units are more frequently individually good predictors of
<term>
document descriptors ( keywords )
</term>
than single
<term>
words
</term>
are .
#25920One of the distinguishing features of a more linguistically sophisticated representation of documents over a word set based representation of them is that linguistically sophisticated units are more frequently individually good predictors ofdocument descriptors ( keywords ) than single words are.
other,40-4-A94-1011,ak
One of the distinguishing features of a more
<term>
linguistically sophisticated representation
</term>
of
<term>
documents
</term>
over a
<term>
word set based representation
</term>
of them is that linguistically sophisticated units are more frequently individually good predictors of
<term>
document descriptors ( keywords )
</term>
than single
<term>
words
</term>
are .
#25927One of the distinguishing features of a more linguistically sophisticated representation of documents over a word set based representation of them is that linguistically sophisticated units are more frequently individually good predictors of document descriptors (keywords) than singlewords are.
tech,4-8-A94-1011,ak
It therefore shows that
<term>
statistical systems
</term>
can exploit
<term>
sophisticated representations
</term>
of
<term>
documents
</term>
, and lends some support to the use of more
<term>
linguistically sophisticated representations
</term>
for
<term>
document classification
</term>
.
#26023It therefore shows thatstatistical systems can exploit sophisticated representations of documents, and lends some support to the use of more linguistically sophisticated representations for document classification.
other,18-7-A94-1011,ak
We then proceed to repeat results which show that standard
<term>
statistical models
</term>
are not particularly suitable for exploiting
<term>
linguistically sophisticated representations
</term>
, and show that a
<term>
statistically fitted rule-based model
</term>
provides significantly improved performance for
<term>
sophisticated representations
</term>
.
#25999We then proceed to repeat results which show that standard statistical models are not particularly suitable for exploitinglinguistically sophisticated representations, and show that a statistically fitted rule-based model provides significantly improved performance for sophisticated representations.
other,11-8-A94-1011,ak
It therefore shows that
<term>
statistical systems
</term>
can exploit
<term>
sophisticated representations
</term>
of
<term>
documents
</term>
, and lends some support to the use of more
<term>
linguistically sophisticated representations
</term>
for
<term>
document classification
</term>
.
#26030It therefore shows that statistical systems can exploit sophisticated representations ofdocuments, and lends some support to the use of more linguistically sophisticated representations for document classification.
tech,26-8-A94-1011,ak
It therefore shows that
<term>
statistical systems
</term>
can exploit
<term>
sophisticated representations
</term>
of
<term>
documents
</term>
, and lends some support to the use of more
<term>
linguistically sophisticated representations
</term>
for
<term>
document classification
</term>
.
#26045It therefore shows that statistical systems can exploit sophisticated representations of documents, and lends some support to the use of more linguistically sophisticated representations fordocument classification.
model,6-6-A94-1011,ak
We investigate how sets of individually
<term>
high-precision rules
</term>
can result in a low
<term>
precision
</term>
when used together , and develop some theory about these probably-correct
<term>
rules
</term>
.
#25960We investigate how sets of individuallyhigh-precision rules can result in a low precision when used together, and develop some theory about these probably-correct rules.
other,11-5-A94-1011,ak
This leads us to consider the assignment of
<term>
descriptors
</term>
from individual
<term>
phrases
</term>
rather than from the
<term>
weighted sum
</term>
of a
<term>
word set representation
</term>
.
#25941This leads us to consider the assignment of descriptors from individualphrases rather than from the weighted sum of a word set representation.
tech,18-1-A94-1011,ak
The use of
<term>
NLP techniques
</term>
for
<term>
document classification
</term>
has not produced significant improvements in performance within the standard
<term>
term weighting statistical assignment paradigm
</term>
( Fagan 1987 ; Lewis , 1992bc ; Buckley , 1993 ) .
#25786The use of NLP techniques for document classification has not produced significant improvements in performance within the standardterm weighting statistical assignment paradigm (Fagan 1987; Lewis, 1992bc; Buckley, 1993).
tech,3-1-A94-1011,ak
The use of
<term>
NLP techniques
</term>
for
<term>
document classification
</term>
has not produced significant improvements in performance within the standard
<term>
term weighting statistical assignment paradigm
</term>
( Fagan 1987 ; Lewis , 1992bc ; Buckley , 1993 ) .
#25771The use ofNLP techniques for document classification has not produced significant improvements in performance within the standard term weighting statistical assignment paradigm (Fagan 1987; Lewis, 1992bc; Buckley, 1993).
measure(ment),13-6-A94-1011,ak
We investigate how sets of individually
<term>
high-precision rules
</term>
can result in a low
<term>
precision
</term>
when used together , and develop some theory about these probably-correct
<term>
rules
</term>
.
#25967We investigate how sets of individually high-precision rules can result in a lowprecision when used together, and develop some theory about these probably-correct rules.
model,10-7-A94-1011,ak
We then proceed to repeat results which show that standard
<term>
statistical models
</term>
are not particularly suitable for exploiting
<term>
linguistically sophisticated representations
</term>
, and show that a
<term>
statistically fitted rule-based model
</term>
provides significantly improved performance for
<term>
sophisticated representations
</term>
.
#25991We then proceed to repeat results which show that standardstatistical models are not particularly suitable for exploiting linguistically sophisticated representations, and show that a statistically fitted rule-based model provides significantly improved performance for sophisticated representations.
other,8-5-A94-1011,ak
This leads us to consider the assignment of
<term>
descriptors
</term>
from individual
<term>
phrases
</term>
rather than from the
<term>
weighted sum
</term>
of a
<term>
word set representation
</term>
.
#25938This leads us to consider the assignment ofdescriptors from individual phrases rather than from the weighted sum of a word set representation.
other,22-8-A94-1011,ak
It therefore shows that
<term>
statistical systems
</term>
can exploit
<term>
sophisticated representations
</term>
of
<term>
documents
</term>
, and lends some support to the use of more
<term>
linguistically sophisticated representations
</term>
for
<term>
document classification
</term>
.
#26041It therefore shows that statistical systems can exploit sophisticated representations of documents, and lends some support to the use of morelinguistically sophisticated representations for document classification.
tech,21-3-A94-1011,ak
A novel method for adding
<term>
linguistic annotation
</term>
to
<term>
corpora
</term>
is presented which involves using a
<term>
statistical POS tagger
</term>
in conjunction with
<term>
unsupervised structure finding methods
</term>
to derive notions of
<term>
noun group
</term>
,
<term>
verb group
</term>
, and so on which is inherently extensible to more sophisticated
<term>
annotation
</term>
, and does not require a
<term>
pre-tagged corpus
</term>
to fit .
#25851A novel method for adding linguistic annotation to corpora is presented which involves using a statistical POS tagger in conjunction withunsupervised structure finding methods to derive notions of noun group, verb group, and so on which is inherently extensible to more sophisticated annotation, and does not require a pre-tagged corpus to fit.