other,45-3-A94-1011,bq |
inherently extensible to more sophisticated
<term>
|
annotation
|
</term>
, and does not require a
<term>
pre-tagged
|
#19990
A novel method for adding linguistic annotation to corpora is presented which involves using a statistical POS tagger in conjunction with unsupervised structure finding methods to derive notions of noun group, verb group, and so on which is inherently extensible to more sophisticatedannotation, and does not require a pre-tagged corpus to fit. |
other,8-8-A94-1011,bq |
statistical systems
</term>
can exploit
<term>
|
sophisticated representations of documents
|
</term>
, and lends some support to the use
|
#20142
It therefore shows that statistical systems can exploitsophisticated representations of documents, and lends some support to the use of more linguistically sophisticated representations for document classification. |
other,18-7-A94-1011,bq |
particularly suitable for exploiting
<term>
|
linguistically sophisticated representations
|
</term>
, and show that a
<term>
statistically
|
#20114
We then proceed to repeat results which show that standard statistical models are not particularly suitable for exploitinglinguistically sophisticated representations, and show that a statistically fitted rule-based model provides significantly improved performance for sophisticated representations. |
other,32-3-A94-1011,bq |
notions of
<term>
noun group
</term>
,
<term>
|
verb group
|
</term>
, and so on which is inherently extensible
|
#19977
A novel method for adding linguistic annotation to corpora is presented which involves using a statistical POS tagger in conjunction with unsupervised structure finding methods to derive notions of noun group,verb group, and so on which is inherently extensible to more sophisticated annotation, and does not require a pre-tagged corpus to fit. |
other,29-3-A94-1011,bq |
methods
</term>
to derive notions of
<term>
|
noun group
|
</term>
,
<term>
verb group
</term>
, and so
|
#19974
A novel method for adding linguistic annotation to corpora is presented which involves using a statistical POS tagger in conjunction with unsupervised structure finding methods to derive notions ofnoun group, verb group, and so on which is inherently extensible to more sophisticated annotation, and does not require a pre-tagged corpus to fit. |
other,8-9-A94-1011,bq |
paper reports on work done for the
<term>
|
LRE project SmTA double check
|
</term>
, which is creating a
<term>
PC based
|
#20171
This paper reports on work done for theLRE project SmTA double check, which is creating a PC based tool to be used in the technical abstracting industry. |
tech,24-2-A94-1011,bq |
</term>
are to be successfully applied in
<term>
|
IR
|
</term>
. A novel method for adding
<term>
|
#19943
This perplexing fact needs both an explanation and a solution if the power of recently developed NLP techniques are to be successfully applied inIR. |
other,36-7-A94-1011,bq |
improved performance for sophisticated
<term>
|
representations
|
</term>
. It therefore shows that
<term>
statistical
|
#20132
We then proceed to repeat results which show that standard statistical models are not particularly suitable for exploiting linguistically sophisticated representations, and show that a statistically fitted rule-based model provides significantly improved performance for sophisticatedrepresentations. |
tech,26-8-A94-1011,bq |
sophisticated representations
</term>
for
<term>
|
document classification
|
</term>
. This paper reports on work done
|
#20160
It therefore shows that statistical systems can exploit sophisticated representations of documents, and lends some support to the use of more linguistically sophisticated representations fordocument classification. |
other,26-9-A94-1011,bq |
based tool
</term>
to be used in the
<term>
|
technical abstracting industry
|
</term>
. This paper proposes a model using
|
#20189
This paper reports on work done for the LRE project SmTA double check, which is creating a PC based tool to be used in thetechnical abstracting industry. |
other,20-5-A94-1011,bq |
from the
<term>
weighted sum
</term>
of a
<term>
|
word set representation
|
</term>
. We investigate how sets of individually
|
#20065
This leads us to consider the assignment of descriptors from individual phrases rather than from the weighted sum of aword set representation. |
other,25-6-A94-1011,bq |
theory about these probably-correct
<term>
|
rules
|
</term>
. We then proceed to repeat results
|
#20094
We investigate how sets of individually high-precision rules can result in a low precision when used together, and develop some theory about these probably-correctrules. |
tech,18-1-A94-1011,bq |
in performance within the standard
<term>
|
term weighting statistical assignment paradigm
|
</term>
( Fagan 1987 ; Lewis , 1992bc ; Buckley
|
#19901
The use of NLP techniques for document classification has not produced significant improvements in performance within the standardterm weighting statistical assignment paradigm (Fagan 1987; Lewis, 1992bc; Buckley, 1993). |
other,40-4-A94-1011,bq |
descriptors ( keywords )
</term>
than single
<term>
|
words
|
</term>
are . This leads us to consider the
|
#20042
One of the distinguishing features of a more linguistically sophisticated representation of documents over a word set based representation of them is that linguistically sophisticated units are more frequently individually good predictors of document descriptors (keywords) than singlewords are. |
other,23-4-A94-1011,bq |
representation
</term>
of them is that
<term>
|
linguistically sophisticated units
|
</term>
are more frequently individually
|
#20025
One of the distinguishing features of a more linguistically sophisticated representation of documents over a word set based representation of them is thatlinguistically sophisticated units are more frequently individually good predictors of document descriptors (keywords) than single words are. |
tech,10-7-A94-1011,bq |
repeat results which show that standard
<term>
|
statistical models
|
</term>
are not particularly suitable for
|
#20106
We then proceed to repeat results which show that standardstatistical models are not particularly suitable for exploiting linguistically sophisticated representations, and show that a statistically fitted rule-based model provides significantly improved performance for sophisticated representations. |
tech,16-2-A94-1011,bq |
if the power of recently developed
<term>
|
NLP techniques
|
</term>
are to be successfully applied in
|
#19935
This perplexing fact needs both an explanation and a solution if the power of recently developedNLP techniques are to be successfully applied in IR. |
tech,4-8-A94-1011,bq |
representations
</term>
. It therefore shows that
<term>
|
statistical systems
|
</term>
can exploit
<term>
sophisticated representations
|
#20138
It therefore shows thatstatistical systems can exploit sophisticated representations of documents, and lends some support to the use of more linguistically sophisticated representations for document classification. |
other,7-6-A94-1011,bq |
sets of individually high-precision
<term>
|
rules
|
</term>
can result in a
<term>
low precision
|
#20076
We investigate how sets of individually high-precisionrules can result in a low precision when used together, and develop some theory about these probably-correct rules. |
tech,3-1-A94-1011,bq |
translation
</term>
use . The use of
<term>
|
NLP techniques
|
</term>
for
<term>
document classification
</term>
|
#19886
The use ofNLP techniques for document classification has not produced significant improvements in performance within the standard term weighting statistical assignment paradigm (Fagan 1987; Lewis, 1992bc; Buckley, 1993). |