lr,52-3-A94-1011,bq A novel method for adding <term> linguistic annotation </term> to <term> corpora </term> is presented which involves using a <term> statistical POS tagger </term> in conjunction with <term> unsupervised structure finding methods </term> to derive notions of <term> noun group </term> , <term> verb group </term> , and so on which is inherently extensible to more sophisticated <term> annotation </term> , and does not require a <term> pre-tagged corpus </term> to fit .
lr,8-3-A94-1011,bq A novel method for adding <term> linguistic annotation </term> to <term> corpora </term> is presented which involves using a <term> statistical POS tagger </term> in conjunction with <term> unsupervised structure finding methods </term> to derive notions of <term> noun group </term> , <term> verb group </term> , and so on which is inherently extensible to more sophisticated <term> annotation </term> , and does not require a <term> pre-tagged corpus </term> to fit .
measure(ment),12-6-A94-1011,bq We investigate how sets of individually high-precision <term> rules </term> can result in a <term> low precision </term> when used together , and develop some theory about these probably-correct <term> rules </term> .
other,11-5-A94-1011,bq This leads us to consider the assignment of <term> descriptors </term> from individual <term> phrases </term> rather than from the <term> weighted sum </term> of a <term> word set representation </term> .
other,15-4-A94-1011,bq One of the distinguishing features of a more <term> linguistically sophisticated representation of documents </term> over a <term> word set based representation </term> of them is that <term> linguistically sophisticated units </term> are more frequently individually good predictors of <term> document descriptors ( keywords ) </term> than single <term> words </term> are .
other,16-5-A94-1011,bq This leads us to consider the assignment of <term> descriptors </term> from individual <term> phrases </term> rather than from the <term> weighted sum </term> of a <term> word set representation </term> .
other,18-7-A94-1011,bq We then proceed to repeat results which show that standard <term> statistical models </term> are not particularly suitable for exploiting <term> linguistically sophisticated representations </term> , and show that a <term> statistically fitted rule-based model </term> provides significantly improved performance for sophisticated <term> representations </term> .
other,20-5-A94-1011,bq This leads us to consider the assignment of <term> descriptors </term> from individual <term> phrases </term> rather than from the <term> weighted sum </term> of a <term> word set representation </term> .
other,22-8-A94-1011,bq It therefore shows that <term> statistical systems </term> can exploit <term> sophisticated representations of documents </term> , and lends some support to the use of more <term> linguistically sophisticated representations </term> for <term> document classification </term> .
other,23-4-A94-1011,bq One of the distinguishing features of a more <term> linguistically sophisticated representation of documents </term> over a <term> word set based representation </term> of them is that <term> linguistically sophisticated units </term> are more frequently individually good predictors of <term> document descriptors ( keywords ) </term> than single <term> words </term> are .
other,25-6-A94-1011,bq We investigate how sets of individually high-precision <term> rules </term> can result in a <term> low precision </term> when used together , and develop some theory about these probably-correct <term> rules </term> .
other,26-9-A94-1011,bq This paper reports on work done for the <term> LRE project SmTA double check </term> , which is creating a <term> PC based tool </term> to be used in the <term> technical abstracting industry </term> .
other,29-3-A94-1011,bq A novel method for adding <term> linguistic annotation </term> to <term> corpora </term> is presented which involves using a <term> statistical POS tagger </term> in conjunction with <term> unsupervised structure finding methods </term> to derive notions of <term> noun group </term> , <term> verb group </term> , and so on which is inherently extensible to more sophisticated <term> annotation </term> , and does not require a <term> pre-tagged corpus </term> to fit .
other,32-3-A94-1011,bq A novel method for adding <term> linguistic annotation </term> to <term> corpora </term> is presented which involves using a <term> statistical POS tagger </term> in conjunction with <term> unsupervised structure finding methods </term> to derive notions of <term> noun group </term> , <term> verb group </term> , and so on which is inherently extensible to more sophisticated <term> annotation </term> , and does not require a <term> pre-tagged corpus </term> to fit .
other,33-4-A94-1011,bq One of the distinguishing features of a more <term> linguistically sophisticated representation of documents </term> over a <term> word set based representation </term> of them is that <term> linguistically sophisticated units </term> are more frequently individually good predictors of <term> document descriptors ( keywords ) </term> than single <term> words </term> are .
other,36-7-A94-1011,bq We then proceed to repeat results which show that standard <term> statistical models </term> are not particularly suitable for exploiting <term> linguistically sophisticated representations </term> , and show that a <term> statistically fitted rule-based model </term> provides significantly improved performance for sophisticated <term> representations </term> .
other,40-4-A94-1011,bq One of the distinguishing features of a more <term> linguistically sophisticated representation of documents </term> over a <term> word set based representation </term> of them is that <term> linguistically sophisticated units </term> are more frequently individually good predictors of <term> document descriptors ( keywords ) </term> than single <term> words </term> are .
other,45-3-A94-1011,bq A novel method for adding <term> linguistic annotation </term> to <term> corpora </term> is presented which involves using a <term> statistical POS tagger </term> in conjunction with <term> unsupervised structure finding methods </term> to derive notions of <term> noun group </term> , <term> verb group </term> , and so on which is inherently extensible to more sophisticated <term> annotation </term> , and does not require a <term> pre-tagged corpus </term> to fit .
other,5-3-A94-1011,bq A novel method for adding <term> linguistic annotation </term> to <term> corpora </term> is presented which involves using a <term> statistical POS tagger </term> in conjunction with <term> unsupervised structure finding methods </term> to derive notions of <term> noun group </term> , <term> verb group </term> , and so on which is inherently extensible to more sophisticated <term> annotation </term> , and does not require a <term> pre-tagged corpus </term> to fit .
other,7-6-A94-1011,bq We investigate how sets of individually high-precision <term> rules </term> can result in a <term> low precision </term> when used together , and develop some theory about these probably-correct <term> rules </term> .
hide detail