lr,9-2-P06-2001,bq |
experiments , and trained with a little
<term>
|
corpus
|
</term>
of 100,000
<term>
words
</term>
, the
|
#11236
After several experiments, and trained with a littlecorpus of 100,000 words, the system guesses correctly not placing commas with a precision of 96% and a recall of 98%. |
lr,8-1-P06-2059,bq |
method of building
<term>
polarity-tagged
|
corpus
|
</term>
from
<term>
HTML documents
</term>
.
|
#11401
This paper proposes a novel method of building polarity-tagged corpus from HTML documents. |
lr,29-2-C88-2130,bq |
</term>
derived through analysis of our
<term>
|
corpus
|
</term>
.
<term>
Chart parsing
</term>
is
<term>
|
#15495
The model is embodied in a program, APT, that can reproduce segments of actual tape-recorded descriptions, using organizational and discourse strategies derived through analysis of ourcorpus. |
lr,15-2-C90-3063,bq |
co-occurrence patterns
</term>
in a large
<term>
|
corpus
|
</term>
. To a large extent , these
<term>
|
#16631
This paper presents an automatic scheme for collecting statistics on co-occurrence patterns in a largecorpus. |
lr-prod,26-4-H90-1060,bq |
</term>
from the
<term>
DARPA Resource Management
|
corpus
|
</term>
. This
<term>
performance
</term>
is
|
#17099
With only 12 training speakers for SI recognition, we achieved a 7.5% word error rate on a standard grammar and test set from the DARPA Resource Management corpus. |
lr,12-4-C92-1055,bq |
possible variations between the
<term>
training
|
corpus
|
</term>
and the real tasks are also taken
|
#17893
To make the proposed algorithm robust, the possible variations between the training corpus and the real tasks are also taken into consideration by enlarging the separation margin between the correct candidate and its competing members. |
lr,6-1-H92-1003,bq |
recently collected
<term>
spoken language
|
corpus
|
</term>
for the
<term>
ATIS ( Air Travel Information
|
#18532
This paper describes a recently collected spoken language corpus for the ATIS (Air Travel Information System) domain. |
lr,3-3-H92-1026,bq |
process
</term>
in a novel way . We use a
<term>
|
corpus
|
of bracketed sentences
</term>
, called a
|
#18946
We use acorpus of bracketed sentences, called a Treebank, in combination with decision tree building to tease out the relevant aspects of a parse tree that will determine the correct parse of a sentence. |
lr-prod,1-1-H92-1074,bq |
<term>
CSR ( Connected Speech Recognition )
|
corpus
|
</term>
represents a new
<term>
DARPA speech
|
#19533
The CSR (Connected Speech Recognition) corpus represents a new DARPA speech recognition technology development initiative to advance the state of the art in CSR. |
lr,52-3-A94-1011,bq |
, and does not require a
<term>
pre-tagged
|
corpus
|
</term>
to fit . One of the distinguishing
|
#19998
A novel method for adding linguistic annotation to corpora is presented which involves using a statistical POS tagger in conjunction with unsupervised structure finding methods to derive notions of noun group, verb group, and so on which is inherently extensible to more sophisticated annotation, and does not require a pre-tagged corpus to fit. |
lr-prod,15-3-H94-1014,bq |
word
</term><term>
Wall Street Journal text
|
corpus
|
</term>
. Using the
<term>
BU recognition system
|
#21261
The models were constructed using a 5K vocabulary and trained using a 76 million word Wall Street Journal text corpus. |