lr,16-6-H90-1060,bq |
Second , we show a significant improvement for
<term>
speaker adaptation ( SA )
</term>
using the new
<term>
SI corpus
</term>
and a small amount of
<term>
speech
</term>
from the new ( target )
<term>
speaker
</term>
.
|
#17135
Second, we show a significant improvement for speaker adaptation (SA) using the newSI corpus and a small amount of speech from the new (target) speaker. |
lr,20-4-H90-1060,bq |
With only 12
<term>
training speakers
</term>
for
<term>
SI recognition
</term>
, we achieved a 7.5 %
<term>
word error rate
</term>
on a standard
<term>
grammar
</term>
and
<term>
test set
</term>
from the
<term>
DARPA Resource Management corpus
</term>
.
|
#17090
With only 12 training speakers for SI recognition, we achieved a 7.5% word error rate on a standardgrammar and test set from the DARPA Resource Management corpus. |
lr,22-4-H90-1060,bq |
With only 12
<term>
training speakers
</term>
for
<term>
SI recognition
</term>
, we achieved a 7.5 %
<term>
word error rate
</term>
on a standard
<term>
grammar
</term>
and
<term>
test set
</term>
from the
<term>
DARPA Resource Management corpus
</term>
.
|
#17092
With only 12 training speakers for SI recognition, we achieved a 7.5% word error rate on a standard grammar andtest set from the DARPA Resource Management corpus. |
lr,23-6-H90-1060,bq |
Second , we show a significant improvement for
<term>
speaker adaptation ( SA )
</term>
using the new
<term>
SI corpus
</term>
and a small amount of
<term>
speech
</term>
from the new ( target )
<term>
speaker
</term>
.
|
#17142
Second, we show a significant improvement for speaker adaptation (SA) using the new SI corpus and a small amount ofspeech from the new (target) speaker. |
lr,27-2-H90-1060,bq |
First , we present a new paradigm for
<term>
speaker-independent ( SI ) training
</term>
of
<term>
hidden Markov models ( HMM )
</term>
, which uses a large amount of
<term>
speech
</term>
from a few
<term>
speakers
</term>
instead of the traditional practice of using a little
<term>
speech
</term>
from many
<term>
speakers
</term>
.
|
#17015
First, we present a new paradigm for speaker-independent (SI) training of hidden Markov models (HMM), which uses a large amount ofspeech from a few speakers instead of the traditional practice of using a little speech from many speakers. |
lr,27-3-H90-1060,bq |
In addition , combination of the
<term>
training speakers
</term>
is done by averaging the
<term>
statistics >
</term>
of
<term>
independently trained models
</term>
rather than the usual pooling of all the
<term>
speech data
</term>
from many
<term>
speakers
</term>
prior to
<term>
training
</term>
.
|
#17061
In addition, combination of the training speakers is done by averaging the statistics> of independently trained models rather than the usual pooling of all thespeech data from many speakers prior to training. |
lr,41-2-H90-1060,bq |
First , we present a new paradigm for
<term>
speaker-independent ( SI ) training
</term>
of
<term>
hidden Markov models ( HMM )
</term>
, which uses a large amount of
<term>
speech
</term>
from a few
<term>
speakers
</term>
instead of the traditional practice of using a little
<term>
speech
</term>
from many
<term>
speakers
</term>
.
|
#17029
First, we present a new paradigm for speaker-independent (SI) training of hidden Markov models (HMM), which uses a large amount of speech from a few speakers instead of the traditional practice of using a littlespeech from many speakers. |
lr-prod,26-4-H90-1060,bq |
With only 12
<term>
training speakers
</term>
for
<term>
SI recognition
</term>
, we achieved a 7.5 %
<term>
word error rate
</term>
on a standard
<term>
grammar
</term>
and
<term>
test set
</term>
from the
<term>
DARPA Resource Management corpus
</term>
.
|
#17096
With only 12 training speakers for SI recognition, we achieved a 7.5% word error rate on a standard grammar and test set from theDARPA Resource Management corpus. |
measure(ment),1-5-H90-1060,bq |
This
<term>
performance
</term>
is comparable to our best condition for this test suite , using 109
<term>
training speakers
</term>
.
|
#17102
Thisperformance is comparable to our best condition for this test suite, using 109 training speakers. |
measure(ment),1-8-H90-1060,bq |
Each
<term>
reference model
</term>
is transformed to the
<term>
space
</term>
of the
<term>
target speaker
</term>
and combined by
<term>
averaging
</term>
.
|
#17171
Eachreference model is transformed to the space of the target speaker and combined by averaging. |
measure(ment),12-9-H90-1060,bq |
Using only 40
<term>
utterances
</term>
from the
<term>
target speaker
</term>
for
<term>
adaptation
</term>
, the
<term>
error rate
</term>
dropped to 4.1 % --- a 45 % reduction in
<term>
error
</term>
compared to the
<term>
SI
</term>
result .
|
#17199
Using only 40 utterances from the target speaker for adaptation, theerror rate dropped to 4.1% --- a 45% reduction in error compared to the SI result. |
measure(ment),14-4-H90-1060,bq |
With only 12
<term>
training speakers
</term>
for
<term>
SI recognition
</term>
, we achieved a 7.5 %
<term>
word error rate
</term>
on a standard
<term>
grammar
</term>
and
<term>
test set
</term>
from the
<term>
DARPA Resource Management corpus
</term>
.
|
#17084
With only 12 training speakers for SI recognition, we achieved a 7.5%word error rate on a standard grammar and test set from the DARPA Resource Management corpus. |
model,16-3-H90-1060,bq |
In addition , combination of the
<term>
training speakers
</term>
is done by averaging the
<term>
statistics >
</term>
of
<term>
independently trained models
</term>
rather than the usual pooling of all the
<term>
speech data
</term>
from many
<term>
speakers
</term>
prior to
<term>
training
</term>
.
|
#17050
In addition, combination of the training speakers is done by averaging the statistics> ofindependently trained models rather than the usual pooling of all the speech data from many speakers prior to training. |
other,10-8-H90-1060,bq |
Each
<term>
reference model
</term>
is transformed to the
<term>
space
</term>
of the
<term>
target speaker
</term>
and combined by
<term>
averaging
</term>
.
|
#17180
Each reference model is transformed to the space of thetarget speaker and combined by averaging. |
other,13-3-H90-1060,bq |
In addition , combination of the
<term>
training speakers
</term>
is done by averaging the
<term>
statistics >
</term>
of
<term>
independently trained models
</term>
rather than the usual pooling of all the
<term>
speech data
</term>
from many
<term>
speakers
</term>
prior to
<term>
training
</term>
.
|
#17047
In addition, combination of the training speakers is done by averaging thestatistics > of independently trained models rather than the usual pooling of all the speech data from many speakers prior to training. |
other,15-5-H90-1060,bq |
This
<term>
performance
</term>
is comparable to our best condition for this test suite , using 109
<term>
training speakers
</term>
.
|
#17116
This performance is comparable to our best condition for this test suite, using 109training speakers. |
other,16-7-H90-1060,bq |
A
<term>
probabilistic spectral mapping
</term>
is estimated independently for each
<term>
training ( reference ) speaker
</term>
and the
<term>
target speaker
</term>
.
|
#17167
A probabilistic spectral mapping is estimated independently for each training (reference) speaker and thetarget speaker. |
other,24-9-H90-1060,bq |
Using only 40
<term>
utterances
</term>
from the
<term>
target speaker
</term>
for
<term>
adaptation
</term>
, the
<term>
error rate
</term>
dropped to 4.1 % --- a 45 % reduction in
<term>
error
</term>
compared to the
<term>
SI
</term>
result .
|
#17211
Using only 40 utterances from the target speaker for adaptation, the error rate dropped to 4.1% --- a 45% reduction inerror compared to the SI result. |
other,3-4-H90-1060,bq |
With only 12
<term>
training speakers
</term>
for
<term>
SI recognition
</term>
, we achieved a 7.5 %
<term>
word error rate
</term>
on a standard
<term>
grammar
</term>
and
<term>
test set
</term>
from the
<term>
DARPA Resource Management corpus
</term>
.
|
#17073
With only 12training speakers for SI recognition, we achieved a 7.5% word error rate on a standard grammar and test set from the DARPA Resource Management corpus. |
other,3-9-H90-1060,bq |
Using only 40
<term>
utterances
</term>
from the
<term>
target speaker
</term>
for
<term>
adaptation
</term>
, the
<term>
error rate
</term>
dropped to 4.1 % --- a 45 % reduction in
<term>
error
</term>
compared to the
<term>
SI
</term>
result .
|
#17190
Using only 40utterances from the target speaker for adaptation, the error rate dropped to 4.1% --- a 45% reduction in error compared to the SI result. |