tech,7-1-H90-1060,ak This paper reports on two contributions to <term> large vocabulary continuous speech recognition </term> .
tech,8-2-H90-1060,ak First , we present a new paradigm for <term> speaker-independent ( SI ) training </term> of <term> hidden Markov models ( HMM ) </term> , which uses a large amount of <term> speech </term> from a few <term> speakers </term> instead of the traditional practice of using a little speech from many <term> speakers </term> .
model,14-2-H90-1060,ak First , we present a new paradigm for <term> speaker-independent ( SI ) training </term> of <term> hidden Markov models ( HMM ) </term> , which uses a large amount of <term> speech </term> from a few <term> speakers </term> instead of the traditional practice of using a little speech from many <term> speakers </term> .
lr,27-2-H90-1060,ak First , we present a new paradigm for <term> speaker-independent ( SI ) training </term> of <term> hidden Markov models ( HMM ) </term> , which uses a large amount of <term> speech </term> from a few <term> speakers </term> instead of the traditional practice of using a little speech from many <term> speakers </term> .
other,31-2-H90-1060,ak First , we present a new paradigm for <term> speaker-independent ( SI ) training </term> of <term> hidden Markov models ( HMM ) </term> , which uses a large amount of <term> speech </term> from a few <term> speakers </term> instead of the traditional practice of using a little speech from many <term> speakers </term> .
other,44-2-H90-1060,ak First , we present a new paradigm for <term> speaker-independent ( SI ) training </term> of <term> hidden Markov models ( HMM ) </term> , which uses a large amount of <term> speech </term> from a few <term> speakers </term> instead of the traditional practice of using a little speech from many <term> speakers </term> .
other,6-3-H90-1060,ak In addition , combination of the <term> training speakers </term> is done by averaging the statistics of independently trained <term> models </term> rather than the usual <term> pooling </term> of all the <term> speech data </term> from many <term> speakers </term> prior to <term> training </term> .
model,17-3-H90-1060,ak In addition , combination of the <term> training speakers </term> is done by averaging the statistics of independently trained <term> models </term> rather than the usual <term> pooling </term> of all the <term> speech data </term> from many <term> speakers </term> prior to <term> training </term> .
tech,22-3-H90-1060,ak In addition , combination of the <term> training speakers </term> is done by averaging the statistics of independently trained <term> models </term> rather than the usual <term> pooling </term> of all the <term> speech data </term> from many <term> speakers </term> prior to <term> training </term> .
lr,26-3-H90-1060,ak In addition , combination of the <term> training speakers </term> is done by averaging the statistics of independently trained <term> models </term> rather than the usual <term> pooling </term> of all the <term> speech data </term> from many <term> speakers </term> prior to <term> training </term> .
other,30-3-H90-1060,ak In addition , combination of the <term> training speakers </term> is done by averaging the statistics of independently trained <term> models </term> rather than the usual <term> pooling </term> of all the <term> speech data </term> from many <term> speakers </term> prior to <term> training </term> .
tech,33-3-H90-1060,ak In addition , combination of the <term> training speakers </term> is done by averaging the statistics of independently trained <term> models </term> rather than the usual <term> pooling </term> of all the <term> speech data </term> from many <term> speakers </term> prior to <term> training </term> .
other,3-4-H90-1060,ak With only 12 <term> training speakers </term> for <term> SI recognition </term> , we achieved a 7.5 % <term> word error rate </term> on a standard <term> grammar </term> and <term> test set </term> from the <term> DARPA Resource Management corpus </term> .
tech,6-4-H90-1060,ak With only 12 <term> training speakers </term> for <term> SI recognition </term> , we achieved a 7.5 % <term> word error rate </term> on a standard <term> grammar </term> and <term> test set </term> from the <term> DARPA Resource Management corpus </term> .
measure(ment),14-4-H90-1060,ak With only 12 <term> training speakers </term> for <term> SI recognition </term> , we achieved a 7.5 % <term> word error rate </term> on a standard <term> grammar </term> and <term> test set </term> from the <term> DARPA Resource Management corpus </term> .
lr,20-4-H90-1060,ak With only 12 <term> training speakers </term> for <term> SI recognition </term> , we achieved a 7.5 % <term> word error rate </term> on a standard <term> grammar </term> and <term> test set </term> from the <term> DARPA Resource Management corpus </term> .
other,22-4-H90-1060,ak With only 12 <term> training speakers </term> for <term> SI recognition </term> , we achieved a 7.5 % <term> word error rate </term> on a standard <term> grammar </term> and <term> test set </term> from the <term> DARPA Resource Management corpus </term> .
lr-prod,26-4-H90-1060,ak With only 12 <term> training speakers </term> for <term> SI recognition </term> , we achieved a 7.5 % <term> word error rate </term> on a standard <term> grammar </term> and <term> test set </term> from the <term> DARPA Resource Management corpus </term> .
other,10-5-H90-1060,ak This performance is comparable to our best condition for this <term> test suite </term> , using 109 <term> training speakers </term> .
other,15-5-H90-1060,ak This performance is comparable to our best condition for this <term> test suite </term> , using 109 <term> training speakers </term> .
hide detail