Training Wideband Acoustic Models Using Mixed-Bandwidth Training Data for Speech Recognition
- Alex Acero ,
- Mike Seltzer
Trans. on Audio, Speech and Language Processing | , Vol 15(1): pp. 235-245
One serious difficulty in the deployment of wideband
speech recognition systems for new tasks is the expense in both
time and cost of obtaining sufficient training data. A more economical
approach is to collect telephone speech and then restrict the
application to operate at the telephone bandwidth. However, this
generally results in suboptimal performance compared to a wideband
recognition system. In this paper, we propose a novel expectation-
maximization (EM) algorithm in which wideband acoustic
models are trained using a small amount of wideband speech and
a larger amount of narrowband speech. We show how this algorithm
can be incorporated into the existing training schemes of
hidden Markov model (HMM) speech recognizers. Experiments
performed using wideband speech and telephone speech demonstrate
that the proposed mixed-bandwidth training algorithm results
in significant improvements in recognition accuracy over conventional
training strategies when the amount of wideband data is
limited.
© 2007 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.