Training Wideband Acoustic Models using Mixed-Bandwidth Training Data via Feature Bandwidth Extension

  • Alex Acero ,
  • Mike Seltzer

Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing |

Published by Institute of Electrical and Electronics Engineers, Inc.

One serious difficulty in the deployment of wideband speech recognition
systems for new tasks is the expense in both time and cost
of obtaining sufficient training data. A more economical approach
is to collect telephone speech and then restrict the application to
operate at the telephone bandwidth. However, this generally results
in sub-optimal performance. In this paper, we propose a
new algorithm for training wideband acoustic models that requires
only a small amount of wideband speech augmented by a larger
amount of narrowband speech. The algorithm operates by first
converting the narrowband features to wideband features through
a process called Feature Bandwidth Extension. The bandwidthextended
features are then combined with available wideband data
to train the acoustic models using a modified version of the conventional
forward-backward algorithm. Experiments performed
using wideband speech and telephone speech demonstrate that the
proposed mixed-bandwidth training algorithm results in significant
improvements in recognition accuracy over conventional training
strategies when the amount of wideband data is limited.