Speech Enhancement using a Pitch Predictive Model
- Jasha Droppo ,
- Alex Acero ,
- Dong Yu ,
- Li Deng ,
- Mike Seltzer ,
- Ivan Tashev
Proceedings of the International Conference on Acoustics, Speech, and Signal Processing |
Published by Institute of Electrical and Electronics Engineers, Inc.
In this paper we present two new methods for speech enhancement based on the previously publised fine pitch model (FPM) for voiced speech. The first method (FPM-NE) uses the FPM to produce a nonstationary noise estimate that can be used in any standard speech enhancement system. In this method, the FPM is used indirectly to perform speech enhancement. The second method we describe (FPM-SE) uses the FPM directly to perform speech enhancement. We present a study of the behavior of the two models on the standard Aurora 2 task, and demonstrate improvements of over 45% average word error rate reduction over the multi-style baseline.
© 2007 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Publication Downloads
Pitch and Voicing Estimates for Aurora 2
January 12, 2005
This archive consists of a set of pitch period and voicing estimates for utterances found in the Aurora 2 corpus[1] using the algorithm described in [2]. Currently, pitch estimates are available for test sets A and B, as well as the clean training data. [1] H. G. Hirsch and D. Pearce, "The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy conditions", in ISCA ITRW ASR2000 "Automatic Speech Recognition: Challenges for the Next Millennium", Paris, France, September 2000. [2] J. Droppo and A. Acero. Maximum a Posteriori Pitch Tracking, in Proc. of the Int. Conf. on Spoken Language Processing. Sydney, Australia. Dec 1998.