Speech Enhancement using a Pitch Predictive Model

  • Jasha Droppo ,
  • Alex Acero ,
  • Dong Yu ,
  • Li Deng ,
  • Mike Seltzer ,

Proceedings of the International Conference on Acoustics, Speech, and Signal Processing |

Published by Institute of Electrical and Electronics Engineers, Inc.

In this paper we present two new methods for speech enhancement based on the previously publised fine pitch model (FPM) for voiced speech. The first method (FPM-NE) uses the FPM to produce a nonstationary noise estimate that can be used in any standard speech enhancement system. In this method, the FPM is used indirectly to perform speech enhancement. The second method we describe (FPM-SE) uses the FPM directly to perform speech enhancement. We present a study of the behavior of the two models on the standard Aurora 2 task, and demonstrate improvements of over 45% average word error rate reduction over the multi-style baseline.

Publication Downloads

Pitch and Voicing Estimates for Aurora 2

January 12, 2005

This archive consists of a set of pitch period and voicing estimates for utterances found in the Aurora 2 corpus[1] using the algorithm described in [2]. Currently, pitch estimates are available for test sets A and B, as well as the clean training data. [1] H. G. Hirsch and D. Pearce, "The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy conditions", in ISCA ITRW ASR2000 "Automatic Speech Recognition: Challenges for the Next Millennium", Paris, France, September 2000. [2] J. Droppo and A. Acero. Maximum a Posteriori Pitch Tracking, in Proc. of the Int. Conf. on Spoken Language Processing. Sydney, Australia. Dec 1998.