A Statistical Approach to Semi-supervised Speech Enhancement with Low-order Non Negative Matrix Factorization
- Shuayb Zarar ,
- Ivan Tashev
IEEE Int. Conf. Acoustics Speech and Signal Processing (ICASSP) |
Published by IEEE - Institute of Electrical and Electronics Engineers
Compared to generic source separation, NMF for speech enhancement is relatively underexplored. When applied to the latter problem, NMF is bereft of performance consistency (across runs and data samples), esp. with small-sized dictionaries. This limitation raises the need for higher-order representations, leading to increased computational costs. In this paper, we propose a statistical-estimation technique that attempts to bridge this gap. Our approach combines multiple low-order NMF decompositions of noisy speech to increase the overall enhancement performance. We show PESQ improvements of up to 0.24 beyond what is achievable by a single NMF parametrization and, at iso-performance levels, major reductions in computational cost.
© IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.