A Statistical Approach to Semi-supervised Speech Enhancement with Low-order Non Negative Matrix Factorization

IEEE Int. Conf. Acoustics Speech and Signal Processing (ICASSP) |

Published by IEEE - Institute of Electrical and Electronics Engineers

Publication

Compared to generic source separation, NMF for speech enhancement is relatively underexplored. When applied to the latter problem, NMF is bereft of performance consistency (across runs and data samples), esp. with small-sized dictionaries. This limitation raises the need for higher-order representations, leading to increased computational costs. In this paper, we propose a statistical-estimation technique that attempts to bridge this gap. Our approach combines multiple low-order NMF decompositions of noisy speech to increase the overall enhancement performance. We show PESQ improvements of up to 0.24 beyond what is achievable by a single NMF parametrization and, at iso-performance levels, major reductions in computational cost.