Sound Capture and Speech Enhancement for Communication and Distant Speech Recognition

In this talk we will discuss the general architecture of speech enhancement pipelines for the needs of hands-free telecommunication and distant speech recognition. The talk will discuss both classical approaches using statistical signal processing and deep learning using neural networks. It will be illustrated with real-life examples from the speech enhancement audio pipelines in Kinect, HoloLens, and Teams.

Date:: April 28, 2021
Speakers:: Dr. Ivan J. Tashev and Dr. Sebastian Braun from Audio and Acoustics Research Group in Microsoft Research – Redmond