Sound Capture and Speech Enhancement for Speech-Enabled Devices
In this talk we will make an overview of the acoustical design of the sound capture systems and discuss the general architecture of speech enhancement pipelines for the needs of distant speech recognition. The talk will discuss both classical algorithms using statistical signal processing and deep learning using neural networks. It will be illustrated with real-life examples from the acoustical design and speech enhancement audio pipelines in Kinect, HoloLens, and Microsoft Teams.