Program dates: September 2020-January 2021
The ICASSP 2021 Deep Noise Suppression (DNS) challenge is designed to foster innovation in the field of noise suppression to achieve superior perceptual speech quality. We recently organized a DNS challenge special session at INTERSPEECH 2020. We open sourced training and test datasets for researchers to train their noise suppression models. We also open sourced a subjective evaluation framework and used the tool to evaluate and pick the final winners. Many researchers from academia and industry made significant contributions to push the field forward. The results of the INTERSPEECH DNS Challenge show we still have a long way to go in achieving superior speech quality in challenging noisy conditions. In this challenge, we will be adding over 20 hours of clean speech with singing and provide more information about the characteristics of the noise based on stationarity. We will also provide over 100000 synthetic and real room impulse responses (RIRs) curated from other data sets.
We will have two tracks for this challenge:
- Real – Time Denoising track:
The noise suppressor must take less than the stride time Ts (in ms) to process a frame of size T (in ms) on an Intel Core i5 quad-core machine clocked at 2.4 GHz or equivalent processors. For example, Ts = T/2 for 50% overlap between frames. The total algorithmic latency allowed including the frame size T, stride time Ts and any look ahead must be less than or equal to 40ms. For example, if you use a frame length of 20ms with a stride of 10ms resulting in an algorithmic delay of 30ms, then you satisfy the latency requirements. If you use a frame size of 32ms with a stride of 16ms resulting in an algorithmic delay of 48ms, then your method does not satisfy the latency requirements as the total algorithmic latency exceeds 40ms. If your frame size plus stride T1=T+Ts is less than 40ms, then you can use up to (40-T1)ms future information.
- Personalized Deep Noise Suppression (pDNS) track:
- Satisfy Track 1 requirements
- You will have access to 2 minutes speech of a particular speaker to extract speaker related information that might be useful to improve the quality of the noise suppressor. The enhancement must be done on the noisy speech test segment of the same speaker.
- The enhanced speech using speaker information must be of better quality than enhanced speech without using the speaker information.
Participants are forbidden from using the blind test set to retrain or tweak their models.They must not submit clips enhanced using any speech enhancement method that is not being submitted to ICASSP 2021 by the authors.Failing to adhere to these rules will lead to disqualification from the challenge.
Registration
Please send an email to [email protected] stating that you are interested to participate in the challenge. Please include the following details in your email:
- Names of the participants and name of the team captain
- Institution/Company
Prizes
Top three winning teams from each track will be awarded prizes as outlined in the description of the rules.
Contact us: If you have questions about this program, email us at [email protected].