Deep Noise Suppression Challenge – INTERSPEECH 2020

Region: Global

The application period is now closed.

Program dates: March-July 2020

Contact us: If you have questions about this program, email us at [email protected].

Other challenges:

Deep Noise Suppression Challenge – ICASSP 2021

Acoustic Echo Cancellation Challenge – ICASSP 2021

Acoustic Echo Cancellation Challenge – INTERSPEECH 2021

Acoustic Echo Cancellation Challenge – ICASSP 2022

Deep Noise Suppression Challenge – ICASSP 2022

The DNS Challenge at INTERSPEECH 2020 is intended to promote collaborative research in single-channel Speech Enhancement aimed to maximize the perceptual quality and intelligibility of the enhanced speech. The challenge will evaluate the speech quality using the online subjective evaluation framework ITU-T P.808. The challenge provides large datasets for training noise suppressors, but allows participants to use any datasets of their choice. Participants can also augment their datasets with the provided data. The challenge also provides a test set that is very extensive. The test set contains both synthetic noisy speech and also real recordings. The final evaluation will be conducted on a blind test set that is similar to the open sourced test set. We also provide model and inference scripts for a baseline noise suppressor that was recently published.

More details about the open sourced data, baseline noise suppressor, ITU-T P.808 and DNS challenge can be found here.

Submitted papers will fall under one of these two tracks based on the computational complexity.

Real-Time Track:
This track focuses on low computational complexity. The algorithm must take less than T/2 (in ms) to process a frame of size T (in ms) on an Intel Core i5 quad core machine clocked at 2.4 GHz or equivalent processors. Frame length T should be less than or equal to 40ms.

Non-Real-Time Track:
This track relaxes the constraints on computational time so that researchers can explore deeper models to attain exceptional speech quality.

In both the tracks, the Speech Enhancement method may have a maximum of 40ms look ahead. To infer the current frame T (in ms), the algorithm can access any number of past frames but only 40ms of future frames (T+40ms).

The blind test will be provided to the participating teams on March 18th, 2020. The enhanced clips must be sent back to the organizers by March 22nd, 2020. The organizers will conduct subjective evaluation using ITU-T P.808 framework to get the final ranking of the methods. Please visit Rules for more details.

Participants are forbidden from using the blind test set to retrain or tweak their models. They must not submit clips enhanced using any speech enhancement method that is not being submitted to INTERSPEECH 2020 by the authors. Failing to adhere to these rules will lead to disqualification from the challenge.

Registration

Please send an email to [email protected] stating that you are interested to participate in the challenge. Please include the following details in your email:

Names of the participants and name of the team captain
Institution/Company
Email

Prizes

Top three winning teams from each track will be awarded prizes as outlined in the description of the rules.

Please email us, if you have any questions or need clarification about any aspect of the challenge.