Deep Noise Suppression Challenge – INTERSPEECH 2020

Region: Global

The accepted papers for the non-real-time track are given below. Papers were accepted through the normal INTERSPEECH peer-review process.
Place Performance Rank Team Authors Title
1 1 Amazon Web Services Umut Isik, Ritwik Giri, Neerad Phansalkar, Jean-Marc Valin, Karim Helwani, Arvindh Krishnaswamy Poconet: Better speech enhancement with frequency-positional embeddings, semi-supervised conversational data, and biased loss (opens in new tab)
2 2 Technische Universitat Braunschweig, Goodix Technology Maximilian Strake, Bruno Defraene, Kristoff Fluyt, Wouter Tirry, Tim Fingscheidt INTERSPEECH 2020 Deep Noise Suppression Challenge: A Fully Convolutional Recurrent Network (FCRN) for Joint Dereverberation and Denoising. (opens in new tab)
2 2 Northwestern Polytechnical University Yanxin Hu , Yun Liu , Shubo Lv, Mengtao Xing, Shimin Zhang, Yihui Fu, Jian Wu, Bihong Zhang, Lei Xie DCCRN: Deep complex convolution recurrent network for phase-aware speech enhancement (opens in new tab)

The accepted papers for the real-time track are given below.

Place

Performance Rank

Team

Authors

Title

1

1

Northwestern Polytechnical University

Yanxin Hu , Yun Liu , Shubo Lv, Mengtao Xing, Shimin Zhang, Yihui Fu, Jian Wu, Bihong Zhang, Lei Xie

DCCRN: Deep complex convolution recurrent network for phase-aware speech enhancement (opens in new tab)

2

2

Amazon Web Services

Jean-Marc Valin, Umut Isik, Neerad Phansalkar, Ritwik Giri, Karim Helwani, Arvindh Krishnaswamy

A perceptually-motivated approach for low-complexity, real-time enhancement of fullband speech (opens in new tab)

3

3

Technische Universitat Braunschweig, Goodix Technology

Maximilian Strake, Bruno Defraene, Kristoff Fluyt, Wouter Tirry, Tim Fingscheidt

INTERSPEECH 2020 Deep Noise Suppression Challenge: A Fully Convolutional Recurrent Network (FCRN) for Joint Dereverberation and Denoising. (opens in new tab)

4

4

Westlake University, Inria Grenoble Rhone-Alpes

Xiaofei Li and Radu Horaud

Online monaural speech enhancement using delayed subband lstm (opens in new tab)

5

8

Carl von Ossietzky University

Nils L. Westhausen and Bernd T. Meyer

Dual-signal transformation lstm network for real-time noise suppression (opens in new tab)

Phase 1 Results
Organization Team # Complexity Synthetic MOS Synthetic dMOS Real Recordings MOS Real Recordings dMOS Synthetic Reverb MOS Synthetic Reverb dMOS Overall MOS Overall dMOS 95% CI
Amazon 9 NRT 4.07 0.74 3.52 0.55 3.33 0.55 3.61 0.60 0.02
Amazon 9 RT 3.92 0.59 3.51 0.53 3.16 0.38 3.52 0.51 0.02
North Western Polytechnical University, China 29 RT 4.01 0.69 3.48 0.51 3.10 0.32 3.52 0.51 0.02
Microsoft – 1   NRT 3.98 0.66 3.41 0.44 3.22 0.44 3.51 0.50 0.02
North Western Polytechnical University, China 29 NRT 3.98 0.66 3.40 0.43 3.15 0.37 3.48 0.47 0.02
TU Braunschweig and Goodix Technology 17 NRT 3.85 0.52 3.39 0.41 3.23 0.46 3.46 0.45 0.02
Sony and CMU 14 NRT 3.86 0.53 3.42 0.44 3.16 0.39 3.46 0.45 0.02
TU Braunschweig and Goodix Technology 17 RT 3.86 0.54 3.39 0.42 3.21 0.43 3.46 0.45 0.02
Institute of Acoustics, Chinese Academy of Science 30 NRT 3.81 0.48 3.33 0.36 3.02 0.24 3.37 0.36 0.02
Supertone/Seoul National University 20 NRT 3.75 0.43 3.28 0.31 3.12 0.34 3.36 0.35 0.02
Microsoft-2   RT 3.76 0.44 3.26 0.29 3.08 0.30 3.34 0.33 0.02
Westlake University, INRIA Grenoble Rhone-Alpes 37 RT 3.67 0.35 3.30 0.33 3.02 0.24 3.32 0.31 0.02
CASIA and John Hopkins 15 NRT 3.73 0.41 3.30 0.33 2.94 0.16 3.32 0.31 0.02
Institute of Automation, Chinese Academy of Science 6 NRT 3.68 0.36 3.31 0.34 2.90 0.12 3.30 0.29 0.02
CASIA and John Hopkins 15 RT 3.63 0.31 3.25 0.27 2.94 0.16 3.27 0.25 0.02
Shandong University of Technology 18 RT 3.68 0.36 3.33 0.36 2.65 -0.12 3.25 0.24 0.02
Supertone/Seoul National University 20 RT 3.60 0.27 3.19 0.22 2.98 0.20 3.24 0.23 0.02
Carl Von Ossietzky University Oldenburg 22 RT 3.58 0.25 3.21 0.24 2.95 0.17 3.24 0.23 0.02
Sayint.ai 25 NRT 3.74 0.42 3.25 0.27 2.62 -0.16 3.21 0.20 0.02
Academia Sinica 5 NRT 3.63 0.30 3.18 0.21 2.83 0.06 3.21 0.19 0.02
Facebook AI, INRIA 41 NRT 3.67 0.34 3.19 0.21 2.78 0.00 3.20 0.19 0.02
Friedrich-Alexander-Universitat Erlangen-Nurnberg 40 RT 3.54 0.21 3.18 0.20 2.92 0.14 3.20 0.19 0.02
Institute of Acoustics, Chinese Academy of Science 30 RT 3.50 0.17 3.10 0.13 2.90 0.12 3.15 0.14 0.02
Institute of Acoustics, Chinese Academy of Science 30 RT 3.50 0.17 3.10 0.13 2.90 0.12 3.15 0.14 0.02
Facebook AI, INRIA 41 RT 3.61 0.28 3.08 0.10 2.70 -0.07 3.12 0.11 0.02
Citicbank credic card center 36 RT 3.46 0.14 3.04 0.07 2.70 -0.07 3.06 0.05 0.02
Baseline-NSNet   RT 3.49 0.17 3.00 0.03 2.64 -0.14 3.03 0.02 0.02
Noisy Blind test set     3.32 0 2.97 0 2.78 0 3.01 0 0.02
Phase 2 Results
Organization Team # Complexity Synthetic MOS Synthetic dMOS Real Recordings MOS Real Recordings dMOS Synthetic Reverb MOS Synthetic Reverb dMOS Overall MOS Overall dMOS 95% CI
Amazon 9 NRT 4.07 0.94 3.40 0.57 3.19 0.54 3.52 0.67 0.01
North Western Polytechnical University, China 29 RT 4.00 0.87 3.37 0.54 2.94 0.30 3.42 0.57 0.01
Amazon 9 RT 3.87 0.74 3.38 0.55 2.97 0.32 3.39 0.54 0.01
TU Braunschweig and Goodix Technology 17 NRT 3.83 0.70 3.28 0.45 3.15 0.51 3.38 0.53 0.01
North Western Polytechnical University, China 29 NRT 3.90 0.77 3.34 0.52 2.96 0.31 3.38 0.53 0.01
TU Braunschweig and Goodix Technology 17 RT 3.83 0.69 3.27 0.44 3.11 0.47 3.36 0.52 0.01
Sony and CMU 14 NRT 3.76 0.63 3.32 0.49 2.98 0.33 3.34 0.49 0.01
Blind test set     3.13 0 2.83 0 2.64 0 2.85 0 0.01