Unified stochastic engine (USE) for speech recognition

Xuedong Huang; Marie-Thérèse Belin; Fil Alleva; Mei-Yuh Hwang

Unified stochastic engine (USE) for speech recognition

Xuedong Huang ,
Marie-Thérèse Belin ,
Fil Alleva ,
Mei-Yuh Hwang

1993 IEEE International Conference on Acoustics, Speech, and Signal Processing, 1993. ICASSP-93. | April 1993

Download BibTex

A unified stochastic engine (USE) that jointly optimizes both acoustic and language models is presented. In the USE, not only can one iteratively adjust language probabilities to fit the given acoustic representations, but one can also adjust acoustic models (including feature representation) guided by language constraints. From the language modeling point of view, the USE makes it possible to encode acoustically confusable words in the language probabilities. From the acoustic modeling point of view, the language-constraint approach makes it possible to focus on acoustic words for which language models lack enough discrimination capacity. The authors report preliminary experimental results for Wall Street Journal continuous 5000-word speaker-independent dictation. The error rate is reduced from 7.3% to 6.9% with the proposed method.