A split lexicon approach for improved recognition of spoken names

Abhinav Sethy; Shrikanth Narayanan; Sarangarajan Parthasarathy

A split lexicon approach for improved recognition of spoken names

Abhinav Sethy ,
Shrikanth Narayanan ,
Sarangarajan Parthasarathy

Speech Communication | September 2006 , pp. 1126-1136

Download BibTex

Recognition of spoken names is a challenging task for automatic speech recognition systems because the list of names for applications such as directory assistance tends to be in the order of several hundred thousands. This makes spoken name recognition a very high perplexity task. In this paper we propose the use of syllables as the acoustic unit for spoken name recognition based on reverse lookup schemes and show how syllables can be used to improve recognition performance and reducing the system perplexity. We present system design methodologies to address the problem of acoustic-training data sparsity encountered when using longer length units such as syllables. We illustrate our ideas first on a TIMIT based continuous speech recognition problem and then focus on the application of these ideas to spoken name recognition. Our results on the OGI spoken name corpus indicate that using syllables in place of phoneme models can help boost system accuracy significantly while helping to reduce the system complexity.