An Unsupervised Adaptation Approach to Leveraging Feedback Loop Data by Using i-Vector for Data Clustering and Selection
- Qiang Huo ,
- Zhijie Yan
IEEE/ACM Transactions on Audio, Speech, and Language Processing |
We present a study of using unsupervised adaptation approaches to improve speech recognition accuracy of a deployed speech service by leveraging large-scale untranscribed speech data collected from a feedback loop (FBL). For a regular user with lots of adaptation utterances, conventional CMLLR-based adaptation can be used for personalization directly. For a casual user with a few adaptation utterances, we propose to use CMLLR-based adaptation by augmenting his / her adaptation utterances with utterances acoustically close to the user, which are selected from the FBL data by an i-vector based approach. For a new user, we propose to perform a CMLLR-based recognition of an unknown utterance by selecting a set of CMLLR transforms from the most similar cluster, which are pre-trained by using the utterances from the corresponding cluster generated by an i-vector based utterance clustering method from the FBL data. The effectiveness of the above approaches are confirmed by our experiments on a short message dictation task on smart phones.
© IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.