Speaker background models for connected digit password speaker verification
Likelihood ratio or cohort normalized scoring has been shown to be effective for improving the performance of speaker verification systems. An important problem in this connection is the establishment of principles for constructing speaker background or cohort models which provide the most effective normalized scores. Several kinds of speaker background models are studied. These include individual speaker models, models constructed from the pooled utterances of different numbers of speakers, models selected on the basis of similarity with customer models, models constructed from random selections of speakers, and models constructed from databases recorded under different conditions than the customer models. The results of experiments show that pooled models based on similarity to the reference speaker perform better than individual cohort models from the same similar set of speakers. Pooled background models from a small number of speakers based on similarity perform about the best, but not significantly better than a random selection of 40 or more gender balanced speakers with training conditions matched to the reference speakers.