Unsupervised speaker segmentation of telephone conversations.
- Aaron E. Rosenberg ,
- Allen Gorin ,
- Zhu Liu ,
- Sarangarajan Parthasarathy
ICSLP 2002 |
Organized by ISCA
A process for segmenting 2-speaker telephone conversations by speaker with no prior speaker models is described and evaluated. The process consists of an initial segmentation using acoustic change and pause detection, segment clustering, and iterative modeling of segment clusters and resegmentation. The technique has been evaluated on (6), approximately 3 min long, customer care conversations. The technique does not resolve short (< 2 secs) or overlapping segments very well, but is capable of detecting longer segments (> 4 secs) with miss rates of the order of 10% and confusion rates 2% or less.