Aligning Meeting Recordings Via Adaptive Fingerprinting
- T. J. Tsai ,
- Andreas Stolcke
Proc. Interspeech |
Published by ISCA - International Speech Communication Association
This paper proposes a robust and efficient way to temporally align a set of unsynchronized meeting recordings, such as might be collected by participants’ cell phones. We propose an adaptive audio fingerprint which is learned on-the-fly in a completely unsupervised manner to adapt to the characteristics of a given set of unaligned recordings. The design of the adaptive audio fingerprint is formulated as a series of optimization problems which can be solved very efficiently using eigenvector routines. We also propose a method of aligning sets of files which uses the cumulative evidence from previous alignments to help align the weakest matches. Based on challenging alignment scenarios extracted from the ICSI meeting corpus, the proposed alignment system is able to achieve > 99% alignment accuracy at a 100 ms error tolerance.