In this paper, temporal, spectral, and structural characteristics of Robin songs and syllables are studied. Syllables in Robin songs are clustered by comparing a distance measure defined as the average of aligned LPC-based frame level differences. The syllable patterns inferred from the clustering results are used for improving the acoustic modelling of a hidden Markov model (HMM)-based Robin song detector. Experiments conducted on a noisy Rocky Mountain Biological Laboratory Robin (RMBL-Robin) song corpus with more than 75 minutes of recordings show that the syllable pattern-based detector has a higher hit rate while maintaining a lower false alarm rate, compared to the detector with a general model trained from all the syllables.
展开▼