首页> 外文期刊>Multimedia, IEEE Transactions on >Local Wavelet Acoustic Pattern: A Novel Time–Frequency Descriptor for Birdsong Recognition
【24h】

Local Wavelet Acoustic Pattern: A Novel Time–Frequency Descriptor for Birdsong Recognition

机译:局部小波声学模式:一种新型的鸟语识别时频描述符

获取原文
获取原文并翻译 | 示例
           

摘要

Investigating the identity, distribution, and evolution of bird species is important for both biodiversity assessment and environmental conservation. The discrete wavelet transform (DWT) has been widely exploited to extract time-frequency features for acoustic signal analysis. Traditional approaches usually compute statistical measures (e.g., maximum, mean, standard deviation) of the DWT coefficients in each subband independently to yield the feature descriptor, without considering the intersubband correlation. A new acoustic descriptor, called the local wavelet acoustic pattern (LWAP), is proposed to characterize the correlation of the DWT coefficients in different subbands for birdsong recognition. First, we divide a variable-length birdsong segment into a number of fixed-duration texture windows. For each texture window, several LWAP descriptors are extracted. The vector of locally aggregated descriptors (VLAD) is then used to aggregate the set of LWAP descriptors into a single VLAD vector. Finally, principal component analysis (PCA) plus linear discriminant analysis (LDA) are employed to reduce the feature dimensionality for classification purposes. Experiments on two birdsong datasets show that the proposed LWAP descriptor outperforms other local descriptors, including linear predictive coding cepstral coefficients, Mel-frequency cepstral coefficients, perceptual linear prediction cepstral coefficients, chroma features, and prosody features. Furthermore, the proposed LWAP descriptor, followed by VLAD encoding, PCA plus LDA feature extraction, and a simple distance-based classifier, yields promising results that are competitive with those obtained by the state-of-the-art convolutional neural networks.
机译:调查鸟类的身份,分布和进化对生物多样性评估和环境保护都很重要。离散小波变换(DWT)已被广泛用于提取时频特征以进行声信号分析。传统方法通常在不考虑子带间相关的情况下独立地计算每个子带中的DWT系数的统计量度(例如,最大,均值,标准差)以产生特征描述符。提出了一种新的声学描述符,称为局部小波声学模式(LWAP),以表征不同子带中DWT系数的相关性,以识别鸟鸣。首先,我们将可变长度的Birdong段划分为多个固定持续时间的纹理窗口。对于每个纹理窗口,提取几个LWAP描述符。然后,使用局部聚合描述符向量(VLAD)将LWAP描述符集聚合为单个VLAD向量。最后,出于分类目的,使用主成分分析(PCA)和线性判别分析(LDA)来减少特征维数。在两个Birdong数据集上进行的实验表明,提出的LWAP描述符优于其他局部描述符,包括线性预测编码倒谱系数,梅尔频率倒谱系数,感知线性预测倒谱系数,色度特征和韵律特征。此外,提出的LWAP描述符,随后进行VLAD编码,PCA和LDA特征提取以及简单的基于距离的分类器,产生了有希望的结果,与通过最新的卷积神经网络获得的结果具有竞争力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号