首页> 美国卫生研究院文献>other >Bird sound spectrogram decomposition through Non-Negative Matrix Factorization for the acoustic classification of bird species
【2h】

Bird sound spectrogram decomposition through Non-Negative Matrix Factorization for the acoustic classification of bird species

机译:基于非负矩阵分解的鸟类声谱图分解用于鸟类的声学分类

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Feature extraction for Acoustic Bird Species Classification (ABSC) tasks has traditionally been based on parametric representations that were specifically developed for speech signals, such as Mel Frequency Cepstral Coefficients (MFCC). However, the discrimination capabilities of these features for ABSC could be enhanced by accounting for the vocal production mechanisms of birds, and, in particular, the spectro-temporal structure of bird sounds. In this paper, a new front-end for ABSC is proposed that incorporates this specific information through the non-negative decomposition of bird sound spectrograms. It consists of the following two different stages: short-time feature extraction and temporal feature integration. In the first stage, which aims at providing a better spectral representation of bird sounds on a frame-by-frame basis, two methods are evaluated. In the first method, cepstral-like features (NMF_CC) are extracted by using a filter bank that is automatically learned by means of the application of Non-Negative Matrix Factorization (NMF) on bird audio spectrograms. In the second method, the features are directly derived from the activation coefficients of the spectrogram decomposition as performed through NMF (H_CC). The second stage summarizes the most relevant information contained in the short-time features by computing several statistical measures over long segments. The experiments show that the use of NMF_CC and H_CC in conjunction with temporal integration significantly improves the performance of a Support Vector Machine (SVM)-based ABSC system with respect to conventional MFCC.
机译:传统上,“声鸟种类分类”(ABSC)任务的特征提取基于专门为语音信号开发的参数表示形式,例如“梅尔频率倒谱系数(MFCC)”。但是,可以通过考虑鸟类的声音产生机制,尤其是鸟类声音的时空结构,来增强这些特征对ABSC的辨别能力。在本文中,提出了一种新的ABSC前端,该前端通过鸟声声谱图的非负分解合并了此特定信息。它由以下两个不同阶段组成:短期特征提取和时间特征集成。在第一阶段中,旨在逐帧提供更好的鸟声频谱表示,评估了两种方法。在第一种方法中,通过使用滤波器组提取类似倒谱的特征(NMF_CC),该滤波器组是通过在鸟类音频声谱图上应用非负矩阵分解(NMF)来自动学习的。在第二种方法中,特征是直接从通过NMF(H_CC)执行的频谱图分解的激活系数中得出的。第二阶段通过计算长段上的几种统计量来总结短时功能中包含的最相关的信息。实验表明,与传统的MFCC相比,将NMF_CC和H_CC与时间积分结合使用可显着提高基于支持向量机(SVM)的ABSC系统的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号