首页> 外文会议>European Signal Processing Conference >Novel TEO-based Gammatone features for environmental sound classification
【24h】

Novel TEO-based Gammatone features for environmental sound classification

机译:基于TEO的新颖Gammatone功能可进行环境声音分类

获取原文

摘要

In this paper, we propose to use modified Gammatone filterbank with Teager Energy Operator (TEO) for environmental sound classification (ESC) task. TEO can track energy as a function of both amplitude and frequency of an audio signal. TEO is better for capturing energy variations in the signal that is produced by a real physical system, such as, environmental sounds that contain amplitude and frequency modulations. In proposed feature set, we have used Gammatone filterbank since it represents characteristics of human auditory processing. Here, we have used two classifiers, namely, Gaussian Mixture Model (GMM) using cepstral features, and Convolutional Neural Network (CNN) using spectral features. We performed experiments on two datasets, namely, ESC-50, and UrbanSound8K. We compared TEO-based coefficients with Mel filter cepstral coefficients (MFCC) and Gammatone cepstral coefficients (GTCC), in which GTCC used mean square energy. Using GMM, the proposed TEO-based Gammatone Cepstral Coefficients (TEO-GTCC), and its score-level fusion with MFCC gave absolute improvement of 0.45 %, and 3.85 % in classification accuracy over MFCC on ESC-50 dataset. Similarly, on UrbanSound8K dataset the proposed TEO-GTCC, and its score-level fusion with GTCC gave absolute improvement of 1.40 %, and 2.44 % in classification accuracy over MFCC. Using CNN, the score-level fusion of Gammatone spectral coefficient (GTSC) and the proposed TEO-based Gammatone spectral coefficients (TEO-GTSC) gave absolute improvement of 14.10 %, and 14.52 % in classification accuracy over Mel filterbank energies (FBE) on ESC-50 and UrbanSond8K datasets, respectively. This shows that proposed TEO-based Gammatone features contain complementary information which is helpful in ESC task.
机译:在本文中,我们建议使用带有Teager能量算子(TEO)的改进的Gammatone滤波器组来进行环境声音分类(ESC)任务。 TEO可以根据音频信号的幅度和频率来跟踪能量。 TEO更适合捕获由实际物理系统产生的信号中的能量变化,例如包含振幅和频率调制的环境声音。在建议的功能集中,我们使用了Gammatone滤镜库,因为它代表了人类听觉处理的特征。在这里,我们使用了两个分类器,即使用倒频谱特征的高斯混合模型(GMM)和使用频谱特征的卷积神经网络(CNN)。我们在两个数据集ESC-50和UrbanSound8K上进行了实验。我们将基于TEO的系数与Mel滤波倒谱系数(MFCC)和Gammatone倒谱系数(GTCC)进行了比较,其中GTCC使用均方能量。使用GMM,提出的基于TEO的Gammatone倒谱系数(TEO-GTCC),以及它与MFCC的得分水平融合,在ESC-50数据集上比MFCC的分类精度绝对提高了0.45 \%和3.85 \%。类似地,在UrbanSound8K数据集上,提出的TEO-GTCC及其与GTCC的得分级融合比MFCC绝对提高了1.40 \%和2.44 \%的分类精度。使用CNN,分数级融合的Gammatone光谱系数(GTSC)和拟议的基于TEO的Gammatone光谱系数(TEO-GTSC)相对于Mel滤波器组能量(FBE)绝对提高了14.10 \%和14.52 \%的分类精度)分别放在ESC-50和UrbanSond8K数据集上。这表明建议的基于TEO的Gammatone功能包含补充信息,这对ESC任务很有帮助。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号