首页> 外文期刊>Neurocomputing >Noise robust audio surveillance using reduced spectrogram image feature and one-against-all SVM
【24h】

Noise robust audio surveillance using reduced spectrogram image feature and one-against-all SVM

机译:使用减少的频谱图图像功能和相对于所有SVM的抗噪音频监视

获取原文
获取原文并翻译 | 示例

摘要

This paper builds on the technique of feature extraction from the spectrogram image of sound signals for automatic sound recognition. The spectrogram image is divided into blocks and statistical distributions are extracted from each block as features. However, when compared to related work, we reduce the dimensionality of the feature vector using mean and standard deviation values along the row and column of the blocks without compromising the classification accuracy. We demonstrate the technique in an audio surveillance application and evaluate the performance using four common multiclass support vector machine (SVM) classification techniques, one-against-all, one-against-one, decision directed acyclic graph, and adaptive directed acyclic graph. Experimentation was carried out using an audio database with 10 sound classes, each containing multiple subclasses with intraclass diversity and interclass similarity in terms of signal properties. Under noisy conditions, the proposed reduced spectrogram image feature (RSIF) produced significantly better classification accuracy than the conventional log compressed mel-frequency cepstral coefficients (MFCCs) and marginally better classification accuracy than linear MFCCs, which does not utilize any compression. The linear spectrogram image representations for feature extraction and the one-against-all multiclass SVM classification method were found to be the most noise robust. In addition, significantly improved results were obtained under noisy conditions when the RSIF is combined with linear MFCCs. (C) 2015 Elsevier B.V. All rights reserved.
机译:本文基于从声音信号的频谱图图像中提取特征以进行自动声音识别的技术。频谱图图像分为多个块,并从每个块中提取统计分布作为特征。但是,与相关工作相比,我们使用沿着块的行和列的平均值和标准偏差值来降低特征向量的维数,而不会影响分类的准确性。我们在音频监视应用程序中演示该技术,并使用四种常见的多类支持向量机(SVM)分类技术,一对一,一对一,决策有向无环图和自适应有向无环图来评估性能。使用具有10个声音类别的音频数据库进行了实验,每个声音类别包含多个子类别,这些子类别在信号属性方面具有类内差异和类间相似性。在嘈杂的条件下,与传统的对数压缩梅尔频率倒谱系数(MFCC)相比,拟议中的减少频谱图图像特征(RSIF)产生了更好的分类精度,而线性线性MFCC却没有使用任何压缩,因此分类精度略微提高。发现用于特征提取的线性频谱图图像表示法和针对所有多类支持向量机的一种分类方法具有最大的噪声鲁棒性。此外,当RSIF与线性MFCC组合使用时,在嘈杂的条件下可以获得明显改善的结果。 (C)2015 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号