首页> 外文会议>Annual conference of the International Speech Communication Association;INTERSPEECH 2011 >Real-World Speech/Non-Speech Audio Classification Based on Sparse Representation Features and GPCs
【24h】

Real-World Speech/Non-Speech Audio Classification Based on Sparse Representation Features and GPCs

机译:基于稀疏表示特征和GPC的真实世界语音/非语音音频分类

获取原文

摘要

A novel and robust approach for content based speechon-speech audio classification is proposed based on sparse representation (SR) features and Gaussian process classifiers (GPCs). The projections of the noise robust sparse representations for audio signals computed by L_1 -norm minimization are used as features. GPCs are used to learn and predict audio categories. Compare to the difficulties of Support Vector Machines (SVMs) in determining the hyperparameters, GPCs employ Bayesian selection criterion to estimate them. Experimental results on real-world audio datasets show that the SR features are more robust to audio variants than mel-frequency cepstral coefficients (MFCCs) and the proposed approach gives better performances than SVM.
机译:提出了一种基于稀疏表示(SR)特征和高斯过程分类器(GPC)的基于内容的语音/非语音音频分类的新颖,鲁棒的方法。通过L_1范数最小化计算的音频信号的鲁棒性稀疏表示的投影用作特征。 GPC用于学习和预测音频类别。与支持向量机(SVM)确定超参数的困难相比,GPC使用贝叶斯选择准则对其进行估计。在现实世界的音频数据集上的实验结果表明,SR特性对音频变体的抵抗力比梅尔频率倒谱系数(MFCC)强,并且所提出的方法比SVM具有更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号