首页> 外文会议>International Conference on Metadata for Audio >HOW EFFICIENT IS MPEG-7 FOR GENERAL SOUND RECOGNITION?
【24h】

HOW EFFICIENT IS MPEG-7 FOR GENERAL SOUND RECOGNITION?

机译:MPEG-7有多少量的声音识别?

获取原文

摘要

Our challenge is to analyze/classify video sound track content for indexing purposes. To this end we compare the performance of MPEG-7 Audio Spectrum Projection (ASP) features based on several basis decomposition algorithms vs. Mel-scale Frequency Cepstrum Coefficients (MFCC). For basis decomposition in the feature extraction we evaluate three approaches: Principal Component Analysis (PCA), Independent Component Analysis (ICA), and Non-negative Matrix Factorization (NMF). Audio features are computed from these reduced vectors and are fed into a continuous hidden Markov model (CHMM) classifier. Our conclusion is that established MFCC features yield better performance compared to MPEG-7 ASP in the general sound recognition under practical constraints.
机译:我们的挑战是分析/分类视频声道内容以获取索引目的。为此,我们基于几个基础分解算法与MEL级频率谱系数(MFCC)进行了基于多个基础分解算法的MPEG-7音频频谱投影(ASP)特征的性能。对于特征提取中的基础分解,我们评估三种方法:主成分分析(PCA),独立分析(ICA)和非负矩阵分解(NMF)。音频功能由这些缩小的矢量计算,并被馈入连续隐藏的马尔可夫模型(CHMM)分类器。我们的结论是,与实际约束下的一般声音识别中的MPEG-7 ASP相比,已建立的MFCC功能会产生更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号