首页> 外文会议>IEEE International Conference on Acoustics, Speech, and Signal Processing >COMPARISON OF MPEG-7 AUDIO SPECTRUM PROJECTION FEATURES AND MFCC APPLIED TO SPEAKER RECOGNITION, SOUND CLASSIFICATION AND AUDIO SEGMENTATION
【24h】

COMPARISON OF MPEG-7 AUDIO SPECTRUM PROJECTION FEATURES AND MFCC APPLIED TO SPEAKER RECOGNITION, SOUND CLASSIFICATION AND AUDIO SEGMENTATION

机译:MPEG-7音频频谱投影功能和MFCC的比较应用于扬声器识别,声音分类和音频分割

获取原文

摘要

Our purpose is to evaluate the MPEG-7 Audio Spectrum Projection (ASP) features for general sound recognition performance vs. well established MFCC. The recognition tasks of interest are speaker recognition, sound classification, and segmentation of audio using sound/speaker identification. For the sound classification we use three approaches: the direct approach, the hierarchical approach without hints, and the hierarchical approach with hints. For audio segmentation the MPEG-7 ASP features and MFCCs are used to train hidden Markov models (HMM) for individual speakers and sounds. The trained sound/speaker models are then used to segment conversational speech involving a given subset of people in panel discussion television programs. Results show that MFCC approach yields sound/speaker recognition rate superior to MPEG-7 implementations.
机译:我们的目的是评估MPEG-7音频频谱投影(ASP)功能,用于通用声音识别性能与已建立的MFCC。感兴趣的识别任务是使用声音/扬声器识别的音频识别,声音分类和音频分割。对于声音分类,我们使用三种方法:直接方法,没有提示的分层方法,以及具有提示的分层方法。对于音频分割,MPEG-7 ASP功能和MFCC用于培训针对各个扬声器和声音的隐马尔可夫模型(HMM)。然后,培训的声音/扬声器模型用于涉及面板讨论电视节目中的给定子集的会话语音。结果表明,MFCC方法会产生优于MPEG-7实现的声音/扬声器识别率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号