首页> 外文会议>International Symposium on Intelligent Multimedia, Video and Speech Processing >Robust speech recognition based on the second-order difference cochlear model
【24h】

Robust speech recognition based on the second-order difference cochlear model

机译:基于二阶差异耳蜗模型的强大语音识别

获取原文

摘要

MFCC is a kind of traditional speech feature widely used in the speech recognition. The error rate of speech recognition algorithm using MFCC and CDHMM is known to be very low in clean speech environment, but it increases greatly in noisy environment, especially in the white noisy environment. In this paper, we propose a new kind of speech feature called the Auditory Spectrum Based Feature (ASBF) that is based on the second-order difference cochlear model of the human auditory system. This new speech feature can track the speech formants and the selection scheme of this feature is based on both second-order difference cochlear model and primary auditory nerve processing model of human auditory system. In our experiment, the performance of MFCC and the ASBF are compared in both clean and noisy environments when left-to-right CDHMM with 6 states and 5 Gaussian mixtures is used. The experimental result shows that the ASBF is much more robust to noise than MFCC. When only 5 frequency components are used in ASBF, the error rate is approximately 38% lower than the traditional MFCC with 39 parameters in the condition of S/N=10dB with white noise.
机译:MFCC是一种广泛用于语音识别的传统语音特征。已知使用MFCC和CDHMM的语音识别算法的错误率在清洁语音环境中非常低,但它在嘈杂的环境中增加了很大增加,特别是在白噪声环境中。在本文中,我们提出了一种新的语音特征,称为听觉的基于频谱的特征(ASBF),其基于人类听觉系统的二阶差异基准模型。这个新的语音特征可以跟踪语音格式,并且该特征的选择方案基于人类听觉系统的二阶差异耳蜗模型和主要听觉神经处理模型。在我们的实验中,当使用具有6个状态和5个高斯混合物的左右CDHMM时,在清洁和嘈杂的环境中,在清洁和嘈杂的环境中比较MFCC和ASBF的性能。实验结果表明,ASBF比MFCC更稳健。当在ASBF中仅使用5个频率分量时,错误率大约比传统MFCC低的38%,在S / N = 10dB的条件下,具有白色噪声的条件。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号