首页> 外文会议>Asia-Pacific Signal and Information Processing Association Annual Summit and Conference >Spectral-temporal receptive fields and MFCC balanced feature extraction for noisy speech recognition
【24h】

Spectral-temporal receptive fields and MFCC balanced feature extraction for noisy speech recognition

机译:频谱时域接收场和MFCC平衡特征提取用于嘈杂的语音识别

获取原文

摘要

This paper aims to propose a new set of acoustic features based on spectral-temporal receptive fields (STRFs). The STRF is an analysis method for studying physiological model of the mammalian auditory system in spectral-temporal domain. It has two different parts: one is the rate (in Hz) which represents the temporal response and the other is the scale (in cycle/octave) which represents the spectral response. With the obtained STRF, we propose an effective acoustic feature. First, the energy of each scale is calculated from the STRF. The logarithmic operation is then imposed on the scale energies. Finally, the discrete Cosine transform is applied to generate the proposed STRF feature. In our experiments, we combine the proposed STRF feature with conventional Mel frequency cepstral coefficients (MFCCs) to verify its effectiveness. In a noise-free environment, the proposed feature can increase the recognition rate by 17.48%. Moreover, the increase in the recognition rate ranges from 5% to 12% in noisy environments.
机译:本文旨在基于频谱-时间接收场(STRFs)提出一套新的声学特征。 STRF是一种在频谱时域范围内研究哺乳动物听觉系统生理模型的分析方法。它有两个不同的部分:一个是表示时间响应的速率(以Hz为单位),另一个是表示频谱响应的标度(以周期/倍频程为单位)。利用获得的STRF,我们提出了一种有效的声学特征。首先,从STRF计算每个标度的能量。然后将对数运算施加到标尺能量上。最后,将离散余弦变换应用于生成建议的STRF特征。在我们的实验中,我们将建议的STRF功能与常规的梅尔频率倒谱系数(MFCC)结合起来以验证其有效性。在无噪声的环境中,提出的功能可以将识别率提高17.48%。此外,在嘈杂的环境中,识别率的提高范围从5%到12%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号