首页> 外文会议>Annual conference of the International Speech Communication Association >Hooking up spectro-temporal filters with auditory-inspired representations for robust automatic speech recognition
【24h】

Hooking up spectro-temporal filters with auditory-inspired representations for robust automatic speech recognition

机译:连接具有听觉启发性表示的光谱时滤波器,以实现强大的自动语音识别

获取原文

摘要

Spectro-temporal filtering has been shown to result in features that can help to increase the robustness of automatic speech recognition (ASR) in the past. We replace the spectro-temporal representation used in previous work with spectrograms that incorporate knowledge about the signal processing of the human auditory system and which are derived from Power-Normalized Cep-stral Coefficients (PNCCs). 2D-Gabor filters are applied to these spectrograms to extract features evaluated on a noisy digit recognition task. The filter bank is adapted to the new representation by optimizing the spectral modulation frequencies associated with each Gabor function. A comparison of optimized parameters and the spectral modulation of vowels shows a good match between optimized and expected range of frequencies. When processed with a non-linear neural net and combined with PNCCs, Gabor features decrease the error rate compared to the baseline and PNCCs by at least 19%.
机译:时空滤波已显示出可以帮助提高过去自动语音识别(ASR)鲁棒性的功能。我们用合并了有关人类听觉系统信号处理知识的频谱图替换了以前工作中使用的频谱时间表示形式,这些频谱图是从功率归一化倒谱系数(PNCC)得出的。将2D-Gabor滤波器应用于这些频谱图,以提取在嘈杂的数字识别任务上评估的特征。通过优化与每个Gabor函数相关的频谱调制频率,可以使滤波器组适应新的表示形式。优化参数和元音频谱调制的比较显示优化和预期频率范围之间的良好匹配。当使用非线性神经网络处理并与PNCC结合使用时,与基线和PNCC相比,Gabor特征可将错误率降低至少19%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号