首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Auditory Model-Based Design and Optimization of Feature Vectors for Automatic Speech Recognition
【24h】

Auditory Model-Based Design and Optimization of Feature Vectors for Automatic Speech Recognition

机译:基于听觉模型的语音识别特征向量的设计与优化

获取原文
获取原文并翻译 | 示例

摘要

Using spectral and spectro-temporal auditory models along with perturbation-based analysis, we develop a new framework to optimize a feature vector such that it emulates the behavior of the human auditory system. The optimization is carried out in an offline manner based on the conjecture that the local geometries of the feature vector domain and the perceptual auditory domain should be similar. Using this principle along with a static spectral auditory model, we modify and optimize the static spectral mel frequency cepstral coefficients (MFCCs) without considering any feedback from the speech recognition system. We then extend the work to include spectro-temporal auditory properties into designing a new dynamic spectro-temporal feature vector. Using a spectro-temporal auditory model, we design and optimize the dynamic feature vector to incorporate the behavior of human auditory response across time and frequency. We show that a significant improvement in automatic speech recognition (ASR) performance is obtained for any environmental condition, clean as well as noisy.
机译:使用光谱和光谱时听觉模型以及基于扰动的分析,我们开发了一种新的框架来优化特征向量,从而模拟人类听觉系统的行为。基于推测,特征向量域和感知听觉域的局部几何形状应该相似,以离线方式进行优化。使用此原理以及静态频谱听觉模型,我们可以修改和优化静态频谱梅尔频率倒谱系数(MFCC),而无需考虑语音识别系统的任何反馈。然后,我们将工作范围扩展到包括时空听觉特性,以设计新的动态时空特征向量。我们使用光谱时听觉模型设计和优化动态特征向量,以整合人类听觉响应在时间和频率上的行为。我们表明,在任何环境条件下(无论清洁还是嘈杂),自动语音识别(ASR)性能都得到了显着改善。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号