首页> 外文期刊>IEEE signal processing letters >Recognition of Reverberant Speech Using Frequency Domain Linear Prediction
【24h】

Recognition of Reverberant Speech Using Frequency Domain Linear Prediction

机译:基于频域线性预测的混响语音识别

获取原文
获取原文并翻译 | 示例

摘要

Performance of a typical automatic speech recognition (ASR) system severely degrades when it encounters speech from reverberant environments. Part of the reason for this degradation is the feature extraction techniques that use analysis windows which are much shorter than typical room impulse responses. We present a feature extraction technique based on modeling temporal envelopes of the speech signal in narrow subbands using frequency domain linear prediction (FDLP). FDLP provides an all-pole approximation of the Hilbert envelope of the signal obtained by linear prediction on cosine transform of the signal. ASR experiments on speech data degraded with a number of room impulse responses (with varying degrees of distortion) show significant performance improvements for the proposed FDLP features when compared to other robust feature extraction techniques (average relative reduction of 24% in word error rate). Similar improvements are also obtained for far-field data which contain natural reverberation in background noise. These results are achieved without any noticeable degradation in performance for clean speech.
机译:当典型的自动语音识别(ASR)系统遇到来自混响环境的语音时,其性能会严重下降。这种退化的部分原因是特征提取技术使用的分析窗口比典型的房间脉冲响应要短得多。我们提出一种特征提取技术,该技术基于使用频域线性预测(FDLP)在狭窄子带中对语音信号的时间包络建模的模型。 FDLP提供了通过对信号的余弦变换进行线性预测而获得的信号的希尔伯特包络的全极点近似。与其他鲁棒的特征提取技术相比,针对语音数据的ASR实验在许多房间脉冲响应(具有不同程度的失真)下退化,显示了所提出的FDLP特征的显着性能改进(平均相对错误率降低了24%)。对于在背景噪声中包含自然混响的远场数据也获得了类似的改进。可以实现这些结果,而不会使清晰语音的性能显着下降。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号