首页> 外文期刊>Computer speech and language >Features based on filtering and spectral peaks in autocorrelation domain for robust speech recognition
【24h】

Features based on filtering and spectral peaks in autocorrelation domain for robust speech recognition

机译:基于滤波和自相关域频谱峰值的功能可增强语音识别能力

获取原文
获取原文并翻译 | 示例

摘要

In this paper, a set of features derived by filtering and spectral peak extraction in autocorrelation domain are proposed. We focus on the effect of the additive noise on speech recognition. Assuming that the channel characteristics and additive noises are stationary, these new features improve the robustness of speech recognition in noisy conditions. In this approach, initially, the autocorrelation sequence of a speech signal frame is computed. Filtering of the autocorrelation of speech signal is carried out in the second step, and then, the short-time power spectrum of speech is obtained from the speech signal through the fast Fourier transform. The power spectrum peaks are then calculated by differentiating the power spectrum with respect to frequency. The magnitudes of these peaks are then projected onto the mel-scale and pass the filter bank. Finally, a set of cepstral coefficients are derived from the outputs of the filter bank. The effectiveness of the new features for speech recognition in noisy conditions will be shown in this paper through a number of speech recognition experiments. A task of multi-speaker isolated-word recognition and another one of multi-speaker continuous speech recognition with various artificially added noises such as factory, babble, car and F16 were used in these experiments. Also, a set of experiments were carried out on Aurora 2 task. Experimental results show significant improvements under noisy conditions in comparison to the results obtained using traditional feature extraction methods. We have also reported the results obtained by applying cepstral mean normalization on the methods to get robust features against both additive noise and channel distortion.
机译:本文提出了在自相关域中通过滤波和谱峰提取得到的一组特征。我们关注于附加噪声对语音识别的影响。假设信道特性和附加噪声是固定的,那么这些新功能将提高嘈杂条件下语音识别的鲁棒性。在这种方法中,首先,计算语音信号帧的自相关序列。在第二步骤中对语音信号的自相关进行滤波,然后,通过快速傅立叶变换从语音信号中获得语音的短时功率谱。然后通过相对于频率微分功率谱来计算功率谱峰。然后将这些峰值的大小投影到mel刻度上,并通过滤波器组。最后,从滤波器组的输出中得出一组倒频谱系数。本文将通过许多语音识别实验来证明在嘈杂条件下语音识别新功能的有效性。在这些实验中,使用了一个多扬声器孤立单词识别任务和另一个带有各种人为添加的噪声(例如工厂,ba不休,汽车和F16)的多扬声器连续语音识别任务。此外,针对Aurora 2任务进行了一组实验。实验结果表明,与使用传统特征提取方法获得的结果相比,在嘈杂条件下具有显着改善。我们还报告了通过对方法应用倒谱均值归一化以获得针对加性噪声和通道失真的强大功能的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号