首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Extended VTS for Noise-Robust Speech Recognition
【24h】

Extended VTS for Noise-Robust Speech Recognition

机译:扩展的VTS,用于增强噪声的语音识别

获取原文
获取原文并翻译 | 示例

摘要

Model compensation is a standard way of improving the robustness of speech recognition systems to noise. A number of popular schemes are based on vector Taylor series (VTS) compensation, which uses a linear approximation to represent the influence of noise on the clean speech. To compensate the dynamic parameters, the continuous time approximation is often used. This approximation uses a point estimate of the gradient, which fails to take into account that dynamic coefficients are a function of a number of consecutive static coefficients. In this paper, the accuracy of dynamic parameter compensation is improved by representing the dynamic features as a linear transformation of a window of static features. A modified version of VTS compensation is applied to the distribution of the window of static features and, importantly, their correlations. These compensated distributions are then transformed to distributions over standard static and dynamic features. With this improved approximation, it is also possible to obtain full-covariance corrupted speech distributions. This addresses the correlation changes that occur in noise. The proposed scheme outperformed the standard VTS scheme by 10% to 20% relative on a range of tasks.
机译:模型补偿是提高语音识别系统对噪声的鲁棒性的标准方法。许多流行的方案都是基于矢量泰勒级数(VTS)补偿的,该补偿使用线性逼近来表示噪声对纯净语音的影响。为了补偿动态参数,经常使用连续时间近似。该近似使用梯度的点估计,该点未考虑动态系数是多个连续静态系数的函数。在本文中,通过将动态特征表示为静态特征窗口的线性变换来提高动态参数补偿的精度。 VTS补偿的修改版适用于静态特征窗口的分布,以及重要的是它们的相关性。然后将这些补偿分布转换为标准静态和动态特征上的分布。通过这种改进的近似,还可以获得完全协方差损坏的语音分布。这解决了噪声中发生的相关变化。相对于一系列任务,拟议方案的性能优于标准VTS方案10%至20%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号