首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Noisy Constrained Maximum-Likelihood Linear Regression for Noise-Robust Speech Recognition
【24h】

Noisy Constrained Maximum-Likelihood Linear Regression for Noise-Robust Speech Recognition

机译:嘈杂约束最大似然线性回归用于鲁棒语音识别

获取原文
获取原文并翻译 | 示例

摘要

Adaptive training is a widely used technique for building speech recognition systems on nonhomogeneous training data. Recently, there has been interest in applying these approaches for situations where there is significant levels of background noise in the training data. Various schemes for adaptive training are based on noise-, or speaker-, specific transforms of features to yield estimates of the clean speech. However, when there are high levels of background noise, these clean speech estimates may be poor resulting in degradations in performance. In this paper, a new approach for adaptive training on noise-corrupted training data is presented. It extends a popular form of linear transform for model-based adaptation and adaptive training, constrained MLLR (CMLLR), to reflect additional uncertainty from noise-corrupted observations. This new form of adaptation transform is called noisy CMLLR (NCMLLR). NCMLLR uses a modified version of generative model between clean speech and noisy observation, similar to factor analysis (FA). However, in contrast to FA here the generative model describes an adaptation transform, rather than a covariance matrix structure. The use of NCMLLR for adaptive training using an expectation–maximization approach is described. Discriminative adaptive training with NCMLLR is also described based on the minimum phone error criterion. Experimental results comparing NCMLLR with standard adaptive training schemes are given on a noise-corrupted version of Resource Management, the ARPA 1994 CSRNAB Spoke 10 task, and in-car recorded data.
机译:自适应训练是在非均匀训练数据上构建语音识别系统的一种广泛使用的技术。近来,人们有兴趣将这些方法应用于训练数据中存在大量背景噪声的情况。用于适应性训练的各种方案基于噪声或说话者的特征的特定变换,以产生干净语音的估计。但是,当存在高水平的背景噪声时,这些干净的语音估计可能很差,从而导致性能下降。在本文中,提出了一种新的针对噪声损坏的训练数据进行自适应训练的方法。它扩展了一种流行的线性变换形式,用于基于模型的自适应和自适应训练,即受约束的MLLR(CMLLR),以反映噪声破坏的观测结果带来的其他不确定性。这种新形式的自适应变换称为“噪声CMLLR”(NCMLLR)。 NCMLLR在纯净语音和嘈杂观察之间使用了生成模型的修改版本,类似于因子分析(FA)。但是,与FA相比,生成模型描述的是自适应转换,而不是协方差矩阵结构。描述了使用NCMLLR进行期望值最大化方法的自适应训练。还基于最小电话错误准则描述了使用NCMLLR进行的歧视性自适应训练。在噪声损坏的资源管理版本,ARPA 1994 CSRNAB Spoke 10任务和车载记录数据上,给出了将NCMLLR与标准自适应训练方案进行比较的实验结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号