...
首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >An Improved VTS Feature Compensation using Mixture Models of Distortion and IVN Training for Noisy Speech Recognition
【24h】

An Improved VTS Feature Compensation using Mixture Models of Distortion and IVN Training for Noisy Speech Recognition

机译:使用失真和IVN训练的混合模型改进的VTS特征补偿,用于嘈杂的语音识别

获取原文
获取原文并翻译 | 示例
           

摘要

In our previous work, we proposed a feature compensation approach using high-order vector Taylor series (VTS) approximation for noisy speech recognition. In this paper, we report new progress on making it more powerful and practical in real applications. First, mixtures of densities are used to enhance the distortion models of both additive noise and convolutional distortion. New formulations for maximum likelihood (ML) estimation of distortion model parameters, and minimum mean squared error (MMSE) estimation of clean speech are derived and presented. Second, we improve the feature compensation in both efficiency and accuracy by applying higher order information of VTS approximation only to the noisy speech mean parameters, and a temporal smoothing operation for the posterior probability of Gaussian mixture components in clean speech estimation. Finally, we design a procedure to perform irrelevant variability normalization (IVN) based joint training of a reference Gaussian mixture model (GMM) for feature compensation and hidden Markov models (HMMs) for acoustic modeling using VTS-based feature compensation. The effectiveness of our proposed approach is confirmed by experiments on Aurora3 benchmark database for a real-world in-vehicle connected digits recognition task. Compared with ETSI advanced front-end, our approach achieves significant recognition accuracy improvement across three “training-testing” conditions for four languages.
机译:在我们之前的工作中,我们提出了一种使用高阶矢量泰勒级数(VTS)逼近的特征补偿方法来进行嘈杂的语音识别。在本文中,我们报告了使其在实际应用中更强大和实用的新进展。首先,使用密度混合来增强加性噪声和卷积失真的失真模型。得出并提出了失真模型参数的最大似然(ML)估计和干净语音的最小均方误差(MMSE)估计的新公式。其次,我们通过仅将VTS逼近的高阶信息仅应用于嘈杂的语音均值参数,以及对纯语音估计中的高斯混合分量的后验概率进行时间平滑操作,来提高效率和准确性方面的特征补偿。最后,我们设计了一个程序,该程序执行基于高斯混合模型(GMM)的特征补偿和基于声学模型的隐马尔可夫模型(HMM)的基于不相关变异性归一化(IVN)的联合训练,基于VTS的特征补偿。通过在Aurora3基准数据库上进行的现实世界中车载数字识别任务的实验,证实了我们提出的方法的有效性。与ETSI高级前端相比,我们的方法在四种语言的三种“训练测试”条件下实现了明显的识别准确性改善。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号