An Improved VTS Feature Compensation using Mixture Models of Distortion and IVN Training for Noisy Speech Recognition

Du J.; Huo Q.

首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >An Improved VTS Feature Compensation using Mixture Models of Distortion and IVN Training for Noisy Speech Recognition

【24h】

An Improved VTS Feature Compensation using Mixture Models of Distortion and IVN Training for Noisy Speech Recognition

机译：使用失真和IVN训练的混合模型改进的VTS特征补偿，用于嘈杂的语音识别

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In our previous work, we proposed a feature compensation approach using high-order vector Taylor series (VTS) approximation for noisy speech recognition. In this paper, we report new progress on making it more powerful and practical in real applications. First, mixtures of densities are used to enhance the distortion models of both additive noise and convolutional distortion. New formulations for maximum likelihood (ML) estimation of distortion model parameters, and minimum mean squared error (MMSE) estimation of clean speech are derived and presented. Second, we improve the feature compensation in both efficiency and accuracy by applying higher order information of VTS approximation only to the noisy speech mean parameters, and a temporal smoothing operation for the posterior probability of Gaussian mixture components in clean speech estimation. Finally, we design a procedure to perform irrelevant variability normalization (IVN) based joint training of a reference Gaussian mixture model (GMM) for feature compensation and hidden Markov models (HMMs) for acoustic modeling using VTS-based feature compensation. The effectiveness of our proposed approach is confirmed by experiments on Aurora3 benchmark database for a real-world in-vehicle connected digits recognition task. Compared with ETSI advanced front-end, our approach achieves significant recognition accuracy improvement across three “training-testing” conditions for four languages.

机译：在我们之前的工作中，我们提出了一种使用高阶矢量泰勒级数（VTS）逼近的特征补偿方法来进行嘈杂的语音识别。在本文中，我们报告了使其在实际应用中更强大和实用的新进展。首先，使用密度混合来增强加性噪声和卷积失真的失真模型。得出并提出了失真模型参数的最大似然（ML）估计和干净语音的最小均方误差（MMSE）估计的新公式。其次，我们通过仅将VTS逼近的高阶信息仅应用于嘈杂的语音均值参数，以及对纯语音估计中的高斯混合分量的后验概率进行时间平滑操作，来提高效率和准确性方面的特征补偿。最后，我们设计了一个程序，该程序执行基于高斯混合模型（GMM）的特征补偿和基于声学模型的隐马尔可夫模型（HMM）的基于不相关变异性归一化（IVN）的联合训练，基于VTS的特征补偿。通过在Aurora3基准数据库上进行的现实世界中车载数字识别任务的实验，证实了我们提出的方法的有效性。与ETSI高级前端相比，我们的方法在四种语言的三种“训练测试”条件下实现了明显的识别准确性改善。

著录项

来源
《Audio, Speech, and Language Processing, IEEE Transactions on》 |2014年第11期|1601-1611|共11页
作者
Du J.; Huo Q.;
展开▼
作者单位

National Engineering Laboratory for Speech and Language Information Processing (NEL-SLIP), University of Science and Technology of China, Hefei, P. R. China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Approximation methods; Estimation; Hidden Markov models; Nonlinear distortion; Speech; Training; Vectors; Feature compensation; irrelevant variability normalization; mixture model of distortion; noisy speech recognition; vector Taylor series;

机译：近似方法;估计;隐藏的马尔可夫模型;非线性失真;言语;训练;向量;特征补偿;不相关的变异性归一化;失真的混合模型;嘈杂的语音识别;矢量泰勒级数;

相似文献

外文文献
中文文献
专利

1. A Feature Compensation Approach Using High-Order Vector Taylor Series Approximation of an Explicit Distortion Model for Noisy Speech Recognition [J] . Du J., Huo Q. Audio, Speech, and Language Processing, IEEE Transactions on . 2011,第8期

机译：高阶向量泰勒级数逼近的显式失真模型用于噪声语音识别的特征补偿方法
2. Model Compensation Approach Based on Nonuniform Spectral Compression Features for Noisy Speech Recognition [J] . Geng-Xin Ning, Gang Wei, Kam-Keung Chu EURASIP journal on advances in signal processing . 2007,第1期

机译：基于非均匀谱压缩特征的模型补偿方法在噪声语音识别中的应用
3. A Study of Variable-Parameter Gaussian Mixture Hidden Markov Modeling for Noisy Speech Recognition [J] . Xiaodong Cui, Yifan Gong IEEE transactions on audio, speech and language processing . 2007,第4期

机译：噪声语音识别的可变参数高斯混合隐马尔可夫模型研究
4. IVN-Based Joint Training Of GMM And HMMs Using An Improved VTS-Based Feature Compensation For Noisy Speech Recognition [C] . Jun Du, Qiang Huo Annual conference of the International Speech Communication Association . 2012

机译：基于IVN的GMM和HMM联合训练，使用改进的基于VTS的特征补偿进行嘈杂的语音识别
5. Compensation for Nonlinear Distortion in Noise for Robust Speech Recognition. [D] . Harvilla, Mark J. 2014

机译：噪声中的非线性失真补偿，用于鲁棒的语音识别。
6. Particle Swarm Optimization Based Feature Enhancement and Feature Selection for Improved Emotion Recognition in Speech and Glottal Signals [O] . Hariharan Muthusamy, Kemal Polat, Sazali Yaacob -1

机译：基于粒子群优化的特征增强和特征选择用于语音和声门信号中的情感识别
7. Model Compensation Approach Based on Nonuniform Spectral Compression Features for Noisy Speech Recognition [O] . Geng-Xin Ning, Gang Wei, Kam-Keung Chu 2007

机译：基于非均匀频谱压缩特征的模型补偿方法在噪声语音识别中的应用

An Improved VTS Feature Compensation using Mixture Models of Distortion and IVN Training for Noisy Speech Recognition

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅