首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Advances in Missing Feature Techniques for Robust Large-Vocabulary Continuous Speech Recognition
【24h】

Advances in Missing Feature Techniques for Robust Large-Vocabulary Continuous Speech Recognition

机译:健壮的大词汇量连续语音识别功能缺失技术的进展

获取原文
获取原文并翻译 | 示例

摘要

Missing feature theory (MFT) has demonstrated great potential for improving the noise robustness in speech recognition. MFT was mostly applied in the log-spectral domain since this is also the representation in which the masks have a simple formulation. However, with diagonally structured covariance matrices in the log-spectral domain, recognition performance can only be maintained at the cost of increasing the number of Gaussians drastically. In this paper, MFT can be applied for static and dynamic features in any feature domain that is a linear transform of log-spectra. A crucial part in MFT-systems is the computation of reliability masks from noisy data. The proposed system operates on either binary masks where hard decisions are made about the reliability of the data or on fuzzy masks which use a soft decision criterion. For real-life deployments, a compensation for convolutional noise is also required. Channel compensation in speech recognition typically involves estimating an additive shift in the log-spectral or cepstral domain. To deal with the fact that some features are considered as unreliable, a maximum-likelihood estimation technique is integrated in the back-end recognition process of the MFT system to estimate the channel. Hence, the resulting MFT-based recognizer can deal with both additive and convolutional noise and shows promising results on the Aurora4 large-vocabulary database.
机译:缺失特征理论(MFT)已显示出巨大的潜力,可以改善语音识别中的噪声鲁棒性。 MFT主要应用于对数谱域,因为这也是掩码具有简单公式的表示形式。但是,在对数谱域中使用对角结构的协方差矩阵,只能以大幅增加高斯数量为代价来维持识别性能。在本文中,MFT可以应用于对数谱的线性变换的任何特征域中的静态和动态特征。 MFT系统中的关键部分是根据噪声数据计算可靠性掩码。所提出的系统在对数据可靠性做出硬决策的二进制掩码或在使用软判决准则的模糊掩码上运行。对于实际部署,还需要补偿卷积噪声。语音识别中的通道补偿通常涉及估计对数谱域或倒谱域中的累加偏移。为了解决某些特征被认为不可靠的事实,将最大似然估计技术集成到MFT系统的后端识别过程中以估计信道。因此,最终的基于MFT的识别器可以处理加性和卷积噪声,并在Aurora4大词汇量数据库上显示出令人鼓舞的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号