首页> 外文会议>Annual conference of the International Speech Communication Association;INTERSPEECH 2011 >Feature Normalization Using Structured Full Transforms for Robust Speech Recognition
【24h】

Feature Normalization Using Structured Full Transforms for Robust Speech Recognition

机译:使用结构化完整变换进行特征归一化以实现稳健的语音识别

获取原文

摘要

Classical mean and variance normalization (MVN) uses a diagonal transform and a bias vector to normalize the mean and variance of noisy features to reference values. As MVN uses diagonal transform, it ignores correlation between feature dimensions. Although full transform is able to make use of feature correlation, its large amount of parameters may not be estimated reliably from a short observation, e.g. 1 utterance. We propose a novel structured full transform that has the same amount of free parameters as diagonal transform while being able to capture correlation between feature dimensions. The proposed structured transform can be estimated reliably from one utterance by maximizing the likelihood of the normalized features on a reference Gaussian mixture model. Experimental results on Aurora-4 task show that the structured transform produces consistently better speech recognition results than diagonal transform and also outperforms advanced frontend (AFE) feature extractor.
机译:经典的均值和方差归一化(MVN)使用对角线变换和偏差矢量将噪声特征的均值和方差归一化为参考值。由于MVN使用对角线变换,因此它会忽略特征尺寸之间的相关性。尽管全变换能够利用特征相关性,但可能无法通过短时间的观察(例如,观测值)可靠地估计其大量参数。 1种话语。我们提出了一种新颖的结构化完整变换,该变换具有与对角线变换相同数量的自由参数,同时能够捕获特征尺寸之间的相关性。通过最大化参考高斯混合模型上归一化特征的可能性,可以从一种话语可靠地估计所提出的结构化变换。在Aurora-4任务上的实验结果表明,结构化变换产生的语音识别结果始终优于对角线变换,并且性能优于高级前端(AFE)特征提取器。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号