【24h】

Maximum Likelihood Normalization for Robust Speech Recognition

机译:最大似然归一化,用于鲁棒语音识别

获取原文
获取原文并翻译 | 示例

摘要

It is well-known that additive and channel noise cause shift and scaling in MFCC features. Empirical normalization techniques to estimate and compensate for the effects, such as cep-stral mean subtraction and variance normalization, have been shown to be useful. However, these empirical estimate may not be optimal. In this paper, we approach the problem from two directions, 1) use a more robust MFCC-based features that is less sensitive to additive and channel noise and 2) propose a maximum likelihood (ML) based approach to compensate the noise effect. In addition, we proposed the use of multi-class normalization in which different normalization factors can be applied to different phonetic units. The combination of the robust features and ML normalization is particularly useful for highly mis-matched condition in the Aurora 3 corpus resulting in a 15.8% relative improvement in the highly mis-matched case and a 10.4% relative improvement on average over the three conditions.
机译:众所周知,加性和通道噪声会导致MFCC功能部件发生偏移和缩放。经验估计归一化技术可以有效地估计和补偿影响,例如等值均值减法和方差归一化。但是,这些经验估计可能不是最佳的。在本文中,我们从两个方向解决这个问题:1)使用更健壮的基于MFCC的功能,这些功能对加性和通道噪声较不敏感; 2)提出了基于最大似然(ML)的方法来补偿噪声影响。另外,我们提出了使用多类归一化的方法,其中可以将不同的归一化因子应用于不同的语音单元。鲁棒性功能和ML归一化的组合对于Aurora 3语料库中高度不匹配的条件特别有用,可导致高度不匹配的情况相对改善15.8%,在这三个条件下平均可以相对改善10.4%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号