...
首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >Minimum Mean-Square Error Estimation of Mel-Frequency Cepstral Features–A Theoretically Consistent Approach
【24h】

Minimum Mean-Square Error Estimation of Mel-Frequency Cepstral Features–A Theoretically Consistent Approach

机译:频率倒谱特征的最小均方误差估计-理论上一致的方法

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In this work, we consider the problem of feature enhancement for noise-robust automatic speech recognition (ASR). We propose a method for minimum mean-square error (MMSE) estimation of mel-frequency cepstral features, which is based on a minimum number of well-established, theoretically consistent statistical assumptions. More specifically, the method belongs to the class of methods relying on the statistical framework proposed in Ephraim and Malah’s original work (“Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator,” IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-32, no. 6, 1984). The method is general in that it allows MMSE estimation of mel-frequency cepstral coefficients (MFCC’s), cepstral-mean subtracted (CMS-) MFCC’s, autoregressive-moving-average (ARMA)-filtered CMS-MFCC’s, velocity, and acceleration coefficients. In addition, the method is easily modified to take into account other compressive non-linearities than the logarithm traditionally used for MFCC computation. In terms of MFCC estimation performance, as measured by MFCC mean-square error, the proposed method shows performance which is identical to or better than other state-of-the-art methods. In terms of ASR performance, no statistical difference could be found between the proposed method and the state-of-the-art methods. We conclude that existing state-of-the-art MFCC feature enhancement algorithms within this class of algorithms, while theoretically suboptimal or based on theoretically inconsistent assumptions, perform close to optimally in the MMSE sense.
机译:在这项工作中,我们考虑了抗噪自动语音识别(ASR)的功能增强问题。我们提出了一种方法,该方法基于最低限度的成熟的,理论上一致的统计假设,来估计梅尔频率倒谱特征的最小均方误差(MMSE)。更具体地说,该方法属于依赖于Ephraim和Malah的原始工作(“使用最小均方误差短时频谱幅度估计器的语音增强”,IEEE Trans。Acoust。,语音,信号处理,第ASSP-32卷,第6号,1984年)。该方法具有通用性,因为它允许MMSE估计梅尔频率倒谱系数(MFCC),倒谱均值(CMS-)MFCC,自回归移动平均值(ARMA)滤波的CMS-MFCC,速度和加速度系数。另外,该方法易于修改,以考虑到除传统上用于MFCC计算的对数以外的其他压缩非线性。就MFCC估计性能而言,通过MFCC均方误差测量,所提出的方法显示出与其他最新技术相同或更好的性能。在ASR性能方面,建议的方法与最新方法之间没有统计差异。我们得出的结论是,此类算法中现有的最先进的MFCC特征增强算法,尽管在理论上不是最佳选择或基于理论上不一致的假设,但在MMSE方面的表现接近最佳。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号