首页> 外文期刊>IEEE transactions on audio, speech and language processing >Speech Enhancement Based on Generalized Minimum Mean Square Error Estimators and Masking Properties of the Auditory System
【24h】

Speech Enhancement Based on Generalized Minimum Mean Square Error Estimators and Masking Properties of the Auditory System

机译:基于广义最小均方误差估计器和听觉系统掩蔽特性的语音增强

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, the family of conditional minimum mean square error (MMSE) spectral estimators is studied which take on the form$(E(X_p^alpha/vert X_p+D_pvert))^1/alpha$, where$X_p$is the clean speech spectrum, and$D_p$is the noise spectrum, resulting in a Generalized MMSE estimator (GMMSE). The degree of noise suppression versus musical tone artifacts of these estimators is studied. The tradeoffs in selection of$(alpha)$, across noise spectral structure and signal-to-noise ratio (SNR) level, are also considered. Members of this family of estimators include the Ephraim–Malah (EM) amplitude estimator and, for high SNRs, the Wiener Filter. It is shown that the colorless residual noise observed in the EM estimator is a characteristic of this general family of estimators. An application of these estimators in an auditory enhancement scheme using the masking threshold of the human auditory system is formulated, resulting in the GMMSE-auditory masking threshold (AMT) enhancement method. Finally, a detailed evaluation of the proposed algorithms is performed over the phonetically balanced TIMIT database and the National Gallery of the Spoken Word (NGSW) audio archive using subjective and objective speech quality measures. Results show that the proposed GMMSE-AMT outperforms MMSE and log-MMSE enhancement methods using a detailed phoneme-based objective quality analysis.
机译:本文研究了条件最小均方误差(MMSE)谱估计量族,其形式为$ {E(X_p ^ alpha / vert X_p + D_pvert))^ 1 / alpha $,其中$ X_p $为干净的语音频谱,而$ D_p $是噪声频谱,从而得到广义MMSE估计器(GMMSE)。研究了这些估计器的噪声抑制程度与乐音伪像的关系。还考虑了在噪声频谱结构和信噪比(SNR)级别之间选择$α$的权衡。该估计器系列的成员包括Ephraim-Malah(EM)幅度估计器,以及对于高SNR的Wiener滤波器。结果表明,在EM估计器中观察到的无色残留噪声是该一般估计器系列的特征。制定了这些估计器在使用人类听觉系统掩蔽阈值的听觉增强方案中的应用,从而形成了GMMSE-听觉掩蔽阈值(AMT)增强方法。最后,使用主观和客观语音质量度量,在语音平衡的TIMIT数据库和国家美术馆的口语(NGSW)音频档案上对提出的算法进行了详细评估。结果表明,使用详细的基于音素的客观质量分析,所提出的GMMSE-AMT优于MMSE和log-MMSE增强方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号