首页> 外文期刊>IEEE transactions on audio, speech and language processing >Improved Signal-to-Noise Ratio Estimation for Speech Enhancement
【24h】

Improved Signal-to-Noise Ratio Estimation for Speech Enhancement

机译:用于语音增强的改进信噪比估计

获取原文
获取原文并翻译 | 示例

摘要

This paper addresses the problem of single-microphone speech enhancement in noisy environments. State-of-the-art short-time noise reduction techniques are most often expressed as a spectral gain depending on the signal-to-noise ratio (SNR). The well-known decision-directed (DD) approach drastically limits the level of musical noise, but the estimated a priori SNR is biased since it depends on the speech spectrum estimation in the previous frame. Therefore, the gain function matches the previous frame rather than the current one which degrades the noise reduction performance. The consequence of this bias is an annoying reverberation effect. We propose a method called two-step noise reduction (TSNR) technique which solves this problem while maintaining the benefits of the decision-directed approach. The estimation of the a priori SNR is refined by a second step to remove the bias of the DD approach, thus removing the reverberation effect. However, classic short-time noise reduction techniques, including TSNR, introduce harmonic distortion in enhanced speech because of the unreliability of estimators for small signal-to-noise ratios. This is mainly due to the difficult task of noise power spectrum density (PSD) estimation in single-microphone schemes. To overcome this problem, we propose a method called harmonic regeneration noise reduction (HRNR). A nonlinearity is used to regenerate the degraded harmonics of the distorted signal in an efficient way. The resulting artificial signal is produced in order to refine the a priori SNR used to compute a spectral gain able to preserve the speech harmonics. These methods are analyzed and objective and formal subjective test results between HRNR and TSNR techniques are provided. A significant improvement is brought by HRNR compared to TSNR thanks to the preservation of harmonics.
机译:本文解决了嘈杂环境中单麦克风语音增强的问题。最先进的短时降噪技术通常表示为频谱增益,具体取决于信噪比(SNR)。众所周知的决策导向(DD)方法极大地限制了音乐噪声的水平,但是估计的先验SNR有偏差,因为它取决于前一帧中的语音频谱估计。因此,增益函数匹配前一帧而不是当前帧,这会降低降噪性能。这种偏差的结果是令人讨厌的混响效果。我们提出了一种称为两步降噪(TSNR)技术的方法,该方法可以解决此问题,同时又保留了决策导向方法的优势。通过第二步骤改进先验SNR的估计,以消除DD方法的偏差,从而消除混响效果。但是,传统的短时降噪技术(包括TSNR)会在增强的语音中引入谐波失真,这是因为估算器对小信噪比的可靠性不高。这主要归因于单麦克风方案中的噪声功率谱密度(PSD)估算的艰巨任务。为了克服这个问题,我们提出了一种称为谐波再生降噪(HRNR)的方法。非线性用于以有效方式再生失真信号的降级谐波。产生的人工信号是为了改进先验SNR,该先验SNR用于计算能够保留语音谐波的频谱增益。分析了这些方法,并提供了HRNR和TSNR技术之间的客观和正式的主观测试结果。由于保留了谐波,与TSNR相比,HRNR带来了重大改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号