首页> 外文会议>International conference on latent variable analysis and signal separation >Improving Deep Neural Network Based Speech Enhancement in Low SNR Environments
【24h】

Improving Deep Neural Network Based Speech Enhancement in Low SNR Environments

机译:在低SNR环境中改进基于深度神经网络的语音增强

获取原文

摘要

We propose a joint framework combining speech enhancement (SE) and voice activity detection (VAD) to increase the speech intelligibility in low signal-noise-ratio (SNR) environments. Deep Neural Networks (DNN) have recently been successfully adopted as a regression model in SE. Nonetheless, the performance in harsh environments is not always satisfactory because the noise energy is often dominating in certain speech segments causing speech distortion. Based on the analysis of SNR information at the frame level in the training set, our approach consists of two steps, namely: (1) a DNN-based VAD model is trained to generate frame-level speechon-speech probabilities; and (2) the final enhanced speech features are obtained by a weighted sum of the estimated clean speech features processed by incorporating VAD information. Experimental results demonstrate that the proposed SE approach effectively improves short-time objective intelligibility (STOI) by 0.161 and perceptual evaluation of speech quality (PESQ) by 0.333 over the already-good SE baseline systems at-5dB SNR of babble noise.
机译:我们提出了一个结合语音增强(SE)和语音活动检测(VAD)的联合框架,以提高低信噪比(SNR)环境中的语音清晰度。深度神经网络(DNN)最近已成功地用作SE中的回归模型。但是,在恶劣环境下的性能并不总是令人满意的,因为噪声能量通常在某些语音段中占主导地位,从而导致语音失真。基于对训练集中帧级别的SNR信息的分析,我们的方法包括两个步骤,即:(1)对基于DNN的VAD模型进行训练,以生成帧级别的语音/非语音概率; (2)通过结合VAD信息处理的估计的干净语音特征的加权和获得最终的增强语音特征。实验结果表明,所提出的SE方法有效地改善了短时目标清晰度(STOI)的0​​.161和语音质量(PESQ)的感知评估的0.333,而这些噪声已经达到了良好的SE基线系统的最低噪声(-5dB SNR)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号