首页> 外文会议>IEEE International Conference on Rebooting Computing >Speech Enhancement With Deep Neural Networks Using MoG Based Labels
【24h】

Speech Enhancement With Deep Neural Networks Using MoG Based Labels

机译:使用基于MoG的标签的深度神经网络进行语音增强

获取原文

摘要

In this paper we present a mixture of Gaussians-deep neural network (MoG-DNN) algorithm for single-microphone speech enhancement. We combine between the generative mixture of Gaussians (MoG) model and the discriminative deep neural network (DNN). The proposed algorithm consists of two phases, the training phase and the test phase. In the training phase, the clean speech power spectral density (PSD) is modeled as a MoG representing an unsupervised assortment of the speech signal. Following, the database is labeled to fit the given MoG. DNN is then trained to classify noisy time-frame features to one of the Gaussians from the already inferred MoG. Given the classification results, a speech presence probability (SPP) is obtained in the test phase. Using the SPP, soft spectral subtraction is then applied, while, simultaneously updating the noise statistics. The generative unsupervised MoG can be applied to any unknown database, in addition to preserving the speech spectral structure. Furthermore, the discriminative DNN maintains the continuity of the speech. Experimental study shows that the proposed algorithm produces higher objective measurements scores compared to other speech enhancement algorithms.
机译:在本文中,我们提出了一种用于单麦克风语音增强的混合高斯深层神经网络(MoG-DNN)算法。我们将高斯模型(MoG)的生成混合与判别性深度神经网络(DNN)结合在一起。所提出的算法包括两个阶段,训练阶段和测试阶段。在训练阶段,将干净语音功率谱密度(PSD)建模为MoG,表示语音信号的无监督分类。接下来,将数据库标记为适合给定的MoG。然后,对DNN进行训练,以根据已经推断出的MoG将嘈杂的时间范围特征分类为高斯之一。给定分类结果,可以在测试阶段获得语音存在概率(SPP)。然后,使用SPP进行软频谱减法,同时更新噪声统计信息。生成的无监督MoG除了可以保留语音频谱结构之外,还可以应用于任何未知数据库。此外,具有区别性的DNN可以保持语音的连续性。实验研究表明,与其他语音增强算法相比,该算法产生了更高的客观测量分数。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号