首页> 外文期刊>IEICE transactions on information and systems >A Multi-Task Scheme for Supervised DNN-Based Single-Channel Speech Enhancement by Using Speech Presence Probability as the Secondary Training Target
【24h】

A Multi-Task Scheme for Supervised DNN-Based Single-Channel Speech Enhancement by Using Speech Presence Probability as the Secondary Training Target

机译:使用语音存在概率作为辅助训练目标的语音存在概率的基于DNN的单通道语音增强的多任务方案

获取原文
       

摘要

To cope with complicated interference scenarios in realistic acoustic environment, supervised deep neural networks (DNNs) are investigated to estimate different user-defined targets. Such techniques can be broadly categorized into magnitude estimation and time-frequency mask estimation techniques. Further, the mask such as the Wiener gain can be estimated directly or derived by the estimated interference power spectral density (PSD) or the estimated signal-to-interference ratio (SIR). In this paper, we propose to incorporate the multi-task learning in DNN-based single-channel speech enhancement by using the speech presence probability (SPP) as a secondary target to assist the target estimation in the main task. The domain-specific information is shared between two tasks to learn a more generalizable representation. Since the performance of multi-task network is sensitive to the weight parameters of loss function, the homoscedastic uncertainty is introduced to adaptively learn the weights, which is proven to outperform the fixed weighting method. Simulation results show the proposed multi-task scheme improves the speech enhancement performance overall compared to the conventional single-task methods. And the joint direct mask and SPP estimation yields the best performance among all the considered techniques.
机译:为了应对现实声学环境中的复杂干扰场景,研究了监督的深度神经网络(DNN)以估计不同的用户定义目标。这些技术可以广泛地分类为幅度估计和时频掩模估计技术。此外,可以通过估计的干扰功率谱密度(PSD)或估计的信号到干扰比(SIR)直接或导出诸如维纳增益的掩模。在本文中,我们建议通过使用语音存在概率(SPP)作为次要目标来掺入基于DNN的单通道语音增强中的多任务学习,以帮助主任务中的目标估计。特定于域的信息是在两个任务之间共享的,以学习更广泛的表示。由于多任务网络的性能对损耗函数的权重参数敏感,因此引入了同性恋的不确定性以自适应地学习权重,这被证明是优于固定的加权方法。仿真结果表明,与传统单件任务方法相比,所提出的多任务方案总体上限提高了语音增强性能。并且联合直接掩模和SPP估计产生了所有考虑的技术中的最佳性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号