A Multi-Task Scheme for Supervised DNN-Based Single-Channel Speech Enhancement by Using Speech Presence Probability as the Secondary Training Target

Lei WANG; Jie ZHU; Kangbo SUN

首页> 外文期刊>IEICE transactions on information and systems >A Multi-Task Scheme for Supervised DNN-Based Single-Channel Speech Enhancement by Using Speech Presence Probability as the Secondary Training Target

【24h】

A Multi-Task Scheme for Supervised DNN-Based Single-Channel Speech Enhancement by Using Speech Presence Probability as the Secondary Training Target

机译：使用语音存在概率作为辅助训练目标的语音存在概率的基于DNN的单通道语音增强的多任务方案

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

To cope with complicated interference scenarios in realistic acoustic environment, supervised deep neural networks (DNNs) are investigated to estimate different user-defined targets. Such techniques can be broadly categorized into magnitude estimation and time-frequency mask estimation techniques. Further, the mask such as the Wiener gain can be estimated directly or derived by the estimated interference power spectral density (PSD) or the estimated signal-to-interference ratio (SIR). In this paper, we propose to incorporate the multi-task learning in DNN-based single-channel speech enhancement by using the speech presence probability (SPP) as a secondary target to assist the target estimation in the main task. The domain-specific information is shared between two tasks to learn a more generalizable representation. Since the performance of multi-task network is sensitive to the weight parameters of loss function, the homoscedastic uncertainty is introduced to adaptively learn the weights, which is proven to outperform the fixed weighting method. Simulation results show the proposed multi-task scheme improves the speech enhancement performance overall compared to the conventional single-task methods. And the joint direct mask and SPP estimation yields the best performance among all the considered techniques.

机译：为了应对现实声学环境中的复杂干扰场景，研究了监督的深度神经网络（DNN）以估计不同的用户定义目标。这些技术可以广泛地分类为幅度估计和时频掩模估计技术。此外，可以通过估计的干扰功率谱密度（PSD）或估计的信号到干扰比（SIR）直接或导出诸如维纳增益的掩模。在本文中，我们建议通过使用语音存在概率（SPP）作为次要目标来掺入基于DNN的单通道语音增强中的多任务学习，以帮助主任务中的目标估计。特定于域的信息是在两个任务之间共享的，以学习更广泛的表示。由于多任务网络的性能对损耗函数的权重参数敏感，因此引入了同性恋的不确定性以自适应地学习权重，这被证明是优于固定的加权方法。仿真结果表明，与传统单件任务方法相比，所提出的多任务方案总体上限提高了语音增强性能。并且联合直接掩模和SPP估计产生了所有考虑的技术中的最佳性能。

著录项

来源
《IEICE transactions on information and systems》 |2021年第11期|共8页
作者
Lei WANG; Jie ZHU; Kangbo SUN;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词
multi-task learningsupervised deep neural networkspeech presence probabilitydereverberationnoise reduction;

机译：多任务学习资务深度神经网络expe expe概率曝光不良;
入库时间 2022-08-19 03:23:07

相似文献

外文文献
中文文献
专利

1. Single-channel speech enhancement method using reconstructive NMF with spectrotemporal speech presence probabilities [J] . Lee Seongjae, Han David K., Ko Hanseok Applied Acoustics . 2017,第ptaB期

机译：使用具有频谱时语音存在概率的重构NMF的单通道语音增强方法
2. DNN-Based Low-Musical-Noise Single-Channel Speech Enhancement Based on Higher-Order-Moments Matching [J] . Satoshi MIZOGUCHI, Yuki SAITO, Shinnosuke TAKAMICHI, IEICE transactions on information and systems . 2021,第11期

机译：基于DNN的低音响单通道语音增强基于高阶矩匹配
3. Pre-Training of DNN-Based Speech Synthesis Based on Bidirectional Conversion between Text and Speech [J] . Kentaro SONE, Toru NAKASHIKA IEICE transactions on information and systems . 2019,第8期

机译：基于文本和语音之间双向转换的基于DNN的语音合成的预训练
4. DNN-Based Speech Presence Probability Estimation for Multi-Frame Single-Microphone Speech Enhancement [C] . Marvin Tammen, Dörte Fischer, Bernd T. Meyer, IEEE International Conference on Acoustics, Speech and Signal Processing . 2020

机译：多帧单麦克风语音增强的基于DNN的语音存在概率估计
5. Semi-Supervised Training for Automatic Speech Recognition [D] . Manohar, Vimal. 2019

机译：半监督自动演讲识别培训
6. On Training Targets for Supervised Speech Separation [O] . Yuxuan Wang, Arun Narayanan, DeLiang Wang -1

机译：论监督性语音分离的训练目标
7. Wavelet-based decomposition of F0 as a secondary task for DNN-based speech synthesis with multi-task learning [O] . Ribeiro, Manuel Sam, Watts, Oliver, Yamagishi, Junichi, 2016

机译：基于小波的F0分解作为基于DNN的语音合成与多任务学习的次要任务

A Multi-Task Scheme for Supervised DNN-Based Single-Channel Speech Enhancement by Using Speech Presence Probability as the Secondary Training Target

摘要

著录项

相似文献

相关主题

期刊订阅