...
首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >On Loss Functions for Supervised Monaural Time-Domain Speech Enhancement
【24h】

On Loss Functions for Supervised Monaural Time-Domain Speech Enhancement

机译:关于监督单型时域语音增强的损失函数

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Many deep learning-based speech enhancement algorithms are designed to minimize the mean-square error (MSE) in some transform domain between a predicted and a target speech signal. However, optimizing for MSE does not necessarily guarantee high speech quality or intelligibility, which is the ultimate goal of many speech enhancement algorithms. Additionally, only little is known about the impact of the loss function on the emerging class of time-domain deep learning-based speech enhancement systems. We study how popular loss functions influence the performance of time-domain deep learning-based speech enhancement systems. First, we demonstrate that perceptually inspired loss functions might be advantageous over classical loss functions like MSE. Furthermore, we show that the learning rate is a crucial design parameter even for adaptive gradient-based optimizers, which has been generally overlooked in the literature. Also, we found that waveform matching performance metrics must be used with caution as they in certain situations can fail completely. Finally, we show that a loss function based on scale-invariant signal-to-distortion ratio (SI-SDR) achieves good general performance across a range of popular speech enhancement evaluation metrics, which suggests that SI-SDR is a good candidate as a general-purpose loss function for speech enhancement systems.
机译:许多基于深度学习的语音增强算法被设计成最小化预测和目标语音信号之间的一些变换域中的平均误差(MSE)。然而,优化MSE不一定保证高语音质量或可懂度,这是许多语音增强算法的最终目标。此外,关于损耗功能对基于时域深度学习的语音增强系统的损失功能的影响只有很少。我们研究流行损失功能如何影响基于时域深度学习的语音增强系统的性能。首先,我们证明了感知的灵感损失函数可能是古典损失功能,如MSE。此外,我们表明,即使对于基于自适应梯度的优化器,学习率也是一种重要的设计参数,其通常被忽视在文献中。此外,我们发现波形匹配性能指标必须谨慎使用,因为它们在某些情况下它们可以完全失败。最后,我们表明,基于尺度不变的信号到失真率(SI-SDR)的损失函数在一系列流行的语音增强评估指标中实现了良好的一般性表现,这表明SI-SI-SDS是一个良好的候选者语音增强系统通用损失功能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号