首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Mixup-breakdown: A Consistency Training Method for Improving Generalization of Speech Separation Models
【24h】

Mixup-breakdown: A Consistency Training Method for Improving Generalization of Speech Separation Models

机译:混合击穿:改善语音分离模型概括的一致性训练方法

获取原文

摘要

Deep-learning based speech separation models confront poor generalization problem that even the state-of-the-art models could abruptly fail when evaluating them in mismatch conditions. To address this problem, we propose an easy-to-implement yet effective consistency based semi-supervised learning (SSL) approach, namely Mixup-Breakdown training (MBT). It learns a teacher model to "breakdown" unlabeled inputs, and the estimated separations are interpolated to produce more useful pseudo "mixup" input-output pairs, on which the consistency regularization could apply for learning a student model. In our experiment, we evaluate MBT under various conditions with ascending degrees of mismatch, including unseen interfering speech, noise, and music, and compare MBT's generalization capability against state-of-the-art supervised learning and SSL approaches. The result indicates that MBT significantly outperforms several strong baselines with up to 13.77% relative SI-SNRi improvement. Moreover, MBT only adds negligible computational overhead to standard training schemes.
机译:基于深度学习的语音分离模型面对较差的概括问题,即使最先进的模型在在不匹配条件下评估它们时也可能突然失败。为了解决这个问题,我们提出了一种易于实现的且有效的一致性半监督学习(SSL)方法,即混合击穿训练(MBT)。它将教师模型学习到“故障”未标记的输入,并且内插的估计分离以产生更多有用的伪“混合”输入输出对,在此期间可以申请学习学生模型的一致性正则化。在我们的实验中,我们在各种条件下评估MBT,其中包括明显的干扰语音,噪声和音乐,并比较MBT的泛化能力,反对最先进的监督学习和SSL方法。结果表明,MBT显着优于几种具有相对Si-SNRI改进的强大基线,可高达13.77%。此外,MBT仅在标准训练方案中仅增加了可忽略的计算开销。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号