Mixup-breakdown: A Consistency Training Method for Improving Generalization of Speech Separation Models

机译：混合击穿：改善语音分离模型概括的一致性训练方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Deep-learning based speech separation models confront poor generalization problem that even the state-of-the-art models could abruptly fail when evaluating them in mismatch conditions. To address this problem, we propose an easy-to-implement yet effective consistency based semi-supervised learning (SSL) approach, namely Mixup-Breakdown training (MBT). It learns a teacher model to "breakdown" unlabeled inputs, and the estimated separations are interpolated to produce more useful pseudo "mixup" input-output pairs, on which the consistency regularization could apply for learning a student model. In our experiment, we evaluate MBT under various conditions with ascending degrees of mismatch, including unseen interfering speech, noise, and music, and compare MBT's generalization capability against state-of-the-art supervised learning and SSL approaches. The result indicates that MBT significantly outperforms several strong baselines with up to 13.77% relative SI-SNRi improvement. Moreover, MBT only adds negligible computational overhead to standard training schemes.

机译：基于深度学习的语音分离模型面对较差的概括问题，即使最先进的模型在在不匹配条件下评估它们时也可能突然失败。为了解决这个问题，我们提出了一种易于实现的且有效的一致性半监督学习（SSL）方法，即混合击穿训练（MBT）。它将教师模型学习到“故障”未标记的输入，并且内插的估计分离以产生更多有用的伪“混合”输入输出对，在此期间可以申请学习学生模型的一致性正则化。在我们的实验中，我们在各种条件下评估MBT，其中包括明显的干扰语音，噪声和音乐，并比较MBT的泛化能力，反对最先进的监督学习和SSL方法。结果表明，MBT显着优于几种具有相对Si-SNRI改进的强大基线，可高达13.77％。此外，MBT仅在标准训练方案中仅增加了可忽略的计算开销。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2020年|p6204-6823|共5页
会议地点
作者
Max W. Y. Lam; Jun Wang; Dan Su; Dong Yu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN912-53;
关键词
Speech separation; semi-supervised learning; data augmentation; teacher-student;

机译：言语分离;半监督学习;数据增强;教师 - 学生;

相似文献

外文文献
中文文献
专利

1. Improving Robustness of Deep Neural Network Acoustic Models via Speech Separation and Joint Adaptive Training [J] . Narayanan A., Wang D. Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2015,第1期

机译：通过语音分离和联合自适应训练提高深度神经网络声学模型的鲁棒性
2. Evaluation of Hidden Semi-Markov Models Training Methods for Greek Emotional Text-to-Speech Synthesis [J] . Alexandros Lazaridis, Iosif Mporas International Journal of Information Technology and Computer Science . 2013,第4期

机译：对希腊情感文本到语音合成的隐藏半马尔可夫模型训练方法的评估
3. Enhancing design method training with insights from educational research-improving and evaluating a training course for a qualitative modelling method [J] . Grauberger Patric, Eisenmann Matthias, Stoitzner Julian, SN Applied Sciences . 2021,第11期

机译：提高教育研究洞察的设计方法培养 - 改进和评估定性造型方法的培训课程
4. Mixup-breakdown: A Consistency Training Method for Improving Generalization of Speech Separation Models [C] . Max W. Y. Lam, Jun Wang, Dan Su, IEEE International Conference on Acoustics, Speech and Signal Processing . 2020

机译：混合分解：一种提高语音分离模型泛化能力的一致性训练方法
5. A generalization of the minimum classification error (MCE) training method for speech recognition and detection. [D] . Fu, Qiang. 2008

机译：语音识别和检测的最小分类错误（MCE）训练方法的概括。
6. Improving Robustness of Deep Neural Network Acoustic Models via Speech Separation and Joint Adaptive Training [O] . Arun Narayanan, DeLiang Wang -1

机译：通过语音分离和联合自适应训练提高深度神经网络声学模型的鲁棒性
7. Improving robustness of deep neural network acoustic models via speech separation and joint adaptive training [O] . Arun Narayanan, DeLiang Wang 2014

机译：通过语音分离和联合自适应培训改善深神经网络声学模型的鲁棒性

Mixup-breakdown: A Consistency Training Method for Improving Generalization of Speech Separation Models

摘要

著录项

相似文献

相关主题

期刊订阅