首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Towards Scaling Up Classification-Based Speech Separation
【24h】

Towards Scaling Up Classification-Based Speech Separation

机译:逐步扩大基于分类的语音分离

获取原文
获取原文并翻译 | 示例
           

摘要

Formulating speech separation as a binary classification problem has been shown to be effective. While good separation performance is achieved in matched test conditions using kernel support vector machines (SVMs), separation in unmatched conditions involving new speakers and environments remains a big challenge. A simple yet effective method to cope with the mismatch is to include many different acoustic conditions into the training set. However, large-scale training is almost intractable for kernel machines due to computational complexity. To enable training on relatively large datasets, we propose to learn more linearly separable and discriminative features from raw acoustic features and train linear SVMs, which are much easier and faster to train than kernel SVMs. For feature learning, we employ standard pre-trained deep neural networks (DNNs). The proposed DNN-SVM system is trained on a variety of acoustic conditions within a reasonable amount of time. Experiments on various test mixtures demonstrate good generalization to unseen speakers and background noises.
机译:已经证明将语音分离作为二进制分类问题是有效的。尽管使用内核支持向量机(SVM)在匹配的测试条件下实现了良好的分离性能,但在涉及新扬声器和环境的不匹配条件下进行分离仍然是一个巨大的挑战。解决不匹配问题的一种简单而有效的方法是将许多不同的声学条件包括在训练集中。但是,由于计算复杂性,对于内核计算机而言,大规模培训几乎是棘手的。为了能够在相对较大的数据集上进行训练,我们建议从原始声学特征中学习更多的线性可分离和区分性特征,并训练线性SVM,这比内核SVM更容易,更快地进行训练。对于特征学习,我们采用标准的预训练深度神经网络(DNN)。拟议的DNN-SVM系统在合理的时间内在各种声学条件下进行了训练。在各种测试混合物上进行的实验证明,可以很好地概括看不见的说话者和背景噪音。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号