首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Blind Separation and Dereverberation of Speech Mixtures by Joint Optimization
【24h】

Blind Separation and Dereverberation of Speech Mixtures by Joint Optimization

机译:通过联合优化实现语音混合的盲分离和混响

获取原文
获取原文并翻译 | 示例

摘要

This paper proposes a method for performing blind source separation (BSS) and blind dereverberation (BD) at the same time for speech mixtures. In most previous studies, BSS and BD have been investigated separately. The separation performance of conventional BSS methods deteriorates as the reverberation time increases while many existing BD methods rely on the assumption that there is only one sound source in a room. Therefore, it has been difficult to perform both BSS and BD when the reverberation time is long. The proposed method uses a network, in which dereverberation and separation networks are connected in tandem, to estimate source signals. The parameters for the dereverberation network (prediction matrices) and those for the separation network (separation matrices) are jointly optimized. This enables a BD process to take a BSS process into account. The prediction and separation matrices are alternately optimized with each depending on the other; hence, we call the proposed method the conditional separation and dereverberation (CSD) method. Comprehensive evaluation results are reported, where all the speech materials contained in the complete test set of the TIMIT corpus are used. The CSD method improves the signal-to-interference ratio by an average of about 4 dB over the conventional frequency-domain BSS approach for reverberation times of 0.3 and 0.5 s. The direct-to-reverberation ratio is also improved by about 10 dB.
机译:本文提出了一种用于语音混合的同时执行盲源分离(BSS)和盲去混响(BD)的方法。在之前的大多数研究中,BSS和BD均已分别进行了研究。传统BSS方法的分离性能会随着混响时间的增加而恶化,而许多现有的BD方法都依赖于一个假设,即房间中只有一个声源。因此,当混响时间长时,难以同时执行BSS和BD。所提出的方法使用将去混响和分离网络串联在一起的网络来估计源信号。共同优化去混响网络的参数(预测矩阵)和分离网络的参数(分离矩阵)。这使得BD流程可以将BSS流程考虑在内。预测矩阵和分离矩阵彼此交替优化。因此,我们将所提出的方法称为条件分离和去混响(CSD)方法。报告了综合评估结果,其中使用了TIMIT语料库完整测试集中包含的所有语音材料。 CSD方法在0.3和0.5 s的混响时间上比常规频域BSS方法平均提高了约4 dB的信号干扰比。直接混响比也提高了约10 dB。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号