首页> 外文会议>International Joint Conference on Neural Networks >pcIRM: Complex Ideal Ratio Masking for Speaker-Independent Monaural Source Separation with Utterance Permutation Invariant Training
【24h】

pcIRM: Complex Ideal Ratio Masking for Speaker-Independent Monaural Source Separation with Utterance Permutation Invariant Training

机译:pcIRM:复杂的理想比率掩蔽,用于独立于说话人的单声道源分离和说话人排列不变训练

获取原文

摘要

Typical speech separation systems usually operate in the time-frequency (T-F) domain by enhancing the magnitude response and leaving the phase response unaltered. Recent studies, however, suggest that phase is important for perceptual quality, leading some researchers to consider magnitude and phase spectrum enhancements. The merging of the complex ideal ratio masking (cIRM) estimation and training with deep neural network (DNN) has been proved to be an effective way to improve speech separation. Furthermore, the label ambiguity (or permutation) problem has become a major barrier for speaker-independent multi-talker source separation, which prompts us to come up with new solutions. In this paper, to solve the problem of speaker-independent monaural source separation, we propose a novel method called pcIRM, which creatively achieves the cIRM estimation with the utterance-level permutation invariant training (uPIT). Specifically, pcIRM is implemented with the deep bidirectional LSTM (Bi-LSTM) RNN network, and evaluated with the WSJ0-2mix datasets. We report separation results for the proposed method and compare them to that of the existing state-of-the-art methods. Extensive experimental results demonstrate the advantages of our proposed pcIRM method in terms of the signal-to-distortion ratio (SDR) metric.
机译:典型的语音分离系统通常通过增强幅度响应并使相位响应保持不变来在时频(T-F)域中运行。但是,最近的研究表明,相位对于感知质量很重要,导致一些研究人员考虑幅度和相位频谱的增强。复杂理想比率掩盖(cIRM)估计和训练与深度神经网络(DNN)的结合已被证明是改善语音分离的有效方法。此外,标签模糊性(或置换)问题已成为独立于说话者的多方通话者源分离的主要障碍,这促使我们提出新的解决方案。在本文中,为了解决与说话者无关的单声道源分离的问题,我们提出了一种称为pcIRM的新方法,该方法通过发声级置换不变训练(uPIT)创造性地实现了cIRM估计。具体而言,pcIRM是通过深度双向LSTM(Bi-LSTM)RNN网络实现的,并使用WSJ0-2mix数据集进行了评估。我们报告了拟议方法的分离结果,并将其与现有技术水平的结果进行比较。大量的实验结果证明了我们提出的pcIRM方法在信号失真比(SDR)指标方面的优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号