首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Complex Ratio Masking for Monaural Speech Separation
【24h】

Complex Ratio Masking for Monaural Speech Separation

机译:用于单声道语音分离的复数比率掩蔽

获取原文
获取原文并翻译 | 示例
           

摘要

Speech separation systems usually operate on the short-time Fourier transform (STFT) of noisy speech, and enhance only the magnitude spectrum while leaving the phase spectrum unchanged. This is done because there was a belief that the phase spectrum is unimportant for speech enhancement. Recent studies, however, suggest that phase is important for perceptual quality, leading some researchers to consider magnitude and phase spectrum enhancements. We present a supervised monaural speech separation approach that simultaneously enhances the magnitude and phase spectra by operating in the complex domain. Our approach uses a deep neural network to estimate the real and imaginary components of the ideal ratio mask defined in the complex domain. We report separation results for the proposed method and compare them to related systems. The proposed approach improves over other methods when evaluated with several objective metrics, including the perceptual evaluation of speech quality (PESQ), and a listening test where subjects prefer the proposed approach with at least a 69% rate.
机译:语音分离系统通常在嘈杂语音的短时傅立叶变换(STFT)上运行,并且仅增强幅度谱而保持相位谱不变。这样做是因为人们认为相位频谱对于语音增强不重要。然而,最近的研究表明,相位对于感知质量很重要,导致一些研究人员考虑幅度和相位频谱的增强。我们提出了一种有监督的单声道语音分离方法,该方法通过在复杂域中进行操作来同时增强幅度和相位谱。我们的方法使用深层神经网络来估计复杂域中定义的理想比率蒙版的实部和虚部。我们报告所提出方法的分离结果,并将其与相关系统进行比较。当用几个客观指标进行评估时,所提出的方法比其他方法有所改进,包括语音质量的感知评估(PESQ)和受试者的听力测试,受试者更喜欢以至少69%的比率提出的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号