首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >The Phasebook: Building Complex Masks via Discrete Representations for Source Separation
【24h】

The Phasebook: Building Complex Masks via Discrete Representations for Source Separation

机译:掌点手册:通过离散表示构建复杂的掩码来源分离

获取原文
获取外文期刊封面目录资料

摘要

Deep learning based speech enhancement and source separation systems have recently reached unprecedented levels of quality, to the point that performance is reaching a new ceiling. Most systems rely on estimating the magnitude of a target source, either directly or by computing a real-valued mask to be applied to a time-frequency representation of the mixture signal. A limiting factor in such approaches is a lack of phase estimation: the phase of the mixture is most often used when reconstructing the estimated time-domain signal. We propose to estimate phase using "phasebook", a new type of layer based on a discrete representation of the phase difference between the mixture and the target. We also introduce "combook", a similar type of layer that directly estimates a complex mask. We present various training and inference schemes involving these representations, and explain in particular how to include them in an end-to-end learning framework. We also present an oracle study to assess upper bounds on performance for various types of masks using discrete phase representations. We evaluate the proposed methods on the wsj0-2mix dataset, a well-studied corpus for single-channel speaker-independent speaker separation, matching the performance of state-of-the-art mask-based approaches without requiring additional phase reconstruction steps.
机译:深度学习语音增强和源分离系统最近已经达到了前所未有的质量水平,该性能达到了一个新的上限点。大多数系统依赖于直接估计目标源的大小,或通过计算实值掩模施加到该混合物中信号的时频表示。在这种方法的一个限制因素是缺乏相位估计的:重建所述估计的时域信号时的混合物的相是最常用的。我们提出使用“phasebook”,基于混合物和靶之间的相位差的离散表示的新型层的估计相位。我们还推出“combook”,相似类型层的直接估计一个复杂的面具。我们目前涉及这些陈述的各种培训和推理机制,特别解释了如何将它们包含在终端到终端的学习框架。我们还提出一个oracle研究,以评估对各类使用离散相表示口罩的性能上限。我们评价对wsj0-2mix数据集,对于单声道扬声器无关的说话者分离充分研究的语料库中提出的方法,而不需要额外的相位重建步骤的匹配状态的最先进的基于掩模的方法的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号