首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >A Multi-Phase Gammatone Filterbank for Speech Separation Via Tasnet
【24h】

A Multi-Phase Gammatone Filterbank for Speech Separation Via Tasnet

机译:通过Tasnet进行语音分离的多相Gammatone滤波器组

获取原文

摘要

In this work, we investigate if the learned encoder of the end-to-end convolutional time domain audio separation network (Conv-TasNet) is the key to its recent success, or if the encoder can just as well be replaced by a deterministic hand-crafted filterbank. Motivated by the resemblance of the trained encoder of Conv-TasNet to auditory filterbanks, we propose to employ a deterministic gammatone filterbank. In contrast to a common gammatone filterbank, our filters are restricted to 2 ms length to allow for low-latency processing. Inspired by the encoder learned by Conv-TasNet, in addition to the logarithmically spaced filters, the proposed filterbank holds multiple gammatone filters at the same center frequency with varying phase shifts. We show that replacing the learned encoder with our proposed multi-phase gammatone filterbank (MP-GTF) even leads to a scale-invariant source-to-noise ratio (SI-SNR) improvement of 0.7 dB. Furthermore, in contrast to using the learned encoder we show that the number of filters can be reduced from 512 to 128 without loss of performance.
机译:在这项工作中,我们研究端到端卷积时域音频分离网络(Conv-TasNet)的学习编码器是否是其近期成功的关键,或者是否也可以用确定性的手代替编码器精心打造的滤网。由于受过训练的Conv-TasNet编码器与听觉滤波器组的相似性,我们建议采用确定性的伽马通滤波器组。与常见的Gammatone滤波器组相反,我们的滤波器的长度限制为2 ms,以便进行低延迟处理。受Conv-TasNet学习的编码器的启发,除了对数间隔的滤波器外,拟议的滤波器组还在相同的中心频率上具有多个相移变化的伽马通滤波器。我们证明,用我们提出的多相伽马通滤波器组(MP-GTF)代替学习型编码器,甚至可以将标度不变的信噪比(SI-SNR)提高0.7 dB。此外,与使用学习的编码器相反,我们表明可以将滤波器的数量从512减少到128,而不会降低性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号