首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Deep Attractor Networks for Speaker Re-Identification and Blind Source Separation
【24h】

Deep Attractor Networks for Speaker Re-Identification and Blind Source Separation

机译:深度吸引人网络用于说话人重新识别和盲源分离

获取原文

摘要

Deep clustering (DC) and deep attractor networks (DANs) are a data-driven way to monaural blind source separation. Both approaches provide astonishing single channel performance but have not yet been generalized to block-online processing. When separating speech in a continuous stream with a block-online algorithm, it needs to be determined in each block which of the output streams belongs to whom. In this contribution we solve this block permutation problem by introducing an additional speaker identification embedding to the DAN model structure. We motivate this model decision by analyzing the embedding topology of DC and DANs and show, that DC and DANs themselves are not sufficient for speaker identification. This model structure (a) improves the signal to distortion ratio (SDR) over a DAN baseline and (b) provides up to 61% and up to 34% relative reduction in permutation error rate and re-identification error rate compared to an i-vector baseline, respectively.
机译:深度聚类(DC)和深度吸引子网络(DAN)是一种数据驱动的单声道盲源分离方法。两种方法都提供了惊人的单通道性能,但尚未推广到在线块处理中。当使用块在线算法在连续流中分离语音时,需要在每个块中确定哪个输出流属于谁。在这一贡献中,我们通过引入嵌入DAN模型结构的附加说话人识别来解决此块置换问题。我们通过分析DC和DAN的嵌入拓扑结构来激发这一模型决策,并表明DC和DAN本身不足以识别说话人。该模型结构(a)在DAN基线上改善了信号失真比(SDR),并且(b)与置换模型相比,排列错误率和重新识别错误率的相对降低分别高达61%和34%。 i-vector基准线。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号