首页> 外文会议>Spoken Language Technology Workshop >Multimodal Attention Fusion for Target Speaker Extraction
【24h】

Multimodal Attention Fusion for Target Speaker Extraction

机译:目标扬声器提取的多式联版融合

获取原文

摘要

Target speaker extraction, which aims at extracting a target speaker’s voice from a mixture of voices using audio, visual or locational clues, has received much interest. Recently an audio-visual target speaker extraction has been proposed that extracts target speech by using complementary audio and visual clues. Although audio-visual target speaker extraction offers a more stable performance than single modality methods for simulated data, its adaptation towards realistic situations has not been fully explored as well as evaluations on real recorded mixtures. One of the major issues to handle realistic situations is how to make the system robust to clue corruption because in real recordings both clues may not be equally reliable, e.g. visual clues may be affected by occlusions. In this work, we propose a novel attention mechanism for multi-modal fusion and its training methods that enable to effectively capture the reliability of the clues and weight the more reliable ones. Our proposals improve signal to distortion ratio (SDR) by 1.0 dB over conventional fusion mechanisms on simulated data. Moreover, we also record an audio-visual dataset of simultaneous speech with realistic visual clue corruption and show that audio-visual target speaker extraction with our proposals successfully work on real data.
机译:目标扬声器提取,旨在使用音频,视觉或位置线索从声音混合中提取目标扬声器的声音,已经获得了很多兴趣。最近,提出了一种视听目标扬声器提取,通过使用互补音频和视觉线索提取目标语音。虽然视听目标扬声器提取提供比模拟数据的单个模态方法更稳定的性能,但其对现实情况的适应尚未完全探索以及对实际记录混合的评估。处理现实情况的主要问题之一是如何使系统对线索腐败的强大,因为在实际记录中,两个线索都可能同样可靠,例如,视觉线索可能受到闭塞的影响。在这项工作中,我们提出了一种新的多模态融合的注意机制及其培训方法,使能有效地捕获线索和重量更可靠的培训方法。我们的提案通过在模拟数据上的传统融合机制上将信号变为失真率(SDR)。此外,我们还记录了一个具有逼真的Visual Clue损坏的同时语音的视听数据集,并显示视听目标扬声器提取与我们的建议成功地处理真实数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号