首页> 外文期刊>IEEE transactions on audio, speech and language processing >Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources
【24h】

Indeterminacy Free Frequency-Domain Blind Separation of Reverberant Audio Sources

机译:混响声源的不确定性自由频域盲分离

获取原文
获取原文并翻译 | 示例

摘要

Blind separation of convolutive mixtures is a very complicated task that has applications in many fields of speech and audio processing, such as hearing aids and man–machine interfaces. One of the proposed solutions is the frequency-domain independent component analysis. The main disadvantage of this method is the presence of permutation ambiguities among consecutive frequency bins. Moreover, this problem is worst when reverberation time increases. Presented in this paper is a new frequency-domain method, that uses a simplified mixing model, where the impulse responses from one source to each microphone are expressed as scaled and delayed versions of one of these impulse responses. This assumption, based on the similitude among waveforms of the impulse responses, is valid for a small spacing of the microphones. Under this model, separation is performed without any permutation or amplitude ambiguity among consecutive frequency bins. This new method is aimed mainly to obtain separation, with a small reduction of reverberation. Nevertheless, as the reverberation is included in the model, the new method is capable of performing separation for a wide range of reverberant conditions, with very high speed. The separation quality is evaluated using a perceptually designed objective measure. Also, an automatic speech recognition system is used to test the advantages of the algorithm in a real application. Very good results are obtained for both, artificial and real mixtures. The results are significantly better than those by other standard blind source separation algorithms.
机译:卷积混合物的盲分离是一项非常复杂的任务,已应用于语音和音频处理的许多领域,例如助听器和人机界面。提出的解决方案之一是频域独立分量分析。该方法的主要缺点是在连续频率仓之间存在置换歧义。而且,当混响时间增加时,这个问题最严重。本文提出了一种新的频域方法,该方法使用简化的混频模型,其中从一个源到每个麦克风的冲激响应表示为这些冲激响应之一的缩放比例和延迟版本。基于脉冲响应的波形之间的相似性,此假设对于麦克风的小间距有效。在此模型下,在连续的频率仓之间执行分离时不会出现任何排列或幅度歧义的情况。这种新方法的主要目的是获得分离效果,并减少混响。然而,由于混响已包含在模型中,因此该新方法能够以很高的速度对各种混响条件进行分离。分离质量使用可感知设计的客观指标进行评估。此外,自动语音识别系统用于在实际应用中测试算法的优势。无论是人工混合物还是真实混合物,都获得了非常好的结果。结果明显优于其他标准盲源分离算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号