首页> 外文期刊>IEEE Transactions on Speech and Audio Proceessing >Accelerated speech source localization via a hierarchical search of steered response power
【24h】

Accelerated speech source localization via a hierarchical search of steered response power

机译:通过分层搜索转向响应能力来加速语音源定位

获取原文
获取原文并翻译 | 示例
       

摘要

Accurate and fast localization of multiple speech sound sources is a problem that is of significant interest in applications such as conferencing systems. Recently, approaches that are based on search for local peaks of the steered response power are becoming popular, despite their known computational expense. Based on the observation that the wavelengths of the sound from a speech source are comparable to the dimensions of the space being searched and that the source is broadband, we have developed an efficient search algorithm. Significant speedups are achieved by using coarse-to-fine strategies in both space and frequency. We present applications of the search algorithm to speed up simple delay-and-sum beamforming and steered response power phase-transform weighted (SRP-PHAT) source localization algorithms. A systematic series of comparisons with previous algorithms are made that show that the technique is much faster, robust, and accurate. The performance of the algorithm can be further improved by using constraints from computer vision.
机译:多个语音声源的准确和快速定位是在诸如会议系统之类的应用中引起极大兴趣的问题。近年来,尽管基于计算的已知开销,但是基于搜索转向响应功率的局部峰值的方法正变得越来越流行。基于观察到来自语音源的声音的波长与要搜索的空间的尺寸相当并且源是宽带的发现,我们开发了一种有效的搜索算法。通过在空间和频率上使用从粗到细的策略,可以显着提高速度。我们介绍了搜索算法的应用,以加快简单的延迟和求和波束成形以及转向响应功率相变加权(SRP-PHAT)源定位算法的速度。与以前的算法进行了一系列系统的比较,结果表明该技术更快,更可靠,更准确。通过使用计算机视觉的约束,可以进一步提高算法的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号