首页> 外文会议>2011 IEEE International Conference on Acoustics, Speech and Signal Processing >Real time speaker localization and detection system for camera steering in multiparticipant videoconferencing environments
【24h】

Real time speaker localization and detection system for camera steering in multiparticipant videoconferencing environments

机译:多参与者视频会议环境中用于摄像机操纵的实时扬声器定位和检测系统

获取原文

摘要

A real time speaker localization and detection system for videoconferencing environments is presented. In this system, a recently proposed modified Steered Response Power - Phase Transform (SRP-PHAT) algorithm has been used as the core processing scheme. The new SRP-PHAT functional has been shown to provide robust localization performance in indoor environments without the need for having a very fine spatial grid, thus reducing the computational cost required in a practical implementation. Moreover, it has been demonstrated that the statistical distribution of location estimates when a speaker is active can be successfully used to discriminate between speech and non-speech frames by using a criterion of peakedness. As a result, talking participants can be detected and located with significant accuracy following a common processing framework.
机译:提出了一种用于视频会议环境的实时说话人定位和检测系统。在该系统中,最近提出的改进的转向响应功率-相位变换(SRP-PHAT)算法已用作核心处理方案。新的SRP-PHAT功能已被证明可以在室内环境中提供强大的定位性能,而无需具有非常精细的空间网格,从而降低了实际实现中所需的计算成本。此外,已经证明,通过使用峰度准则,当说话者活跃时,位置估计的统计分布可以成功地用于区分语音和非语音帧。结果,可以遵循通用的处理框架,以明显的准确性检测和定位会说话的参与者。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号