【24h】

Audiovisual speaker localization in medium smart meeting room

机译:中型智能会议室中的视听演讲者本地化

获取原文
获取原文并翻译 | 示例

摘要

The issue of automatic selection of the current active speaker among more than thirty participants located in the medium-sized meeting room is considered. Techniques of video tracking and sound source localization are implemented for recording AVI files of speaker remarks in the developed smart meeting room. Video processing of streams from five cameras serves for registration of participants in fixed chair positions, tracking main speaker based on histogram comparison and AdaBoosted cascade classifier for face detection. Multichannel sound source localization based on GCC-PHAT method is used for estimation of the speaker position by four microphone arrays. In the 18dB SNR case the sound source localization rate was about 97% and fine RMSE was lower 0.23 m.
机译:考虑了在中型会议室的三十多个参与者中自动选择当前活动发言人的问题。实现了视频跟踪和声源定位技术,用于在开发的智能会议室中记录演讲者的AVI文件。来自五个摄像头的流的视频处理用于将参与者固定在椅子上的位置进行注册,并基于直方图比较和AdaBoosted级联分类器跟踪主要说话者,以进行面部检测。基于GCC-PHAT方法的多声道声源定位被用于通过四个麦克风阵列估计扬声器位置。在18dB SNR的情况下,声源定位率约为97%,精细的RMSE较低,为0.23 m。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号