首页> 外文期刊>電子情報通信学会技術研究報告. 音声. Speech >Fusing audio and video information toward detection of speech events under real environments
【24h】

Fusing audio and video information toward detection of speech events under real environments

机译:融合音频和视频信息以检测真实环境下的语音事件

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, a method of detecting and separating speech events in a multiple-sound-source condition using audio and video information is proposed. For detecting speech events, sound localization using a microphone array and human tracking by a stereo vision is combined by a Bayesian network. From the inference results of the Bayesian network, the information on the time and location of speech events can be known in a multiple-sound-source condition. Based on the detected speech event information, a maximum likelihood adaptive beamformer is constructed and the speech signal is separated from background noises and interferences.
机译:本文提出了一种利用音频和视频信息在多声源条件下检测和分离语音事件的方法。为了检测语音事件,通过贝叶斯网络将使用麦克风阵列进行的声音定位和通过立体视觉进行的人体跟踪相结合。根据贝叶斯网络的推断结果,可以在多声源条件下获知有关语音事件的时间和位置的信息。基于检测到的语音事件信息,构造最大似然自适应波束形成器,并将语音信号与背景噪声和干扰分离。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号