...
首页> 外文期刊>Journal of Advanced Computatioanl Intelligence and Intelligent Informatics >Joint Audio-Visual Tracking Based on Dynamically Weighted Linear Combination of Probability State Density
【24h】

Joint Audio-Visual Tracking Based on Dynamically Weighted Linear Combination of Probability State Density

机译:基于概率状态密度动态加权线性组合的联合视听跟踪

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

This paper proposes a method that can be applied to speaker tracking under stabilized, continuous conditions using visual and audio information even when input information is interrupted due to disturbance or occlusion caused by the effects of noise or varying illumination. Using this method, the position of a speaker is expressed based on a likelihood distribution that is obtained through integration of visual information and audio information. First, visual and audio information is integrated as as a weighted linear combination of probability density distribution, which is estimated as a result of the observation of the visual and audio information. In this case, the weight is taken as a variable, which varys in proportion to the maximum value of probability density distributions obtained for each type of information. Next, the result obtained as described above and the weighted linear combination of the distribution in the past are obtained, and the result thus obtained is taken as the likelihood distribution related to the position of the speaker. By changing the weight dynamically, it becomes possible to select the type of information freely or to add weight and, accordingly, to conduct stabilized, continuous tracking even when the speaker cannot be detected momentarily due to occlusion, voice interruption, or noise. We conducted a series of experiments on speaker tracking using circular microphone array and an omni-directional camera. In this way, we have succeeded in confirming it possible to perform stabilized tracking on speakers continuously in spite of occlusion or voice interruption.
机译:本文提出了一种方法,该方法即使在由于噪声或光照变化而引起的干扰或遮挡导致输入信息中断的情况下,也可以使用视觉和音频信息在稳定,连续的条件下应用于扬声器跟踪。使用该方法,基于通过视觉信息和音频信息的整合而获得的似然分布来表达说话者的位置。首先,视听信息被集成为概率密度分布的加权线性组合,该概率密度分布是根据视听信息的观察结果而估计的。在这种情况下,权重被视为一个变量,该变量与为每种类型的信息获得的概率密度分布的最大值成比例地变化。接下来,获得如上所述获得的结果和过去的分布的加权线性组合,并将由此获得的结果作为与说话者的位置有关的似然分布。通过动态改变权重,可以自由选择信息的类型或增加权重,因此,即使由于遮挡,语音中断或噪音而无法立即检测到说话者时,也可以进行稳定的连续跟踪。我们使用圆形麦克风阵列和全向摄像头对扬声器进行了一系列实验。这样,我们成功地确认了即使有遮挡或声音中断,也可以连续对扬声器进行稳定的跟踪。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号