首页> 外文期刊>Neurocomputing >Neural networks based visual attention model for surveillance videos
【24h】

Neural networks based visual attention model for surveillance videos

机译:基于神经网络的监控视频视觉注意力模型

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper we propose a novel Computational Attention Models (CAM) that fuses bottom-up, top-down and salient motion visual cues to compute visual salience in surveillance videos. When dealing with a number of visual features/cues in a system, it is always challenging to combine or fuse them. As there is no commonly agreed natural way of combining different conspicuity maps obtained from different features: face and motion for example, the challenge is thus to find the right mix of visual cues to get a salience map that is the closest to a corresponding gaze map? In the literature many CAMs have used fixed weights for combining different visual cues. This is computationally attractive but is a very crude way of combining the different cues. Furthermore, the weights are typically set in an ad hoc fashion. Therefore in this paper, we propose a machine learning approach, using an Artificial Neural Network (ANN) to estimate these weights. The ANN is trained using gaze maps, obtained by eye tracking in psycho-physical experiments. These weights are then used to combine the conspicuities of the different visual cues in our CAM, which is later applied to surveillance videos. The proposed model is designed in a way to consider important visual cues typically present in surveillance videos, and to combine their conspicuities via ANN. The obtained results are encouraging and show a clear improvement over state-of-the-art CAMs.
机译:在本文中,我们提出了一种新颖的计算注意力模型(CAM),该模型融合了自下而上,自上而下和显着运动的视觉提示,以计算监视视频中的视觉显着性。当处理系统中的许多视觉特征/提示时,将它们组合或融合总是有挑战性的。由于没有一种普遍同意的自然方法来组合从不同特征(例如面部和动作)获得的不同显眼图,因此面临的挑战是找到视觉线索的正确组合以获得与相应的凝视图最接近的显着图。 ?在文献中,许多CAM使用固定权重来组合不同的视觉提示。这在计算上很有吸引力,但是是组合不同提示的非常粗略的方法。此外,权重通常以临时方式设置。因此,在本文中,我们提出了一种使用人工神经网络(ANN)估计这些权重的机器学习方法。使用凝视图训练ANN,凝视图是通过心理物理实验中的眼睛跟踪获得的。然后使用这些权重来组合我们CAM中不同视觉线索的显着性,然后将其应用于监视视频。设计提出的模型的方式是考虑通常存在于监视视频中的重要视觉提示,并通过ANN组合其醒目之处。所获得的结果令人鼓舞,并且显示出相对于最新CAM的明显改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号