Neural networks based visual attention model for surveillance videos

Fahad Fazal Elahi Guraya; Faouzi Alaya Cheikh

首页> 外文期刊>Neurocomputing >Neural networks based visual attention model for surveillance videos

【24h】

Neural networks based visual attention model for surveillance videos

机译：基于神经网络的监控视频视觉注意力模型

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper we propose a novel Computational Attention Models (CAM) that fuses bottom-up, top-down and salient motion visual cues to compute visual salience in surveillance videos. When dealing with a number of visual features/cues in a system, it is always challenging to combine or fuse them. As there is no commonly agreed natural way of combining different conspicuity maps obtained from different features: face and motion for example, the challenge is thus to find the right mix of visual cues to get a salience map that is the closest to a corresponding gaze map? In the literature many CAMs have used fixed weights for combining different visual cues. This is computationally attractive but is a very crude way of combining the different cues. Furthermore, the weights are typically set in an ad hoc fashion. Therefore in this paper, we propose a machine learning approach, using an Artificial Neural Network (ANN) to estimate these weights. The ANN is trained using gaze maps, obtained by eye tracking in psycho-physical experiments. These weights are then used to combine the conspicuities of the different visual cues in our CAM, which is later applied to surveillance videos. The proposed model is designed in a way to consider important visual cues typically present in surveillance videos, and to combine their conspicuities via ANN. The obtained results are encouraging and show a clear improvement over state-of-the-art CAMs.

机译：在本文中，我们提出了一种新颖的计算注意力模型（CAM），该模型融合了自下而上，自上而下和显着运动的视觉提示，以计算监视视频中的视觉显着性。当处理系统中的许多视觉特征/提示时，将它们组合或融合总是有挑战性的。由于没有一种普遍同意的自然方法来组合从不同特征（例如面部和动作）获得的不同显眼图，因此面临的挑战是找到视觉线索的正确组合以获得与相应的凝视图最接近的显着图。？在文献中，许多CAM使用固定权重来组合不同的视觉提示。这在计算上很有吸引力，但是是组合不同提示的非常粗略的方法。此外，权重通常以临时方式设置。因此，在本文中，我们提出了一种使用人工神经网络（ANN）估计这些权重的机器学习方法。使用凝视图训练ANN，凝视图是通过心理物理实验中的眼睛跟踪获得的。然后使用这些权重来组合我们CAM中不同视觉线索的显着性，然后将其应用于监视视频。设计提出的模型的方式是考虑通常存在于监视视频中的重要视觉提示，并通过ANN组合其醒目之处。所获得的结果令人鼓舞，并且显示出相对于最新CAM的明显改进。

著录项

来源
《Neurocomputing》 |2015年第ptac期|1348-1359|共12页
作者
Fahad Fazal Elahi Guraya; Faouzi Alaya Cheikh;
展开▼
作者单位

Faculty of Computer Science and Media Technology, Gjovik University College, P.O. Box 191, N-2802 Gjovik, Norway;

Faculty of Computer Science and Media Technology, Gjovik University College, P.O. Box 191, N-2802 Gjovik, Norway;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Visual salience; Video; Surveillance; Neural network; Attention model; HVS;

机译：视觉显着性视频;监视神经网络;注意模型;HVS;

相似文献

外文文献
中文文献
专利

1. A visual attention model based on hierarchical spiking neural networks [J] . QingXiang Wu, T.M.McGinnity, Liam Maguire, Neurocomputing . 2013,第sepa20期

机译：基于分层尖峰神经网络的视觉注意力模型
2. Hybrid convolutional neural networks and optical flow for video visual attention prediction [J] . Meijun Sun, Ziqi Zhou, Dong Zhang, Multimedia Tools and Applications . 2018,第22期

机译：混合卷积神经网络和光流用于视频视觉注意力预测
3. VideoWhisper: Toward Discriminative Unsupervised Video Feature Learning With Attention-Based Recurrent Neural Networks [J] . Na Zhao, Hanwang Zhang, Richang Hong, IEEE transactions on multimedia . 2017,第9期

机译：VideoWhisper：通过基于注意力的递归神经网络实现区分性无监督视频特征学习
4. Visual attention modeling for 3D video using neural networks [C] . Iatsun Iana, Larabi Mohamed-Chaker, Fernandez-Maloigne Christine International Conference on 3D Imaging . 2014

机译：使用神经网络对3D视频进行视觉注意力建模
5. Neural models of inter-cortical networks in the primate visual system for navigation, attention, path perception, and static and kinetic figure-ground perception. [D] . Layton, Oliver W. 2013

机译：灵长类动物视觉系统中皮层间网络的神经模型，用于导航，注意力，路径感知以及静态和动态图形地面感知。
6. Neural network modelling of the influence of channelopathies on reflex visual attention [O] . Alexandre Gravier, Chai Quek, Włodzisław Duch, 2016

机译：神经网络模型对反射性视觉注意的影响
7. ASTM: An Attention based Spatiotemporal Model for Video Prediction Using 3D Convolutional Neural Networks [O] . Zheng Chang, Xinfeng Zhang, Shanshe Wang, 2021

机译：ASTM：使用3D卷积神经网络的视频预测注意力模型

Neural networks based visual attention model for surveillance videos

摘要

著录项

相似文献

相关主题

期刊订阅