Attentive models in vision: Computing saliency maps in the deep learning era

Marcella Cornia; Davide Abati; Lorenzo Baraldi; Andrea Palazzi; Simone Calderara; Rita Cucchiara

首页> 外文期刊>Intelligenza Artificiale >Attentive models in vision: Computing saliency maps in the deep learning era

【24h】

Attentive models in vision: Computing saliency maps in the deep learning era

机译：视野中的细心模型：深度学习时代的计算显着图

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Estimating the focus of attention of a person looking at an image or a video is a crucial step which can enhance many vision-based inference mechanisms: image segmentation and annotation, video captioning, autonomous driving are some examples. The early stages of the attentive behavior are typically bottom-up; reproducing the same mechanism means to find the saliency embodied in the images, i.e. which parts of an image pop out of a visual scene. This process has been studied for decades both in neuroscience and in terms of computational models for reproducing the human cortical process. In the last few years, early models have been replaced by deep learning architectures, that outperform any early approach compared against public datasets. In this paper, we discuss the effectiveness of convolutional neural networks (CNNs) models in saliency prediction.We present a set of Deep Learning architectures developed by us, which can combine both bottom-up cues and higher-level semantics, and extract spatio-temporal features by means of 3D convolutions to model task-driven attentive behaviors.We will show how these deep networks closely recall the early saliency models, although improved with the semantics learned from the human ground-truth. Eventually, we will present a use-case in which saliency prediction is used to improve the automatic description of images.

机译：估计观察图像或视频的人的关注是一个重要的步骤，可以提高许多基于视觉的推断机制：图像分割和注释，视频字幕，自主驾驶是一些示例。周度行为的早期阶段通常是自下而上的;再现相同的机制装置，以找到图像中体现的显着性，即，从视觉场景中弹出图像的哪些部分。这一过程已经在神经科学的几十年中研究过，以便再现人皮质过程的计算模型。在过去几年中，早期模型已被深入学习架构所取代，比较与公共数据集相比表达任何早期方法。在本文中，我们讨论了卷积神经网络（CNNS）模型在显着性预测中的有效性。我们展示了我们开发的一组深度学习架构，可以组合自下而上的线索和更高级别的语义，并提取季度通过3D卷积来模拟任务驱动的细节行为的时间特征。我们将展示这些深度网络如何密切回忆早期的显着模型，尽管从人类地面真理中学到的语义中，可以改善。最终，我们将呈现一个用例，其中使用显着性预测来改善图像的自动描述。

著录项

来源
《Intelligenza Artificiale》 |2018年第2期|共15页
作者
Marcella Cornia; Davide Abati; Lorenzo Baraldi; Andrea Palazzi; Simone Calderara; Rita Cucchiara;
展开▼
作者单位

Department of Engineering "Enzo Ferrari" University of Modena and Reggio Emilia;

Department of Engineering "Enzo Ferrari" University of Modena and Reggio Emilia;

Department of Engineering "Enzo Ferrari" University of Modena and Reggio Emilia;

Department of Engineering "Enzo Ferrari" University of Modena and Reggio Emilia;

Department of Engineering "Enzo Ferrari" University of Modena and Reggio Emilia;

Department of Engineering "Enzo Ferrari" University of Modena and Reggio Emilia;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类人工智能理论;
关键词
Saliency; Human Attention; Neuroscience; Vision; Deep Learning;

机译：显着性;人类注意;神经科学;愿景;深入学习;

相似文献

外文文献
中文文献
专利

1. Attentive models in vision: Computing saliency maps in the deep learning era [J] . Marcella Cornia, Davide Abati, Lorenzo Baraldi, Intelligenza Artificiale . 2018,第2期

机译：视野中的细心模型：深度学习时代的计算显着图
2. Synthesizing Supervision for Learning Deep Saliency Network without Human Annotation [J] . Zhang Dingwen, Han Junwei, Zhang Yu, IEEE Transactions on Pattern Analysis and Machine Intelligence . 2020,第7期

机译：没有人为注释学习深度显着性网络的监督
3. Subword Attentive Model for Arabic Sentiment Analysis: A Deep Learning Approach [J] . Beseiso Majdi, Elmousalami Haytham ACM transactions on Asian and low-resource language information processing . 2020,第2期

机译：阿拉伯语情绪分析的子字分级模型：深入学习方法
4. Attentive Models in Vision: Computing Saliency Maps in the Deep Learning Era [C] . Marcella Cornia, Davide Abati, Lorenzo Baraldi, International conference of the Italian Association for Artificial Intelligence . 2017

机译：视觉中的注意力模型：深度学习时代中的计算显着性图
5. Leveraging Model Flexibility and Deep Structure: Non-parametric and Deep Models for Computer Vision Processes with Applications to Deep Model Compression [D] . Rhodes, Anthony D. 2020

机译：利用模型灵活性和深度结构：计算机视觉过程的非参数和深模型，具有深入模型压缩的应用程序
6. Investigation of a Novel Deep Learning-Based Computed Tomography Perfusion Mapping Framework for Functional Lung Avoidance Radiotherapy [O] . Ge Ren, Sai-kit Lam, Jiang Zhang, 2021

机译：基于深度学习的基于深度学习的计算机断层扫描灌注框架的功能性肺避免放射疗法研究
7. Attentive models in vision: Computing saliency maps in the deep learning era [O] . Marcella Cornia, Davide Abati, Lorenzo Baraldi, 2019

机译：视野中的细心模型：深度学习时代的计算显着性图

Attentive models in vision: Computing saliency maps in the deep learning era

摘要

著录项

相似文献

相关主题

期刊订阅