首页> 外文期刊>Frontiers in Computational Neuroscience >Distorted Low-Level Visual Features Affect Saliency-Based Visual Attention
【24h】

Distorted Low-Level Visual Features Affect Saliency-Based Visual Attention

机译:变形的低级视觉功能会影响基于显着性的视觉注意

获取原文
           

摘要

Introduction Image distortions can attract attention away from the natural scene saliency (Redi et al., 2011 ). Performance of viewers in visual search tasks and their fixation patterns are also affected by different types and amounts of distortions (Vu et al., 2008 ). In this paper, we have discussed the opinion that distortions could largely affect the performance of predictive models of visual attention, and simulated the effects of distorted low-level visual features on the saliency-based bottom-up visual attention. Saliency is a fast and pre-attentive mechanism for orienting visual attention to intrinsically important objects which pop-out more easily in a cluttered scene. Distortion of the low-level features that contribute to saliency may impair the readiness of the visual system in detection of salient objects, which may have major implications for critical situations like driving or locomotion. These distortions in natural life can be introduced by eye diseases such as cataract, or spectacles which may alter color perception (de Fez et al., 2002 ) or cause undesired optical effects like blurring, non-uniform magnification, and image displacement (Barbero and Portilla, 2016 ). The extent to which each of these distorted saliency features may affect the attentional performance is addressed in this paper by employing a biologically-inspired predictive model of visual attention. We briefly summarize the current standing of computational work on visual attention models in the following section and suggest a simple and influential model of saliency to examine the above hypothesis. Furthermore, we demonstrate in an example the hindered performance of the predictive saliency model on distorted images. Models of visual attention Despite the widely shared belief in general public that we see everything around us, only a small fraction of the information registered by the visual system reaches levels of processing that mediate perception and directly influence behavior (Itti and Koch, 2000 ). Selective attention is the key to this process which turns looking into seeing (Carrasco, 2011 ). But how does the visual system select and enhance the representation of one particular feature or spatial location over less relevant features and locations? Much evidence has accumulated in favor of a two-component framework for the control of where in a visual scene attention is deployed: a bottom-up, fast, and image-based mechanism that biases the observer toward selecting stimuli based on their saliency, and a second slower, top-down mechanism, which uses task-dependent cues to direct the spotlight of attention under voluntary control. Koch and Ullman ( 1985 ) introduced the idea of a saliency map to accomplish pre-attentive selection. A saliency map is an explicit two-dimensional map that encodes the saliency of visual objects in the environment purely based on the low-level visual attributes of the object (Itti et al., 1998 ). Competition among neurons in this map gives rise to a single winning location that corresponds to the most salient object, which constitutes the next target. If this location is subsequently inhibited, the system automatically shifts to the next most salient location. This internal dynamic models the saccadic eye movements in visual search. This purely computational hypothesis received experimental support from many electrophysiological studies including single-cell recordings from lateral intraparietal neurons of macaque monkeys which responded to visual stimuli only when those stimuli were made salient (Gottlieb et al., 1998 ). Today, more than fifty quantitative models for saliency and fixation prediction are available which have been accumulated over the past 20 years; some of them tried to also incorporate top-down attention (Borji and Itti, 2013 ; Borji et al., 2013 ) or context-aware saliency detection (Goferman et al., 2012 ). However, not all of them are biologically plausible (Zhang and Sclaroff, 2013 ) or explain low-level visual features (Kümmerer et al., 2014 ); and the metrics used to compare the performance of these models are often different and inconsistent with each other (Kümmerer et al., 2015 ; Gide et al., 2016 ). For the purpose of this paper, we chose the original model of saliency-based visual attention for rapid scene analysis by Itti et al. ( 1998 ) for its utmost biological plausibility and simple dichotomy between low-level visual features which allows us to look at different distortion effects on each feature map independently, as well as the final saliency map. In the next section, we explain how this model can be used to simulate the effect of distortions on visual attention and visual search. Attention and image distortions Visual attention models have been used in many computer vision applications (Pal, 2016 ), including image and video compression and retrieval (Ouerhani et al., 2001 ; Li et al., 2011 ), multimedia technologies (Le Callet and Niebur, 2013 ), and
机译:引言图像失真会引起人们对自然场景凸显度的关注(Redi等,2011)。观看者在视觉搜索任务中的表现及其注视方式也受不同类型和变形量的影响(Vu等,2008)。在本文中,我们讨论了失真可能在很大程度上影响视觉注意力预测模型的性能的观点,并模拟了失真的低层视觉特征对基于显着性的自下而上视觉注意力的影响。显着性是一种快速且专注的机制,用于将视觉注意力定向到本质上重要的对象上,这些对象在混乱的场景中更容易弹出。导致显着性的低级特征的失真可能会损害视觉系统检测重要物体的准备,这可能会对诸如驾驶或运动之类的紧急情况产生重大影响。自然生活中的这些扭曲可能是由诸如白内障之类的眼疾引起的,或者是可能改变色彩感知的眼镜(de Fez等,2002),或者引起不希望的光学效果,例如模糊,放大率不均匀和图像位移(Barbero and Portilla,2016年)。本文通过采用生物学启发的视觉注意力预测模型,解决了这些扭曲的显着性特征可能会影响注意力表现的程度。在以下部分中,我们简要概述了视觉注意模型的计算工作的现状,并提出了一个简单而有影响力的显着性模型来检验上述假设。此外,我们在一个示例中证明了预测显着性模型在失真图像上的受阻性能。视觉注意的模型尽管公众普遍认为我们可以看到周围的一切,但视觉系统记录的信息中只有一小部分达到了介导感知并直接影响行为的处理水平(Itti和Koch,2000)。选择性注意是这一过程的关键,这一过程将目光投向了视线(Carrasco,2011年)。但是,视觉系统如何在不那么相关的特征和位置上选择并增强一个特定特征或空间位置的表示呢?已经有许多证据支持采用两部分框架来控制视觉场景中注意力的部署:一种自底向上,快速且基于图像的机制,该机制会使观察者偏向于根据其显着性选择刺激,以及第二种较慢的,自上而下的机制,它使用依赖于任务的提示来将注意力集中在自愿控制下。 Koch和Ullman(1985)引入了显着性图的概念来完成注意力集中的选择。显着性图是显式的二维图,它纯粹基于对象的低级视觉属性对环境中视觉对象的显着性进行编码(Itti等,1998)。在此图中,神经元之间的竞争产生了与最突出的对象相对应的单个获胜位置,该位置构成了下一个目标。如果此位置随后被禁止,则系统会自动切换到下一个最明显的位置。此内部动态模型为视觉搜索中的眼跳运动建模。这种纯粹的计算假设得到了许多电生理学研究的实验支持,包括猕猴侧面顶内神经元的单细胞记录,仅当这些刺激显着时才对视觉刺激做出反应(Gottlieb等,1998)。如今,在过去20年中已经积累了五十多种用于显着性和注视性预测的定量模型;他们中的一些人还尝试结合自上而下的关注(Borji和Itti,2013; Borji等,2013)或上下文感知的显着性检测(Goferman等,2012)。然而,并非所有这些在生物学上都是合理的(Zhang and Sclaroff,2013)或解释了低水平的视觉特征(Kümmerer等,2014)。并且用于比较这些模型的性能的指标通常是不同的并且彼此不一致(Kümmerer等,2015; Gide等,2016)。出于本文的目的,我们选择了基于显着性的视觉注意力的原始模型,用于Itti等人的快速场景分析。 (1998年)由于其最大的生物学上的合理性和低级视觉特征之间的简单二分法,这使我们能够独立地查看每个特征图以及最终显着图的不同失真效果。在下一节中,我们将说明如何使用此模型来模拟失真对视觉注意力和视觉搜索的影响。注意力和图像失真视觉注意力模型已在许多计算机视觉应用中使用(Pal,2016),包括图像和视频压缩与检索(Ouerhani等,2001; Li等,2011),多媒体技术(Le Callet和Niebur,2013年),和

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号