首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >Bottom–Up Visual Saliency Estimation With Deep Autoencoder-Based Sparse Reconstruction
【24h】

Bottom–Up Visual Saliency Estimation With Deep Autoencoder-Based Sparse Reconstruction

机译:基于深度基于自动编码器的稀疏重构的自下而上的视觉显着性估计

获取原文
获取原文并翻译 | 示例

摘要

Research on visual perception indicates that the human visual system is sensitive to center–surround (C–S) contrast in the bottom–up saliency-driven attention process. Different from the traditional contrast computation of feature difference, models based on reconstruction have emerged to estimate saliency by starting from original images themselves instead of seeking for certain features. However, in the existing reconstruction-based methods, the reconstruction parameters of each area are calculated independently without taking their global correlation into account. In this paper, inspired by the powerful feature learning and data reconstruction ability of deep autoencoders, we construct a deep C–S inference network and train it with the data sampled randomly from the entire image to obtain a unified reconstruction pattern for the current image. In this way, global competition in sampling and learning processes can be integrated into the nonlocal reconstruction and saliency estimation of each pixel, which can achieve better detection results than the models with separate consideration on local and global rarity. Moreover, by learning from the current scene, the proposed model can achieve the feature extraction and interaction simultaneously in an adaptive way, which can form a better generalization ability to handle more types of stimuli. Experimental results show that in accordance with different inputs, the network can learn distinct basic features for saliency modeling in its code layer. Furthermore, in a comprehensive evaluation on several benchmark data sets, the proposed method can outperform the existing state-of-the-art algorithms.
机译:视觉感知研究表明,在自下而上的显着性驱动注意力过程中,人类视觉系统对中心-周围(CS)对比敏感。与传统的特征差异对比计算不同,基于重构的模型已经出现,它通过从原始图像本身开始而不是寻找某些特征来估计显着性。但是,在现有的基于重建的方法中,每个区域的重建参数是独立计算的,而不考虑其全局相关性。在本文中,受深层自动编码器强大的功能学习和数据重建能力的启发,我们构建了一个深层的CS推理网络,并将其与从整个图像中随机采样的数据一起训练,以获得当前图像的统一重建模式。这样,就可以将采样和学习过程中的全球竞争整合到每个像素的非局部重建和显着性估计中,与单独考虑局部和全局稀有性的模型相比,该模型可以获得更好的检测结果。此外,通过从当前场景中学习,所提出的模型可以以自适应方式同时实现特征提取和交互,从而可以形成更好的泛化能力来处理更多类型的刺激。实验结果表明,根据不同的输入,网络可以在其代码层中学习显着性建模的不同基本特征。此外,在对几个基准数据集的综合评估中,所提出的方法可以胜过现有的最新算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号