首页> 外文期刊>Cybernetics, IEEE Transactions on >Scene Categorization by Deeply Learning Gaze Behavior in a Semisupervised Context
【24h】

Scene Categorization by Deeply Learning Gaze Behavior in a Semisupervised Context

机译:通过在半熟语境中深入学习凝视行为的场景分类

获取原文
获取原文并翻译 | 示例
           

摘要

Accurately recognizing different categories of sceneries with sophisticated spatial configurations is a useful technique in computer vision and intelligent systems, e.g., scene understanding and autonomous driving. Competitive accuracies have been observed by the deep recognition models recently. Nevertheless, these deep architectures cannot explicitly characterize human visual perception, that is, the sequence of gaze allocation and the subsequent cognitive processes when viewing each scenery. In this paper, a novel spatially aware aggregation network is proposed for scene categorization, where the human gaze behavior is discovered in a semisupervised setting. In particular, as semantically labeling a large quantity of scene images is labor-intensive, a semisupervised and structure-preserved non-negative matrix factorization (NMF) is proposed to detect a set of visually/semantically salient regions from each scenery. Afterward, the gaze shifting path (GSP) is engineered to characterize the process of humans perceiving each scene picture. To deeply describe each GSP, a novel spatially aware CNN termed SA-Net is developed. It accepts input regions with various shapes and statistically aggregates all the salient regions along each GSP. Finally, the learned deep GSP features from the entire scene images are fused into an image kernel, which is subsequently integrated into a kernel SVM to categorize different sceneries. Comparative experiments on six scene image sets have shown the advantage of our method.
机译:准确地识别具有复杂空间配置的不同类别的景手是计算机视觉和智能系统中的有用技术,例如,场景理解和自动驾驶。最近的深度识别模型已经观察到竞争的准确性。然而,这些深度架构不能明确地表征人类视觉感知,即在观看每个风景时凝视分配和随后的认知过程的序列。在本文中,提出了一种新颖的空间意识到的聚合网络,用于场景分类,其中在半经验设置中发现人凝视行为。特别地,作为大量场景图像是劳动密集型的,提出了一种半质化和结构保存的非负矩阵分解(NMF)以检测来自每个风景的一组视觉/语义突出区域。之后,设计了凝视移位路径(GSP)以表征感知每个场景图片的人类的过程。为了深入描述每个GSP,开发了一种新的空间意识的CNN被称为SA-Net。它接受具有各种形状的输入区域,并统计地聚集沿每个GSP的所有突出区域。最后,来自整个场景图像的学习深度GSP功能融合到图像内核中,随后将其集成到内核SVM中以对不同的景客进行分类。六场场景图像集的比较实验表明了我们方法的优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号