首页> 外文会议>IEEE Conference on Computer Vision and Pattern Recognition >Deep Structured Scene Parsing by Learning with Image Descriptions
【24h】

Deep Structured Scene Parsing by Learning with Image Descriptions

机译:通过使用图像描述学习深层结构化场景解析

获取原文
获取外文期刊封面目录资料

摘要

This paper addresses a fundamental problem of scene understanding: How to parse the scene image into a structured configuration (i.e., a semantic object hierarchy with object interaction relations) that finely accords with human perception. We propose a deep architecture consisting of two networks: i) a convolutional neural network (CNN) extracting the image representation for pixelwise object labeling and ii) a recursive neural network (RNN) discovering the hierarchical object structure and the inter-object relations. Rather than relying on elaborative user annotations (e.g., manually labeling semantic maps and relations), we train our deep model in a weakly-supervised manner by leveraging the descriptive sentences of the training images. Specifically, we decompose each sentence into a semantic tree consisting of nouns and verb phrases, and facilitate these trees discovering the configurations of the training images. Once these scene configurations are determined, then the parameters of both the CNN and RNN are updated accordingly by back propagation. The entire model training is accomplished through an Expectation-Maximization method. Extensive experiments suggest that our model is capable of producing meaningful and structured scene configurations and achieving more favorable scene labeling performance on PASCAL VOC 2012 over other state-of-the-art weakly-supervised methods.
机译:本文讨论了场景理解的根本问题:如何将场景图像解析为结构化配置(即,具有对象交互关系的语义对象层次结构),其精细地符合人类感知。我们提出了一个由两个网络组成的深度架构:i)卷积神经网络(CNN)提取PixelWient对象标签的图像表示和II)递归神经网络(RNN),其发现分层对象结构和对象间关系。通过利用培训图像的描述性句子,我们通过利用训练图像的描述性句子来训练我们的深层模型而不是依赖于精细的用户注释(例如,手动标记语义地图和关系)而不是依赖于阐述的用户注释。具体地,我们将每个句子分解为由名词和动词短语组成的语义树,并促进这些树发现训练图像的配置。一旦确定了这些场景配置,那么通过后传播相应地更新CNN和RNN的参数。通过期望最大化方法完成整个模型培训。广泛的实验表明,我们的模型能够在其他最先进的虚弱的方法上生产有意义和结构化的场景配置,并在Pascal VOC上实现更有利的场景标记性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号