首页> 外文期刊>Computer vision and image understanding >Image Understanding using vision and reasoning through Scene Description Graph
【24h】

Image Understanding using vision and reasoning through Scene Description Graph

机译:通过场景描述图使用视觉和推理进行图像理解

获取原文
获取原文并翻译 | 示例

摘要

Two of the fundamental tasks in image understanding using text are caption generation and visual question answering (Antol et al., 2015; ). This work presents an intermediate knowledge structure that can be used for both tasks to obtain increased interpretability. We call this knowledge structureScene Description Graph (SDG), as it is a directed labeled graph, representing objects, actions, regions, as well as their attributes, along with inferred concepts and semantic (from KM-Ontology (Clark et al., 2004)), ontological (i.e. superclass, hasProperty), and spatial relations. Thereby a general architecture is proposed in which a system can represent both the content and underlying concepts of an image using an SDG. The architecture is implemented using generic visual recognition techniques and commonsense reasoning to extract graphs from images. The utility of the generated SDGs is demonstrated in the applications of image captioning, image retrieval, and through examples in visual question answering. The experiments in this work show that the extracted graphs capture syntactic and semantic content of images with reasonable accuracy.
机译:使用文字进行图像理解的两个基本任务是字幕生成和视觉问答(Antol et al。,2015;)。这项工作提出了一个中间知识结构,可以用于两个任务,以获取更高的解释性。我们称这种知识结构为场景描述图(SDG),因为它是有向标记的图,代表对象,动作,区域及其属性以及推断的概念和语义(来自KM-Ontology(Clark et al。,2004 )),本体论(即超类,hasProperty)和空间关系。因此,提出了一种通用体系结构,其中系统可以使用SDG来表示图像的内容和底层概念。该体系结构使用通用的视觉识别技术和常识推理来实现,以从图像中提取图形。生成的SDG的实用性在图像字幕,图像检索的应用中以及通过视觉问题解答中的示例得到了证明。实验表明,所提取的图形以合理的精度捕获了图像的句法和语义内容。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号