首页> 外文期刊>ETRI journal >Improving visual relationship detection using linguistic and spatial cues
【24h】

Improving visual relationship detection using linguistic and spatial cues

机译:使用语言和空间线索提高视觉关系检测

获取原文
获取外文期刊封面目录资料

摘要

Detecting visual relationships in an image is important in an image understanding task. It enables higher image understanding tasks, that is, predicting the next scene and understanding what occurs in an image. A visual relationship comprises of a subject, a predicate, and an object, and is related to visual, language, and spatial cues. The predicate explains the relationship between the subject and object and can be categorized into different categories such as prepositions and verbs. A large visual gap exists although the visual relationship is included in the same predicate. This study improves upon a previous study (that uses language cues using two losses) and a spatial cue (that only includes individual information) by adding relative information on the subject and object of the extant study. The architectural limitation is demonstrated and is overcome to detect all zero‐shot visual relationships. A new problem is discovered, and an explanation of how it decreases performance is provided. The experiment is conducted on the VRD and VG datasets and a significant improvement over previous results is obtained.
机译:检测图像中的视觉关系在图像理解任务中是重要的。它能够实现更高的图像理解任务,即预测下一个场景并理解图像中发生的内容。视觉关系包括对象,谓词和对象,并且与视觉,语言和空间线索有关。谓词解释了主题和对象之间的关系,并且可以分为不同类别,例如介词和动词。虽然视觉关系包括在相同的谓词中,但存在大的视觉差距。本研究通过以前的研究(使用两个损失使用语言线索)和空间提示(仅包括个人信息),通过添加关于延时研究的主题和对象的相对信息,包括空间提示。校正架构限制并克服以检测所有零射击视觉关系。发现了一个新问题,并解释了它如何降低性能。实验在VRD和VG数据集上进行,获得对先前结果的显着改善。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号