首页> 外文会议>IEEE Conference on Computer Vision and Pattern Recognition >Modeling Relationships in Referential Expressions with Compositional Modular Networks
【24h】

Modeling Relationships in Referential Expressions with Compositional Modular Networks

机译:使用组合模块化网络对引用表达式中的关系进行建模

获取原文

摘要

People often refer to entities in an image in terms of their relationships with other entities. For example, the black cat sitting under the table refers to both a black cat entity and its relationship with another table entity. Understanding these relationships is essential for interpreting and grounding such natural language expressions. Most prior work focuses on either grounding entire referential expressions holistically to one region, or localizing relationships based on a fixed set of categories. In this paper we instead present a modular deep architecture capable of analyzing referential expressions into their component parts, identifying entities and relationships mentioned in the input expression and grounding them all in the scene. We call this approach Compositional Modular Networks (CMNs): a novel architecture that learns linguistic analysis and visual inference end-to-end. Our approach is built around two types of neural modules that inspect local regions and pairwise interactions between regions. We evaluate CMNs on multiple referential expression datasets, outperforming state-of-the-art approaches on all tasks.
机译:人们经常根据与其他实体的关系来指代图像中的实体。例如,坐在桌子下面的黑猫既指黑猫实体,也指它与另一个表实体的关系。理解这些关系对于解释和扎实自然语言表达至关重要。先前的大多数工作都集中于将整个参照表达全部基于一个区域,或者基于一组固定的类别来定位关系。相反,在本文中,我们提出了一种模块化的深层体系结构,该体系结构能够将引用表达式分析成它们的组成部分,识别输入表达式中提到的实体和关系,并将它们全部扎根在场景中。我们称这种方法为“组合模块化网络(CMN)”:一种新颖的体系结构,可端到端学习语言分析和视觉推理。我们的方法是基于两种类型的神经模块构建的,它们可以检查局部区域以及区域之间的成对交互。我们在多个引用表达数据集上评估CMN,在所有任务上均优于最新方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号