首页> 外文会议>IEEE/CVF Conference on Computer Vision and Pattern Recognition >Cops-Ref: A New Dataset and Task on Compositional Referring Expression Comprehension
【24h】

Cops-Ref: A New Dataset and Task on Compositional Referring Expression Comprehension

机译:警察参考:一种新的数据集和组成参考表达表达的任务。

获取原文

摘要

Referring expression comprehension (REF) aims at identifying a particular object in a scene by a natural language expression. It requires joint reasoning over the textual and visual domains to solve the problem. Some popular referring expression datasets, however, fail to provide an ideal test bed for evaluating the reasoning ability of the models, mainly because 1) their expressions typically describe only some simple distinctive properties of the object and 2) their images contain limited distracting information. To bridge the gap, we propose a new dataset for visual reasoning in context of referring expression comprehension with two main features. First, we design a novel expression engine rendering various reasoning logics that can be flexibly combined with rich visual properties to generate expressions with varying compositionality. Second, to better exploit the full reasoning chain embodied in an expression, we propose a new test setting by adding additional distracting images containing objects sharing similar properties with the referent, thus minimising the success rate of reasoning-free cross-domain alignment. We evaluate several state-of-the-art REF models, but find none of them can achieve promising performance. A proposed modular hard mining strategy performs the best but still leaves substantial room for improvement.
机译:引用表达理解(REF)旨在通过自然语言表达来识别场景中的特定对象。它需要在文本和视觉域上进行联合推理才能解决该问题。但是,一些流行的引用表达数据集无法为评估模型的推理能力提供理想的测试平台,主要是因为1)它们的表达通常仅描述对象的某些简单独特属性,以及2)其图像包含有限的干扰信息。为了弥合差距,我们在引用具有两个主要特征的表达理解的情况下,提出了一个用于视觉推理的新数据集。首先,我们设计了一种新颖的表达引擎,该引擎呈现了各种推理逻辑,这些逻辑可以与丰富的视觉特性灵活地组合在一起,以生成具有不同成分的表达式。其次,为了更好地利用表达式中包含的全部推理链,我们提出了一种新的测试设置,方法是添加其他分散注意力的图像,其中包含与参考对象共享相似属性的对象,从而最大程度地减少了无需推理的跨域对齐的成功率。我们评估了几种最先进的REF模型,但发现它们都无法实现令人鼓舞的性能。拟议的模块化硬采矿策略性能最佳,但仍有很大的改进空间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号