Cops-Ref: A New Dataset and Task on Compositional Referring Expression Comprehension

机译：警察参考：一种新的数据集和组成参考表达表达的任务。

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Referring expression comprehension (REF) aims at identifying a particular object in a scene by a natural language expression. It requires joint reasoning over the textual and visual domains to solve the problem. Some popular referring expression datasets, however, fail to provide an ideal test bed for evaluating the reasoning ability of the models, mainly because 1) their expressions typically describe only some simple distinctive properties of the object and 2) their images contain limited distracting information. To bridge the gap, we propose a new dataset for visual reasoning in context of referring expression comprehension with two main features. First, we design a novel expression engine rendering various reasoning logics that can be flexibly combined with rich visual properties to generate expressions with varying compositionality. Second, to better exploit the full reasoning chain embodied in an expression, we propose a new test setting by adding additional distracting images containing objects sharing similar properties with the referent, thus minimising the success rate of reasoning-free cross-domain alignment. We evaluate several state-of-the-art REF models, but find none of them can achieve promising performance. A proposed modular hard mining strategy performs the best but still leaves substantial room for improvement.

机译：引用表达理解（REF）旨在通过自然语言表达来识别场景中的特定对象。它需要在文本和视觉域上进行联合推理才能解决该问题。但是，一些流行的引用表达数据集无法为评估模型的推理能力提供理想的测试平台，主要是因为1）它们的表达通常仅描述对象的某些简单独特属性，以及2）其图像包含有限的干扰信息。为了弥合差距，我们在引用具有两个主要特征的表达理解的情况下，提出了一个用于视觉推理的新数据集。首先，我们设计了一种新颖的表达引擎，该引擎呈现了各种推理逻辑，这些逻辑可以与丰富的视觉特性灵活地组合在一起，以生成具有不同成分的表达式。其次，为了更好地利用表达式中包含的全部推理链，我们提出了一种新的测试设置，方法是添加其他分散注意力的图像，其中包含与参考对象共享相似属性的对象，从而最大程度地减少了无需推理的跨域对齐的成功率。我们评估了几种最先进的REF模型，但发现它们都无法实现令人鼓舞的性能。拟议的模块化硬采矿策略性能最佳，但仍有很大的改进空间。

著录项

来源
《IEEE/CVF Conference on Computer Vision and Pattern Recognition》|2020年|10083-10092|共10页
会议地点
作者
Zhenfang Chen; Peng Wang; Lin Ma; Kwan-Yee K. Wong; Qi Wu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Cognition; Visualization; Task analysis; Cats; Engines; Semantics; Genetic expression;

机译：认知;可视化;任务分析;猫;引擎;语义;遗传表达;

相似文献

外文文献
中文文献
专利

1. A task-performance evaluation of referring expressions in situated collaborative task dialogues [J] . Philipp Spanger, Ryu Iida, Takenobu Tokunaga, Language Resources and Evaluation . 2013,第4期

机译：协作式协作对话中引用表达的任务绩效评估
2. A task-performance evaluation of referring expressions in situated collaborative task dialogues [J] . Philipp Spanger, Ryu Iida, Takenobu Tokunaga, Language Resources and Evaluation . 2013,第4期

机译：协作式协作对话中引用表达的任务绩效评估
3. Stacked Attention Networks for Referring Expressions Comprehension [J] . Yugang Li, Haibo Sun, Zhe Chen, Computers, Materials & Continua . 2020,第3期

机译：堆积注意网络用于参考表达理解
4. Multi-Task Collaborative Network for Joint Referring Expression Comprehension and Segmentation [C] . Gen Luo, Yiyi Zhou, Xiaoshuai Sun, IEEE/CVF Conference on Computer Vision and Pattern Recognition . 2020

机译：联合引用表达理解和分段的多任务协作网络
5. Referring Expression Comprehension for CLEVR-Ref+ Dataset [D] . Rathor, Kuldeep Singh. 2020

机译：引用CLEVR-REF + DataSet的表达式理解
6. Identification of candidate drugs using tensor-decomposition-based unsupervised feature extraction in integrated analysis of gene expression between diseases and DrugMatrix datasets [O] . Y.-h. Taguchi -1

机译：在基于疾病和DrugMatrix数据集的基因表达集成分析中使用基于张量分解的无监督特征提取来识别候选药物
7. Referring Expression Comprehension: A Survey of Methods and Datasets [O] . Yanyuan Qiao, Chaorui Deng, Qi Wu 2020

机译：参考表达理解：对方法和数据集的调查

Cops-Ref: A New Dataset and Task on Compositional Referring Expression Comprehension

摘要

著录项

相似文献

相关主题

期刊订阅