Multi-task Compositional Network for Visual Relationship Detection

Zhan Yibing; Yu Jun; Yu Ting; Tao Dacheng

首页> 外文期刊>International Journal of Computer Vision >Multi-task Compositional Network for Visual Relationship Detection

【24h】

Multi-task Compositional Network for Visual Relationship Detection

机译：用于视觉关系检测的多任务组合网络

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Previous methods treat visual relationship detection as a combination of object detection and predicate detection. However, natural images likely contain hundreds of objects and thousands of object pairs. Relying only on object detection and predicate detection is insufficient for effective visual relationship detection because the significant relationships are easily overwhelmed by the dominant less-significant relationships. In this paper, we propose a novel subtask for visual relationship detection, the significance detection, as the complement of object detection and predicate detection. Significance detection refers to the task of identifying object pairs with significant relationships. Meanwhile, we propose a novel multi-task compositional network (MCN) that simultaneously performs object detection, predicate detection, and significance detection. MCN consists of three modules, an object detector, a relationship generator, and a relationship predictor. The object detector detects objects. The relationship generator provides useful relationships, and the relationship predictor produces significance scores and predicts predicates. Furthermore, MCN proposes a multimodal feature fusion strategy based on visual, spatial, and label features and a novel correlated loss function to deeply combine object detection, predicate detection, and significance detection. MCN is validated on two datasets: visual relationship detection dataset and visual genome dataset. The experimental results compared with state-of-the-art methods verify the competitiveness of MCN and the usefulness of significance detection in visual relationship detection.

机译：以前的方法将视觉关系检测视为对象检测和谓词检测的组合。然而，自然图像可能包含数百个物体和数千对象对。仅依赖于物体检测和谓词检测不足以有效的视觉关系检测，因为显着的关系很容易被主导的更少显着关系所淹没。在本文中，我们提出了一种用于视觉关系检测，重要性检测的新型子任务，作为对象检测和谓词检测的补充。意义检测是指识别具有重要关系的对象对的任务。同时，我们提出了一种新的多任务组成网络（MCN），其同时执行对象检测，谓词检测和显着性检测。 MCN由三个模块，对象检测器，关系发生器和关系预测器组成。对象检测器检测对象。关系发生器提供有用的关系，并且关系预测器产生显着性评分并预测谓词。此外，MCN提出了一种基于视觉，空间和标签特征的多模式特征融合策略和新型相关损失功能，以深入地组合对象检测，谓词检测和意义检测。 MCN在两个数据集中验证：视觉关系检测数据集和视觉基因组数据集。实验结果与最先进的方法相比，验证了MCN的竞争力以及视觉关系检测中显着性检测的有用性。

著录项

来源
《International Journal of Computer Vision》 |2020年第9期|共20页
作者
Zhan Yibing; Yu Jun; Yu Ting; Tao Dacheng;
展开▼
作者单位

Hangzhou Dianzi Univ Sch Comp Sci &

Technol Key Lab Complex Syst Modeling &

Simulat Hangzhou Peoples R China;

Hangzhou Dianzi Univ Sch Comp Sci &

Technol Key Lab Complex Syst Modeling &

Simulat Hangzhou Peoples R China;

Hangzhou Dianzi Univ Sch Comp Sci &

Technol Key Lab Complex Syst Modeling &

Simulat Hangzhou Peoples R China;

Univ Sydney Fac Engn UBTECH Sydney Artificial Intelligence Ctr Sch Comp Sci 6 Cleveland St Darlington NSW 2008 Australia;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
Visual relationship detection; Object detection; Predicate detection; Significance detection; Multi-task;

机译：视觉关系检测;对象检测;谓词检测;意义检测;多任务;

相似文献

外文文献
中文文献
专利

1. Multi-task Compositional Network for Visual Relationship Detection [J] . Zhan Yibing, Yu Jun, Yu Ting, International Journal of Computer Vision . 2020,第8a9期

机译：用于视觉关系检测的多任务组合网络
2. Residual visualization-guided explainable copy-relationship learning for image copy detection in social networks [J] . Zhou Zhili, Li Yujiang, Zhang Yulan, Knowledge-Based Systems . 2021,第Sepa27期

机译：社交网络中图像复制检测的剩余可视化导向的可解释 - 关系学习
3. Visual relationship detection based on bidirectional recurrent neural network [J] . Yibo Dai, Chao Wang, Jian Dong, Multimedia Tools and Applications . 2020,第47a48期

机译：基于双向复发神经网络的视觉关系检测
4. Visual Relationship Detection with Low Rank Non-Negative Tensor Decomposition [C] . Mohammed Haroon Dupty, Zhen Zhang, Wee Sun Lee AAAI Conference on Artificial Intelligence . 2020

机译：低排名非负张量分解的视觉关系检测
5. Visualizing the Cypher: Networking Relationships of Temporal Dance Environments [D] . Fernando, Randy S. 2018

机译：可视化密码：时间舞蹈环境的网络关系
6. Upper gastrointestinal anatomy detection with multi-task convolutional neural networks [O] . Zhang Xu, Yu Tao, Zheng Wenfang, 2019

机译：多任务卷积神经网络的上消化道解剖学检测
7. ReLaText: Exploiting visual relationships for arbitrary-shaped scene text detection with graph convolutional networks [O] . Chixiang Ma, Lei Sun, Zhuoyao Zhong, 2021

机译：相关：利用图形卷积网络的任意形状的场景文本检测的视觉关系

Multi-task Compositional Network for Visual Relationship Detection

摘要

著录项

相似文献

相关主题

期刊订阅