首页> 外文期刊>International Journal of Computer Vision >Multi-task Compositional Network for Visual Relationship Detection
【24h】

Multi-task Compositional Network for Visual Relationship Detection

机译:用于视觉关系检测的多任务组合网络

获取原文
获取原文并翻译 | 示例
           

摘要

Previous methods treat visual relationship detection as a combination of object detection and predicate detection. However, natural images likely contain hundreds of objects and thousands of object pairs. Relying only on object detection and predicate detection is insufficient for effective visual relationship detection because the significant relationships are easily overwhelmed by the dominant less-significant relationships. In this paper, we propose a novel subtask for visual relationship detection, the significance detection, as the complement of object detection and predicate detection. Significance detection refers to the task of identifying object pairs with significant relationships. Meanwhile, we propose a novel multi-task compositional network (MCN) that simultaneously performs object detection, predicate detection, and significance detection. MCN consists of three modules, an object detector, a relationship generator, and a relationship predictor. The object detector detects objects. The relationship generator provides useful relationships, and the relationship predictor produces significance scores and predicts predicates. Furthermore, MCN proposes a multimodal feature fusion strategy based on visual, spatial, and label features and a novel correlated loss function to deeply combine object detection, predicate detection, and significance detection. MCN is validated on two datasets: visual relationship detection dataset and visual genome dataset. The experimental results compared with state-of-the-art methods verify the competitiveness of MCN and the usefulness of significance detection in visual relationship detection.
机译:以前的方法将视觉关系检测视为对象检测和谓词检测的组合。然而,自然图像可能包含数百个物体和数千对象对。仅依赖于物体检测和谓词检测不足以有效的视觉关系检测,因为显着的关系很容易被主导的更少显着关系所淹没。在本文中,我们提出了一种用于视觉关系检测,重要性检测的新型子任务,作为对象检测和谓词检测的补充。意义检测是指识别具有重要关系的对象对的任务。同时,我们提出了一种新的多任务组成网络(MCN),其同时执行对象检测,谓词检测和显​​着性检测。 MCN由三个模块,对象检测器,关系发生器和关系预测器组成。对象检测器检测对象。关系发生器提供有用的关系,并且关系预测器产生显着性评分并预测谓词。此外,MCN提出了一种基于视觉,空间和标签特征的多模式特征融合策略和新型相关损失功能,以深入地组合对象检测,谓词检测和意义检测。 MCN在两个数据集中验证:视觉关系检测数据集和视觉基因组数据集。实验结果与最先进的方法相比,验证了MCN的竞争力以及视觉关系检测中显着性检测的有用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号