...
首页> 外文期刊>Knowledge-Based Systems >RDBN: Visual relationship detection with inaccurate RGB-D images
【24h】

RDBN: Visual relationship detection with inaccurate RGB-D images

机译:RDBN:具有不准确的RGB-D图像的视觉关系检测

获取原文
获取原文并翻译 | 示例
           

摘要

Traditional visual relationship detection methods only use RGB information to train the semantic network, which do not match human habits that we combine RGB information with Depth information to perceive the world, thus, there is not enough generalization ability (zero-shot performance) to extract the visual relationships in practical scenes. To solve this problem, a novel visual relationship detection framework based on RGB-D images is proposed in this paper. Since it is difficult to get accurate depth maps from complex scenes, we propose a fuzzy strategy based method to represent Depth features of inaccurate depth maps which are independent of manual depth annotations. In particular, we formulate the RGB-Depth-Balanced-Network (RDBN) which can simultaneously process RGB features and the corresponding estimated depth maps to counter the inaccuracy of depth maps and extract semantic information by the only input of monocular RGB images. In experiments, we conduct ablation experiments to analyze functions of different visual components to demonstrate the effectiveness of our RDBN. Furthermore, we show that RDBN outperforms state-of-the-art visual relationship detection methods on Visual Relationship Dataset (VRD) and UnRel Dataset when tackling the visual relationship detection task of zero-shot learning in specific depth conditions, and the task of image retrieval among unusual relationships. (C) 2020 Elsevier B.V. All rights reserved.
机译:传统的视觉关系检测方法仅使用RGB信息培训语义网络,这与人类习惯不匹配,我们将RGB信息与深度信息相结合,因此,没有足够的概括能力(零击序)提取实际场景中的视觉关系。为了解决这个问题,本文提出了一种基于RGB-D图像的新型视觉关系检测框架。由于难以从复杂场景获得精确的深度映射,因此我们提出了一种基于模糊的策略方法来表示不准确的深度图的深度特征,这些方法与手动深度注释无关。特别地,我们制定了RGB深度平衡网络(RDBN),该网络(RDBN)可以同时处理RGB特征和相应的估计深度映射,以通过单眼RGB图像的唯一输入来对抗深度映射的不准确性并提取语义信息。在实验中,我们进行消融实验,分析不同视觉组件的功能,以证明我们的RDBN的有效性。此外,我们表明RDBN在特定深度条件下解决零射击学习的视觉关系检测任务时,RDBN在视觉关系数据集(VRD)和UNREL数据集上优于最先进的视觉关系检测方法。在不寻常的关系中检索。 (c)2020 Elsevier B.v.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号