首页> 外文期刊>Journal of visual communication & image representation >Bi-projection for 360°image object detection bridged by RoI Searcher
【24h】

Bi-projection for 360°image object detection bridged by RoI Searcher

机译:用于 360° 图像物体检测的双投影,由 RoI Searcher 桥接

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

? 2022 Elsevier Inc.Object detection on 360°images is a vital component of 3D environment perception. The existing methods either treat panoramic images (usually represented in equirectangular projection—ERP) as normal FoV images and endure the distortions or project them into the less-distortion format and narrow the FoV, leading to unsatisfactory performance in practical applications. To solve this problem, we propose a dual-projection 360°object detection network named Bip R-CNN, consisting of three modules: a bi-projection feature extractor, a cross-projection region-of-interest (RoI) searcher, and a classification and regression predictor. Specifically, we extract the equirectangular and corresponding dual-cubemap features simultaneously from the input images. Besides, Projection-Inter Feature Fusion and Projection-Intra Feature Fusion are designed to allow the mutual interaction between the bi-projective features and promote the integration of features at different scales, respectively. In the proposed cross-projection RoI Searcher, we search for the bounding box (BBox) locations on cubemap from the corresponding ERP spherical proposals, bridging the RoIs of two different projection formats at feature level. Finally, the cube proposals are used to detect objects in the last predictor module. Considering the scarceness of the existing panoramic dataset (only indoor scenes), we propose an efficient approach to convert conventional datasets into annotated panoramic datasets without manual intervention, increasing the diversity of panoramic datasets. Extensive experiments are conducted on the synthetic and real-world datasets with spherical criteria, demonstrating our superiority to other state-of-the-art solutions.
机译:?2022 Elsevier Inc.360°图像上的物体检测是3D环境感知的重要组成部分。现有的方法要么将全景图像(通常以等距柱状投影-ERP表示)视为正常的视场图像并承受畸变,要么将其投影为失真较小的格式并缩小视场,导致在实际应用中性能不理想。为了解决这个问题,我们提出了一种名为Bip R-CNN的双投影360°目标检测网络,该网络由三个模块组成:双投影特征提取器、交叉投影感兴趣区域(RoI)搜索器和分类回归预测器。具体来说,我们从输入图像中同时提取等距柱状投影和相应的双立方体贴图特征。此外,投影-特征间融合和投影-内特征融合分别允许双投影特征之间的相互交互,促进不同尺度的特征融合。在提出的交叉投影 RoI 搜索器中,我们从相应的 ERP 球形建议中搜索立方体地图上的边界框 (BBox) 位置,在特征级别桥接两种不同投影格式的 RoI。最后,使用多维数据集建议来检测最后一个预测器模块中的对象。考虑到现有全景数据集(仅室内场景)的稀缺性,提出了一种无需人工干预即可将常规数据集转换为标注全景数据集的高效方法,增加了全景数据集的多样性。在具有球形标准的合成和真实世界数据集上进行了广泛的实验,证明了我们优于其他最先进的解决方案。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号