首页> 外文期刊>Sensors >Grid Based Spherical CNN for Object Detection from Panoramic Images
【24h】

Grid Based Spherical CNN for Object Detection from Panoramic Images

机译:基于网格的球形CNN,用于从全景图像检测的物体检测

获取原文
           

摘要

Recently proposed spherical convolutional neural networks (SCNNs) have shown advantages over conventional planar CNNs on classifying spherical images. However, two factors hamper their application in an objection detection task. First, a convolution in S2 (a two-dimensional sphere in three-dimensional space) or SO(3) (three-dimensional special orthogonal group) space results in the loss of an object’s location. Second, overlarge bandwidth is required to preserve a small object’s information on a sphere because the S2/SO(3) convolution must be performed on the whole sphere, instead of a local image patch. In this study, we propose a novel grid-based spherical CNN (G-SCNN) for detecting objects from spherical images. According to input bandwidth, a sphere image is transformed to a conformal grid map to be the input of the S2/SO3 convolution, and an object’s bounding box is scaled to cover an adequate area of the grid map. This solves the second problem. For the first problem, we utilize a planar region proposal network (RPN) with a data augmentation strategy that increases rotation invariance. We have also created a dataset including 600 street view panoramic images captured from a vehicle-borne panoramic camera. The dataset contains 5636 objects of interest annotated with class and bounding box and is named as WHU (Wuhan University) panoramic dataset. Results on the dataset proved our grid-based method is extremely better than the original SCNN in detecting objects from spherical images, and it outperformed several mainstream object detection networks, such as Faster R-CNN and SSD.
机译:最近提出的球形卷积神经网络(SCNNS)已经在分类球图像上的传统平面CNN上显示了优势。然而,两个因素妨碍了他们在异议的检测任务中的应用。首先,S2中的卷积(三维空间中的二维球)左右(3)(三维特殊正交组),导致对象位置的丢失。其次,需要重叠的带宽来保留在球体上的小物体的信息,因为必须在整个球体上执行S2 / SO(3)卷积而不是本地图像补丁。在这项研究中,我们提出了一种用于从球形图像中检测物体的新型基于网格的球形CNN(G-SCNN)。根据输入带宽,将球形图像变换为共形网格图以作为S2 / SO3卷积的输入,并缩放对象的边界框以覆盖网格图的适当区域。这解决了第二个问题。对于第一个问题,我们利用了一个平面区域提案网络(RPN),其中数据增强策略增加了旋转不变性。我们还创建了一个数据集,包括从车辆传播的全景摄像头捕获的600街视图全景图像。 DataSet包含5636个兴趣对象,注释了类和边界框,并被命名为WHU(武汉大学)全景数据集。结果在数据集中证明了我们基于网格的方法比原始SCNN从球形图像中检测到的原始SCNN,并且它优于几个主流对象检测网络,例如更快的R-CNN和SSD。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号